Introducing Song Creator Pro — create music with AI, locally on your device. Try it now →

Projects

Create audiobooks, podcasts, voiceovers, and other long-form audio content from scripts and documents.

Projects is where you produce long-form audio. Import a document, assign voices to different speakers or sections, fine-tune pacing, and export a finished audio file. Everything runs locally on your machine with no usage limits.

How It Works

1. Import Your Document

Import an EPUB, PDF, DOCX, or plain text file. Voice Creator Pro extracts the text and preserves chapter structure automatically. You can also paste text directly into the Projects window with Ctrl+V (Cmd+V on macOS) if you already have your script in the clipboard.

FormatDetails
.epubExtracts text with chapter structure
.pdfExtracts text content
.docxExtracts text with formatting
.txtPlain text import

2. Assign Voices

Map different voices to different speakers or sections. You can use any combination of:

  • Built-in voices from the TTS tab
  • Cloned voices from the Clone tab
  • Designed voices from the Design tab

Mix and match freely. For example, assign a cloned narrator voice for prose and a designed character voice for dialogue.

3. Fine-Tune and Export

Adjust pauses, speed, and pacing per section. When you are happy with the result, export to your preferred format:

FormatBest For
.m4bAudiobooks - includes embedded chapter markers so listeners can navigate by section
.mp3Podcasts, social media, general use
.wavVideo editing, professional workflows
.flacLossless archival

Features

Multi-Voice Assignment

Assign different voices to different speakers, characters, or sections. Highlight any part of a segment to assign it a different voice from the default, giving you precise control over who speaks what. Create multi-narrator audiobooks and dialogues with distinct voices for each role.

Chapter-Aware

Automatically detects chapter boundaries from your source file. Chapters carry through to M4B exports so listeners can navigate by section.

Granular Pacing Controls

Fine-tune pauses between paragraphs and chapters, adjust speaking speed per section, and control pacing for a natural listening experience. You can also highlight any part of a segment and add a pause before it for precise timing control.


Project Settings

Open Project Settings to configure generation, segmentation, and output options for the entire project.

Generation

SettingDefaultDescription
LanguageAutoOutput language. Auto detects from the text.
Takes per generation1Number of takes to generate per segment. Extra takes appear in the segment's history so you can pick the best one.

Generation Parameters

Advanced generation parameters (speed, steps, guidance scale, etc.) can be configured at three levels:

  • Project level - Set defaults in Project Settings that apply to all segments
  • Segment level - Override parameters for individual segments
  • Selection level - Highlight text within a segment to adjust settings for that specific portion, giving you control over emotion and variance in different parts of the speech

The available parameters depend on the selected model. See the Voice Cloning docs for a breakdown of each model's settings.

Segmentation

SettingDefaultDescription
Max chars per segment500Maximum characters before a segment is auto-split. Set to 0 to disable auto-splitting. Paragraphs longer than this are split at sentence boundaries. Click Re-parse to apply changes to the current document.
Clean up imported text-Strips decorative symbols while normalizing ligatures and whitespace. Keeps punctuation, dashes, and quotes.

Output

SettingDefaultDescription
Paragraph gap (ms)300Silence between paragraphs in milliseconds.
Segment gap (ms)1000Silence between segments in milliseconds.
FormatMP3Export format (MP3, M4B, WAV, or FLAC).

Use Cases

Audiobooks

Convert novels, non-fiction, and self-published books into professional audiobooks with chapter markers and multiple narrator voices.

Podcasts

Generate podcast episodes from scripts with different voices for host and guest roles. Export directly to MP3.

E-Learning and Training

Convert manuals, study guides, and training documents into narrated audio that learners can listen to anywhere.

Social Media Voiceovers

Turn scripts into voiceovers for YouTube, TikTok, Instagram, and other platforms.

Documentation and Reports

Turn lengthy documents, research papers, or reports into audio so you can absorb the content while commuting or exercising.

Accessibility

Make written content accessible to people with visual impairments or reading difficulties by converting it to high-quality spoken audio.

Self-Publishing

Self-published authors can create audiobook editions of their work without hiring a narrator or booking studio time.