Clone a Voice
A step-by-step tutorial for cloning any voice in Voice Creator Pro, from recording a reference sample to generating speech.
Clone a Voice
In this tutorial you will clone a voice from scratch and use it to generate speech. By the end, you will have a saved voice in your library and a generated audio clip that sounds like the original speaker.
The entire process takes under two minutes.
Prerequisites
- Voice Creator Pro installed and running on your machine
- A microphone (if recording your own voice) or an audio file / YouTube link with the voice you want to clone
Step-by-Step Walkthrough
Step 1: Open the Clone Tab
Open Voice Creator Pro and click Lab in the left sidebar, then select the Clone tab at the top. You will see three panels:
- Reference Voice (left) - where you load or record the source voice
- Generate Speech (center) - where you enter text and pick a model
- Output (right) - where you listen to and download results
Step 2: Add a Reference Voice
You have four options. Pick whichever fits your situation:
Option A: Record from microphone Click the microphone icon in the Reference Voice panel and speak for 3 to 10 seconds. Keep it clean and natural. A quiet room makes a big difference.
Option B: Upload an audio file Click the upload icon and select a WAV or MP3 file from your computer.
Option C: Import from YouTube Click the YouTube icon, paste a URL, then click Fetch. Preview the extracted clip and click Use this clip to load it.
Option D: Browse Voice Search Open Voice Search from the sidebar, browse the community library, and click Import to Clone Library on any voice you like. It will appear in your Clone Library dropdown instantly.
Keep your clip under 15 seconds. We recommend 3 to 10 seconds of clean speech. Longer audio does not produce a better clone. It can actually harm quality and takes much longer to process. Extremely long clips can cause the application to malfunction. Learn more about it in this guide How Many Minutes of Audio Do You Need for Voice Cloning?.
Step 3: Verify the Transcript
After you load a reference voice, the built-in speech-to-text model fills in the Transcript field automatically.
This step is critical. Read the transcript and compare it word-for-word with what the speaker actually says. Fix any mistakes before moving on. Even small mismatches between the transcript and the audio will hurt clone quality.
Step 4: Save the Voice (Optional but Recommended)
Click Save Voice and give it a descriptive name (for example, "Sarah - warm narration"). The voice is now stored in your Library and you can reload it anytime from the Library dropdown without re-uploading audio.
Step 5: Pick a Model
Click the model badge (shown in purple) in the Generate Speech panel to choose a TTS model. Here is a quick guide:
| Model | Best for |
|---|---|
| OmniVoice | Widest language support (600+), expression tags like [laughter], [sigh] ,etc. |
| Chatterbox Multilingual | High-quality conversational speech in 23 languages |
| Chatterbox Turbo | Fast English-only generation. To select this model, select 'Chatterbox' and then select' Lower' quality. |
| Qwen3 | 10 languages with different voice characteristics |
| NeuTTS | Lightweight English-only option for weaker hardware |
If you are unsure, start with OmniVoice. It covers the most languages and supports expression tags.
Step 6: Enter Your Text
Type or dictate the text you want the cloned voice to say in the Text to speak field.
Keep it to a paragraph or two for quick testing. For longer scripts, use Projects instead.
Want to add personality? If you chose OmniVoice or Chatterbox, click the smiley icon to insert expression tags like [laughter] or [surprise-oh] directly into your text.
Step 7: Generate
Click the Generate button. The Output panel on the right will play the result once processing finishes.
Not happy with the output? Try these before generating again:
- Switch to a different model
- Re-record or trim your reference audio to a cleaner 3 to 10 second clip. See How to Pick the Right Reference Audio for Voice Cloning for guidance on choosing a good clip.
- Adjust the advanced settings (click the sliders icon next to the language selector)
Step 8: Download or Keep Iterating
Click the download button in the Output panel to save the audio file. Every generation is also stored in the History section below the panels, so you can revisit and compare earlier attempts.
Tips for Best Results
- Keep reference audio short. 3 to 10 seconds of clean speech is the sweet spot. Longer clips do not improve quality and can make it worse. See How Many Minutes of Audio Do You Need for Voice Cloning? for the details.
- Minimize background noise. Record in a quiet room or use a well-isolated clip. Background music, echo, and ambient noise all degrade the clone.
- Match the transcript exactly. This is the single most common cause of poor results. Double-check it every time.
- Experiment with models. Each model handles tone, pacing, and accents differently. Try two or three on the same text to find your favorite.
- Use advanced settings sparingly. The defaults work well for most cases. If you do tweak them, change one setting at a time so you can hear the difference.
Next Steps
- Voice Cloning reference - Full details on every setting and model
- Voice Search - Browse and import community voices
- Projects - Scale up to long-form content with consistent voices
- How to Pick the Right Reference Audio - Deep dive on choosing reference clips
Welcome
Learn how to clone voices, design custom AI voices, and produce unlimited audiobooks, social media voiceovers, podcasts, and more with Voice Creator Pro.
Create a Custom Voice
A step-by-step tutorial for designing a custom AI voice in Voice Creator Pro, from configuring voice settings to saving and using the voice in Projects.