OmniVoice Voice Design Guide: Build a Voice From Scratch (With Accents)
Voice cloning copies a voice that already exists. Voice design creates one that does not. Instead of feeding the model a recording, you describe the voice you want and it builds a match from scratch. That is useful whenever you need a specific character voice but have no reference audio to clone from: narrators, game characters, brand voices, and localized content.
OmniVoice takes a structured approach to voice design. Rather than a free-form text box, it gives you a fixed set of attributes to dial in. This guide covers every attribute, how to combine them, and the feature that makes OmniVoice stand out: accents.
OmniVoice vs Qwen3-TTS: Structured vs Free-Form
Both models can design a voice from scratch, but they take opposite approaches:
| OmniVoice | Qwen3-TTS | |
|---|---|---|
| Input style | Structured attributes (pick from fixed options) | Free-form text description |
| Accents | Ten English accents as direct options | Described in free text, less reliable |
| Best when | You need accents or repeatable results | You want fine, descriptive nuance |
The short version: if you want a specific accent or a result you can reproduce exactly, OmniVoice's structured attributes are the better tool. If you want to describe something nuanced in your own words ("a gravelly, world-weary detective"), Qwen3-TTS free-form design gives you more room. Both are available in Voice Creator Pro, so you can switch based on the job.
A simple way to choose: if you are designing voices for English or Chinese speech and you need a specific accent, OmniVoice excels. If you need something more dramatic and characterful, like a game character, a villain, or an over-the-top narrator, Qwen3-TTS gives you the descriptive range to get there. Our Qwen3 TTS prompting guide covers how to write those descriptions with examples.
The Attributes
OmniVoice voice design is built from five attributes. Every one has an Auto option, which hands that decision to the model. Set the attributes you care about and leave the rest on Auto.
| Attribute | Options |
|---|---|
| Gender | Auto, Male, Female |
| Age | Auto, Child, Teenager, Young adult, Middle-aged, Elderly |
| Pitch | Auto, Very low, Low, Moderate, High, Very high |
| Style | Auto, Whisper |
| Accent | Auto, plus ten English accents (listed below) |
A few of these reward a closer look.
Pitch is independent of gender and age. You can push a young adult voice lower or lift an elderly voice higher to fine-tune the character.
Style is currently focused. Use Auto for normal delivery, or Whisper for a soft, breathy read. Whisper is well suited to ASMR, intimate narration, suspense, and meditation content.
English Accents
This is the standout. OmniVoice offers ten English accents, plus Auto: American, British, Australian, Canadian, Indian, Chinese, Korean, Japanese, Portuguese, and Russian.
These generate English speech delivered with the chosen accent, so you can voice an Indian-accented narrator, a British storyteller, or a Russian-accented character without finding and cloning a real speaker.
Designing a Voice: Recipes
Attributes combine, so a handful of choices defines a distinct voice. Set the ones that matter and leave the rest on Auto. A few starting points:
| Voice you want | Gender | Age | Pitch | Style | Accent |
|---|---|---|---|---|---|
| Warm British audiobook narrator | Female | Middle-aged | Moderate | Auto | British |
| Energetic American podcaster | Male | Young adult | Moderate | Auto | American |
| Gentle ASMR reader | Female | Young adult | Low | Whisper | Auto |
| Elderly storyteller | Male | Elderly | Low | Auto | Auto |
| Young Indian-accented explainer voice | Female | Young adult | Moderate | Auto | Indian |
| Suspense narrator | Male | Middle-aged | Very low | Whisper | American |
Treat these as starting points. Change one attribute at a time and regenerate, so you can hear exactly what each control does to the voice.
Tips for Better Voice Design
Start with Auto, then narrow. Generate once with most attributes on Auto to hear the model's default, then lock in the attributes you want to change. This is faster than setting all five blind.
Pitch is your fine-tuning dial. Once gender, age, and accent give you the right character, use pitch to nudge it warmer (lower) or brighter (higher) without changing the rest.
Lean on accents for character variety. If you need several distinct voices for a cast, accent is the fastest way to make them feel like different people, even with similar gender and age settings.
Change one thing at a time. Voice design is iterative. Adjusting a single attribute per generation makes it clear what each control contributes, so you can dial in the voice you hear in your head.
Layer expressive tags onto a designed voice. OmniVoice reads inline tags directly in your text, so once you have a voice you like you can add performance without touching the attributes. Tags like [laughter] and [sigh] add non-verbal sounds, while intonation tags like [question-en] and [surprise-ah] shape delivery. Drop them right into the line, for example [laughter] You really got me there.
Voice design is most stable in English and Chinese. OmniVoice's voice design was trained on English and Chinese data, so designed voices are most reliable in those two languages. You can still synthesize other languages, but expect cloning to be the steadier option outside English and Chinese.
Voice Design vs Cloning vs Prompting
These three get conflated constantly, so to be clear:
- Voice design (this guide): build a new voice from attributes. No recording needed.
- Voice cloning: copy a specific real voice from a short reference clip (3 to 10 seconds). Use this when you want a particular person's voice.
- Voice prompting: control the delivery, emotion, and performance of a voice through text instructions (for example, our DramaBox prompting guide). This shapes how a voice acts, not who it is.
Design Your Voice in Voice Creator Pro
Voice Creator Pro includes OmniVoice voice design with every attribute covered above (gender, age, pitch, style, and all ten accents) in a point-and-click interface, so there is no setup or prompt engineering required.
Designed voices are not just for one-off clips. In Voice Creator Pro, both OmniVoice designed voices and cloned voices work inside projects, so you can use them for full audiobooks, YouTube voiceovers, and other long-form narration. Projects also support multiple speakers, which means you can build a designed cast, give each character its own attributes, and run an entire dialogue or chapter in one place.
The desktop app runs OmniVoice locally and offline, with unlimited generations and no subscription, on Windows and Mac. Or try it for free in your browser, no GPU or install required.
Try OmniVoice for free
Also available on Windows and macOS. One-time purchase, unlimited generations.