Introducing Song Creator Pro — create music with AI, locally on your device. Coming soon →
Voice Design

Design Custom Voices from Text

Describe the voice you want — its age, gender, tone, accent, and personality — and generate a unique, realistic voice to match. No audio samples required.

No Audio Needed
100% Local
Unlimited Voices

Demo

See It in Action

Watch how you can describe a voice in plain text and generate it instantly — all running locally.

How It Works

Three Steps to Your Custom Voice

01

Describe Your Voice

Write a natural language description — specify age, gender, tone, pitch, energy, and personality traits.

02

Generate Instantly

The AI creates a unique voice matching your description. Every generation produces a distinct result.

03

Refine and Save

Iterate on your description until the voice is perfect, then save it to your library for use anytime.

Examples

Hear What Voice Design Can Do

Each voice below was generated from a text prompt alone — no recordings, no audio samples.

Street Vendor

A rough, fast-talking male voice, mid-thirties, medium pitch with sharp rising inflections, raspy and brash, high energy, suitable for character acting.

Anime Male

A low-pitched, male voice, with dramatic pitch swings, intimidating and mischievous, suitable for anime voice-overs. Add dramatic pauses.

Female Wizard

Gender: female; Age: fifties; Pitch: low pitch with an eerie resonance; Pace: slow and deliberate with dramatic pauses; Emotion: mysterious, commanding; Characteristics: smooth, powerful; Use case: fantasy game dialogue.

Sultry Female

A smooth, alluring young female voice, late twenties, low pitch with a breathy quality, slow deliberate pace, warm and intimate, suitable for late-night radio.

Gruff Warrior

A rough, commanding male voice, mid-forties, deep low pitch, hoarse and gravelly, steady measured pace, serious and intense, suitable for fantasy game dialogue or action trailers.

Child

A bright, curious child's voice, around 8 years old, high pitch with expressive intonation, moderate pace with occasional excited bursts, cheerful and innocent.

Capabilities

Total Creative Control

Design any voice you can imagine — from a warm narrator to an energetic game character — with nothing but a text prompt.

No Samples Needed

Create entirely new voices from text descriptions alone. No recording, no audio files, no microphone required.

Infinite Variety

Every description produces a unique voice. Generate as many as you need — no two are alike.

Fine-Grained Control

Specify age range, gender, accent, speaking pace, warmth, energy level, and emotional quality in your descriptions.

Unlimited Iterations

Refine your descriptions and regenerate as many times as you want. No caps, no cooldowns, no extra cost.

Full REST API

Integrate voice design into your own products via API. Generate voices programmatically for apps, games, or automation pipelines.

100% Local & Private

All voice generation runs on your hardware. Descriptions and output audio never leave your machine.

Design Your Perfect Voice

The quality of your designed voice depends on how you describe it. Our prompting guide walks you through every attribute you can control — with examples, tips, and ready-to-use templates.

Read the Guide
DimensionExamples
GenderMale, female, neutral
AgeChild (5–12), teenager (13–18), young adult (19–35), middle-aged (36–55), elderly (55+)
PitchHigh, medium, low, high-pitched, low-pitched
PaceFast, medium, slow, fast-paced, slow-paced
EmotionCheerful, calm, gentle, serious, lively, composed, soothing
CharacteristicsMagnetic, crisp, hoarse, mellow, sweet, rich, powerful
Use caseNews broadcast, ad voice-over, audiobook, animation character, voice assistant, documentary narration

Voice Design API

Generate and save voices programmatically from your own code. The local REST API lets you create voices on demand — ideal for games that generate NPC voices at runtime, apps with personalized onboarding, or any pipeline that needs voices without manual setup.

POST/api/v1/generation/design
const response = await fetch(
"http://localhost:7862/api/v1/generation/design",
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
instruct_text: "A warm female narrator,
mid-30s, British accent, calm",
target_text: "Hello! I'm your guide for today."
})
}
)

Real-Time Streaming

Start playing audio while it's still being generated. No waiting for the full file — ideal for interactive apps and live previews.

Queue-Based Processing

Requests are queued and processed reliably in order. Poll for progress or block until complete — no dropped jobs.

Two Quality Modes

Pick the 1.7B-parameter model for maximum fidelity, or the 0.6B model when speed matters more. Switch per request.

10 Languages

Design voices in English, Chinese, Japanese, Korean, German, French, Russian, Portuguese, Spanish, and Italian — or let auto-detect handle it.

FAQ

Common Questions

Voice design lets you create entirely new, unique voices by writing a text description. Instead of cloning an existing voice from audio, you describe the characteristics you want — age, gender, accent, tone — and the AI generates a matching voice.

The more specific, the better. Include details like age range (e.g. 'mid-30s'), gender, accent (e.g. 'British'), tone (e.g. 'warm and authoritative'), speaking pace, and any personality traits. Short descriptions work too, but more detail gives you more control.

Voice cloning replicates an existing voice from an audio sample. Voice design creates entirely new voices from text descriptions — no audio input of any kind is needed. They complement each other: clone real voices, design fictional ones.

Yes. Once you generate a voice you like, save it to your local voice library. It's available for text-to-speech generation anytime, just like a cloned voice.

No. Generate and iterate as many times as you want. There are no monthly caps, per-voice fees, or cooldowns. One-time purchase, unlimited use.

Yes. All voices you design with Voice Creator Pro are yours to use in commercial projects — videos, podcasts, games, apps, audiobooks, and more. No additional licensing required.

Voice Design is ideal when the voice you need doesn't exist in the built-in library and you don't have audio to clone. If you need a specific character or persona that isn't available, Voice Design lets you create it from scratch.

You can create both realistic voices (e.g. 'A young Indian female with a soft, high voice, conversational and calm') and character voices (e.g. 'An angry old pirate captain, shouting' or 'A massive evil ogre'). The more descriptive your prompt, the more control you have over the result. See our prompting guide for tips.

Yes. Like all Voice Creator Pro features, voice design runs entirely on your local hardware. No internet connection is required, and no data is sent to external servers.

Yes. You can generate new voice previews and save them to your library programmatically via the local REST API. This lets you integrate voice design into your own applications and workflows. See our API documentation for endpoint details.

Windows 10 or later, or macOS with Apple Silicon (M1 or later). A modern GPU (NVIDIA recommended on Windows) provides the best performance. The app runs entirely on your hardware with no cloud dependency. CPU-only processing is also supported.

Start Designing Voices Today

One-time purchase. No subscriptions, no usage limits, no cloud dependency. Describe your perfect voice and bring it to life.