Try Supertonic TTS Online for Free
Multilingual text-to-speech across 31 languages with 10 preset voices. Runs entirely in your browser. No signup, no install, completely free.
Try Supertonic FreeWhy Supertonic
Why Use Supertonic for Text-to-Speech
Supertonic 3 is an MIT-licensed multilingual text-to-speech system from Supertone. It uses a four-stage ONNX pipeline (text encoder, duration predictor, vector estimator, vocoder) and runs entirely on-device through ONNX Runtime Web.
At about 99 million parameters, it is a fraction of the size of most open TTS systems while supporting 31 languages and producing natural, expressive speech.
31 Languages
Cover every major European language plus Korean, Japanese, Arabic, Hindi, Vietnamese, Indonesian, and more from a single model.
10 Preset Voices
Five male and five female voice styles (M1 through M5 and F1 through F5) that work across every supported language.
WebGPU + WASM
Uses WebGPU for fast on-device inference where available, with an automatic WebAssembly fallback for Firefox and Safari.
Quality Tuning
Advanced settings let you trade off speed and quality by adjusting the number of denoising steps from 1 to 32.
Get Started
How It Works
Open the free tool
Go to the Supertonic TTS tool in your browser. The first run downloads the model and caches it locally for next time.
Pick a voice and language
Choose one of the 10 preset voices and the target language. Any voice can speak any of the 31 supported languages.
Type your text and generate
Enter your text, tweak the speed or quality steps in advanced settings if you want, then generate. Audio is ready in seconds.
Use Cases
Who Is Supertonic For?
Multilingual Creators
Generate voiceovers in 31 languages from a single model. Useful for international podcasts, dubbed video content, and localized social media.
Language Learners
Generate pronunciation examples and listening practice across European, Asian, and Middle Eastern languages without juggling multiple TTS providers.
Developers
Prototype voice features that need to work across many languages. MIT license means you can ship Supertonic in your own products, including commercial ones.
Accessibility
Convert articles, documentation, or learning material into speech for audiences across dozens of languages from one consistent voice catalog.
Getting the Most Out of Supertonic
Tips for Best Results
Match the language to your text
Supertonic uses the selected language to guide pronunciation. Switching to the right language gives much better results than leaving it on English for non-English text.
Tune quality with denoising steps
Lower step counts (4-6) are faster and good for previews. Higher counts (12-32) give the cleanest audio. The default of 8 is a sensible balance.
Use WebGPU where available
Chrome and Edge use your GPU for inference, which is much faster than WebAssembly. If you're on Firefox or Safari and generation feels slow, try a Chromium browser.
Voice Creator Pro
Need more? Go further with Voice Creator Pro.
Voice Creator Pro gives you higher quality models, voice cloning in 600+ languages, emotional speech, and a commercial use license. Try it free in your browser or download the desktop app.
Voice Cloning
Clone any voice from a short audio sample. Zero-shot cloning with no training required
600+ Languages
Combine multiple open-source models for TTS across 600+ languages
GPU-Accelerated Processing
Faster generation with NVIDIA, Apple Silicon, AMD, and Intel GPU support
Voice Design from Text
Describe a voice in plain text and the AI creates it. No audio samples needed
Local REST API
Automate voice generation in your own apps and workflows
Commercial License
Full rights to use generated audio in commercial projects
FAQ
Common Questions
Supertonic is an on-device multilingual text-to-speech system from Supertone. Supertonic 3 has 99 million parameters and supports 31 languages with 10 preset voices. Despite being a fraction of the size of larger open TTS systems, it produces natural, high-quality speech and runs entirely on-device with no cloud dependency.
Supertonic is developed by Supertone, a voice AI company. The open-weight ONNX checkpoint is released on Hugging Face under the MIT license, so you can use it freely in personal or commercial projects.
Supertonic 3 supports 31 languages: English, Korean, Japanese, Arabic, Bulgarian, Czech, Danish, German, Greek, Spanish, Estonian, Finnish, French, Hindi, Croatian, Hungarian, Indonesian, Italian, Lithuanian, Latvian, Dutch, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Swedish, Turkish, Ukrainian, and Vietnamese, plus a language-agnostic mode.
Supertonic ships with 10 preset voice styles: five male (M1 through M5) and five female (F1 through F5). Each voice can speak any of the supported languages.
Yes. Supertonic 3 is released under the MIT license, one of the most permissive open-source licenses. On this site it runs directly in your browser with no account, no signup, and no usage caps.
At about 99 million parameters, Supertonic 3 is a fraction of the size of 0.7B to 2B parameter open TTS systems while staying competitive on quality benchmarks. The smaller model size means faster cold starts, smaller downloads, and lower memory usage, which is what makes browser inference practical. For voice cloning, voice design, and 600+ languages, Voice Creator Pro is the desktop counterpart.
Supertonic 3 weighs about 400 MB on first download and is cached locally afterwards. It runs best on browsers with WebGPU support (Chrome, Edge) where it can use your GPU for inference. WebAssembly is used automatically as a fallback. Any recent laptop or desktop can run it, but the first download takes longer than smaller models.
Chrome and Edge work best because they support WebGPU acceleration. Firefox and Safari work too but fall back to WebAssembly, which is slower. The first run downloads the model and caches it; subsequent runs are much faster.