Try MOSS-TTS-Nano Online for Free
Multilingual text-to-speech and voice cloning in 18 languages. Runs entirely in your browser. No signup, no install, completely free.
Try MOSS-TTS-Nano FreeWhy MOSS
Why Use MOSS-TTS-Nano
MOSS-TTS-Nano is a compact, open-source TTS model with just 100 million parameters, released under the Apache 2.0 license by MOSI.AI and the OpenMOSS team. Despite its small size (~715 MB), it delivers multilingual voice cloning in 18 languages with 48kHz stereo output.
MOSS is the only model in the free tool that supports voice cloning in languages other than English, making it the go-to choice for multilingual projects.
Multilingual Voice Cloning
Clone voices in 18 languages including English, Chinese, Japanese, Korean, German, Spanish, French, Arabic, and more.
48kHz Stereo Output
Higher quality than most TTS models. Produces rich, natural-sounding stereo audio at 48kHz sample rate.
Real-Time Streaming
Hear audio as it generates with streaming output. Designed to run in real time, even on CPU.
Runs in Your Browser
No download, no server, no signup. Runs locally via WebGPU/WASM with full privacy.
Get Started
How It Works
MOSS-TTS-Nano supports two modes: text-to-speech with built-in voices, and voice cloning from an audio sample.
Open the free tool and choose your mode
Go to the MOSS-TTS-Nano tool in your browser. Pick a built-in voice for standard TTS, or upload an audio sample for voice cloning. No download or signup required.
Type your text
Enter the text you want spoken. For built-in voices, you can use English, Chinese, or Japanese. For voice cloning, you can generate speech in any of the 18 supported languages.
Generate and listen
Hit generate and your audio streams in real time. Download the result as a high-quality 48kHz stereo audio file.
Use Cases
Who Is MOSS-TTS-Nano For?
Multilingual Content Creators
Create voiceovers in multiple languages using a single cloned voice. Produce content in English, Chinese, Japanese, Korean, Spanish, French, and more without hiring voice actors for each language.
Localization Teams
Generate localized audio for apps, games, and media across 18 languages. Maintain a consistent brand voice across regions with multilingual cloning.
Developers and Researchers
Prototype multilingual voice features, test TTS pipelines, or experiment with a compact open-source model you can inspect and build on. Apache 2.0 licensed for maximum flexibility.
Accessibility
Create personalized synthetic voices for people who speak different languages, helping them communicate in their native language with a familiar voice.
Getting the Most Out of MOSS
Tips for Best Results
Use clean reference audio for cloning
Background noise, music, or room echo will affect the cloned voice. Use a clear recording with minimal background noise and a single speaker for the best results.
Try different built-in voices for each language
MOSS includes built-in voices for English, Chinese, and Japanese. Try different voices to find the best match for your content before turning to cloning.
Streaming works best in Chrome
For the smoothest real-time streaming experience, use Chrome or another Chromium-based browser. Firefox and Safari may have limited support for some features.
Keep reference clips short and focused
A few seconds of clear, natural speech gives MOSS everything it needs to clone a voice. Longer clips do not necessarily improve quality.
Match the language to your use case
For voice cloning, make sure the text you enter matches one of the 18 supported cloning languages. Built-in voice TTS is available in English, Chinese, and Japanese.
Voice Creator Pro
Need more languages, speed, or a commercial license?
Voice Creator Pro gives you GPU acceleration, voice cloning in 600+ languages, voice design from text descriptions, a local REST API, and a commercial use license. One-time purchase of $49.99.
Voice Cloning in 600+ Languages
Go beyond 18 languages with additional open-source models for TTS across 600+ languages
GPU-Accelerated Processing
Faster generation with NVIDIA, Apple Silicon, AMD, and Intel GPU support
Voice Design from Text
Describe a voice in plain text and the AI creates it. No audio samples needed
Advanced Voice Cloning
Multiple cloning models including Chatterbox Multilingual for 20+ language support
Local REST API
Automate voice generation in your own apps and workflows
Commercial License
Full rights to use generated audio in commercial projects. One-time $49.99 purchase
FAQ
Common Questions
MOSS-TTS-Nano is a compact, open-source text-to-speech model with 100 million parameters. It supports both built-in voices and voice cloning across 18 languages. The model uses an autoregressive audio tokenizer combined with an LLM pipeline to produce high-quality 48kHz stereo audio, and it is designed to run efficiently without a GPU.
MOSS-TTS-Nano was developed by MOSI.AI and the OpenMOSS team. It is released under the Apache 2.0 license, making it freely available for both personal and commercial use.
For built-in voices, MOSS-TTS-Nano supports English, Chinese, and Japanese. For voice cloning, it supports 18 languages: English, Chinese, Japanese, Korean, German, Spanish, French, Italian, Hungarian, Russian, Arabic, Polish, Portuguese, Czech, Danish, Swedish, Greek, and Turkish. MOSS is the only model in the free tool that supports cloning in languages other than English.
Upload a short audio sample of the voice you want to clone, then type the text you want spoken. MOSS will generate speech in that voice. For best results, use a clear recording with a single speaker and minimal background noise. You can clone voices in any of the 18 supported languages.
MOSS-TTS-Nano produces 48kHz stereo audio, which is higher quality than many other TTS models that output mono audio at 22kHz or 24kHz. The result is richer, more natural-sounding speech.
Yes. MOSS-TTS-Nano is open-source software released under the Apache 2.0 license. On this site, it runs directly in your browser with no account, no signup, and no usage limits.
MOSS-TTS-Nano is a smaller, more efficient version designed to run in real time on CPUs and in the browser. The full MOSS-TTS model is larger and may produce higher-fidelity output in some cases, but MOSS-TTS-Nano retains the core capabilities including multilingual voice cloning and 48kHz stereo output.
Chrome, Edge, and other Chromium-based browsers work best. Firefox and Safari have limited support for the WebGPU and WASM features that MOSS-TTS-Nano uses for acceleration. For the best streaming experience, use Chrome.