Introducing Song Creator Pro — create music with AI, locally on your device. Try it now →

Try MOSS-TTS-Nano Online for Free

Multilingual text-to-speech and voice cloning in 18 languages. Runs entirely in your browser. No signup, no install, completely free.

Try MOSS-TTS-Nano Free
18 Languages
Voice Cloning
Free and Unlimited

Why MOSS

Why Use MOSS-TTS-Nano

MOSS-TTS-Nano is a compact, open-source TTS model with just 100 million parameters, released under the Apache 2.0 license by MOSI.AI and the OpenMOSS team. Despite its small size (~715 MB), it delivers multilingual voice cloning in 18 languages with 48kHz stereo output.

MOSS is the only model in the free tool that supports voice cloning in languages other than English, making it the go-to choice for multilingual projects.

Multilingual Voice Cloning

Clone voices in 18 languages including English, Chinese, Japanese, Korean, German, Spanish, French, Arabic, and more.

48kHz Stereo Output

Higher quality than most TTS models. Produces rich, natural-sounding stereo audio at 48kHz sample rate.

Real-Time Streaming

Hear audio as it generates with streaming output. Designed to run in real time, even on CPU.

Runs in Your Browser

No download, no server, no signup. Runs locally via WebGPU/WASM with full privacy.

Get Started

How It Works

MOSS-TTS-Nano supports two modes: text-to-speech with built-in voices, and voice cloning from an audio sample.

1

Open the free tool and choose your mode

Go to the MOSS-TTS-Nano tool in your browser. Pick a built-in voice for standard TTS, or upload an audio sample for voice cloning. No download or signup required.

2

Type your text

Enter the text you want spoken. For built-in voices, you can use English, Chinese, or Japanese. For voice cloning, you can generate speech in any of the 18 supported languages.

3

Generate and listen

Hit generate and your audio streams in real time. Download the result as a high-quality 48kHz stereo audio file.

Use Cases

Who Is MOSS-TTS-Nano For?

Multilingual Content Creators

Create voiceovers in multiple languages using a single cloned voice. Produce content in English, Chinese, Japanese, Korean, Spanish, French, and more without hiring voice actors for each language.

Localization Teams

Generate localized audio for apps, games, and media across 18 languages. Maintain a consistent brand voice across regions with multilingual cloning.

Developers and Researchers

Prototype multilingual voice features, test TTS pipelines, or experiment with a compact open-source model you can inspect and build on. Apache 2.0 licensed for maximum flexibility.

Accessibility

Create personalized synthetic voices for people who speak different languages, helping them communicate in their native language with a familiar voice.

Getting the Most Out of MOSS

Tips for Best Results

Use clean reference audio for cloning

Background noise, music, or room echo will affect the cloned voice. Use a clear recording with minimal background noise and a single speaker for the best results.

Try different built-in voices for each language

MOSS includes built-in voices for English, Chinese, and Japanese. Try different voices to find the best match for your content before turning to cloning.

Streaming works best in Chrome

For the smoothest real-time streaming experience, use Chrome or another Chromium-based browser. Firefox and Safari may have limited support for some features.

Keep reference clips short and focused

A few seconds of clear, natural speech gives MOSS everything it needs to clone a voice. Longer clips do not necessarily improve quality.

Match the language to your use case

For voice cloning, make sure the text you enter matches one of the 18 supported cloning languages. Built-in voice TTS is available in English, Chinese, and Japanese.

Voice Creator Pro

Need more languages, speed, or a commercial license?

Voice Creator Pro gives you GPU acceleration, voice cloning in 600+ languages, voice design from text descriptions, a local REST API, and a commercial use license. One-time purchase of $49.99.

Voice Cloning in 600+ Languages

Go beyond 18 languages with additional open-source models for TTS across 600+ languages

GPU-Accelerated Processing

Faster generation with NVIDIA, Apple Silicon, AMD, and Intel GPU support

Voice Design from Text

Describe a voice in plain text and the AI creates it. No audio samples needed

Advanced Voice Cloning

Multiple cloning models including Chatterbox Multilingual for 20+ language support

Local REST API

Automate voice generation in your own apps and workflows

Commercial License

Full rights to use generated audio in commercial projects. One-time $49.99 purchase

FAQ

Common Questions

MOSS-TTS-Nano is a compact, open-source text-to-speech model with 100 million parameters. It supports both built-in voices and voice cloning across 18 languages. The model uses an autoregressive audio tokenizer combined with an LLM pipeline to produce high-quality 48kHz stereo audio, and it is designed to run efficiently without a GPU.

MOSS-TTS-Nano was developed by MOSI.AI and the OpenMOSS team. It is released under the Apache 2.0 license, making it freely available for both personal and commercial use.

For built-in voices, MOSS-TTS-Nano supports English, Chinese, and Japanese. For voice cloning, it supports 18 languages: English, Chinese, Japanese, Korean, German, Spanish, French, Italian, Hungarian, Russian, Arabic, Polish, Portuguese, Czech, Danish, Swedish, Greek, and Turkish. MOSS is the only model in the free tool that supports cloning in languages other than English.

Upload a short audio sample of the voice you want to clone, then type the text you want spoken. MOSS will generate speech in that voice. For best results, use a clear recording with a single speaker and minimal background noise. You can clone voices in any of the 18 supported languages.

MOSS-TTS-Nano produces 48kHz stereo audio, which is higher quality than many other TTS models that output mono audio at 22kHz or 24kHz. The result is richer, more natural-sounding speech.

Yes. MOSS-TTS-Nano is open-source software released under the Apache 2.0 license. On this site, it runs directly in your browser with no account, no signup, and no usage limits.

MOSS-TTS-Nano is a smaller, more efficient version designed to run in real time on CPUs and in the browser. The full MOSS-TTS model is larger and may produce higher-fidelity output in some cases, but MOSS-TTS-Nano retains the core capabilities including multilingual voice cloning and 48kHz stereo output.

Chrome, Edge, and other Chromium-based browsers work best. Firefox and Safari have limited support for the WebGPU and WASM features that MOSS-TTS-Nano uses for acceleration. For the best streaming experience, use Chrome.