OuteAI has released OuteTTS-0.2-500M, a lightweight text-to-speech (TTS) model featuring 500 million parameters. This model, built on Qwen-2.5-0.5B, has been trained on over 5 billion audio prompt tokens and offers multilingual capabilities, supporting English, Chinese, Korean, and Japanese. Users have reported improvements in voice cloning, describing the output as smoother, more natural, and coherent. The model's small size allows it to run on devices with limited resources, such as a Raspberry Pi. The upgraded voice cloning feature enhances diversity and accuracy, making it a notable contender in the TTS market.
Just tried OuteTTS v0.2 and WOW - voice cloned myself from a 10s clip and it's mind-blowing! 500M param model with multilingual support & stellar results. - So small it can even run on a Raspberry Pi! - Supports EN/CN/KR/JP languages. https://t.co/mu9a4ht164
🔥 Exciting update for TTS fans! Meet OuteTTS-0.2-500M, the latest lightweight TTS model: 🎙️ Smoother, more natural, and more coherent sound than ever! 🌍 Supports multiple languages: Chinese, Korean, Japanese & more! 🧑🎤 Upgraded voice cloning with better diversity & accuracy.… https://t.co/lE0J8xaOkX
OuteTTS-0.2-500M, a 500M parameter text-to-speech model just released by @OuteAI . Built on Qwen-2.5-0.5B, trained on over 5 billion audio prompt tokens with multilingual capabilities for English, Chinese, Japanese, and Korean. → The model offers improved voice cloning,… https://t.co/vq3JLzdnT8