Oute AI has launched OuteTTS-0.1-350M, a new text-to-speech (TTS) synthesis model that utilizes a pure language modeling approach without the need for external adapters. This model is built on the LLaMa architecture and features zero-shot voice cloning capabilities, allowing users to clone voices instantly without prior training. It operates on-device using llama.cpp, making it accessible and efficient for users. Additionally, a new speech-to-speech model called Fish Agent v0.1 3B has been introduced by FishAudio, which was trained on 700,000 hours of multilingual audio. This model also supports zero-shot voice cloning and can process both text and audio inputs, providing ultra-fast results.
OuteAI’s new Smol TTS, OuteTTS-0.1-350M, lets you clone voices instantly without training. Built on LLaMa technology and free to use, it runs smoothly on your device. https://t.co/YaOKSo5j5u
Wow! New Speech to Speech model - Fish Agent v0.1 3B by @FishAudio 🔥 > Trained on 700K hours of multilingual audio > Continue-pretrained version of Qwen-2.5-3B-Instruct for 200B audio & text tokens > Zero-shot voice cloning > Text + audio input/ Audio output > Ultra-fast… https://t.co/UvdwxGUm4w
OuteTTS-0.1-350M Released: A Novel Text-to-Speech (TTS) Synthesis Model that Leverages Pure Language Modeling without External Adapters https://t.co/O8fJF6Rpfg #TextToSpeech #AIAdvancements #OuteTTS #VoiceCloning #SpeechSynthesis #ai #news #llm #ml #research #ainews #innovati… https://t.co/Wc6slvIR6P