The F5-TTS is a new, fully non-autoregressive text-to-speech system based on Flow Matching with Diffusion Transformer (DiT). This innovative AI technology allows users to generate high-quality speech locally on their devices. The system is open source, free to use, and can clone voices with zero-shot capability. Users have demonstrated its versatility by generating speech in various famous voices, such as Peter Griffin, Donald Trump, and Scarlett Johansson's voice from the movie 'Her'. The F5-TTS can be easily installed and run using Python, making it accessible for a wide range of applications, including voice cloning and lip-syncing when combined with other tools like Dreamtalk and Pinokio. Additionally, the F5-TTS can be integrated with MLX for enhanced functionality.
Super easy to generate speech with F5 TTS + MLX locally thanks to @lllucas 1. pip install f5-tts-mlx 2. python -m f5_tts_mlx.generate --text "Hello world" 3. afplay output.wav (🔊)
AI Mashup on steroids --- F5-TTS (voice clone) + Dreamtalk (lip sync), both available on pinokio! https://t.co/Vy3qgcVqxg
This is the best open source text-to-speech I’ve ever heard. It’s free and you can run it locally. It’s also unrestricted as you can give it any voice sample. In this video I have it speak as Peter Griffin, Trump, Jarvis, & Scarjo’s voice from HER. More in thread 👇 https://t.co/yHUbQmroYc