ElevenLabs has launched Eleven v3 (alpha), a new version of its text-to-speech model that supports over 70 languages, more than double the 33 languages supported in the previous version. This model offers enhanced expressiveness with features including multi-speaker dialogue with contextual awareness and audio tags such as [excited], [sighs], [laughing], and [whispers]. The company has announced a competition inviting users to showcase the best voice generations created with Eleven v3, with winners receiving Meta Ray-Ban AI Glasses. Concurrently, other companies are advancing voice AI technologies, such as Cartesia AI's Ink-Whisper, a fast and affordable streaming speech-to-text model optimized for real-world conditions, and HeyGen's Avatar IV update, which improves gesture control, facial expressions, and script-driven animation. Additionally, ChatGPT's voice mode received an update in 2025, enhancing human-like conversation with smoother intonation and real-time responsiveness. ElevenLabs also introduced Speech Studio, a tool offering dozens of voices and tones for text-to-speech conversion.
Eleven v3 (alpha) is the most expressive Text to Speech model. v3 introduces: • Multi-speaker dialogue with contextual awareness • Support for 70+ languages, up from 33 in v2 • Audio tags such as [excited], [sighs], [laughing], and [whispers] https://t.co/7YJJwwNDTP
The latest Avatar IV update on @HeyGen_Official is massive 🔥🗣️ You can now control gestures via prompt, facial expressions are enhanced, and the script drives the animation. Huge upgrade. https://t.co/A3bmiw8BD6
𝐂𝐡𝐚𝐭𝐆𝐏𝐓 𝐕𝐨𝐢𝐜𝐞 𝐔𝐩𝐝𝐚𝐭𝐞: 𝐄𝐱𝐜𝐢𝐭𝐢𝐧𝐠 𝐍𝐞𝐰 𝐅𝐞𝐚𝐭𝐮𝐫𝐞𝐬 𝐘𝐨𝐮 𝐂𝐚𝐧'𝐭 𝐌𝐢𝐬𝐬 ChatGPT’s voice mode just got a major glow-up in 2025! The Advanced Voice update brings smoother, more human-like conversations with subtler intonation and real-time https://t.co/GwwMhv2HHh