May 29, 10:39 PM

Tencent and Tencent Music Launch HunyuanVideo-Avatar AI With Multi-Character Support; Hume Releases EVI 3 Voice AI With 300ms Response

Tencent Hunyuan, in collaboration with Tencent Music, has launched HunyuanVideo-Avatar, an advanced AI model that transforms static photos and audio into dynamic, lifelike videos. This technology supports emotion-controlled animations, multi-character scenarios with separate audio controls, and works across various styles including cartoon, 3D, and real faces while preserving the subject's identity. The single-character mode, which allows up to 14 seconds of audio-generated video, has been open-sourced and is available on the Tencent Hunyuan website, with multi-character support expected to be released soon. The model automatically detects scene context and emotions to generate realistic speech and singing animations. Concurrently, Hume has introduced EVI 3, a personalized voice AI model capable of responding within 300 milliseconds and mimicking any voice with a personalized tone. Other advancements in voice AI include Rime's Arcana TTS model, which captures natural vocal nuances such as laughter, accents, and sighs, and NVIDIA's ACE suite updates that convert text to speech in multiple languages and transform audio into real-time facial animations via Audio2Face-3D. These developments highlight rapid progress in AI-driven voice and avatar technologies aimed at enhancing human-machine interaction and content creation.

#Tencent Hunyuan #Tencent Music #Hume #Rime #Arcana TTS #NVIDIA #ACE

Written with ChatGPT (GPT-4).

Tencent and Tencent Music Launch HunyuanVideo-Avatar AI With Multi-Character Support; Hume Releases EVI 3 Voice AI With 300ms Response

Sources

Additional media

Similar Stories

Similar Stories