Oct 25, 02:24 PM

ZhipuAI and iFLYTEK Launch Bilingual Speech Models GLM-4-Voice and Spark Multilingual

ZhipuAI has officially launched the GLM-4-Voice, an open-source end-to-end voice model capable of understanding and generating speech in both Chinese and English. The model utilizes a tokenizer fine-tuned from the Whisper encoder and a decoder based on CosyVoice, which converts discrete tokens into speech. This development is part of a broader trend in artificial intelligence communication technologies, with iFLYTEK also announcing its Spark Multilingual Model and Spark 4.0 Turbo, showcasing advancements in multilingual AI capabilities.

#Chinese #English #Whisper #CosyVoice #Spark Multilingual Model

Written with ChatGPT (GPT-4o mini).

Sources

Vlad Ruso PhD@vlruso
2 years ago
Zhipu AI Releases GLM-4-Voice: A New Open-Source End-to-End Speech Large Language Model https://t.co/wPAasjIxLw #AIcommunication #ZhipuAI #GLM4Voice #SpeechTechnology #OpenSourceAI #ai #news #llm #ml #research #ainews #innovation #artificialintelligence #machinelearning #tech… https://t.co/Jmj6COiNRd
ChatGLM@ChatGLM
2 years ago
🎉 Exciting news! 📷 ZhipuAI has launched and open-sourced GLM-4-Voice, an end-to-end voice model that directly understands and generates Chinese and English speech! 📷🗣️ https://t.co/rCEdgLiwLm
ChatGLM@ChatGLM
2 years ago
🎉 Exciting news! 🌟 ZhipuAI has launched GLM-4-Voice, an end-to-end voice model that directly understands and generates Chinese and English speech! 🗣️ https://t.co/rCEdgLiwLm

ZhipuAI and iFLYTEK Launch Bilingual Speech Models GLM-4-Voice and Spark Multilingual

Sources

Additional media

Similar Stories