DataOcean AI, in collaboration with Tsinghua University, has launched Dolphin, an advanced open-source Automatic Speech Recognition (ASR) model. Dolphin supports 40 Eastern languages and 22 Chinese dialects, featuring 21.2 million hours of data, of which 7.4 million hours are open data. The model is released under the Apache 2.0 license, aiming to enhance multilingual speech recognition capabilities. Additionally, ByteDanceOSS has introduced MegaTTS3, an open Text-to-Speech (TTS) model that supports English and Chinese, offering high-quality voice cloning and accent intensity control. Meanwhile, Roblox has launched a voice safety classifier that now supports seven new languages and has achieved over 23,000 downloads on GitHub and Hugging Face.
Dolphin: Advanced Multilingual ASR Model for Eastern Languages and Dialects #DolphinASR #MultilingualTechnology #SpeechRecognition #EasternLanguages #AIInnovation https://t.co/hGqjtc9Ykr https://t.co/w8wK4B30rP
Researchers from Dataocean AI and Tsinghua University Introduces Dolphin: A Multilingual Automatic Speech Recognition ASR Model Optimized for Eastern Languages and Dialects Researchers from Dataocean AI and Tsinghua University have introduced Dolphin, a comprehensive https://t.co/Pcyh1I2LfA
Our newly launched open-source voice safety classifier now supports 7 new languages with significant accuracy improvements. We're thrilled to see our community engaging with the model, which has now surpassed 23K downloads on GitHub and Hugging Face. https://t.co/sY81G32b9t