Sesame Labs has released its Conversational Speech Model (CSM), an open-source AI voice generation system licensed under Apache 2.0. The model, which is trained on 1 million hours of data, features capabilities such as voice cloning, watermarking, and real-time synthesis. Built on the Llama architecture, CSM aims to produce more natural and engaging AI-generated speech by understanding conversation history, tone, and rhythm. The model is already gaining traction, ranking #1 on Hugging Face within 24 hours of its release. Co-founded by Brendan Iribe, Sesame Labs is recognized for its viral virtual assistant, Maya, and its latest offering is positioned to enhance the capabilities of AI voice assistants.
how the fuck do you use sesame csm? is there a guide? so far getting very poor results even when providing context. it just speaks a couple of words from the beginning of the text i give it in a very weird voice and then silence.
how the fuck do you use sesame csm? is there a guide. so far getting very poor results even when providing context. it just speaks a couple of words from the beginning of the text i give it in a very weird voice and then silence.
.@sesame now #1 on @huggingface 🚀 https://t.co/jnV8OaEs7i https://t.co/Lht7kNlWyq https://t.co/B2M2KyMMzE