Microsoft has introduced its first fully in-house artificial-intelligence models—MAI-Voice-1 for speech and the MAI-1-preview large language model—signaling a move to lessen its reliance on OpenAI technology. The work is led by Mustafa Suleyman, the former DeepMind co-founder who now heads the Microsoft AI division. MAI-Voice-1 can generate a full minute of high-fidelity audio in under a second using a single graphics processor. The model already powers Copilot Daily and Microsoft’s podcast features and is available in Copilot Labs for developers to test expressive, multilingual speech synthesis. MAI-1-preview, trained on roughly 15,000 Nvidia H100 GPUs, is Microsoft’s first end-to-end foundation text model. It is being benchmarked on the public LMArena site, and Microsoft plans to integrate it into select Copilot text tasks in the coming weeks while offering early access to external developers. The rollout underscores Microsoft’s dual strategy of advancing proprietary AI while maintaining its $13 billion partnership with OpenAI. Suleyman said the company has a five-year roadmap that includes expanding its data-center capacity with Nvidia’s next-generation GB200 chips, aiming to reach billions of users through Microsoft products even as competition with OpenAI intensifies.
Microsoft AI releases its first in-house models, MAI-Voice-1 and MAI-1-preview. - MAI-Voice-1 generates high-fidelity, expressive audio for single and multi-speaker scenarios, powering Copilot Daily and Podcasts, and available in Copilot Labs for testing expressive speech and https://t.co/qNWatjiLgH https://t.co/wCgoH8Vaed
Microsoft have released MAI-Voice-1 and MAI-1-preview This release comes with Microsoft’s first in-house trained voice and text model What do you think of the MAI-1 family of models? https://t.co/ockM98NNxF
Microsoft introduces a pair of in-house AI models https://t.co/gRkDapLgl5