OpenAI has taken its Realtime API out of beta and made the service generally available, expanding the WebSocket-based platform with support for Model Context Protocol (MCP) servers, Session Initiation Protocol (SIP) phone calling, image inputs and new WebRTC video capabilities. The update is designed to help developers deploy low-latency, production-grade voice agents that can draw on additional context and integrate more easily with external tools. Alongside the API upgrade, the company unveiled gpt-realtime, described as its most advanced speech-to-speech model to date. Trained in collaboration with enterprise customers for tasks such as customer support and virtual assistance, the model delivers more natural intonation, handles language switching and recognises non-verbal cues. Internal benchmarks show MultiChallenge audio instruction-following accuracy of 30.5%, up from 20.6% for the December 2024 model, while its Big Bench Audio reasoning score improved to 82.8% from 65.6%. The launch also introduces two exclusive voices, Cedar and Marin, and refines asynchronous function calling so conversations can continue while long-running tasks complete. According to OpenAI, the Realtime API now processes audio in a single end-to-end model, avoiding the latency penalties of separate speech-to-text and text-to-speech pipelines. gpt-realtime is priced 20% below the earlier gpt-4o preview at US$32 per million audio input tokens and US$64 per million output tokens, with fine-grained context controls intended to cut costs for longer sessions. The new model and the enhanced Realtime API are available to all developers globally starting 28 August 2025.
OpenAI just dropped major upgrades to the realtime API - and a new speech model (GPT-realtime) Here's what they announced + what it means for startups š https://t.co/rHzEtxtxYW
OpenAI makes Realtime API generally available with new features, including MCP support, and launches gpt-realtime, its most advanced speech-to-speech model (@sabrinaa_ortiz / ZDNET) https://t.co/y3DbQZHPcX https://t.co/LRIWXejydt š« Subscribe: https://t.co/OyWeKSQRTe
Huge Realtime API release today! Details below, but TLDR: - GA (out of beta) - better instruction following, naturalness, audio - MCP support - new voices - SIP (telephony) support - new WebRTC APIs and video support Demos: https://t.co/td6Cx2Eh0g, or call 425-800-0042