OpenAI has introduced advanced audio models named 'gpt-4o-mini-tts' and 'gpt-4o-transcribe', enhancing real-time speech synthesis and transcription capabilities. The 'gpt-4o-transcribe' model supports over 100 languages and boasts an English word error rate of just 2.46%. Meanwhile, the 'gpt-4o-mini-tts' model enables customization of emotional expressions, marking a notable advancement in voice AI technology. Additionally, OpenAI has updated its ChatGPT voice mode, which now interrupts less frequently, allowing for natural pauses. This update is available to all users, while paid users will experience an improved model personality that is described as engaging, direct, and concise. The updates reflect OpenAI's commitment to enhancing user interaction through more natural and effective voice communication.
ChatGPT's Advanced Voice Mode has been updated and is rolling out now. OpenAI's AVM now has a better personality and will interrupt less. https://t.co/pzQN26pQZ3 https://t.co/Wpq1uJKklt
🆕 ChatGPT’s voice mode now interrupts you less so you can take pauses and gather your thoughts. This update is available for all users. For paid users, this new version also improves the model’s personality to be more engaging, direct, and concise. Give it a try and let us
Nice. @OpenAI’s ChatGPT Advanced Voice Mode (AVM) just got an update, shipping today; enhanced personality + smoother convos + fewer interruptions. Feelin' the AGI vibe. https://t.co/55bPlsqr1E