Nov 21, 08:53 PM

OpenAI's Updated GPT-4o: Performance Drops in Artificial Analysis Quality Index, GPQA Diamond, and MATH Scores Despite Speed Gains

OpenAI has released an updated version of its GPT-4o model, which shows mixed performance results compared to its previous iteration from August 2024. Key metrics indicate a decrease in the Artificial Analysis Quality Index from 77 to 71, and a drop in GPQA Diamond scores from 51% to 39%. Additionally, the model's performance in mathematical tasks has declined, with scores falling from 78% to 69%. However, the new version has demonstrated improved speed, increasing output from approximately 80 tokens per second to 180 tokens per second, and enhancements in creative writing capabilities. The competition in the AI landscape remains intense, particularly with Google’s Gemini model, as analysts note that forthcoming large models from OpenAI, Google, and Anthropic have not met expected performance gains despite advancements in training data and computing power. Some users have expressed dissatisfaction with the updates, suggesting that the constant adjustments could be detrimental to the model's overall quality.

#OpenAI #Artificial Analysis Quality Index #GPQA Diamond #Google #Gemini #Anthropic

Written with ChatGPT (GPT-4o mini).

Sources

Additional media

Image #1 for story openai-s-updated-gpt-4o-performance-drops-artificial-analysis-quality-index-gpqa-46ff93a7

Image #2 for story openai-s-updated-gpt-4o-performance-drops-artificial-analysis-quality-index-gpqa-46ff93a7

OpenAI's Updated GPT-4o: Performance Drops in Artificial Analysis Quality Index, GPQA Diamond, and MATH Scores Despite Speed Gains

Sources

Additional media

Similar Stories