gpt-5 plays Pokémon — 3x faster progress than o3: https://t.co/JUxIK8CB1y
GPT-5 just completed Pokémon Red in about 7 days damn... impressive it took only 6,470 steps, far fewer than o3, which took 18,184 another key point: o3 took 15 days to finish, while GPT-5 did it in less than half that time it also beat Claude and Gemini by a big margin https://t.co/QE8Exg0aiK https://t.co/qizJLmbmvu
In terms of efficiency, GPT-5 completed Pokemon Red with only 33% of the steps (67 % less steps). In this simple and abstract example, efficiency increased by ≈200%. https://t.co/3XKmZ8xISP https://t.co/NbC95NBCWE
OpenAI’s experimental GPT-5 agent has completed the 1996 video game “Pokémon Red” in 6,470 in-game decision steps, according to developers involved in the test. The run required roughly one week of continuous play, less than half the 15 days logged by the company’s earlier o3 model. The result marks a three-fold reduction in the number of steps compared with o3’s 18,184-step play-through, translating to an estimated 200 percent gain in efficiency. Observers said GPT-5 also outperformed rival large language models, including Anthropic’s Claude and Google’s Gemini, although detailed figures for those systems were not disclosed. While beating a vintage role-playing game is far removed from commercial applications, researchers view the exercise as a proxy for an AI system’s ability to plan, adapt and optimise over long sequences of actions—skills that could translate to robotics, code generation and complex decision-making tasks.