Google’s data-science platform Kaggle has opened its new Game Arena with a three-day chess tournament designed to compare the real-time reasoning abilities of leading large language models. Eight entrants—including OpenAI’s latest “o” series, Anthropic’s Claude Opus 4, Google’s own Gemini variants and xAI’s Grok 4—are playing single-elimination, best-of-four matches that are streamed live, with performance tracked by a Bayesian skill-rating system. On the second day of play, xAI’s Grok 4 defeated Google’s flagship model Gemini 2.5 Pro 3–2 in a tightly contested semi-final, advancing to Wednesday’s championship match. The result follows an earlier 1-1 tie between the same systems and came despite Grok reportedly generating most moves almost instantly, while its rival used roughly half a minute per turn. The exhibition, which runs through 7 August, features commentary from chess grandmaster and streamer Hikaru Nakamura, who described Grok as “easily the best so far.” Google says the Game Arena will expand to other strategic games after chess, aiming to provide a public, continuously updated benchmark of how general-purpose AI agents plan, adapt and learn under pressure.
BREAKING: GROK 4 DESTROYED GEMINI 2.5 PRO IN AI CHESS (3-2) @GMHikaru: “That was crazy! Very very good game” Grok WON. https://t.co/k8WeKfeJji
🚨 BREAKING: Grok 4 defeats Google’s Gemini in the Kaggle AI Chess semi-final and moves on to the grand finale! 🤖♟️🔥 https://t.co/v9Ja014yLF
🚨 BREAKING: Grok 4 defeats Google’s Gemini in the AI Chess semi-final and moves on to the grand finale! 🤖♟️🔥 https://t.co/Gon2EhS4iM