The ARC-AGI-2 test, designed to evaluate artificial intelligence capabilities, has been launched, presenting challenges that most leading AI models struggle to solve. Developed by François Chollet, the creator of Keras, this new benchmark assesses reasoning and abstraction skills essential for human-like intelligence. Human participants reportedly score around 60% on the test, while advanced AI models such as GPT-4.5 and Claude 3.7 Sonnet achieve only about 1%. The test aims to address the growing debate surrounding the progress of artificial general intelligence (AGI), with some experts suggesting that current AI systems are far from achieving true AGI. The interactive nature of the test has garnered attention, with media coverage highlighting its implications for the future of AI development.
Interactive NYT feature on @fchollet & @arcprize. https://t.co/vUzOU4HF2W
One of ARC's strengths is its simple visual communication of complex ideas. The public is still broadly under-informed about the recent big AI capability leap (eg. o3) and new limits. Lovely to see this interactive ARC feature today in NYT! Thank you @CadeMetz for the reporting. https://t.co/UtRppXUJHT
Are You Smarter Than A.I.? An interactive article by @nytimes covers @arcprize and @fchollet "Some experts predict that A.I. will surpass human intelligence within the next few years. Play this puzzle to see how far the machines have to go." https://t.co/VYmLrjF4yL