X, the social-media and technology company controlled by Elon Musk, has introduced a new artificial-intelligence benchmark dubbed “Humanity’s Last Exam.” The test comprises about 2,500 questions designed to differentiate advanced reasoning systems from less capable models, with mathematics accounting for roughly 41% of the material and the remainder spread across sciences and the humanities. Musk said the release reflects a shortage of sufficiently difficult evaluation material as large language models master existing academic tests. He argued that, ultimately, the only meaningful assessment for AI will be its ability to build products that obey the laws of physics, citing examples such as cars, rockets and pharmaceuticals. “Physics is the law—everything else is a recommendation,” he said, adding that reality itself will become the definitive proving ground for future systems.
🚨ELON: WE’RE RUNNING OUT OF QUESTIONS TO ASK OUR AI... REALITY IS THE ULTIMATE TEST "We're running out of actual test questions to ask. Questions that are ridiculously hard for humans are becoming trivial for AI. Physics is the law—you can't break physics. The ultimate https://t.co/6fYB4SziUR https://t.co/mW07uL4pUX
🚨 ELON MUSK on AI: "Physics is the law. Everything else? Just a recommendation." He says truly intelligent AI must be grounded in reality making predictions that align with the real world. No hype. Just physics. #AI #ElonMusk #Grok4 #TechNews https://t.co/Rgb7o7gQnQ
🚨ELON MUSK: "We're running out of test questions to ask AI. Everything has become trivial. The ultimate reasoning test for the AI is going to be reality. Can it create a car, or a rocket, or a new medication? Does it work? Reality is the ultimate judge." https://t.co/FpGHQz4JB8