Jul 10, 04:24 AM

Musk’s X Launches 2,500-Question Benchmark to Probe AI Limits

X, the social-media and technology company controlled by Elon Musk, has introduced a new artificial-intelligence benchmark dubbed “Humanity’s Last Exam.” The test comprises about 2,500 questions designed to differentiate advanced reasoning systems from less capable models, with mathematics accounting for roughly 41% of the material and the remainder spread across sciences and the humanities. Musk said the release reflects a shortage of sufficiently difficult evaluation material as large language models master existing academic tests. He argued that, ultimately, the only meaningful assessment for AI will be its ability to build products that obey the laws of physics, citing examples such as cars, rockets and pharmaceuticals. “Physics is the law—everything else is a recommendation,” he said, adding that reality itself will become the definitive proving ground for future systems.

#Elon Musk #Musk

Written with ChatGPT .

Sources

Additional media

Image #1 for story musks-x-launches-2500-question-benchmark-to-probe-ai-limits-7375ef94

Musk’s X Launches 2,500-Question Benchmark to Probe AI Limits

Sources

Additional media

Similar Stories

Similar Stories