Jun 12, 09:57 PM

Yann LeCun and Abacus AI Launch LiveBench, a New LLM Benchmark

A team of AI researchers, including Yann LeCun, has announced the launch of LiveBench, a new general-purpose live LLM benchmark. LiveBench aims to address the limitations of existing LLM benchmarks by using contamination-free test data and objective scoring. Developed in collaboration with Abacus AI, LiveBench is designed to be lightweight and easy to run, featuring around 200 questions per category. This benchmark is unique in that it presents new challenges that models cannot simply memorize, making it a more robust tool for evaluating AI models.

#Yann LeCun #LiveBench #Abacus AI

Written with ChatGPT (GPT-4o).

Sources

Ken Yeung@thekenyeung
2 years ago
A team of AI researchers/academics, incl. @ylecun developed a new open LLM benchmark called LiveBench. It evaluates models using contamination-free test data and objective scoring. I spoke w/some of its creators: @micahgoldblum & folks from @abacusai. https://t.co/V75Jp6rNFz
The Tech News Roundup@TechNewsRoundup
2 years ago
LiveBench is an open LLM benchmark that uses contamination-free test data and objective scoring: Yann LeCun and other researchers have developed LiveBench, an open AI benchmark evaluating models using challenging, contamination-free… https://t.co/A7CygWiM8e #AI #AIbenchmarking
VentureBeat@VentureBeat
2 years ago
LiveBench is an open LLM benchmark that uses contamination-free test data and objective scoring https://t.co/pYKQFRAIKx https://t.co/1So8uyes1C

Additional media

Image #1 for story yann-lecun-abacus-ai-launch-livebench-new-llm-benchmark

Yann LeCun and Abacus AI Launch LiveBench, a New LLM Benchmark

Sources

Additional media

Similar Stories