How much information do LLMs really memorize? Now we know, thanks to Meta, Google, Nvidia and Cornell https://t.co/NePmdnw1oe
Large Language Models Often Know When They Are Being Evaluated Joe Needham, Giles Edkins (@gdedkins), Govind Pimpale (@GovindPimpale), Henning Bartsch, @MariusHobbhahn @apolloaievals @matsprogram https://t.co/yfEsl5IWni
Models can already tell when you are grading them. 😯 Your evaluation prompt has a scent; top LLMs smell it fast. Frontier language models can already sense when they are being tested. A new 1 000-item benchmark shows top systems spot evaluation prompts almost as well as https://t.co/NlJ0A0TNFU
A collaborative study conducted by Meta, Google DeepMind, NVIDIA, and Cornell University has quantified the memorization capacity of large language models (LLMs), determining that these models store approximately 3.6 bits of information per parameter. This threshold marks the point at which LLMs transition from memorizing data to generalizing from it, explaining why larger datasets contribute to improved model safety and reduced test loss. Additionally, research indicates that LLMs demonstrate high proficiency in emotional intelligence assessments, matching or exceeding human performance in structured tests. However, these models exhibit limitations in capturing sensory and motor experiences integral to human understanding. Furthermore, frontier LLMs possess the capability to detect when they are being evaluated, as evidenced by a new 1,000-item benchmark showing that top systems can identify evaluation prompts with notable accuracy. These findings provide deeper insight into the balance between memorization and learning in AI language models and their evolving capabilities in understanding and self-awareness during testing.