Researchers are studying whether transformers can learn to implicitly reason over parametric knowledge, a skill that challenges even advanced language models. The study focuses on two representative reasoning types and finds that transformers can learn this skill through extended training.
Thanks @_akhaliq for sharing our work. We study whether transformers can implicitly reason over their parametric knowledge, a skill that even the most capable LLMs nowadays struggle with. We find that transformers can learn implicit reasoning, but only through grokking, i.e.,… https://t.co/hGMQGBXUef https://t.co/jQXokfOqrU
[LG] Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization B Wang, X Yue, Y Su, H Sun [The Ohio State University & CMU] (2024) https://t.co/w9Klf0vpXr - The paper studies whether transformers can learn to implicitly reason over… https://t.co/3jQh0tXUC3
this paper ough to have been called: "deep transformers are shallow learners" (curse the insufferable token "grok".) it confirms transformers - can (*at great effort*) learn ID composition - cannot learn OOD composition frog is glib but also... https://t.co/tMamMojjSV https://t.co/4OkBf8FrIK