🏷️:Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs 🔗:https://t.co/4nwvH8Ipxd https://t.co/r51aDd8BxG
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Wang et al.: https://t.co/RzLSNENMer #DeepLearning #ChatGPT #LLM https://t.co/3L3mrgMJQY
A really weird paper by vibe, but interesting thesis (if I understand it): LLMs don't do search well (and default to a primitive strategy of maximizing novelty rather than updating on a node's fertility) because early layers dominate decisionmaking (at least here); O1 copes w/it https://t.co/pFYzmlBUm1 https://t.co/y0hQt1RmOK
Recent discussions in the field of artificial intelligence highlight a new phenomenon termed 'underthinking' in o1-like large language models (LLMs). Several papers have emerged examining the cognitive patterns of these models, contrasting them with the previously noted issues of overthinking. Researchers, including Wang et al., suggest that these models may not effectively perform search tasks and tend to default to strategies that prioritize novelty over updating based on the relevance of information. This behavior is attributed to the influence of early layers in the decision-making process of the models. The implications of these findings are still being explored, with ongoing research contributing to a deeper understanding of LLM functionality.