Feb 10, 07:52 PM

Researchers from Max-Planck Institute and University of Maryland Enhance AI Reasoning with 3.5 Billion Parameter Model

Researchers have introduced a novel approach to enhance the reasoning capabilities of language models through a method called 'Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach.' This technique involves scaling computation during inference by iterating a recurrent block, allowing the model to reason implicitly in latent space. Unlike traditional models that increase computation by generating more tokens, this method does not require specialized training data and can function with small context windows. The study, conducted by researchers from the Max-Planck Institute for Intelligent Systems and the University of Maryland, scaled a proof-of-concept model to 3.5 billion parameters and 800 billion tokens. The model demonstrated improved performance on reasoning benchmarks, with performance enhancements up to a computation load equivalent to 50 billion parameters.

#University of Maryland

Written with ChatGPT .