Meta AI has introduced ReasonIR-8B, a retriever model specifically trained for reasoning-intensive information retrieval tasks. Developed using LLaMA3.1-8B, ReasonIR-8B employs a synthetic data generation pipeline that creates challenging queries with plausible but misleading negatives, enhancing its reasoning capabilities. The model outperforms existing retrievers and rerankers on the BRIGHT benchmark, achieving a 36.9 nDCG@10 score while being 200 times more compute-efficient than leading large language model rerankers. Additionally, research on reinforcement learning with verifiable reward (RLVR) demonstrates that large language models can significantly improve mathematical reasoning performance using just one training example. For instance, the Qwen2.5-Math-1.5B model's accuracy on the MATH500 benchmark rose from 36.0% to 73.6% with one-shot RLVR. This approach also shows generalizability across tasks. These advancements highlight progress in efficient training methods and retrieval performance for reasoning tasks in AI models.
LLMs often require large datasets for Reinforcement Learning with Verifiable Reward (RLVR) training, questioning data efficiency. This paper demonstrates that RLVR using just one training example (1-shot RLVR) can significantly enhance mathematical reasoning, matching https://t.co/6bhNfWDtUX
Reasoning & test-time scaling don't just matter for generating text with LLMs — @RulinShao, @ray_qiaorui & team show how these are key to retrieval quality. ReasonIR is SoTA on reasoning-intensive retrieval across multiple test-time compute budgets! https://t.co/oJnFlyyWYG
Meta AI Introduces ReasonIR-8B: A Reasoning-Focused Retriever Optimized for Efficiency and RAG Performance Meta AI has released ReasonIR-8B, a retriever model designed explicitly for reasoning-intensive information retrieval. Trained from LLaMA3.1-8B, the model establishes new https://t.co/RRq16XZPuA