Recent advancements in large language models (LLMs) are addressing their limitations, such as computational expense and adaptability to new tasks. Configurable foundation models are emerging to overcome these challenges. Techniques like chunked prefill and margin generation are improving LLM inferencing performance on long-context tasks, achieving an average 7.5% accuracy improvement in reasoning skills. Additionally, methods to extend context post-training are being explored, which are crucial for models like OpenAI o1. Making text embedders few-shot learners is also being discussed. A two-stage algorithm called ARES, combining reinforcement learning and supervised fine-tuning, has shown a 70% win rate in rationale reasoning against baselines and a 2.5% average increase in performance.
Paper - "ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning" A Two-stage algorithm for enhancing multi-modal chain-of-thought reasoning in LMMs 📈 Results: • 70% win rate in rationale reasoning against baselines (GPT-4 judged) • 2.5% average increase in… https://t.co/PMYPAxYjJQ
Long-context is central to models like OpenAI o1, but rare to see in natural data. Extension methods grow context by post-training open LLMs. A tutorial and controlled study of this area of long-context extension. https://t.co/MWN3aHu1O5 https://t.co/FlvN1kflxn
Making Text Embedders Few-Shot Learners discuss: https://t.co/MEkGLDuJD3 Large language models (LLMs) with decoder-only architectures demonstrate remarkable in-context learning (ICL) capabilities. This feature enables them to effectively handle both familiar and novel tasks by… https://t.co/BzL5aeAoD3