Recent advancements in machine learning have highlighted the effectiveness of long-context in-context learning (ICL) for large language models (LLMs), which now allows the integration of thousands of training examples directly into the model's context. This approach has shown significant improvements over traditional few-shot prompting and is competitive with fine-tuning methods. Additionally, challenges such as increased hallucination when LLMs are trained on unfamiliar text have been identified, prompting solutions like the FLAME methodology, which aims to enhance factuality in AI systems.
The latest ML/AI research (Part 1): ▪️ The Instruction Hierarchy ▪️ Multi-Head Mixture-of-Experts ▪️ AdvPrompter ▪️ SnapKV ▪️ XC-CACHE ▪️ Make Your LLM Fully Utilize the Context 🧵 https://t.co/1GDSdkvcOY
"Make Your LLM Fully Utilize the Context" 📌 This research from Microsoft partially solves "lost-in-the-middle" problems in LLMs, where LLMs struggle to fully utilize information located within long contexts, particularly in the middle sections. 📌 The paper hypothesizes that… https://t.co/ccto8RtW7m
. @Meta + others find that training LLMs on new knowledge/unfamiliar text increases hallucination in building agentic products it can be 1 step forward and 2 steps back when adding new capabilities that then break the entire system they propose FLAME - factuality-aware… https://t.co/M3p03h8yxw