Feb 3, 09:37 AM

University of Waterloo and CMU Introduce Critique Fine-Tuning with 600K Instruction-Critique Pairs to Enhance LLM Reasoning

Researchers from the University of Waterloo, Carnegie Mellon University (CMU), and the Vector Institute have introduced a novel AI approach called Critique Fine-Tuning (CFT). This method aims to enhance the reasoning capabilities of large language models (LLMs) through structured critique learning. The initiative includes the creation of WebInstruct-CFT, which comprises 600,000 instruction-critique pairs, with a focus on 65% math-related content, as well as business and sciences. The critiques are generated using GPT-4, and the dataset is available in three sizes: 4K, 50K, and 600K examples. Additionally, the researchers have developed a benchmark named RealCritic to evaluate the quality of LLM critiques based on their ability to improve solutions, rather than solely on verdict accuracy. The study also addresses the challenge of aligning LLM judgments with human evaluations through various automatic prompt optimization techniques.

#University of Waterloo #Carnegie Mellon University #CMU #Vector Institute #WebInstruct #RealCritic

Written with ChatGPT (GPT-4o mini).