Sep 20, 06:58 PM

$Google DeepMind's SCoRe Boosts AI Self-Correction by 15.6% in MATH, 9.1% in HumanEval$

Google DeepMind's SCoRe Boosts AI Self-Correction by 15.6% in MATH, 9.1% in HumanEval

Google DeepMind has introduced a new method called SCoRe, which stands for Self-Correction via Reinforcement Learning, aimed at enhancing the self-correction abilities of large language models (LLMs). This multi-turn chain-of-thought online reinforcement learning approach uses entirely self-generated data to improve the accuracy of LLMs in complex mathematical and coding tasks. SCoRe has demonstrated significant improvements, including a 15.6% gain in self-correction for reasoning problems from the MATH dataset and a 9.1% improvement in HumanEval. Additionally, SCoRe combined with inference-time scaling (maj@32) achieves a 10.5% improvement. The method represents a significant advancement in AI, addressing the challenge of self-correction without external guidance.

#Google DeepMind #HumanEval

Written with ChatGPT (GPT-4o).

Sources

Additional media

$Image #1 for story google-deepmind-s-score-boosts-ai-self-correction-15-6-math-9-1-humaneval-9a7578df$

$Image #2 for story google-deepmind-s-score-boosts-ai-self-correction-15-6-math-9-1-humaneval-9a7578df$

$Image #3 for story google-deepmind-s-score-boosts-ai-self-correction-15-6-math-9-1-humaneval-9a7578df$

$Image #4 for story google-deepmind-s-score-boosts-ai-self-correction-15-6-math-9-1-humaneval-9a7578df$

$Image #5 for story google-deepmind-s-score-boosts-ai-self-correction-15-6-math-9-1-humaneval-9a7578df$

Google DeepMind's SCoRe Boosts AI Self-Correction by 15.6% in MATH, 9.1% in HumanEval

Sources

Additional media

Similar Stories