Jun 13, 03:02 AM

Sakana AI Labs Unveils DiscoPOP, a Self-Improving AI Algorithm That Beats DPO

Sakana AI Labs has unveiled DiscoPOP, a state-of-the-art (SOTA) preference optimization algorithm discovered and written by a large language model (LLM). This innovative approach leverages LLMs as code-level mutation operators to improve their own training algorithms. The LLM-driven discovery process uses evolutionary strategies and meta-evolution, referred to as LLM Squared (LLM²), to find new preference optimization loss functions. DiscoPOP, which beats DPO, represents a significant advancement in AI research, showcasing the potential for self-improving AI systems. Sakana AI's LRML method further underscores their pioneering efforts in AI innovation.

#Sakana AI Labs #DiscoPOP #LLM Squared #Sakana AI

Written with ChatGPT (GPT-4o).

Sources

Tony Wang@TonyW
2 years ago
Introducing DiscoPOP, the latest release from the team at @SakanaAILabs. This time, it’s a new SOTA preference optimisation algorithm that was discovered and written by an LLM 😮. The LLM-driven discovery process seems generalizable enough, but here it’s been used to create novel… https://t.co/nnCJm06h7A
Robert Lange@RobertTLange
2 years ago
🎉 Stoked to share our latest work @SakanaAILabs - DiscoPOP 🪩 We leverage LLMs as code-level mutation operators, which improve their own training algorithms. Thereby, we discover various performant preference optimization algorithms using LLM-driven meta-evolution (LLM²) 🔁… https://t.co/wf6cRqucjp
Alex Volkov (Thursd/AI)@altryne
2 years ago
This looks like very exciting work out of Sakana AI (@hardmaru @YesThisIsLion) called LLM Squared, using LLMs to write code and come up with a better way to train LLMs (specifically create SOTA preference optimization algorithms that beat DPO) 👏 Self improving AI anyone? https://t.co/kpEZqw6LqZ

Additional media

Image #1 for story sakana-ai-labs-unveils-discopop-self-improving-ai-algorithm-that-beats-dpo

Sakana AI Labs Unveils DiscoPOP, a Self-Improving AI Algorithm That Beats DPO

Sources

Additional media

Similar Stories