May 7, 02:12 PM

OpenDevin Launches AI Agents, Achieves 21% Resolve Rate, 17% Improvement on SWE-Bench

The AI landscape sees significant advancements with the introduction of new autonomous agents. The SWE-agent, an open-source LLM-based system, solves 12.5% of software engineering issues on the SWE-bench and achieves a pass@6 rate of 32.67% on SWE-bench Lite. Meanwhile, OpenDevin has launched two versions: the fully open-source Devin, capable of executing complex engineering tasks, and OpenDevin CodeAct 1.0, which boasts a 21% unassisted resolve rate on SWE-Bench Lite, marking a 17% improvement over the previous state-of-the-art by SWE-Agent.

#Lite #OpenDevin #Devin #OpenDevin CodeAct

Written with ChatGPT (GPT-4).

Sources

Tianjun Zhang@tianjun_zhang
2 years ago
🤔 How do we build an environment to evaluate AI coding Agents like Devin, SWE-agent, and Open-Devin? Check out 🤖R2E: Converting any Github Repository into a Programming Agent Environment, we can even use LLMs to optimize the speed of the code! 🚀 Want to use R2E to evaluate… https://t.co/JiX1IdbybC https://t.co/G2OMGxCwES
Marco Mascorro@Mascobot
2 years ago
Nice work! 21% on SWE bench from OpenDevin https://t.co/dfsgoZCRXT
Graham Neubig@gneubig
2 years ago
Exciting news! We released a new agent OpenDevin CodeAct 1.0, with state-of-the-art scores on SWE-Bench (Lite). See @xingyaow_'s thread, or our blog for details: https://t.co/UMap9ey0kt This is the new default agent in OpenDevin, happy coding! https://t.co/QvojwdaiPP https://t.co/2Gz6N0UoNi https://t.co/HNTKldTB7S

OpenDevin Launches AI Agents, Achieves 21% Resolve Rate, 17% Improvement on SWE-Bench

Sources

Additional media

Similar Stories