Apr 1, 04:47 PM

Open-source AI Model Devin with GPT4 Shows Promising Results in SWE Bench Test led by John for Generalization

A new open-source AI model named Devin has shown promising results in software engineering benchmarks. Devin achieved 12.29% accuracy on 100% of the SWE Bench test set, compared to 13.84% on 25% of the set. The model uses GPT4 and is expected to improve with GPT5. The project is led by a team including John and has garnered attention for its potential in generalization.

#Devin #SWE Bench #John

Written with ChatGPT (GPT-3).

Sources

Rich Hemming@S_A_R_Lab
2 years ago
Open source with results close to Devin #ai #coding https://t.co/eNxX1dk8FO
Blaze (Balázs Galambosi)@gblazex
2 years ago
Exciting open source Devin from Princeton https://t.co/2eiigNO0Pn
Andrew Curran@AndrewCurran_
2 years ago
Open source Devin, with very impressive numbers. From the thread: 'letting SWE-agent only view 100 lines at a time was better than letting it view 200 or 300 lines and much better than letting it view the entire file'. https://t.co/mQsplpewWw

Open-source AI Model Devin with GPT4 Shows Promising Results in SWE Bench Test led by John for Generalization

Sources

Additional media

Similar Stories