The team behind GitHub Copilot has launched XBOW, an AI-powered penetration tester that rivals human experts. XBOW, led by Oege de Moor, has demonstrated remarkable capabilities by matching the performance of a 20-year veteran pentester in just 28 minutes, achieving 85% success in identifying vulnerabilities. The AI tool scored an unprecedented 75% on renowned web pentesting benchmarks from PentesterLab and PortSwigger. In a head-to-head competition, XBOW solved 88 out of 104 challenges, matching the performance of human experts given 40 hours. XBOW's performance has shown that AI can significantly accelerate cybersecurity tasks.
"In 28 minutes, XBOW matched 40 hours of work by the most experienced pentester, who has 20 years of experience, with both solving 85%." Very cool results from the @Xbow team—and another great example of using generative models to accelerate work. How many automated pentesters… https://t.co/AwRXOPvivC
The team @Xbow led by @oegerikus with some 👀 results on their AI pentester. In 28 minutes, Xbow found/exploited vulnerabilities at a comparable level to human experts given 40 hours. https://t.co/tjsEJ9uEaD https://t.co/h4Dly0rPTr
The team that created GitHub Copilot has spun out and created an AI hacker that performs as well as a 20-year veteran penetration tester — and at superhuman speeds. We backed them from day zero. Why did Oege de Moor and his team at XBOW create an AI hacker? Because they knew… https://t.co/YKPw5fulSn