Sources
Clint Gibler🤖 LLM Agents can Autonomously Hack Websites Academic paper testing several LLMs across vulnerability classes like XSS, CSRF, SQL injection, and more → Uses OpenAI Assistants API, LangChain, and Playwright GPT-4 wins https://t.co/z1axioMI3X https://t.co/fgtHvCLJ3w
chrisrohlfI wrote down some quick thoughts on that "LLM Agents can Autonomously Hack Websites" paper that has been going around. TLDR; no data, lack of transparency in methodology, no baseline testing against traditional penetrating testing tools https://t.co/is1bGAeuGY
Bart de Wittereminds of that @open_phil sponsored paper that argued that we should ban powerful open source LLMs as it increases the risk of bio-terror, but forgot to include google search in their paper as a control group. https://t.co/Z0vtTiWYyA



