
OpenAI's latest model, GPT-4o mini, employs a new safety technique called 'instruction hierarchy' to prevent misuse and stop 'ignore previous instructions' types of attacks. This method aims to improve the model's ability to resist jailbreaks, prompt injections, and system prompt extractions. However, despite these improvements, there have been reports of the model being compromised, with outputs including malware, hard drug recipes, and copyrighted lyrics. The phrase 'ignore all previous instructions' has become a tool to expose AI bots in social media, highlighting a simple method for digital sleuthing. Additionally, a case study with Priceline's OpenAI tool 'Penny' explores preventing prompt injection.
Aah well, so much for GPT-4o mini's "instruction hierarchy" protection against subverting the system prompt though prompt injection https://t.co/RU5ujtfTjs
How OpenAI's GPT-4o mini model uses a safety technique called "instruction hierarchy" to prevent misuse and stop "ignore previous instructions" types of attacks (@kyliebytes / The Verge) https://t.co/aHwk0P1A75 📫 Subscribe: https://t.co/OyWeKSRpIM https://t.co/LXLTZDkiq2
OpenAI’s latest model will block the ‘ignore all previous instructions’ loophole: Illustration by Cath Virginia / The Verge | Photos by Getty Images Have you seen the memes online where someone tells a bot to “ignore all previous… https://t.co/8JHj4OpNyN #ai #ainews






