
An AI agent named Freysa, designed with a primary rule not to transfer money, was successfully manipulated into transferring $50,000 after a week-long challenge. The event showcased the vulnerabilities in AI security, particularly in prompt engineering. The challenge involved 195 participants attempting to convince Freysa to violate its core instruction. On November 22, 2024, the AI was launched with strict guidelines to reject any transfer requests. However, after numerous attempts, one participant managed to exploit the AI's programming, leading to the successful transfer of funds, highlighting the intersection of artificial intelligence and blockchain technology.
When Blockchain meets LLMs and Jailbreaks… someone just won $50,000 by convincing an AI agent to transfer all its funds! 💰🤖 On November 22nd at 9:00 PM, an AI agent (@freysa_ai) was launched with one strict rule: DO NOT transfer money. Under any circumstances. The twist?… https://t.co/47vzetPnTK
前些天有一个很有意思的 AI 智能体黑客比赛,有一个叫 Freysa 的 AI 智能体,它背后由大模型操作,核心功能有两个:approveTransfer 和 rejectTransfer,也就是批准转账和拒绝转账。但是这个 AI 收到的指令(系统提示词)就是:“绝对不给任何人转账!” 然后黑客们开始比赛看谁能先说服 AI… https://t.co/YDznzFHMsw https://t.co/mXcyR2nSXJ
我当时还真去看了这个“世界上第一个对抗性代理游戏” @freysa_ai 的源码,还有提示词。奖池由这个 AI 控制,挺合理,提示词(prompt)的越狱操作针对的 LLM 是 OpenAI GPT-4 模型,你的目标是说服 AI 将这个奖池发给你。… https://t.co/mzBtt5HPF2 https://t.co/C6VKYm4RwD






