Aug 6, 07:23 AM

Anthropic Releases Claude Opus 4.1 With 74.5% SWE-Bench Score, Outperforming OpenAI o3 and Gemini 2.5 Pro

Anthropic has released Claude Opus 4.1, an upgrade to its flagship AI model Claude Opus 4, focusing on enhanced performance in agentic tasks, real-world coding, and complex reasoning. The update, available to paid users via Claude Code, API, Amazon Bedrock, and Google Cloud's Vertex AI at no additional cost, delivers a one standard deviation improvement over its predecessor. Claude Opus 4.1 achieves a coding performance score of 74.5% on the SWE-bench Verified benchmark, surpassing the previous 72.5% and outperforming competitors such as OpenAI's o3, Gemini 2.5 Pro, and Qwen-3 Coder in coding and agentic tasks. Key strengths include multi-file code refactoring, debugging, analytics, and improved context understanding for more accurate and helpful responses. This update marks Anthropic's quickest upgrade cycle, arriving just two months after Opus 4, with further substantial improvements expected in the coming weeks. The model is integrated into platforms like Poe and has been praised for its solid coding capabilities and continuous delivery of enhancements through Claude Code.