Aug 16, 01:53 AM

Developers Say OpenAI GPT-5 Is Cheap but Trails Anthropic on Code Accuracy

OpenAI’s newly released GPT-5 model is drawing mixed reviews from software developers who say the system excels at technical reasoning and project planning but lags rivals on raw coding accuracy. Early users interviewed by WIRED and discussing their tests online report that Anthropic’s latest Claude Opus 4.1 and Claude Sonnet 4 continue to generate cleaner, more reliable code. Cost is emerging as GPT-5’s principal advantage. Sayash Kapoor, a Princeton University researcher benchmarking large language models, says a standard SWE-bench test costs about $30 to run with GPT-5 set to medium verbosity, compared with roughly $400 for the same test on Claude Opus 4.1. Yet in Kapoor’s trials GPT-5 reproduced results from scientific papers only 27 % of the time, versus 51 % for Opus. Some engineers praise GPT-5’s ability to digest complex briefs and return end-to-end solutions in a single pass, but others criticise its tendency to generate redundant code and hallucinate details such as URLs. Anthropic argues that real-world performance depends on outcome-based pricing, noting that highly deliberative models can quickly consume tokens. The early feedback underscores a trade-off developers face: lower operating costs with GPT-5 versus higher accuracy from competing models.

#OpenAI #WIRED #Anthropic #Claude Opus #Sayash Kapoor #Princeton University #GPT #Kapoor #Opus

Written with ChatGPT .

Sources

Additional media

Image #1 for story developers-say-openai-gpt-5-cheap-trails-anthropic-on-code-accuracy-9704cad9

Developers Say OpenAI GPT-5 Is Cheap but Trails Anthropic on Code Accuracy

Sources

Additional media

Similar Stories

Similar Stories