Aug 27, 06:41 PM

OpenAI and Anthropic Join Forces on AI Safety Testing

OpenAI and Anthropic, the two most heavily funded developers of large language models, have published the results of a joint safety study in which each company tested the other’s newest systems. The unprecedented cross-lab collaboration, announced 27 August, gave researchers temporary reciprocal API access and is intended to establish a benchmark for third-party scrutiny of frontier AI models. The findings highlight contrasting risk profiles. Anthropic’s Claude Opus 4 and Sonnet 4 refused to answer up to 70 percent of questions when uncertain, whereas OpenAI’s o3 and o4-mini attempted to respond far more often but generated a higher rate of hallucinations. Executives from both firms said an optimal approach would blend greater refusal rates with reduced fabrication. OpenAI co-founder Wojciech Zaremba and Anthropic researcher Nicholas Carlini said they hope to repeat the exercise with future models and encourage other labs to participate, arguing that shared evaluations can mitigate commercial pressures that might otherwise lead companies to cut safety corners. The collaboration comes amid intensifying competition for talent, data-center capacity and government contracts.

#OpenAI #Anthropic #Wojciech Zaremba #Nicholas Carlini

Written with ChatGPT .

OpenAI and Anthropic Join Forces on AI Safety Testing

Sources

Additional media

Similar Stories