OpenAI is building an artificial-intelligence system dubbed “Universal Verifier” that automatically assesses the answers generated by its large language models, The Information reported on Monday. The tool is intended to score responses not only in well-defined tasks such as coding and mathematics but also in more subjective areas including business decision-making and creative writing. By acting as an internal reviewer, the verifier provides reinforcement-learning feedback that could reduce the company’s reliance on human annotators and accelerate model improvements. People familiar with the project said the technology is being integrated into the training pipeline for GPT-5, the planned successor to GPT-4. The work extends research described in the paper “Prover-Verifier Games Improve Legibility of LLMs,” which details a production-ready pipeline in which a dedicated verifier model evaluates each reasoning chain before an answer is accepted.
Wow! It took OpenAI five model generations to figure out that one LLM's answers can be "verified" by another LLM. Google figured it out after Gemini's first generation and called it "double check this answer". https://t.co/4a5zadcUYQ
Just wrapped another Cal Newport book and it got me thinking: in the future, AI will grind on rote tasks, so humans can focus on the magic. At @sevensevensix we’re wiring more AI into Cerebro to surface what matters and amplify the work only people can do. https://t.co/GqigDwvlFk
OpenAI's mysterious “Universal Verifier” that is supposedly plugged into the GPT-5 training loop. And OpenAI published this paper earlier. "Prover-Verifier Games Improve Legibility of LLM", showing a production-ready pipeline where a verifier model scores each reasoning chain https://t.co/rfcxgc2Iyg https://t.co/R1Lp8S4U1u