OptionProbability
At least one former IMO medalist will review the model's answers and claim it did not actually achieve Gold
The model that achieved it was trained with a new reinforcement learning algorithm
The model that achieved it could earn at least bronze using no more than 100,000 reasoning tokens per question
It was achieved with the same model OpenAI used to get second place in AtCoder
The breakthrough is mostly the result of superior test-time scaling methods
I will consider the techniques used to achieve it at least as big of a breakthrough as strawberry
It was achieved by a model that does not use a standard transformer architecture
92
66
50
41
34
24
Get the latest stories live on any device.
Top Stories