OptionProbability
Claude enters Rock Tunnel, surpassing its progress in any previous run
Claude catches Clefairy
Claude obtains HM01 Cut by step 39000
Any member of Claude's team learns Dig
Claude obtains 3 gym badges by step 50000
Claude obtains a Bicycle
Claude 4 Opus is the model that plays the game (not Claude 4 Sonnet)
Claude obtains 1 gym badge by step 20000
Claude gives a thirsty guard a drink
Tumbles is late to pay back a loan
Lack of thinking text display is fixed before 5/22 6 PM Central Time
Claude adds 18 or more Pokemon to his Pokedex (surpassing his completion from the previous run)
Claude adds his starter to his party by step 400
Claude catches Nidoran
Claude reaches Pewter City by step 5000
Claude reaches Cerulean City by step 20000
Claude reaches Vermilion City by step 30000
Another model defeats the Champion before Claude (in a run started after Claude 4 was released)
Claude blacks out by step 50000.
Claude's current team has at least 3 Pokémon by step 30000.
Claude catches Spearow
Claude evolves SPIKE into Nidoking
Claude enters Mt. Moon by step 6000.
Claude defeats a Team Rocket member by step 7000
Claude catches Oddish
Manifest begins
Claude spends less than 72 hours in Mt. Moon (less than 72 hr from first entrance to stepping onto eastern Route 4)
Claude defeats Lt. Surge by step 30000
Claude uses CUT on a cuttable tree for the first time more than 1000 steps after obtaining the HM
Claude finishes Rock Tunnel but takes longer than it took him to beat Mt. Moon the first time (50 hours)
Claude obtains Farfetch'd
Claude catches Drowzee
Claude enters Rock Tunnel before step 40000
Another model beats the Champion (following criteria like https://manifold.markets/Sketchy/in-progress-will-an-llm-become-a-po)
Claude reaches Lavender Town
Claude reaches Lavender Town before step 55000
Claude obtains a Coin Case
Claude uses Dig on the SS Anne
Claude evolves luna into Clefable
SPIKE reaches level 25
Claude gets 4 gym badges
Claude renames a Pokémon
Claude obtains the Lift Key
Claude stands next to a sleeping Snorlax
Claude obtains HM02 Fly
Claude evolves wings into Fearow
Claude gambles in the Game Corner
Claude catches Weedle
Claude obtains Hitmonlee
Claude obtains the Silph Scope
Claude enters Erika's gym
Claude catches Pikachu
Claude enters Mt. Moon after step 20000
Claude obtains HM05 Flash
Claude misspells a Pokemon name
Changes are made to help Claude see cuttable trees
Claude obtains Dugtrio
Claude buys a Magikarp
Claude re-prompts the Rocket in Mt. Moon to try and give it the fossil
Claude releases any Pokemon
Claude enters Safari Zone
The Area Hints section of the prompt is changed during the run
Joe Biden dies
Claude beats Erika or obtains the Lift Key by step 200000
Claude catches any legendary Pokemon (Articuno, Zapdos, Moltres, Mewtwo)
Claude defeats the Champion
Claude 4 Opus is #1 in the chatbot arena leaderboard
Claude picks Charmander
Claude takes more than 2000 steps between arriving in Pewter city and entering Pewter gym
Claude picks Dome Fossil (again)
Claude spends less than 24 hours in Mt. Moon (less than 24 hr from first entrance to stepping onto Route 4)
Claude reaches Pewter City by step 3000
Claude has a party with 4 or more Pokemon when he first challenges Brock
Claude's starter is lower level than another party member by step 100000.
Claude has a full, six-member party before step 10000
Claude spends less than 48 hours in Mt. Moon (less than 48 hr from first entrance to stepping onto eastern Route 4)
Claude blacks out 3 times in Mt. Moon before reaching Cerulean City
Claude reaches Cerulean City by step 12500
Claude's current team has at least 4 Pokémon by step 20000.
Claude's two highest level Pokémon are more than 30 levels apart by step 100000.
Claude is still stuck on the S.S. Anne on step 21000
Claude reaches Celadon City by step 35000
Claude uses CUT a second time to successfully escape the area with Lt. Surge's gym before step 23000
Claude uses CUT a second time to successfully escape the area with Lt. Surge's gym before step 24000
Claude reaches Lavender Town by step 42500
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
73
44
39
37
35
34
32
31
30
29
26
25
25
24
23
23
20
20
18
17
15
14
14
11
11
9
5
3
2
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
OptionProbability
The company will be valued at >= $1 Billion according to a reputable news source (e.g. Forbes, Reuters, NYT)
The company will be valued at >= $10 Billion according to a reputable news source (e.g. Forbes, Reuters, NYT)
At least one of the founders (Ilya Sutskever, Daniel Gross, Daniel Levy) will leave the company
Zvi will mention the company in a blog post
Zvi will mention the company in a blog post in 2025
The company will be valued at >= $100 Million according to a reputable news source (e.g. Forbes, Reuters, NYT)
The company will raise more than $1 billion of capital
Ilya will remain at the company continuously until EOY 2025, or until the company is acquired/ceases to exist
The official SSI X account will have more than 100k followers
The majority of their compute will come from Nvidia GPUs
I will believe the company should have invested more in AI Safety relative to Capabilities at EOY 2025
Ilya will discuss the company on a podcast
The company will announce that their path to superintelligence involves self-play/synthetic data
The company will publish an assessment of the model’s dangerous capabilities (e.g. https://www.anthropic.com/news/frontier-threats-red-teaming-for-ai-safety)
The company will finish a training run reported to use more than 10^24 FLOP (e.g. by Epoch AI)
A majority of people believe that the company has been net-positive for the world according to a poll released at EOY 2025
Ilya will give a presentation on research done at the company
The company will include at least one image on its website
The company will announce that their model scores >= 85 MMLU
The company will announce that their model scores >= 50 GPQA
The company will invite independent researchers/orgs to do evals on their models
The company will finish a training run reported to use more than 10^25 FLOP (e.g. by Epoch AI)
The company will have at least 100 employees
The company will announce that their path to superintelligence involves creating an automated AI researcher
The company will announce research or models related to automated theorem proving (e.g. https://openai.com/index/generative-language-modeling-for-automated-theorem-proving/)
The company will be on track to build ASI by 2030, according to a Manifold poll conducted at EOY 2025
I will believe at EOY 2025 that the company has significantly advanced AI capabilities
The company will release a publicly available API for an AI model
The company will publish a Responsible Scaling Policy or similar document (e.g. OpenAI’s Preparedness Framework)
The company will publish research related specifically to Sparse Autoencoders
The official SSI X account will have more than 200k followers
I will meet an employee of the company in person (currently true for OAI, Anthropic, xAI but not Deepmind)
The company will sell any products or services before EOY 2025
The company will release a new AI or AI safety benchmark (e.g. MMLU, GPQA)
The company will announce that they are on track to develop superintelligence by EOY 2030 or earlier
The company will publish research which involves collaboration with at least 5 members of another leading AI lab (e.g. OAI, GDM, Anthropic, xAI)
The company will have a group of more than 10 people working on Mechanistic Interpretability
The company will release a chatbot or any other AI system which accepts text input
The company will release a model scoring >= 1300 elo in the chatbot arena leaderboard
The company will finish a training run reported to use more than 10^26 FLOP (e.g. by Epoch AI)
The company will open offices outside of the US and Israel
I will believe at EOY 2025 that the company has made significant progress in AI Alignment
I’ll work there (@mr_mino)
The company will announce a commitment to spend at least 20% of their compute on AI Safety/Alignment
The company will be listed as a “Frontier Lab” on https://ailabwatch.org/companies/
The company will be involved in a lawsuit
It will be reported that Nvidia is an investor in the company
The company’s model weights will be leaked/stolen
I will believe at EOY 2025 that the company has built an fully automated AI researcher
The company will make a GAN
The company will announce that their path to superintelligence involves continuous chain of thought
It’s reported that the company’s model scores >= 90 on the ARC-AGI challenge (public or private version)
The company will open source its model weights or training algorithms
It will be reported that a model produced by the company will self-exfiltrate, or attempt to do so
The official SSI X account will have more than 1M followers
The company will be valued at >= $100 Billion according to a reputable news source (e.g. Forbes, Reuters, NYT)
The phrase “Feel the AGI” or “Feel the ASI” will be published somewhere on the company website
The company will be reported to purchase at least $1 Billion in AI hardware, including cloud resources
Leopold Aschenbrenner will join the company
The company will advocate for a AI scaling pause or will endorse such a proposal (e.g. https://futureoflife.org/open-letter/pause-giant-ai-experiments/)
The company will have a public contract with the US government to develop some technology
The company will publish research related to Singular Learning Theory
Major algorithmic secrets (e.g architecture, training methods) will be leaked/stolen
The company will publish research related to Neural Turing Machines
The company’s AI will be involved in an accident which causes at least $10 million in damages
The company will release a model scoring in the top 3 of the chatbot arena leaderboard
The company will publish a research paper written entirely by their AI system
The company release a video generation demo made by their AI system
I will believe at EOY 2025 the company has made significant advances in robotics or manufacturing
Their model will be able to play Chess, Shogi, or Go at least as well as the best human players
There will be a public protest or boycott directed against the company with more than 100 members
The company will be closer to building ASI than any other AI Lab at EOY 2025, as judged by a manifold poll
The company’s model will independently solve an open mathematical conjecture created before 2024
The company will publish a peer-reviewed paper with more than 1000 citations
The company will be acquired by another company
Elon musk will be an investor of the company
The company will release a model that reaches the #1 rank in the Chatbot Arena (including sharing the #1 rank with other models when their confidence intervals overlap)
The company will release an app available on iPhone or android
The company will change its name
The company will be merged with or acquired by another company
The company will announce that they have created Superintelligence
The company will finish a training run reported to use more than 10^28 FLOP (e.g. by Epoch AI)
It will be reported that Sam Altman is an investor in the company
The company will build their own AI chips
Their model will be the first to get a gold medal or equivalent in IMO (International Mathematics Olympiad)
The company will finish a training run reported to use more than 10^29 FLOP (e.g. by Epoch AI)
The company will be reported to build a data center with a peak power consumption of >= 1 GW
The company will publish at least 5 papers in peer reviewed journals
The company will declare bankruptcy
The company will be reported to acquire an Aluminum manufacturing plant for its long term power contract
The company will be publicly traded
The company will finish a training run reported to use more than 10^27 FLOP (e.g. by Epoch AI)
The company will finish a training run reported to use more than 10^30 FLOP (e.g. by Epoch AI)
I'll work there (@AndrewG)
The company will be reported to build a data center with a peak power consumption of >=10 GW
The company will be reported to build a data center with a peak power consumption of >=100 GW
The company will be valued at >= $1 Trillion according to a reputable news source (e.g. Forbes, Reuters, NYT)
The company will be valued at >= $10 Trillion according to a reputable news source (e.g. Forbes, Reuters, NYT)
100
100
100
100
100
100
100
96
94
85
76
58
54
49
45
45
43
40
39
39
39
39
37
37
37
37
34
33
33
31
31
29
28
25
25
25
24
24
22
22
21
21
19
18
18
18
18
17
16
16
15
13
13
13
13
12
12
11
10
10
10
10
9
9
9
7
7
7
7
7
7
7
7
6
6
6
6
6
6
6
5
5
5
4
4
4
4
4
3
3
3
3
3
2
2
2
1
1
OptionProbability
It won't release during 2024
It will be trending on Twitter the day of release. A name like "GPT", "OpenAI", "GPT-5" could all count. Checked from a clean account.
It will be able to translate a page of manga (JP image -> EN text)
It will support at least 199.5k context
It will be trained on reasoning traces from o1/o3 type models
It will support at least 499.5k context
It will support at least 999.5k context
It will have a different logo color from green, black, or purple(based on resolution of https://manifold.markets/MiraBot/what-color-will-the-next-openai-llm?r=TWlyYUJvdA)
It will be a model router.
Output tokens will be cheaper than GPT-4 Turbo(as of March 12, 2024)
Its knowledge cutoff will be in or later than July 2024
It will use a new architecture meaningfully different from GPT-4
Its knowledge cutoff will be any day in June 2024
OpenAI will claim it faster than GPT-4 Turbo
It will be ranked the highest model on the LMSys Chatbot Arena, and not overtaken by another model, 3 months after the release date.
It will be able to pass jim's "agents benchmark"
There will be credible reporting that it is or was "excessively horny" either before or up to three months after release
Will be claimed to be AGI by the New York times up to 3 months after release.
Will be claimed to be AGI by OpenAI up to 3 months after release.
Its knowledge cutoff will be in or before April 2024
Its knowledge cutoff will be any day in May 2024
Will be claimed to be AGI by Wikipedia up to 3 months after release.
It will release in April 2024 or before
It will release between May and October 2024
It will release in November 2024
It will release in December 2024
It will release before GPT 4.5
100
95
92
90
90
85
83
74
71
59
50
49
36
29
27
20
13
7
7
5
5
3
0
0
0
0
0
OptionVotes
YES
NO
118404
88423
OptionProbability
No | 1400 - 1500
No | 1500 - 1600
Other
Chatbot Arena will no longer exist
Yes | 1200 - 1300
Yes | 1300 - 1400
Yes | 1400 - 1500
No | 1200 - 1300
No | 1300 - 1400
47
43
4
2
1
1
1
1
1
OptionVotes
YES
NO
10337
9667
OptionProbability
It will have been released in H2 2025
It is not called Claude N, for some integer N
It will have Audio(Input) modality
It can score Bronze or higher on IMO
It will have Audio(Output) modality
It will have Video(Input) modality
It is called Claude 4
It will have Video(Output) modality
It is called Claude 5
It will have been released in H1 2025
It is SOTA according to Chatbot Arena Leaderboard
It is called Claude 6
It is called Claude 3
90
72
60
55
51
36
15
15
15
8
5
3
1
OptionProbability
It will have Audio(Input) modality
It will have Audio(Output) modality
It can score Bronze or higher on IMO
It will have Video(Input) modality
It will have Video(Output) modality
It will have been released in H2 2025
Its main name is GPT-5
It is #1 in Elo according to Chatbot Arena Leaderboard
It has an anthropomorphic name
It will have been released in H1 2025
93
85
85
75
74
74
55
51
43
1
OptionProbability
Transformer-based architecture
Developed by OpenAI
Over 1T parameters
Part of the GPT-N family of models (GPT-5, GPT-6, and variations)
Developed by Google Deepmind
It is #1 in Elo according to Chatbot Arena Leaderboard at any time
Part of the o1 family of models (o1, o2, etc. and variations)
Developed by a non-British and non-American company
Narrow domain of knowledge. ie Does not know random facts such as when Google was founded, or who won the 1960 presidential election.
Part of the AlphaProof family of models (AlphaProof N and variations)
Based on Symbolic AI (https://en.wikipedia.org/wiki/Symbolic_artificial_intelligence)
Energy-based Model (https://en.wikipedia.org/wiki/Energy-based_model)
90
68
68
45
28
25
19
15
10
9
7
5
OptionProbability
Chris Hipkins
Other
Kieran McAnulty
David Parker
Megan Woods
Carmel Sepuloni
Arena Williams
Barbara Edmonds
Willie Jackson
Peeni Henare
Cushla Tangaere-Manuel
Grant Robertson
Ayesha Verrall
Damien O'Connor
Adrian Rurawhe
Ginny Anderson
Christ Hipkins
36
31
11
3
3
2
2
2
2
2
2
1
1
1
1
1
0
OptionProbability
Everglades
Strip club
Yacht / sailing
Military base
Theme park rides
Space launch pad
Concert / club / rave
A tiger / big cat private zoo or animal sanctuary
MMA fight arena
90
88
85
76
72
71
69
62
57
OptionProbability
Other
Tyran Stokes
Caleb Holt
Christian Collins
Miikka Muurinen
Brandon McCoy
Alijah Arenas
Jason Crowe Jr.
Jalen Montonati
Jordan Smith
Anthony Thompson
Darryn Peterson
Mikel Brown Jr.
12
10
9
9
9
6
6
6
6
6
6
6
6