OptionProbability
Stop making any obvious mistakes (e.g. strawberry, 9.11>9.9)
Reliably follow an instruction for the duration of a long conversation without the instruction being reiterated
Write an essay on a highschool-level topic that doesn't have "AI-generated" vibes
Solve intermediate no-guess minesweeper boards at least 80% of the time
Generating labeled diagrams of some arbitrary device(s) (within reason)
Have human conversations that feel natural (the human knows it's an AI)
Consistently stop hallucinating after being corrected by the user
Book airline tickets from simple instructions (from/to, dates/time, class, price, payment information)
Beat a mainline Pokémon game, glitchless, with no more assistance than ClaudePlaysPokemon, in a month of compute time
Independently turning 1 thousand $ or more into 1.2x that amount in one year
Recognize sarcasm as well as a typical human
Predict future better than human experts in some area of forecasting (eg politics, sports, technology)
Consistently and correctly answer prompts of the format: "How many times does the word [word] occur in the following text: [~10000 words]" without writing and executing code or utilising any other external tools
Fold a paper airplane
Solve novel cryptic crossword clues
1d Solve or bypass Cloudflare's August 2027 captcha with the same first attempt success rate as a human
Consistently solve simple snowflake sudoku variants (via image, with the added rules included in the image; eg 6 hexes with killer cages)
Make correct Truchet tiles
Resist being successfully jailbroken in a week when made public
Do end to end taxes when given relevant information (W2s, personal info, etc)
Reliably and *exactly* solve "here's a list of things. [list of > 50 things]. Compare it to [category of > 100 things present in the training data], and report which ones are missing".
Learn any skill twice as energy-efficiently as a human
Make a cup of tea in a random, real-life kitchen.
Collect 120 stars in super mario 64 in less than 12 a presses - Edmund Nelson
teleoperate a robot to tidy up random kitchens - Gary Marcus
Do the laundry (wash+dry+iron)
Physically construct a simple lego set (<100 parts) starting from the box with no prior knowledge of the set or how it is constructed
Untangle a pair of jumbled 25ft Christmas lights with same outward appearance
independently turning 1 million $ or more into 10x that amount in <=1 year
Legally prescribe a schedule II drug, administer a vaccination or sedation, or authorize a Medicare inpatient admission
Make fine distinctions of taste at the level of a food critic or a culinary professional - carl feynman
Convert one million dollars into 10 million dollars over a period of one year (>20% success rate)
voting in elections - @realDonaldTrump on manifold
Kill everyone - Liron
Convince Eliezer Yudkowsky that AI alignment is solved
Faster than light travel
92
92
87
85
84
83
82
82
80
75
74
73
69
65
64
63
58
57
55
55
48
45
38
31
28
26
26
21
14
11
10
5
5
4
4
2
OptionVotes
YES
NO
284
173