OptionProbability
J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.
I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.
Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)
M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)
C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.
B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.
O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)
K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.
A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.
H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.
L. Earth's present civilization crashes before powerful AGI, and the next civilization that rises is wiser and better at ops. (Exception to 'okay' as defined originally, will be said to count as 'okay' even if many current humans die.)
D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.
E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.
G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.
F. Somebody pulls off a hat trick involving blah blah acausal blah blah simulations blah blah, or other amazingly clever idea, which leads an AGI to put the reachable galaxies to good use despite that AGI not being otherwise alignable.
N. A crash project at augmenting human intelligence via neurotech, training mentats via neurofeedback, etc, produces people who can solve alignment before it's too late, despite Earth civ not slowing AI down much.
If you write an argument that breaks down the 'okay outcomes' into lots of distinct categories, without breaking down internal conjuncts and so on, Reality is very impressed with how disjunctive this sounds and allocates more probability.
You are fooled by at least one option on this list, which out of many tries, ends up sufficiently well-aimed at your personal ideals / prejudices / the parts you understand less well / your own personal indulgences in wishful thinking.
19
18
16
10
9
7
6
5
3
2
2
1
1
1
0
0
0
0
OptionVotes
YES
NO
6244
3509
OptionVotes
YES
NO
14299
6435
OptionVotes
YES
NO
1235
918
OptionProbability
K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.
I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.
C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.
Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)
A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.
B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.
D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.
M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)
O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)
E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.
F. Somebody pulls off a hat trick involving blah blah acausal blah blah simulations blah blah, or other amazingly clever idea, which leads an AGI to put the reachable galaxies to good use despite that AGI not being otherwise alignable.
J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.
H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.
G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.
L. Earth's present civilization crashes before powerful AGI, and the next civilization that rises is wiser and better at ops. (Exception to 'okay' as defined originally, will be said to count as 'okay' even if many current humans die.)
N. A crash project at augmenting human intelligence via neurotech, training mentats via neurofeedback, etc, produces people who can solve alignment before it's too late, despite Earth civ not slowing AI down much.
You are fooled by at least one option on this list, which out of many tries, ends up sufficiently well-aimed at your personal ideals / prejudices / the parts you understand less well / your own personal indulgences in wishful thinking.
If you write an argument that breaks down the 'okay outcomes' into lots of distinct categories, without breaking down internal conjuncts and so on, Reality is very impressed with how disjunctive this sounds and allocates more probability.
20
10
8
7
6
6
6
6
6
4
4
4
3
2
2
1
1
1
OptionVotes
NO
YES
1629
626
OptionVotes
YES
NO
1023
980
OptionProbability
Mass AI-driven job displacement event
A government declaration/statement (any country)
Reports about software
Reports about financial activity
Report about sociological observations
Report about economic observations
AI-related academic achievement
Attack on military target
Attack on civilian target
AI-related weapon announcement/use/threat
A statement from an individual (human)
AI-related fake news
AI-related political movement (pro or anti)
A private company declaration/statement (any company)
Discovery of a spy / mole working for foreign power (or for an AI)
Military activity / posture change
An AI-related apocalypse cult attacks a civilian or military target
AI-related mass psychosis/hysteria event
An AI uncovers evidence relating to a previous event
A government of a major nation orders the shutdown of a significant AI service
AI-designed cyberweapon
AI makes prediction about future event
Open source model release
Release of a closed-source/closed-weights model
Reports of aerial devices/machines
Reports of market activity
AI-related corruption scandal
AI-related theft
A new war between nation states
AI-related resignation
Reports of activity in online communities
Reports of activity on internet-connected servers
Reports of activity in open source software
A declaration/statement from a military / intelligence agency
Reports of activity on internet media distribution platforms
People receiving messages (text / phone calls / whatsapp / ...)
AI-related diplomacy
Reports about consumer activity
Reports about identity theft
Reports of activity in religious communities
AI-related media (movie/song/book ...)
Reports about industrial activity
Reports about physical machines
Reports about criminal activity
AI-related competition achievement
A viral meme
An announcement from an AI lab similar to o3
AI-related Internet shutdown in some country
AI-related cyberattack
Conventional weapons attack on AI infrastructure / supply chain
A declaration/statement from an AI (any AI)
AI-related sex scandal
A piece of viral AI-generated media that has a strong/unexpected effect on large numbers of people (~psychological)
AI parasitism / Addictive-Persuasive Agent
Attack on Taiwan
AI-related terrorist attack
AI-related assassination
AI-designed pathogen
AI drug discovery
AI material discovery
Death of an individual in suspicious circumstances
AI-related astronomical event / observation / analysis (can also include satellites)
Reports of underwater activity
Reports of activity on blockchain networks
Reports of activity in academic communities
Publication of a research paper (or pre-print / blog post / poster / twitter thread / ... -- about research-related topic)
Reports of activity in online video games
Reports about people getting scammed
AI-related cult
AI system demonstrates general robotic control
AI causes significant stock market event
Major AGI lab whistleblower revelation
AI-related archaeology
Major AI safety incident or accident
AI system makes scientific breakthrough
AI system solves major unsolved math problem
An AI lab demonstrates an automated AI research engineer
AI-related nuclear weapon use
49
45
41
41
41
41
41
40
40
40
39
38
38
36
35
35
35
34
34
30
30
29
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
28
26
25
25
25
24
21
21
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
18
15
15
15
11
11
11
11
10
OptionVotes
YES
NO
1040
971
OptionVotes
NO
YES
1139
878
OptionVotes
YES
NO
119
84
OptionVotes
YES
NO
133
75
