OptionProbability
J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.
I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.
Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)
M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)
O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)
C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.
B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.
K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.
A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.
D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.
E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.
H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.
L. Earth's present civilization crashes before powerful AGI, and the next civilization that rises is wiser and better at ops. (Exception to 'okay' as defined originally, will be said to count as 'okay' even if many current humans die.)
N. A crash project at augmenting human intelligence via neurotech, training mentats via neurofeedback, etc, produces people who can solve alignment before it's too late, despite Earth civ not slowing AI down much.
F. Somebody pulls off a hat trick involving blah blah acausal blah blah simulations blah blah, or other amazingly clever idea, which leads an AGI to put the reachable galaxies to good use despite that AGI not being otherwise alignable.
G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.
If you write an argument that breaks down the 'okay outcomes' into lots of distinct categories, without breaking down internal conjuncts and so on, Reality is very impressed with how disjunctive this sounds and allocates more probability.
You are fooled by at least one option on this list, which out of many tries, ends up sufficiently well-aimed at your personal ideals / prejudices / the parts you understand less well / your own personal indulgences in wishful thinking.
17
13
13
12
11
10
8
8
3
1
1
1
1
1
0
0
0
0
OptionProbability
The Pentagon cuts ties with Anthropic
The Pentagon declares Anthropic a "supply chain risk"
Anthropic files a lawsuit against the federal government
The Department of War signs a deal to use OpenAI models instead
The supply chain risk designation is officially issued
OpenAI signs a contract substantially weaker than Anthropic's requirements
An injunction is granted against the supply chain risk designation
A judge grants an injunction against the Department of War
Anthropic's "supply chain risk" designation removed before 2027
Anthropic's "supply chain risk" designation removed before July
An injunction is granted against the supply chain risk designation And survives unblocked by other courts for 6 months
Resignation letter signed by at least 5 OpenAI employees
The government offers the same terms to Anthropic as to OpenAI
US company with >100B total valuation cancels some Pentagon contract and uses Anthropic supply chain risk designation as stated justification
Anthropic's contract is publicly available
OpenAI's new contract is publicly available
The Pentagon invokes the Defense Production Act
Autonomous weapon + surveillance Claude safeguards are removed for the Pentagon
Anthropic gives the government unfettered access of their own accord (they "cave in")
The Pentagon designates Anthropic a supply chain risk, AND invokes the Defense Production Act
Dario Amodei leaves or is removed from Anthropic
Resignation letter signed by at least 5 Anthropic researchers
Will Amazon have to divest / break with Anthropic in 2026?
Anthropic stops advancing AI capabilities
The Pentagon continues to use Anthropic services without the requested changes (autonomous weapon + mass surveillance of americans)
The Pentagon and Anthropic come to some form of mutual settlement by the Friday deadline
100
100
100
94
89
85
80
77
69
58
57
50
26
19
12
11
6
6
6
6
4
3
3
2
0
0
OptionProbability
Sexism and racism, among other forms of prejudice, are responsible for worse health outcomes, and it’s not overly dramatic for people to treat those issues as public health/safety concerns.
Prediction markets are good
Tenet (Christopher Nolan film) is underrated
[*] ...and things will improve in the future
Scientific racism is bad, actually. (also it's not scientific)
We should be doing much more to pursue human genetic engineering to prevent diseases and aging.
The Fermi paradox isn't a paradox, and the solution is obviously just that intelligent life is rare.
The way quantum mechanics is explained to the lay public is very misleading.
Authoritarian populism is bad actually
Prolonged school closures because of COVID were socially devastating.
Nuclear power is by far the best solution to climate change. [N]
Most organized religion are false
The Many Worlds Interpretation of quantum mechanics
Humans have a responsibility to figure out what if anything we can do about wildlife suffering.
Pineapple pizza tastes good
First-past-the-post electoral systems are not merely flawed but outright less democratic than proportional or preferential alternatives
Liberal-democracy is good actually
Physician-assisted suicide should be legal in most countries
Peeing in the shower is good and everyone should do it
It would actually be a good thing if automation eliminated all jobs.
We need a bigger welfare state than we have now.
Many amphetamines and psychedelics have tremendous therapeutic value when guided by an established practitioner.
The proliferation of microplastics will be viewed as more harmful to the environment than burning fossil fuels, in the long term
Free will doesn't require the ability to do otherwise.
Metaculus will take over Manifold in more serious topics, and Manifold will be known as the "unserious" prediction market site
American agents are in the highest positions in government for more than half the world.
We should give every American food stamps, in a fixed dollar amount, with no means testing or work requirements or disqualification for criminal convictions.
Dialetheism (the claim that some propositions are both true and false) is itself both true and false.
Dreams analysis is a legitimate means of gaining personal insight.
Given what we know about the social and health effects of being fired, even if abolishing at will employment has efficiency costs it is likely worth it.
Mobile UX will be a key explaining factor in explaining the stories of Manifold and Metaculus.
The overall state of the world is pretty good... [*]
If a developed nation moves from democratic to authoritarian government today, it should be expected to end up poorer, weaker, sicker, and stupider.
California is wildly overrated.
Factory farming is horrific but it is not wrong to eat meat.
The United States doesn't need a strong third party.
Political libertarianism
Racial Colorblindness is the only way to defeat racism
People will look back on using animal products as a moral disgrace on the level of chattel slavery.
There's a reasonable chance of a militant green/communist movement that gains popular support in the coming decade.
[N], and to the extent climate activists are promoting other kinds of solutions, they are actively making the situation worse by diverting attention and resources from nuclear power.
Being a billionaire is morally wrong.
Eating meat is morally wrong in most cases.
You should bet NO on this option
The Windows kernel is better than Linux; it’s just all the bloat piled on top that makes it worse
White people are the least racist of any racial group
Technology is not making our lives easier or more fulfilling.
COVID lockdowns didn’t save many lives; in fact they may have caused net increases in global deaths and life years lost.
Light mode is unironically better than Dark mode for most websites
Some people have genuine psychic capabilities
God is evil
A sandwich is a type of hot dog
Climate change is significantly more concerning than AI development.
Astrology is a legitimate means of gaining personal insight.
It's acceptable for our systems of punishment to be retributive in part
Mereological nihilism (composite objects don't exist)
China not having real democracy does more good than harm
AI will not be as capable as humans this century, and will certainly not give us genuine existential concerns
Governments should not support parents for having children that they cannot take care of
Reincarnation is a real phenomenon
Dentistry is mostly wasted effort.
Moral Hazard isn’t real, and all the purported instances of it can be chalked up to coincidence or confounding variables
Donald Trump would have been a better president than Joe Biden
Mass surveillance (security cameras everywhere) has more positives than negatives
Future generations will say that on balance the world reacted appropriately after learning that fossil fuels cause climate change. That the balance between addressing the problem and slowing economies was just about right.
The next American moon landing will be faked
SBF didn't intentionally commit fraud
Humans don't have free will.
AI art is better than human art
Souls/spirits are real and can appear to the living sometimes
Communism just wasn't implemented well, next time it will work
The first American moon landing was faked
The human race should voluntarily choose to go extinct via nonviolent means (antinatalism).
LK-99 room temp, ambient pressure superconductivity pre-print will replicate before 2025
Astrology is actually true.
91
91
86
84
81
78
78
77
77
76
74
74
72
72
72
71
71
70
70
67
67
65
65
60
60
60
59
58
54
54
50
50
50
49
46
46
46
45
44
44
44
44
44
42
41
38
36
35
33
33
33
32
31
30
27
26
26
23
23
22
22
22
21
19
14
13
13
11
9
8
8
7
7
5
5
OptionVotes
YES
NO
3594
1876
OptionProbability
K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.
I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.
C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.
D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.
Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)
A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.
B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.
M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)
O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)
E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.
F. Somebody pulls off a hat trick involving blah blah acausal blah blah simulations blah blah, or other amazingly clever idea, which leads an AGI to put the reachable galaxies to good use despite that AGI not being otherwise alignable.
J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.
H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.
L. Earth's present civilization crashes before powerful AGI, and the next civilization that rises is wiser and better at ops. (Exception to 'okay' as defined originally, will be said to count as 'okay' even if many current humans die.)
G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.
N. A crash project at augmenting human intelligence via neurotech, training mentats via neurofeedback, etc, produces people who can solve alignment before it's too late, despite Earth civ not slowing AI down much.
You are fooled by at least one option on this list, which out of many tries, ends up sufficiently well-aimed at your personal ideals / prejudices / the parts you understand less well / your own personal indulgences in wishful thinking.
If you write an argument that breaks down the 'okay outcomes' into lots of distinct categories, without breaking down internal conjuncts and so on, Reality is very impressed with how disjunctive this sounds and allocates more probability.
19
10
8
7
7
6
6
6
6
5
4
4
3
3
2
1
1
1
OptionVotes
YES
NO
220
28
OptionVotes
YES
NO
1259
922
OptionProbability
Other
Democratic
Republican
No election in 2100 (e.g. because the US ceases exist)
Independent / no party preference
28
20
19
19
13
OptionVotes
YES
NO
169
59
OptionVotes
YES
NO
155
86
OptionVotes
YES
NO
200
76
OptionVotes
YES
NO
105
95

