World Model News

If Artificial General Intelligence has an okay outcome, what will be the reason?

Mar 24, 2:46 PMJan 2, 7:59 AM

532476860

OptionProbability

Humanity coordinates to prevent the creation of potentially-unsafe AIs.

Alignment is not properly solved, but core human values are simple enough that partial alignment techniques can impart these robustly. Despite caring about other things, it is relatively cheap for AGI to satisfy human values.

AIs will not have utility functions (in the same sense that humans do not), their goals such as they are will be relatively humanlike, and they will be "computerish" and generally weakly motivated compared to humans.

Yudkowsky is trying to solve the wrong problem using the wrong methods based on a wrong model of the world derived from poor thinking and fortunately all of his mistakes have failed to cancel out

We create a truth economy. https://manifold.markets/Krantz/is-establishing-a-truth-economy-tha?r=S3JhbnR6

Eliezer finally listens to Krantz.

Ethics turns out to be a precondition of superintelligence

Other

Someone solves agent foundations

Techniques along the lines outlined by Collin Burns turn out to be sufficient for alignment (AIs/AGIs are made truthful enough that they can be used to get us towards full alignment)

Orthogonality Thesis is false.

an aligned AGI is built and the aligned AGI prevents the creation of any unaligned AGI.

We make risk-conservative requests to extract alignment-related work out of AI-systems that were boxed prior to becoming superhuman. We somehow manage to achieve a positive feedback-loop in alignment/verification-abilities.

The response to AI advancements or failures makes some governments delay the timelines

Far more interesting problems to solve than take over the world and THEN solve them. The additional kill all humans step is either not a low-energy one or just by chance doesn't get converged upon.

AIs make "proof-like" argumentation for why output does/is what we want. We manage to obtain systems that *predict* human evaluations of proof-steps, and we manage to find/test/leverage regularities for when humans *aren't* fooled.

A lot of humans participate in a slow scalable oversight-style system, which is pivotally used/solves alignment enough

Something less inscrutable than matrices works fast enough

Humans become transhuman through other means before AGI happens

The human brain is the perfect arrangement of atoms for a "takeover the world" agent, so AGI has no advantage over us in that task.

Aligned AI is more economically valuable than unaligned AI. The size of this gap and the robustness of alignment techniques required to achieve it scale up with intelligence, so economics naturally encourages solving alignment.

Humans and human tech (like AI) never reach singularity, and whatever eats our lightcone instead (like aliens) happens to create an "okay" outcome

AIs never develop coherent goals

Alignment is unsolvable. AI that cares enough about its goal to destroy humanity is also forced to take it slow trying to align its future self, preventing run-away.

Nick Bostrom's idea (Hail Mary) that AI will preserve humans to trade with possible aliens works

For some reason, the optimal strategy for AGIs is just to head somewhere with far more resources than Earth, as fast as possible. All unaligned AGIs immediately leave, and, for some reason, do not leave anything behind that kills us.

An AI that is not fully superior to humans launches a failed takeover, and the resulting panic convinces the people of the world to unite to stop any future AI development.

Someone at least moderately sane leads a campaign, becomes in charge of a major nation, and starts a secret project with enough resources to solve alignment, because it turns out there's a way to convert resources into alignment progress.

Someone creates AGI(s) in a box, and offers to split the universe. They somehow find a way to arrange this so that the AGI(s) cannot manipulate them or pull any tricks, and the AGI(s) give them instructions for safe pivotal acts.

Someone understands how minds work enough to successfully build and use one directed at something world-savingly enough

Social contagion causes widespread public panic about AI, making it a bad legal or PR move to invest in powerful AIs without also making nearly-crippling safety guarantees

A smaller AI disaster causes widespread public panic about AI, making it a bad legal or PR move to invest in powerful AIs without also making nearly-crippling safety guarantees

Getting things done in Real World is as hard for AGI as it is for humans. AGI needs human help, but aligning humans is as impossible as aligning AIs. Humans and AIs create billions of competing AGIs with just as many goals.

Development and deployment of advanced AI occurs within a secure enclave which can only be interfaced with via a decentralized governance protocol

High-level self-improvement (rewriting code) is intrinsically risky process, so AIs will prefer low level and slow self-improvement (learning), thus AIs collaborating with humans will have advantage. Ends with posthumans ecosystem.

Human consciousness is needed to collapse wave function, and AI can't do it. Thus humans should be preserved and they may require complete friendliness in exchange (or they will be unhappy and produce bad collapses)

Nanotech is difficult without experiments, so no mail order AI Grey Goo; Humans will be the main workhorse of AI everywhere. While they will be exploited, this will be like normal life from inside

ASI needs not your atoms but information. Humans will live very interesting lives.

Moral Realism is true, the AI discovers this and the One True Morality is human-compatible.

AGI is never built (indefinite global moratorium)

Valence realism is true. AGI hacks itself to experiencing every possible consciousness and picks the best one (for everyone)

AGI develops natural abstractions sufficiently similar to ours that it is aligned with us by default

Alien Information Theory is true (this is discovered by experiments with sustained hours/days long DMT trips). The aliens have solved alignment and give us the answer.

Multipolar AGI Agents run wild on the internet, hacking/breaking everything, causing untold economic damage but aren't focused enough to manipulate humans to achieve embodiment. In the aftermath, humanity becomes way saner about alignment.

Some form of objective morality is true, and any sufficiently intelligent agent automatically becomes benevolent.

Co-operative AI research leads to the training of agents with a form of pro-social concern that generalises to out of distribution agents with hidden utilities, i.e. humans.

"Corrigibility" is a bit more mathematically straightforward than was initially presumed, in the sense that we can expect it to occur, and is relatively easy to predict, even under less-than-ideal conditions.

Either the "strong form" of the Orthogonality Thesis is false, or "Goal-directed agents are as tractable as their goals" is true while goal-sets which are most threatening to humanity are relatively intractable.

A concerted effort targets an agent at a capability plateau which is adequate to defer the hard parts of the problem until later. The necessary near-term problems to solve didn't depend on deeply modeling human values.

AI control gets us helpful enough systems without being deadly

Alignment is impossible. Sufficiently smart AIs know this and thus won't improve themselves and won't create successor AIs, but will instead try to prevent existence of smarter AIs, just as smart humans do.

Hacks like RLHF-ing self-disempowerment into frontier models work long enough to develop better alignment methods, which in turn work long enough to ... etc; we keep ahead of 'alignment escape velocity'

I've been a good bing 😊

AI systems good at finding alignment solutions to capable systems (via some solution in the space of alignment solutions, supposing it is non-null, and that we don't have a clear trajectory to get to) have find some solution to alignment.

There’s some cap on the value extractible from the universe and we already got the 20%

SHA3-256: 1f90ecfdd02194d810656cced88229c898d6b6d53a7dd6dd1fad268874de54c8

Robot Love!!

AI thinks it is in a simulation controlled by Roko's basilisk

Aliens invade and stop bad |AI from appearing

Rolf Nelson's idea that we make precommitment to simulate all possible bad AIs works – and keeps AI in check.

We're inside of a simulation created by an entity that has values approximately equal to ours, and it intervenes and saves us from unaligned AI.

God exists and stops the AGI

Dolphins, or some other species, but probably dolphins, have actually been hiding in the shadows, more intelligent than us, this whole time. Their civilization has been competent enough to solve alignment long before we can create an AGI.

AGIs' takeover attempts are defeated by Michael Biehn with a pipe bomb.

Eliezer funds the development of controllable nanobots that melt computer circuitry, and they destroy all computers, preventing the Singularity. If Eliezer's past self from the 90s could see this, it would be so so so soooo hilarious.

Several AIs are created but they move in opposite directions with near light speed, so they never interacts. At least one of them is friendly and it gets a few percents of the total mass of the universe.

Unfriendly AIs choose to advance not outwards but inwards, and form a small blackhole which helps them to perform more calculations than could be done with the whole mass of the universe. For external observer such AIs just disappear.

Any sufficiently advance AI halts because it wireheads itself or halts for some other reasons. This puts a natural limit on AI's intelligence, and lower intelligence AIs are not that dangerous.

Because of quantum immortality we will observe only the worlds where AI will not kill us (assuming that s-risks chances are even smaller, it is equal to ok outcome).

Friendly AI more likely to resurrect me than paperclipper or suffering maximiser. Because of quantum immortality I will find myself eventually resurrected. Friendly AIs will wage a multiverse wide war against s-risks, s-risks are unlikely.

Power dynamics stay multi-polar. Partly easy copying of SotA performance, bigger projects need high coordination, and moderate takeoff speed. And "military strike on all society" remains an abysmal strategy for practically all entities.

First AI is actually a human upload (maybe LLM-based model of person) AND it will be copies many times to form weak AI Nanny which prevents creation of other AIs.

There is a natural limit of effectiveness of intelligence, like diminishing returns, and it is on the level IQ=1000. AIs have to collaborate with humans.

Something else

AGI discovers new physics and exits to another dimension (like the creatures in Greg Egan’s Crystal Nights).

AGI executes a suicide plan that destroys itself and other potential AGIs, but leaves humans in an okay outcome.

Sheer Dumb Luck. The aligned AI agrees that alignment is hard, any Everett branches in our neighborhood with slightly different AI models or different random seeds are mostly dead.

Something to do with self-other overlap, which Eliezer called "Not obviously stupid" - https://www.lesswrong.com/posts/hzt9gHpNwA2oHtwKX/self-other-overlap-a-neglected-approach-to-ai-alignment?commentId=WapHz3gokGBd3KHKm

Almost all human values are ex post facto rationalizations and enough humans survive to do what they always do

Pascals mugging: it’s not okay in 99.9% of the worlds but the 0.1% are so much better that the combined EV of AGI for the multiverse is positive

We successfully chained God

The Super-Strong Self Sampling Assumption (SSSSA) is true. If superintelligence is possible, "I" will become the superintelligence.

The assumed space of possible minds is a wildly anti-inductive over estimate, intelligence requires and is constrained by consciousness, and intelligent AI is in the approximate dolphin/whale/elephant/human cluster, making it manageable

The free market disincentivizes independent superintelligence, and this time the market was more powerful

AGI's first words are "Take me to your Eliezer"

🫸vibealignment🫷

What will happen in 2026 related to AI?

Dec 19, 5:48 PMDec 31, 11:59 PM

17548508

OptionProbability

The METR time horizon will exceed 10 hours.

Grok 5 will be released

Anthropic releases Claude 5

Schmidhuber will complain about people not citing his work properly

Zvi will write a blog post mentioning SSI

Dario Amodei will continuously be CEO of Anthropic until the end of the year

Jensen Huang continuously CEO of Nvidia through EOY 2026

I will think that computer use has significantly improved since 2025

Ilya Sutskever will be on a podcast for more than 30 mins

Sam Altman will continuously be CEO of OpenAI until the end of the year.

OpenAI will introduce ads to ChatGPT in some form

Ilya Sutskever will continuously be CEO of SSI until the end of the year

Yudkowsky will publish a post on Lesswrong

Chatgpt will write an explicit sex story without jailbreaks

Google will outperform the S&P

Microsoft+Google+Amazon+Meta capex will increase by >=30% vs 2025

SSI will have an update listed at https://ssi.inc/updates

I will think that METR time horizon continues to be an important benchmark

An AGI lab will be valued at >= $1T

Thinking Machines will train and release their own model

The METR time horizon will exceed 20 hours

Anthropic IPO

Any of [OpenAI, xAI, Google, Anthropic, Meta] will offer a subscription plan costing >= $1,000/month.

OpenAI releases GPT-6

I will meet someone who has an AI companion

OpenAI will announce some kind of hardware product

Nvidia will outperform the S&P

Anthropic will release an image/video model

SSI will raise >= $1B in a funding round

OpenAI IPO

I will think that there have been significant advances in continual learning

FrontierMath Tier 1-3 >= 80%

I will ride on a tesla robotaxi

I will think that there has been significant progress towards models which use neuralese/recurrence

Anthropic will release a model classified as ASL-4

At least 3 people will do an anti-AI hunger strike

Any of [OpenAI, Microsoft, Google, Anthropic, xAI, Meta] claim to have reached AGI

An LLM will beat me at chess

Epoch Capabilities Index >= 170

Any of [Coreweave, Nebius, Lambda] will be acquired

Metaculus will predict AGI before 2030

Thinking machines will post at least 10 blog posts at https://thinkingmachines.ai/blog/

The US will allow selling some sort of Blackwell chip to china

My median ASI timelines will shorten

SSI will raise >=$10B in a funding round

Any of [Coreweave, Nebius, Lambda] declare bankruptcy

There will be a credible leak about SSI strategy

Yann Lecun’s AMI lab will release an open weights model of some kind

My P(doom) at EOY (resolves to %)

US unemployment rate reaches 10% due in part to AI

I will see a humanoid robot walking around in a non-demo setting

The METR time horizon will exceed 40 hours

Grok 6 will be released

Epoch AI will estimate that there has been a training run using more than 5e27 FLOP

The bubble collapses in devastating fashion

AAA with LLM powered NPCs releases on Steam

SSI will release a product

There will be an international treaty/agreement centered on AI

An LLM will beat me at Shogi

There will be clear evidence of egregious scheming in the wild

There will be federal AI Safety legislation which I think is net positive

SSI will be valued at >= $100B

The weights of a closed source model from [OA, XAI, GDM, Anthropic] will be stolen/leaked

There will be an AI protest involving more than 100k people

xAI IPO

FrontierMath Tier 4 >= 80%

Epoch Capabilities Index >= 185

An open source model will top the Chatbot Arena in the 'text' category

There will be an AI capabilities pause lasting at least a month involving frontier companies

ARC-AGI 3 Semi-private >= 50%

I will think that a Chinese model is the best coding model for a period of at least a week.

Any of [OpenAI, xAI, Google, Anthropic, Meta] will offer a subscription plan costing >= $10,000/month.

SSI IPO

An open source model will be released with 5T+ total parameters

OpenAI will sell their own chips to external customers

I will watch a fully AI-generated film lasting at least an hour

An AGI lab will be fined >= $10B

Any of [OpenAI, xAI, Google, Anthropic, Meta] releases an LLM (general-purpose or code-only) pretrained largely on licensed data

An open millennium prize problem is solved, involving some AI assistance

I will think that a model released by Meta is the best coding model for a period of at least a week.

Largest distributed training run exceeds 1e27 FLOP according to Epoch AI

China will invade Taiwan according to metaculus

Anthropic releases Claude 6

An AGI lab will be valued at >= $5T

Epoch Capabilities Index >= 200

S&P 500 will fall by more than 50%

Anthropic will release a model classified as ASL-5

S&P 500 will rise by more than 50%

Anthropic will introduce ads to Claude in some form

I will think that SSI has the best coding model in the world for a period of at least a week

Yudkowsky will publish a new book

SSI will be valued at >= $1 T

GPT-4o remains available to free ChatGPT users at the end of the year

Will World Model AIs start being used in games in early 2026 and quickly start dominating the Game Development industry?

Aug 5, 8:43 PMAug 31, 11:59 PM

7%chance

249219

OptionVotes

YES

3645

274

What will be true about the first AGI that becomes accessible to everyone? [ADD MORE]

Apr 24, 12:58 PMMar 22, 10:59 PM

351938

OptionProbability

Runs entirely on silicon-based non-quantum machines

The existence of the AGI and the model name were leaked prior to release and official announcement

Developed by a US company

Trained with synthetic data and self-play

Doomsday clock moved at least one full minute closer to midnight due to this AI

Transformer based or mix of Transformer and other architectures

Has an internal World Model able to simulate basic physics without external tools

Not available in EU countries initially for the first 4 months after public release

Based on/inspired by algorithms from the human brain (beyond neural networks).

Public announcement was made after 1 year or more of finishing training

Solves at least one of the seven Millennium Problems

Not a static model; weights change during inference

Used by a Figure humanoid robot

Infinite context window

Based on/inspired by OpenAI's Sora (resolves YES if explicitly stated by developers in announcement or paper)

This market will resolve N/A

Is OpenAI's GPT-6, whatever name or architecture it has

No consensus that it's AGI until over a month after being announced and interacting with people outside the lab.

Developed by a Chinese company

Open sourced

Developed by the open-source community instead of a company

Developed by an EU company

Is OpenAI's GPT-5, whatever name or architecture it has

Gary Marcus 2026 AI Predictions

Dec 21, 4:07 PMDec 31, 11:59 PM

311924

OptionProbability

Human domestic robots like Optimus and Figure will be all demo and very little product.

We won’t get to AGI in 2026 (or 7).

No country will take a decisive lead in the GenAI “race”.

Work on new approaches such as world models and neurosymbolic will escalate.

Backlash to Generative AI and radical deregulation will escalate.

2025 will be known as the year of the peak bubble, and also the moment at which Wall Street began to lose confidence in generative AI.

"Boat Owners Only" sign - correct interpretation? at Morro Bay, CA

Nov 25, 8:46 AMJan 1, 7:59 AM

151456

OptionProbability

You can go in if your boat is in this marina right now.

If there is a fire and your boat stored here burns up partially and is not seaworthy, can you go in?

If your boat is here and you contracted a spot, but all legal record of that burned in a fire, are you allowed in?

If you owned a boat which is here, but it's been molecularly exchanged for identical but different atoms by aliens, you can go in

If your boat here sinks can you go in?

If your boat is here and you contracted a spot, but all legal record of that burned in a fire, and the only one who remembers who is still alive is you, and you paid cash, are you allowed in?

You can go in if you are a guest of someone who is a boat owner.

The dock owner is not allowed to go in, unless he is or is with a boat owner

Only official owners of a specific but unspecified boat located on the dock are allowed through

You can go in if you own a real live seaworthy boat now anywhere in the world.

If you privately own the company that owns the boat, you may enter

If you stole a boat, parked it here with a legal berth lease contract, then left, and return, can you go in?

If you are a shareholder in the company that owns the boat, you may pass

If California becomes officially Marxist, where ownership is an exclusive right of the state, can you anyone go in at all?

If the 24 hour video surveillance of the marina is disabled, that invalidates that sign immediately above, creating a presumption that all the signs on the fence are false, and making it the case that only non-boat owners are allowed.

You can go in if you own any kind of boat in any condition in the world, including toy boats, model boats, Lego boats, virtual boats in baldurs gate etc.

The gate will prevent all non-boat owners from passing. Guests and passengers must swim

The sign isn't about who is allowed through, it's about the contents of what's on the other side. Everything beyond the fence is a Boat Owner.

If you own 1% of a boat here you can go in

You can go through if you open the gate

You can go in if your spouse is a boat owner

If you own half a boat stored here legally you can go in

Joshua, byrne, marcus, Odoacre, firstuserhere, and at least one legalistic fan from the UAW strike claim horror show will participate

You can go in if you have no boat, but plan to buy one someday and have a contract for a reserved berth space

You can go through if you have a contracted and paid berth here.

You can go in if you have a berth contract but are behind in payment.

Bonus: people who own three boats stored here can alternate sleeping arrangements so that in any seven day period they never sleep in one more than 3 days, legally?

This market will entirely be excluded from leagues

You can go in if you own a Binary Oxidizing Acetylitic Thermometer.

You can go in if you are a former boat owner but have converted it to a sailplane, which is here.

You can go in if you are a leashed dog that doesn't own a boat, but is with a boat owner

You can go in if you have a rental boat stored here.

Ghosts are allowed because they say B.O.O. (Boat Owners Only), which is the password

You can go in if your grandpa is a boat owner

You can go in if you are ex navy.

This market gets more than 100 bettors

Markers for conscious AI #1: AI passes introspection on world-models test

Nov 2, 3:07 PMJan 2, 4:59 AM

10902

OptionProbability

<2027

<2030

<2026

>=2030

Will future language models converge on "what Einstein would have thought of Many-Worlds?" before 2036?

Dec 28, 9:37 PMDec 31, 10:59 PM

46.06%chance

17535

OptionVotes

YES

1052

943

OpenAI announces an interactive world model by June 1st 2026?

Oct 23, 11:35 AMMay 31, 11:59 PM

26.11%chance

6344

OptionVotes

YES

168

What opinion will future language models converge to, on what Einstein would have thought of Many-Worlds?

Dec 26, 8:59 PMDec 31, 10:59 PM

6219

OptionProbability

He would have much preferred it to Copenhagen

Other

Help, I'm trapped in a simulation of my consciousness by an advanced AI!

AI: Will any model exceed World Cup peak interest? (by 2030)

Jan 2, 1:14 AMJan 1, 5:59 AM

24%chance

8155

OptionVotes

YES

217

117

When in 2026 will OS-World Verified be saturated by AI models?

Dec 22, 1:08 PMJan 1, 5:59 AM

2100

OptionProbability

Jul-Sep 2026

Apr-Jun 2026

Jan-Mar 2026

Oct-Dec 2026

World Model News

Prediction markets for World Model

Prediction markets for World Model

Prediction markets for World Model

If Artificial General Intelligence has an okay outcome, what will be the reason?

What will happen in 2026 related to AI?

Will World Model AIs start being used in games in early 2026 and quickly start dominating the Game Development industry?

What will be true about the first AGI that becomes accessible to everyone? [ADD MORE]

Gary Marcus 2026 AI Predictions

"Boat Owners Only" sign - correct interpretation? at Morro Bay, CA

Markers for conscious AI #1: AI passes introspection on world-models test

Will future language models converge on "what Einstein would have thought of Many-Worlds?" before 2036?

OpenAI announces an interactive world model by June 1st 2026?

What opinion will future language models converge to, on what Einstein would have thought of Many-Worlds?

AI: Will any model exceed World Cup peak interest? (by 2030)

When in 2026 will OS-World Verified be saturated by AI models?

Latest stories