Google DeepMind has unveiled Genie 3, a general-purpose “world model” that creates fully interactive virtual environments from a single text prompt, advancing the frontier of generative simulation technologies. The model streams at 720p and up to 24 frames per second while preserving visual and physical consistency for several minutes. A short-term memory of roughly one minute lets objects remain stable when they leave and re-enter the camera’s view, and users can navigate with standard controls or alter the scene in real time—changing weather, adding characters or triggering other events—without restarting the simulation. DeepMind positions Genie 3 as a tool for training embodied AI agents and robots, calling it a “crucial stepping stone” toward artificial general intelligence. In internal tests, the company’s SIMA agent successfully completed navigation tasks inside Genie-generated warehouses, underscoring the model’s potential as a high-fidelity sandbox for self-play and reinforcement learning. Genie 3 represents a sharp jump from last year’s Genie 2, boosting resolution from 360p to 720p, raising frame rates from 15 fps to 24 fps and extending interactive sessions from seconds to minutes. The upgrade narrows the gap between AI-generated video and conventional game engines, hinting at future applications in gaming, education and content prototyping. The system remains in closed research preview, available to a select group of academics and creators. DeepMind has not yet provided a timetable for broader access or commercial deployment.
If y’all are watching this world model from google (Genie3) and still don’t believe we’re living in a simulation… What other evidence do you require!? https://t.co/iqT4qPU0td
#Genie3 https://t.co/GX6avQJlvb
A world within a world with Genie 3 watch the complete video. Vision encoders will finally see motion the way the world delivers it, with this physics consistent simulations. https://t.co/QIR5LCbztg https://t.co/aCYsquKrdQ