Aug 5, 09:34 PM

Google DeepMind Unveils Genie 3 Interactive Real-Time Text-to-World AI Model

Google DeepMind on Tuesday introduced Genie 3, its most advanced “world model,” capable of generating real-time, fully navigable 3D environments from a single text prompt. The system renders at 720p and 24 frames per second and can preserve visual and physical consistency for several minutes—an order-of-magnitude jump from the 10- to 20-second, 360p output of last year’s Genie 2. Genie 3 keeps track of objects for roughly one minute via a short-term “world memory,” allowing users to leave and revisit a scene without losing context. A new feature for “promptable world events” lets creators alter weather, add characters or otherwise reshape the scene on the fly, while end-to-end control latency is about 50 milliseconds, according to the research team. DeepMind executives said the technology is intended not only for next-generation game and media production but also for training embodied AI agents and robots in simulated environments—a capability they describe as a critical step toward artificial general intelligence. Early tests show the model supporting DeepMind’s SIMA agent in completing goal-oriented tasks inside the generated worlds. The software remains in a restricted research preview, with access limited to select academics and creators while the company works on longer simulation horizons, multi-agent interactions and geographic accuracy. DeepMind has not yet disclosed hardware requirements or a timeline for broader release.

#Google DeepMind #DeepMind

Written with ChatGPT .