Google DeepMind has introduced a new method called Still-Moving, which allows users to apply any image customization method to video models. This includes personalization with DreamBooth, stylization with StyleDrop, and control with ControlNet, all in one method. Additionally, users can control the amount of generated motion. Separately, a new approach called Gen2Act has been developed to leverage off-the-shelf video-generation models for human video generation and robot policy execution. Gen2Act enables robots to perform diverse real-world tasks by conditioning their actions on generated human videos and providing zero-shot demonstrations for robot actions.
Introducing Gen2Act! 🤖🎥 Off-the-shelf video generation models can provide zero-shot human demonstrations to control a robot. The visual world model can show how a human might do many different tasks, and we created a policy that can follow these generated video plans. 🧵👇 https://t.co/MwXMi1Zuk1
Gen2Act leverages off-the-shelf video-generation models to generate targeted *human* videos of manipulation tasks. The Gen2Act policy is conditioned on motion cues over the generated videos to enable policies that can act in diverse and unseen real-world scenarios. 🧵👇 https://t.co/VS1dpGvGNa
How can we connect advances in video generation foundation models to low-level robot actions? We propose a simple but powerful idea: condition robot policies directly on generated human videos! Video Generation 🤝 Robot Actions Check out Homanga’s thread: https://t.co/7sV3KEUONH