OpenVLA, a vision-language-action model for robotics, allows developers to control robots using natural language and images, enhancing customization in multi-task environments with multiple objects affordably. Open-TeleVision, a real-time teleoperation system, streams stereo vision and allows users to control robots with a VR headset across the United States, offering highly precise and smooth bimanual manipulation and active egocentric vision, demonstrated by inserting 12 cans nonstop. RoboPack, a framework integrating tactile-informed state estimation, dynamics prediction, and planning, improves robots' understanding of world dynamics through visual and tactile sensing for complex tasks like packing. EquiBot is a generalizable and data-efficient method for visuomotor policy learning, enabling robots to learn household tasks by watching human videos for just five minutes, robust to changes in object shapes, lighting, and scene makeup.
Want a robot that learns household tasks by watching you? EquiBot is a ✨ generalizable and 🚰 data-efficient method for visuomotor policy learning, robust to changes in object shapes, lighting, and scene makeup, even from just 5 mins of human videos. 🧵↓ https://t.co/vjzQ5fUP21
#RSS24 Can robots better understand the world dynamics through visual and tactile sensing? 🤖 We introduce RoboPack, a framework that integrates tactile-informed state estimation, dynamics prediction, and planning for complex tasks like packing. 🧵1/N https://t.co/tMPDUNahmI https://t.co/RYehRSTjP0
Introducing Open-TeleVision: https://t.co/tm4exWTXsL with Fully Autonomous policy video👇. We can conduct a long-horizon task with inserting 12 cans nonstop without any interruptions. We offer: 🤖 Highly precise and smooth bimanual manipulation. 📺 Active egocentric vision (with… https://t.co/q8qz6EnodW