Sources
Center for Data Innovation📰 OpenVLA, a vision-language-action model for robotics, lets developers control robots with natural language & images, aiding customization in multi-task environments with multiple objects affordably. https://t.co/Z9vuF7GYP0
fly51fly[RO] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy https://t.co/ebM3pszYAY - The paper proposes LLaRA, a framework to convert a pretrained vision language model (VLM) into a robot action policy using curated instruction tuning datasets. - LLaRA first… https://t.co/IAsCPF47n7
Kumara KahatapitiyaIntroducing LLaRA ✨ A complete recipe for converting a VLM into a robot policy: from data curation, finetuning to real-robot execution, all open-sourced NOW! Our experiments show the benefits of auxiliary data (eg: spatial/temporal reasoning) on learning policy. Have fun! https://t.co/PUJ9p9vrEz
Additional media



