Google has broadened the multimodal capabilities of its Gemini artificial-intelligence app, allowing users to upload and analyze their own video files directly in chat. The update, which began rolling out on 18–19 June, is now available on Android and iOS and is also reaching the web client, according to company engineers involved in the release. Gemini can watch and listen to clips up to at least 39 minutes, transcribe overlapping voices, summarise content and answer follow-up questions—functions that previously applied only to images, documents and YouTube links. Early testers said the system accurately distilled a keynote speech and identified individual speakers in crowded audio tracks, bringing the product closer to a full-service digital assistant. The launch coincides with a wider push to weave Gemini deeper into Google’s mobile software. A "Scheduled Actions" tool that triggers prompts at preset times is being made more widely available on Android, and the latest Android 16 QPR1 beta introduces a redesigned Pixel Launcher search bar and new Gemini launch animation. Together, the additions underscore Google’s efforts to compete with OpenAI by embedding its generative models across hardware and services.
NEW 🔥: Gemini now supports video uploads across web and mobile apps! https://t.co/GMiZ8wWB16 https://t.co/AmWrjG1FOX
Gemini can take video as input! love it https://t.co/8NCFhT9k4L
🚨Breaking: Gemini now supports video upload. Just gave it a 39-minute Andrej Karpathy latest keynote video and it summarized it well. This was really needed. https://t.co/j5hASpIUg0