WHY IS NO ONE TALKING ABOUT THIS?? Gemma 3n model was one of the best surprises for me. The fact that you can run it on edge devices even with just 2GB of RAM is impressive. A few weeks back, I was on holiday and used the Gemini Live feature a lot. But I kept running into https://t.co/0XPBcNFmon
Gemini's got a glow-up: → Text → Image → Video All on 2GB RAM 🤯 Multimodal just went mainstream.https://t.co/IKa29JkVce
谷歌发布专为移动端训练的多模态语言模型 Gemma 3n 5B 的模型居然还能理解视频,内存占用仅相当于2B模型 - 在移动设备上的响应速度提升约 1.5 倍 - 通过逐层嵌入、键值缓存共享等技术降低内存占用 - 能够理解和处理音频、文本及图像,甚至是视频 - 将会内置在Android 和 Chrome 里面 - https://t.co/m9zF0B4EJG https://t.co/NGf4gDXRb3
Google DeepMind has introduced Gemma 3n, a new multimodal AI model designed specifically for mobile on-device applications. The model significantly reduces RAM usage by nearly threefold, enabling it to run efficiently on devices with as little as 2GB of RAM. Gemma 3n supports complex AI tasks including processing text, images, audio, and video, making it a versatile tool for mobile and edge computing. It achieves faster response times—approximately 1.5 times quicker on mobile devices—through advanced techniques such as layer-wise embedding and key-value cache sharing. The model, which is around 5 billion parameters in size but with memory usage comparable to a 2 billion parameter model, is expected to be integrated into Android and Chrome platforms. This development marks a shift in AI inference from centralized data centers to decentralized, user-end devices, promoting broader accessibility and real-time AI capabilities on smartphones and laptops. Gemma 3n is currently available in early preview.