Google DeepMind has officially released Gemma 3n, a compact and powerful multimodal AI model designed for edge devices. Gemma 3n supports text, image, audio, and video processing, and can run efficiently with as little as 2GB of RAM, making it suitable for mobile and on-device applications. The model is available in two variants, E2B and E4B, which perform comparably to 5B and 8B parameter models respectively. Gemma 3n is the first model under 10 billion parameters to surpass a score of 1300 on the LMArena benchmark. It supports 140 languages for text and multimodal understanding across 35 languages. The architecture incorporates innovations such as the MatFormer transformer and PLE embeddings to optimize performance and memory usage. The model is open source and accessible through major platforms including Hugging Face Transformers, llama.cpp, Together AI, and others, with free fine-tuning options available via Colab notebooks. Developers are encouraged to build applications leveraging Gemma 3n’s capabilities, including a Kaggle Impact Challenge focused on projects in accessibility, education, and health. The model’s design emphasizes privacy and offline functionality by enabling full multimodal AI processing without cloud dependence.
💎 Gemma 3n can understand audio, video, images, text, code, and more - in hundreds of languages. And it's only 4B parameters in size, which means you can even run it locally! 👇 If you're building with Gemma, or have an idea, there's no better way to share it with the world: https://t.co/0xdJl9rifq
Gemma 3n (text-only) is now available on Clarifai! 🎉 Available in two variants: ⚡ E2B for mobile-friendly tasks ⚡ E4B for full performance on larger devices Uses PLE caching, MatFormer, and conditional param loading. 👉 https://t.co/vjmqvVfPkn 👉 https://t.co/IXu2yRNWw1 https://t.co/xw8boBjeFa
Gemma 3n plus @UnslothAI 1.5x faster, 50% less VRAM and 4x longer context lengths. https://t.co/gEnH8ZRGaN https://t.co/CXyPTcPpcJ