Google DeepMind has officially released Gemma 3n, an open-source multimodal AI model designed for edge devices. Gemma 3n supports text, image, audio, and video inputs, enabling comprehensive multimodal understanding. The model is optimized to run efficiently with as little as 2GB of RAM, making it suitable for deployment on mobile and other low-memory devices. It is available across major open-source platforms including Hugging Face, Kaggle, llama.cpp, Together AI, and others. Gemma 3n is notable for being the first model under 10 billion parameters to achieve a score exceeding 1300 on LMArena, a benchmark for language models. The model comes in two sizes, E2B and E4B, which perform comparably to 5 billion and 8 billion parameter models respectively. Key architectural features include the MatFormer transformer with elastic sizing and PLE embeddings that reduce VRAM usage. Gemma 3n supports 140 languages and is designed to run entirely on-device, providing real-time, private AI experiences without reliance on cloud connectivity. This release marks a step forward in mobile-first AI, offering powerful capabilities for developers and users requiring efficient, offline multimodal AI solutions.
L'IA des processeurs Snapdragon X Series propose de nombreux avantages ! - pas d'abonnement ni de limite - une IA toujours dispo, même sans connexion internet - une sécurité renforcée - des performances optimales - une meilleure expérience utilisateur Collaboration commerciale https://t.co/Q4kWpRIq54
Gemma3n has some interesting features, almost all geared towards efficient inference. A detailed breakdown for model mechanics: - The 2B model is actually 5B parameters. But half of them are in the per-layer-embeddings. These are look-up tables (just like input embeddings). The
Gemma 3n is Google’s new mobile-first multimodal AI—text, image, audio, even video on 2 GB devices. Edge AI just went premium. #EdgeAI #MobileAI https://t.co/F0WScR52Mg https://t.co/SKDNwAuzA4