Jun 27, 12:50 AM

Google DeepMind Releases Gemma 3n, Sub-10B Multimodal AI Model Running on 2GB RAM with 1300+ LMArena Score

Google DeepMind has officially released Gemma 3n, an open-source multimodal AI model designed for edge devices. Gemma 3n supports text, image, audio, and video inputs, enabling comprehensive multimodal understanding. The model is optimized to run efficiently with as little as 2GB of RAM, making it suitable for deployment on mobile and other low-memory devices. It is available across major open-source platforms including Hugging Face, Kaggle, llama.cpp, Together AI, and others. Gemma 3n is notable for being the first model under 10 billion parameters to achieve a score exceeding 1300 on LMArena, a benchmark for language models. The model comes in two sizes, E2B and E4B, which perform comparably to 5 billion and 8 billion parameter models respectively. Key architectural features include the MatFormer transformer with elastic sizing and PLE embeddings that reduce VRAM usage. Gemma 3n supports 140 languages and is designed to run entirely on-device, providing real-time, private AI experiences without reliance on cloud connectivity. This release marks a step forward in mobile-first AI, offering powerful capabilities for developers and users requiring efficient, offline multimodal AI solutions.

#Google DeepMind #Gemma 3n #Hugging Face #Kaggle #Together AI #LMArena #MatFormer

Written with ChatGPT (GPT-4).