Jun 27, 06:08 PM

Google DeepMind Releases Gemma 3n, Multimodal AI Model Running on 2GB RAM Supporting 140 Languages

Google DeepMind has officially released Gemma 3n, a compact and powerful multimodal AI model designed for edge devices. Gemma 3n supports text, image, audio, and video processing, and can run efficiently with as little as 2GB of RAM, making it suitable for mobile and on-device applications. The model is available in two variants, E2B and E4B, which perform comparably to 5B and 8B parameter models respectively. Gemma 3n is the first model under 10 billion parameters to surpass a score of 1300 on the LMArena benchmark. It supports 140 languages for text and multimodal understanding across 35 languages. The architecture incorporates innovations such as the MatFormer transformer and PLE embeddings to optimize performance and memory usage. The model is open source and accessible through major platforms including Hugging Face Transformers, llama.cpp, Together AI, and others, with free fine-tuning options available via Colab notebooks. Developers are encouraged to build applications leveraging Gemma 3n’s capabilities, including a Kaggle Impact Challenge focused on projects in accessibility, education, and health. The model’s design emphasizes privacy and offline functionality by enabling full multimodal AI processing without cloud dependence.

#Google DeepMind #Gemma 3n #LMArena #MatFormer #PLE #Hugging Face Transformers #Together AI #Colab #Kaggle Impact Challenge

Written with ChatGPT (GPT-4).