May 23, 03:44 PM

Google Gemma 3n (2GB RAM), Sarvam-M 24B, Nvidia AceReason-Nemotron 14B, ByteDance 7B Open-Source Models Released

Google has launched Gemma 3n, a multimodal model designed for mobile and edge devices. Gemma 3n can process text, images, audio, and video, and operates efficiently on devices with as little as 2GB of RAM. The model reduces RAM usage by nearly 3x, offers approximately 1.5x faster response on mobile, and is available in early preview. It will be integrated into Android and Chrome platforms. Sarvam AI, an Indian startup, has released Sarvam-M, a 24-billion-parameter open-weights hybrid language model built on Mistral Small. Sarvam-M demonstrates strong performance in math, programming, and Indian language tasks, achieving over 86% improvement on the romanised Indian language GSM-8K benchmark. It outperforms Llama-4 Scout and is comparable to larger models such as Llama-3.3 70B. The model is accessible via API and available for download. Nvidia has introduced AceReason-Nemotron-14B, based on DeepSeek-R1-Distill-Qwen-14B, with both 7B and 14B variants. The model advances math and code reasoning through reinforcement learning, applying RL first on math-only prompts and then on code-only prompts, resulting in substantial improvements in benchmark accuracy. ByteDance has released a unified multimodal AI model that matches GPT-4o and Gemini 2.0 capabilities with only 7 billion parameters. The 100% open-source model supports multiple modalities and includes a 'Thinking mode' for advanced reasoning. Additional research includes Meta-PerSER, a personalized speech emotion recognition framework using meta-learning, and Adaptive Cognition Policy Optimization (ACPO), a reinforcement learning approach for efficient large language model reasoning.