Jul 9, 01:55 AM

Hugging Face Releases SmolLM3, 3B-Parameter Multilingual Model with 128K Context and Dual-Mode Reasoning

Hugging Face has released SmolLM3, a 3-billion-parameter multilingual language model that supports long-context processing up to 128,000 tokens and features dual-mode reasoning capabilities. Trained on 11.2 trillion tokens, SmolLM3 matches the performance of larger 4-billion-parameter models such as Google's Gemma3 and surpasses models like Llama-3.2-3B and Qwen2.5-3B. The model supports six languages and is designed for efficient inference, demonstrated by its fast performance on Apple’s M4 Max chip. Hugging Face has also open-sourced the training methodology, which utilizes public datasets and frameworks. Additionally, SmolLM3 has day-zero support in the mlx-lm library, which includes performance improvements and support for fine-tuning with low learning rates using LORA. The release is part of a broader update to mlx-lm, which now includes several new models from Baidu, Microsoft, TII, Google, OpenBMB, and Apple.