
Hugging Face has introduced SmolLM360, a state-of-the-art model in the sub-500 million parameter range, optimized for real-time, in-browser text generation with Instant SmolLM. The SmolLM models, including versions with 135 million, 360 million, and 1.7 billion parameters, are instruction-tuned and licensed under Apache 2.0. These models are optimized to run on-device with WebGPU support, allowing for efficient performance even on older devices. The SmolLM360 model, in particular, is highlighted for its ability to generate 100 tokens per second on older MacBook Pro models using llama.cpp. The SmolLM 360M Q8 model is available with an MLC checkpoint. Additionally, the models have been fine-tuned with curated pretraining data and well-tuned hyperparameters, enhancing their educational content and conversational abilities.




🚀 New model alert! Introducing SmolLM 1.7b-instruct! A small but mighty language model. Install it in LocalAI with `local-ai run smollm` and get ready to chat! #LocalAI #SmolLM #LLM #AI #NLP 🚀🔥🤖
New SmolLM model with MLC checkpoint available, running across WebGPU, mobile and more https://t.co/Ag9pdsY9qZ
It's beautiful to see how far you can get with a 360M model (5x smaller than GPT-2!) with a few tricks: - curate the pretraining data for educational content - choose well tuned hyperparameters - and latest: add simple conversations to the SFT mix Result: 🤏SmolLM https://t.co/wQkH7ZPArr