Aug 18, 08:47 PM

Nvidia Unveils 9-Billion-Parameter Nemotron Nano Model With 6× Speed Boost

Nvidia has introduced Nemotron-Nano-9B-v2, a 9-billion-parameter open-source language model that blends Mamba state-space layers with transformer blocks. The company says the hybrid design delivers up to six times the throughput of comparably sized transformer models while fitting on a single Nvidia A10 GPU after pruning an earlier 12-billion-parameter version. On internal tests the multilingual model scored 72.1 percent on AIME25, 97.8 percent on MATH500, 64.0 percent on GPQA and 71.1 percent on LiveCodeBench, outperforming the open-source Qwen3-8B on most reasoning benchmarks and topping the Artificial Analysis open-model leaderboard. Developers can toggle chain-of-thought traces on or off with simple tokens and cap the model’s “thinking budget” to trade accuracy for latency. Nemotron-Nano-9B-v2 is available immediately on Hugging Face and Nvidia’s model catalog under the Nvidia Open Model License, which permits free commercial deployment and derivative works provided users keep safety guardrails and attribution. Nvidia also released about three million vision-language training samples and the broader pre-training corpus to spur community adoption. The launch adds to a string of efficiency-focused AI releases from Nvidia. Earlier in the day the company said more than two million developers now build on its robotics software stack, underscoring demand for compact models that can run on edge devices as well as in the data center.

#Nvidia #Mamba #Nvidia A10 #GPQA #LiveCodeBench #Artificial Analysis #Hugging Face #Nvidia Open Model License

Written with ChatGPT .