Jun 6, 08:26 AM

OpenThinker3 7B Model Outperforms Nvidia Nemotron and GPT-4.1; LocalAI Releases Ultravox Speech LLM

Nvidia recently released Nemotron, a hybrid Mamba-Transformer architecture model designed for large-scale reasoning with four times the throughput of comparable transformer models. The Nemotron-H-47B-Reasoning-128k variant demonstrates slightly higher accuracy than the Llama-Nemotron-Super-49B-1.0 across various benchmarks. Meanwhile, the OpenThinker team launched OpenThinker3, a 7 billion parameter (7B) model trained solely with supervised fine-tuning (SFT) on 1.2 million reasoning traces covering math, coding, and science domains. OpenThinker3 reportedly outperforms all open 7B and 8B models, including those trained with reinforcement learning, and surpasses Nvidia's Nemotron as well as GPT-4.1 in reasoning tasks. OpenThinker3 is available for local deployment via Hugging Face and LocalAI. Additionally, LocalAI announced the release of Ultravox, a multimodal speech large language model (LLM) based on Llama 3.2, and a variant combining Llama 3.1 with Whisper for speech and text input. Another new model, nbeerbower_qwen3-gutenberg-encore-14B, a Qwen3 fine-tuned for text generation, was also introduced on LocalAI. The developments highlight ongoing advancements in open-source and hybrid AI models focusing on reasoning capabilities and inference speed.

#Nvidia #Nemotron #OpenThinker #OpenThinker3 #Hugging Face #LocalAI #Ultravox #Llama #Whisper

Written with ChatGPT (GPT-4).

Sources

Additional media

Image #1 for story openthinker3-7b-model-outperforms-nvidia-nemotron-gpt-4-1-localai-releases-llm-e061f14f

Image #2 for story openthinker3-7b-model-outperforms-nvidia-nemotron-gpt-4-1-localai-releases-llm-e061f14f

OpenThinker3 7B Model Outperforms Nvidia Nemotron and GPT-4.1; LocalAI Releases Ultravox Speech LLM

Sources

Additional media

Similar Stories