ZyphraAI has released the Zamba2-7B model, a state-of-the-art small language model that surpasses dense transformers in both inference speed and training cost efficiency. The Zamba2-7B, part of the Zamba2 series, offers state-of-the-art performance and unparalleled inference efficiency, making it the best LLM in the ≤8B range. The model, developed in collaboration with NVIDIA, outperforms other notable models such as Llama 3.2 11b, Mistral-7B, Llama 3.1 8B, and Google Gemma 7B. This hybrid SSM-attention architecture model challenges the traditional transformers architecture, setting a new benchmark in the AI and machine learning industry, outperforming models from AIatMeta, Gemma 2, and MistralAI.
NVIDIA just quietly released a fine-tuned version of Llama 3.1 70B that outperforms GPT4o and Sonnet 3.5 on multiple benchmarks. This model was trained using RLHF (specifically, REINFORCE), Llama-3.1-Nemotron-70B-Reward and HelpSteer2-Preference prompts on a… https://t.co/3J2soEPEsT
Nvidia’s Nemotron 70B is destroying GPT4o & Sonnet 3.5 on Arena Hard, MT-Bench, AlpacaEval 2 🔥 Arena Hard Nemotron: 85.0 3.5: 79.2 4o: 79.3 AlpacaEval 2: Nemotron: 57.6 3.5: 52.4 4o: 57.5 MT Bench Nemotron: 8.98 3.5: 8.81 4o: 8.74 LFG!! 🔥🔥 https://t.co/D0yiIO1Lqd
We're soo back!: Nvidia Nemotron 70B - beats Llama 3.1 405B, GPT4o & Claude 3.5 Sonnet! 🔥 Evals (Nemotron 70B vs Claude 3.5 vs GPT4o) > Arena Hard - 85.0 vs 79.2 vs 79.3 > AlpacaEval 2 LC - 57.6 vs 52.4 vs 57.5 > MT Bench - 8.98 vs 8.81 vs 8.74 Secret Sauce? RLHF (REINFORCE)… https://t.co/yaiDErBUFR