The Allen Institute for AI (AI2) has released Tülu 3, an open-source post-training model with 405 billion parameters, which reportedly surpasses the performance of DeepSeek V3 and OpenAI's GPT-4o on various benchmarks. This new model employs a novel approach that includes Reinforcement Learning from Verifiable Rewards (RLVR), demonstrating its capability to scale effectively. Tülu 3 is built on Llama 3.1 architecture and is positioned as the last member of the Tülu 3 family. The release is part of a broader trend in the AI community focusing on open-source models and advancements in machine learning techniques.
The Allen Institute for AI (AI2) Releases Tülu 3 405B: Scaling Open-Weight Post-Training with Reinforcement Learning from Verifiable Rewards (RLVR) to Surpass DeepSeek V3 and GPT-4o in Key Benchmarks #AIResearch #OpenSourceAI #PostTraining #MachineLearn… https://t.co/8mRDCzS993 https://t.co/2gAmqALI1q
The Allen Institute for AI (AI2) Releases Tülu 3 405B: Scaling Open-Weight Post-Training with Reinforcement Learning from Verifiable Rewards (RLVR) to Surpass DeepSeek V3 and GPT-4o in Key Benchmarks The team has developed its latest release, Tülu 3 405B, the first open-weight… https://t.co/QoXNXp4PFk
New from @allen_ai 🚀 Tülu 3 405B 🐫 is an open-source post-training model that outperforms DeepSeek-V3! Thrilled to announce our collaboration with @allen_ai, where the authors will be on alphaXiv to discuss their work! Join the discussion in the link below 👇 https://t.co/annmlbqwSc