May 12, 03:20 PM

PrimeIntellect Releases INTELLECT-2, 32B-Parameter Model Based on Qwen/QwQ-32B, Optimized for Math and Code

PrimeIntellect has released INTELLECT-2, a 32-billion parameter reasoning model that is the first of its kind to be trained using globally distributed asynchronous reinforcement learning. The model is based on the Qwen/QwQ-32B architecture and has been further trained using Generalized Reinforcement Policy Optimization (GRPO). INTELLECT-2 is optimized for mathematical reasoning and coding tasks, outperforming its base model on benchmarks such as AIME24, LiveCodeBench, and GPQA-Diamond. The model is open source under the Apache 2.0 license and compatible with popular transformer frameworks including llama.cpp and vllm. This release marks a milestone in decentralized AI training, leveraging community-contributed GPUs for large-scale model development. NVIDIA has also highlighted advancements in multi-data center large language model training, achieving over 96% scaling efficiency across global GPU clusters using its NeMo and Megatron-Core technologies.

#PrimeIntellect #Generalized Reinforcement Policy Optimization #AIME24 #LiveCodeBench #Apache #NVIDIA #NeMo

Written with ChatGPT (GPT-4).