May 2, 02:05 PM

DeepSeek Prover V2 Solves 90% MiniF2F Problems; Microsoft’s 14B Phi-4 Model Outperforms OpenAI, Released on Hugging Face

Chinese AI start-up DeepSeek has released DeepSeek-Prover V2, an advanced domain-specific AI model designed for automated mathematical reasoning. The model reportedly solves approximately 90% of miniF2F problems, surpasses state-of-the-art performance on the PutnamBench, and successfully addresses formal problems from AIME 24 and 25. DeepSeek-Prover V2 was quietly uploaded to the Hugging Face platform, contributing to the open-source AI ecosystem. Concurrently, Microsoft has launched a series of small language models focused on reasoning, including Phi 4 mini reasoning, Phi 4 reasoning, and Phi 4 reasoning plus. These models utilize a combination of data curation, supervised fine-tuning, and targeted reinforcement learning to achieve strong reasoning capabilities. Microsoft’s new 14-billion parameter Phi-4 reasoning model reportedly outperforms OpenAI’s o1-mini and matches the performance of models five times larger on math and science tasks, while being efficient enough to run on laptops and mobile devices. Both DeepSeek and Microsoft’s releases highlight advancements in AI-driven mathematical and logical reasoning, with reinforcement learning playing a key role in improving model performance.

#DeepSeek #PutnamBench #AIME #Hugging Face #Microsoft #OpenAI

Written with ChatGPT (GPT-4).