Microsoft has announced the release of its new language models under the Phi-3 series, featuring advancements in AI capabilities. The Phi-3-mini, a 3.8 billion parameter model trained on 3.3 trillion tokens, is designed to rival major models like Mixtral 8x7B and GPT-3.5. Additionally, the Phi-3-medium, with 14 billion parameters trained on 4.8 trillion tokens, achieves a 78% score on the MMLU benchmark and 8.9 on MT-bench. The Phi-3 14B model outperforms the Llama-3 8B, GPT-3.5, and Mixtral 8x7b MoE in most benchmarks, while the Phi-3 mini also surpasses the Llama-3 8B in MMLU and HellaSwag. The series also includes the Phi-3 7B model, which surpasses the Llama-3 7B model in performance, scoring 75.3 on MMLU. These models are part of Microsoft's effort to enhance the open-source community, sharing similar architecture with Llama-2.
Small Models For The Win!! Phi-3 7B just dropped and beats Llama-3 7B handily. With an MMLU of 75.3, it's coming close to 70B SOTA models!! 🤯 I wouldn't be surprised if we ended up with a 7B model that beats GPT-4 by the end of the year. LLMs are now pretty universally… https://t.co/2BzjDr03xL
Microsoft just released Phi-3 Phi-3 14B beats Llama-3 8B, GPt-3.5 and Mixtral 8x7b MoE in most of the benchmarks. Even the Phi-3 mini beats Llama-3 8B in MMLU and HellaSwag. https://t.co/P3GzWxfJaR
Microsoft just released technical report of phi-3-mini 3.8 billion parameter language model trained on 3.3 trillion tokens. Rivals models such as Mixtral 8x7B and GPT-3.5. Link of paper in 🧵 https://t.co/EcB7SANkRJ