A recent paper from Microsoft has revealed the sizes of various large language models (LLMs), indicating that the GPT-4o-mini is approximately an 8 billion parameter model. Other models noted in the paper include Claude 3.5 Sonnet at 175 billion parameters, GPT-4 at 1.76 trillion parameters, GPT-4o at 200 billion parameters, o1-preview at 300 billion parameters, and o1-mini also at 200 billion parameters. The findings suggest that distillation techniques at OpenAI are effective, as evidenced by the reduced size of the GPT-4o-mini compared to its larger counterparts.
Look like GPT4o mini is just an 8b model! At least that’s something Microsoft suggests in their paper https://t.co/qteqSggBN9 https://t.co/A0QxLezzxX
When you use GPT-4o to search the internet, its responses often include citations that link information to its sources. @zilliz_universe https://t.co/CQU2sqpulQ
GPT-4o-mini is just 8B! A Microsoft paper has estimated various LLM (mostly GPT) model sizes. Full list with dates: - Claude 3.5 Sonnet (2024-10-22) = ≈175B - ChatGPT = ≈175B - GPT-4 = ≈1.76T - GPT-4o (2024-05-13) = ≈200B - GPT-4o-mini (2024-05-13) = ≈8B - o1-mini… https://t.co/AbnpJMZVns https://t.co/QB3VpGQNAa