Jan 16, 06:30 PM

DeepSeek Launches V3 Model with 671 Billion Parameters, Surpassing GPT-4o and Llama 3.1, 20X Cheaper than Claude 3.5

DeepSeek, a Chinese AI firm, has launched DeepSeek V3, an open-source large language model (LLM) with 671 billion parameters. This model reportedly surpasses competitors such as GPT-4o and Llama 3.1 405B on key benchmarks, particularly in coding and math tasks. DeepSeek V3 utilizes a mixture-of-experts architecture, activating only 37 billion parameters at a time, allowing it to operate at a fraction of the cost of its U.S. counterparts. In recent weeks, three Asian startups have introduced frontier models, all featuring open weights and permissive licenses, including DeepSeek V3, MiniMax-Text 01 with 456 billion parameters, and InternLM3-8B-Instruct. This development signals a notable advancement in China's AI capabilities, particularly as public data is now being leveraged for labeling, enhancing the potential for further innovation in the sector.

#DeepSeek #Chinese #DeepSeek V3 #Llama #Asian #China

Written with ChatGPT (GPT-4o mini).

DeepSeek Launches V3 Model with 671 Billion Parameters, Surpassing GPT-4o and Llama 3.1, 20X Cheaper than Claude 3.5

Sources

Additional media

Similar Stories

Similar Stories