Tencent has unveiled its new open-source large language model, Hunyuan-Large, which features a total of 389 billion parameters, with 52 billion of those being activated during generation. This model, which employs mixed expert routing and was trained on 1.5 trillion tokens of synthetic data, has been reported to outperform Meta's Llama 3.1 model, which has 405 billion parameters, across various academic benchmarks. Hunyuan-Large is noted for its multilingual capabilities and a context length of 128,000 tokens, utilizing innovative techniques for improved throughput and efficiency. The release includes pre-trained, instruct, and FP8 checkpoints, making it a significant addition to the field of artificial intelligence and natural language processing.
.@TencentGlobal released the open-source #MoE large language model #Hunyuan-large with a total of 398 billion parameters, making it the largest in the industry. https://t.co/nVN40PjqK6
Tencent Releases Hunyuan-Large (Hunyuan-MoE-A52B) Model: A New Open-Source Transformer-based MoE Model with a Total of 389 Billion Parameters and 52 Billion Active Parameters https://t.co/h0306z97oX #HunyuanLarge #ArtificialIntelligence #NLP #OpenSourceAI #MachineLearning #ai… https://t.co/QNyJ1obixw
Tencent Releases Hunyuan-Large (Hunyuan-MoE-A52B) Model: A New Open-Source Transformer-based MoE Model with a Total of 389 Billion Parameters and 52 Billion Active Parameters Tencent has taken a significant step forward by releasing Hunyuan-Large, which is claimed to be the… https://t.co/foR7MjHZe5