
The 01AI team has announced significant advancements in their Yi model series, demonstrating substantial improvements in long text capability and overall performance. The Yi-34B-200K model, in particular, has shown a remarkable increase in performance in the Needle-in-a-Haystack test, jumping from 89.3% to 99.8%. The team has been pretraining the model on a 5 billion tokens long-context data mixture, contributing to its enhanced capabilities. The Yi model family, which includes language and multimodal models, is based on 6 billion and 34 billion pretrained language models. These models have been extended to chat models, showcasing strong multi-dimensional capabilities. A technical report detailing these advancements has been released, highlighting the base, chat, vision-language models, and a 200K long context model among others. This progress suggests that the Yi models, particularly the 34B models, have the potential to outperform GPT 3.5 class models in benchmarks and human evaluations. The team is now aiming to develop 7 billion models that could surpass GPT 3.5, indicating that 75% of enterprise AI tasks could soon be performed using a local large language model (LLM), reducing the need for expensive hosted LLMs.
The new Yi models show that 34B models can beat GPT 3.5 class models in benchmarks and human evaluation. Next stop - 7B models that beat GPT 3.5 If we achieve this milestone, 75% of enterprise AI tasks can be performed using a local LLM We don't need expensive LLMs hosted…
Wondering what's powering the trending #yi34b? Curious about the tech behind it? At @01AI_Yi, we've written a tech report revealing the inside scoop on: ✅Base and chat models ✅200K long context model ✅Depth-upscaled model ✅Vision-language model Enjoy! https://t.co/O2C0fWS1Bc
Yi: Open Foundation Models by 01 . AI Young et al.: https://t.co/6WNbmPWjKC #ArtificialIntelligence #DeepLearning #MachineLearning https://t.co/KiQ5or99X3
