
OpenAI, in collaboration with Google, has released a new AI model named Grok-1, a 314B big parameter Mixture-of-Experts (MoE) transformer, on Thursday 6 Jan 2022. The model, which has been released open source under the Apache 2.0 license, was developed by a team including Alethea Power, Yuri Burda, Harri Edwards, Igor Babuschkin from OpenAI, and Vedant Misra from Google. Despite being twice the size of GPT-3.5, Grok-1's performance has been questioned, with some comparisons suggesting it performs worse than Mixtral despite its larger size. The model, which has not been fine-tuned, applies a top2 selection over the softmax of all 8 experts, differing from Mixtral's approach. It has been made available on Hugging Face, but concerns have been raised about its effectiveness and the lack of information on its training data. The initial announcement highlighted a 73% score on MMLU, and the tokenizer has a vocab size of 131072. Grok-1 is noted for its open weights, a move towards more open AI models.





Grok-1 is out on Hugging Face https://t.co/6GBcgnD1ZI
Musk's Grok AI was just released open source in a way that is more open than most other open models (it has open weights) but less than what is needed to reproduce it (there is no information on training data). Won't change much, there are stronger open source models out there. https://t.co/eFouYFVRmN
pretty badass https://t.co/P1tp1hmzxN