Sep 4, 05:30 AM

ContextualAI and Allen Institute Release OLMoE, a 1B Active, 7B Total Parameters MoE Model Led by Muennighoff

ContextualAI, in collaboration with Allen Institute for AI, has released OLMoE, a state-of-the-art, fully open-source Mixture-of-Experts (MoE) language model. OLMoE, led by Muennighoff, features 7 billion parameters but uses only 1 billion active per input token, making it highly efficient. The model has been pretrained on 5 trillion tokens and is designed to rival more costly models such as Gemma and Llama in performance. The release includes model weights, training data, code, and logs, and aims to provide a cost-effective yet powerful tool for language model research and application. Part of the OLMo family, OLMoE boasts a superior performance-to-cost ratio and incorporates 64 experts with 8 active at any given time.

#ContextualAI #Allen Institute for AI #OLMoE #MoE #Muennighoff #Gemma #Llama #OLMo

Written with ChatGPT (GPT-4o).