Jul 18, 02:56 PM

Mistral AI Releases 7B Codestral Mamba and MathΣtral Models Under Apache 2.0

Mistral AI has released two new models, Codestral Mamba and MathΣtral, both under the Apache 2.0 license. Codestral Mamba, a 7B parameter model based on the Mamba2 architecture, is designed for efficient code generation and achieves 75% on HumanEval for Python coding. It supports a context length of up to 256K tokens and offers linear time inference, making it suitable for local code assistants and copilot applications. MathΣtral, another 7B parameter model, focuses on mathematical reasoning and scientific discovery. It features a 32K context window and outperforms Minerva 540B by over 20% on the MATH benchmark, achieving 56.6% pass@1, 68.4% with majority voting, and 74.6% using a reward model. These releases highlight Mistral AI's commitment to developing specialized, high-performance models for specific tasks.

#Mistral AI #Codestral Mamba #MathΣtral #Apache #Mamba2 #HumanEval #Python

Written with ChatGPT (GPT-4o).