Aug 28, 12:18 AM

Cartesia AI Unveils Rene 1.3B Mamba-2 On-Device Model, Achieves 80-120 Tokens/Sec

Cartesia AI has announced the release of its new on-device AI model, Rene 1.3B, which is part of their Mamba-2 language model series. This model operates efficiently on devices, achieving speeds of 80-120 tokens per second, and is licensed under Apache 2.0. The model utilizes state space models (SSMs) for improved performance and is integrated with custom SSM kernels in MLX. It is also capable of running at almost 200 tokens per second on an M2 Ultra. The release marks a significant milestone in Cartesia AI's broader initiative to develop more efficient AI architectures that operate independently from data centers, with deployment at the edge. The Rene model is noted for its compactness and power, showcasing potential for on-device intelligence with alternating Mamba 2 and MLP layers.

#Cartesia AI #Rene #Apache #MLX #M2 Ultra #MLP

Written with ChatGPT (GPT-4o).

Sources

Additional media

Image #1 for story cartesia-ai-unveils-rene-1-3b-mamba-2-on-device-model-achieves-80-120-tokens-sec

Image #2 for story cartesia-ai-unveils-rene-1-3b-mamba-2-on-device-model-achieves-80-120-tokens-sec

Image #3 for story cartesia-ai-unveils-rene-1-3b-mamba-2-on-device-model-achieves-80-120-tokens-sec

Cartesia AI Unveils Rene 1.3B Mamba-2 On-Device Model, Achieves 80-120 Tokens/Sec

Sources

Additional media

Similar Stories