Feb 19, 01:30 PM

Groq Inc. Unveils LPU™: 483 Tokens/Sec at $0.8/Million, Challenging Nvidia

Groq Inc. has made significant strides in the AI and hardware industry with its introduction of the Mixtral model and LPU™ Inference Engine, achieving an impressive processing speed of nearly 483 tokens per second. This development has been met with enthusiasm from the tech community, highlighting its potential to dramatically reduce latency and cost in large language model (LLM) applications. The technology is noted for its ability to deliver instantaneous responses, opening up new use cases and enhancing user experience. Groq's achievements are attributed to its innovative approach, leveraging custom hardware and a software-defined network that treats all chips as a single unit. The company, founded by former Google TPU members, offers a cost-effective solution at $0.8 per 1 million tokens, significantly cheaper than its competitors. Groq's technology is not only fast but also accessible, as it is not closed-source like Google's Gemini Ultra, which can handle 500k tokens. The tech community anticipates that Groq's advancements, especially with the Mixtral 8x7b-32k model achieving 500 tokens per second, could be a game-changer, potentially challenging Nvidia's dominance in the GPU market.

#Mixtral #Groq #Google TPU #Google #Gemini Ultra #Nvidia

Written with ChatGPT (GPT-4).

Groq Inc. Unveils LPU™: 483 Tokens/Sec at $0.8/Million, Challenging Nvidia

Sources

Additional media

Similar Stories