Andrej Karpathy has provided an early assessment of Grok 3, stating that its thinking capabilities are comparable to OpenAI's strongest models, specifically the o1-pro, which costs $200 per month. He noted that Grok 3 + Thinking exhibits a performance level slightly better than DeepSeek-R1 and Gemini 2.0 Flash Thinking. Karpathy's review highlights the model's state-of-the-art capabilities, particularly in its blend of search and reasoning through DeepSearch. Remarkably, this evaluation comes from a team that started development just a year ago, underscoring the rapid advancement in their technology. Other commentators echoed Karpathy's sentiments, emphasizing Grok 3's impressive performance and potential impact on the AI landscape.
“Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI's strongest models (o1-pro, $200/month), and slightly better than DeepSeek-R1 and Gemini 2.0 Flash Thinking. Which is quite incredible considering that the team started from scratch ~1 year ago” https://t.co/b9giSF7RNt
Informative Grok 3 vibe check with insightful commentary by the great Andrej Karpathy. Personally, I couldn't test thinking/deep-searching Grok yet - looks like that would raise my opinion of the model a lot. Hope that'll be available soon. Oh, and Grok 2 open source when? https://t.co/NVnpty51nU
Karpathy: Grok 3 + Thinking feels somewhere SOTA territory of OAI o1 pro ($200/month), and better than DeepSeek-R1 + Gemini 2.0 Flash Thinking. All this from a team started from scratch ~1 year ago... this timescale to SOTA is most impressive & unprecedented. https://t.co/8MCx79nJsQ