Grok 3 Beta, developed by Elon Musk's team, has achieved top rankings in several key benchmarks, outperforming competitors such as OpenAI's GPT-4 and DeepSeek V3. In the CorpFin benchmark, Grok 3 scored 69.1%, while it excelled in CaseLaw with an impressive 88.1% accuracy, and achieved 78.8% in TaxEval. These results indicate Grok 3's strong capabilities in finance, legal, and tax reasoning, marking it as a leading model in these areas. The performance of Grok 3 Beta has been noted to surpass that of Gemini 2.5 Pro and GPT-4o, which scored 67.0%. Additionally, Grok 3 is recognized for its minimal censorship and free-speech approach, distinguishing it from other models in the market. The launch of Grok 3 API has also been announced, although it has been noted that it may already be outdated compared to other models available.
Just Grok it https://t.co/buXlomb5J4
Claude Sonnet and Gemini 2.5 Pro are still better than GPT 4.1 and Grok 3 in Cursor
Perplexity's Sonar API is tied with Gemini-2.5 Pro for #1 spot in the LM Search Arena leaderboard that measures the quality of web-search grounded LLM completions. Congrats to @GoogleDeepMind for a great model. Lots to do to keep improving our Sonar models and our search index ! https://t.co/BO3IlMOkl3