

Grok's new 1.5 Vision model is now competitive with or outperforms leading AI models such as GPT-4V, Claude, and Gemini, according to recent benchmarks published by xAI. The integration of vision capabilities into Grok will allow it to process a variety of visual information, including multi-disciplinary reasoning, math, diagrams, text reading, charts, and documents. This marks Grok's move into multimodal functionalities, with early results showing promising performance. The vision model is expected to be integrated into the Grok chat in the medium term, with other features planned for release in the near future.
Vision coming to Grok https://t.co/uoPUsUOf3K
NEWS: Grok can now process a variety of visual information in addition to text! It is competitive with or outperforms existing multimodal models across benchmarks in areas like multi-disciplinary reasoning, math, diagrams, text reading, charts, and documents https://t.co/2hDR8qRlT8 https://t.co/mX4Zm0G7Y6
Grok 1.5 Vision's capabilities are competitive with GPT-4V, Claude and Gemini According to benchmarks just published by xAI https://t.co/xZ8Aeu6oGJ