OpenAI's latest multimodal language model, known as o3, has demonstrated an advanced ability to geolocate images with remarkable precision. Users report that o3 can analyze visual elements such as topography, vegetation, architecture, ridges, and contours to identify exact locations, even in complex environments like the Pyrenean valleys where hundreds of similar landscapes exist. This capability surpasses other multimodal large language models, which have struggled with such tasks. The model achieves this by zooming into images, cropping, enhancing details, and conducting rapid searches before providing precise map coordinates along with explanations of its reasoning. This development highlights a significant advancement in AI's visual reasoning and geographic identification skills.
ChatGPT puede deducir la ubicación de una fotografía con las nuevas capacidades del modelo o3 https://t.co/jR7cxWCUCu
Ok, OpenAI o3 is insane. It crops and zooms into the image 🔍, finds every little clue, runs quick searches— Then drops the exact map coordinates with a full explanation of how it got there. Wow. 🤯 https://t.co/wCC7lwbLE6
Les nouveaux modèles d'intelligence artificielle d'OpenAI, notamment o3, montrent une capacité surprenante à déterminer la localisation géographique d'une photo en se basant uniquement sur son contenu visuel. Cette prouesse technique, bien ... https://t.co/iZ9gIpauzi