Cloud-infrastructure provider Cloudflare said in a 4 Aug. blog post that answer-engine startup Perplexity AI is disguising its web-crawling activity to bypass websites that explicitly block automated scraping. According to Cloudflare, Perplexity alters its user-agent string to mimic a standard Chrome browser and rotates IP addresses across multiple autonomous-system numbers after its declared “PerplexityBot” is rejected by robots.txt files or firewall rules. The company said Perplexity’s official bots generate about 20–25 million requests a day, while undeclared crawlers add a further 3–6 million requests across tens of thousands of domains. Cloudflare removed Perplexity from its Verified Bots list and rolled out new signatures to block what it calls “stealth crawling.” It contrasted the behaviour with OpenAI’s ChatGPT crawler, which it says respects site directives. San Francisco-based Perplexity, valued at roughly US$18 billion after a recent US$100 million funding round, rejected the accusations. The startup argued that Cloudflare misattributed traffic from BrowserBase, a third-party cloud-browser service it uses sparingly, and said its requests are made in real time on behalf of users rather than for bulk data harvesting. Perplexity called for technical dialogue instead of public “naming and shaming.” The spat highlights growing friction between artificial-intelligence firms hungry for web content and publishers seeking to control use of their material. Cloudflare said more than 2.5 million sites now deploy its tools to block AI training or challenge non-compliant crawlers, signalling mounting pressure on AI companies to adopt transparent and consensual data-access practices.
Accusé de piller les sites web, Perplexity se défend maladroitement ➡️ https://t.co/HcifDqIVba https://t.co/ul1bSuBYUt
Google says AI is boosting Search. Yes, but… via @MrDannyGoodwin https://t.co/U28Cz5fidW https://t.co/V9XAIzUtLL
Google says AI not causing traffic declines & that it cares more about web’s health than competition https://t.co/tkB4xbhAWg by @technacity