From open scraping to permission by default
On July 1, 2025, Cloudflare became the first major internet infrastructure provider to block AI crawlers by default, asking every new domain at sign-up whether to allow or deny AI bots rather than making owners opt out. Cloudflare handles traffic for roughly 20 percent of the web, so the shift moves a large share of sites from an open-scraping baseline to a permission-based one. The change reframes AI access as something a publisher grants on purpose, not something that happens silently. For content owners, the practical effect is that doing nothing now defaults toward control rather than exposure.
How pay per crawl and the 402 response work
Pay per crawl, launched in private beta alongside the default-blocking change, gives publishers three settings per crawler: allow free access, charge a domain-wide price, or block outright. When a crawler requests a charged page, it either signals payment intent in its request headers and receives an HTTP 200, or it gets an HTTP 402 'Payment Required' response stating the price. Cloudflare acts as the merchant of record, handling billing between AI companies and publishers. Customizable 402 responses are now available to all paid Cloudflare customers, and the company reports its network serves more than one billion 402 responses to AI crawlers per day.
The crawl-to-refer gap that makes this matter
Cloudflare's own data shows why publishers are pushing back: AI crawlers take far more than they send back. In July 2025, Anthropic's ClaudeBot crawled roughly 38,000 pages for every visitor it referred, while OpenAI's ratio reached about 3,700 to 1 earlier in the year, and Perplexity ran lower at generally under 400 to 1. Google's traditional search crawl-to-refer ratio stayed between roughly 3 to 1 and 30 to 1 over the same period. User-driven AI crawling, where a bot fetches a page in response to a live user query, grew more than fifteen-fold year over year, so the volume side of the imbalance is accelerating.
What content owners should weigh
The first decision is posture: block, allow, or charge, and whether to treat training crawlers differently from search or inference crawlers, since Cloudflare lets AI companies declare crawler purpose. Charging only makes sense if your content is distinctive enough that an AI lab would rather pay than skip it, so commodity pages have little leverage while proprietary data, archives, or expert content have more. Blocking AI search crawlers can also cut a site out of AI-generated answers, so owners trading on AI visibility should separate the bots that may refer traffic from the bots that only train. The honest read is that pricing power is still being established and few publishers have public rate cards yet.
Key takeaways
- Cloudflare blocks AI crawlers by default for new domains as of July 1, 2025, covering a network that carries about 20 percent of web traffic.
- Pay per crawl lets publishers allow, charge, or block AI bots, using HTTP 402 responses and request-header payment intent, with Cloudflare as merchant of record; the feature remains in private beta.
- Cloudflare's network serves over one billion HTTP 402 'payment required' responses to AI crawlers daily, and user-driven AI crawling grew more than fifteen-fold year over year.
- Crawl-to-refer ratios are heavily lopsided (ClaudeBot near 38,000:1 in July 2025), so owners should decide posture per crawler and weigh AI-search visibility before blocking everything.
