Key Takeaways

  • On domains created after September 15, 2026, Cloudflare will block Training and Agent AI crawlers by default on pages that show ads while allowing Search-classified crawlers, changing default crawl behavior immediately.
  • Cloudflare enforces three behavior buckets—Search, Agent, Training—and offers three mitigation levels (Block, Block on pages with ads, Allow) that site owners must configure per behavior type.
  • Cloudflare will support a pay-per-crawl model using HTTP 402; each AI crawler request must present payment intent to receive HTTP 200, otherwise it receives a 402 with pricing details.
  • Site owners must audit Cloudflare Security AI bot settings, test zone-level and subdomain behavior with HTTP clients, and decide per-content pricing or blocking to preserve ad revenue and SEO visibility.

From September 15, 2026, Cloudflare will change default rules for how AI crawlers can access new domains on its network.[2]

If you monetize with ads, those defaults often mean Training and Agent-style AI bots are blocked from key pages unless you explicitly opt out.[2]

For SEO, AI-driven discovery, and potential pay-per-crawl revenue, AI access is no longer a free default but a configuration and business decision.[1]

💡 Key takeaway: Treat AI crawl policy as seriously as robots.txt and ad stack configuration.[2]


1. How Cloudflare’s new defaults block AI crawlers

On new domains created after September 15, 2026, Cloudflare will:[2]

  • Allow crawlers it classifies as Search by default
  • Block Training and Agent crawlers on pages that show ads
  • Apply these behaviors automatically, unless you change the policies

Result: a new ad-supported blog or media site could be invisible to many AI tools that summarize or reuse content, even though classic web search remains open.[2]

Cloudflare groups AI bot behavior into three buckets:[2]

  • Search – indexing to answer questions later
  • Agent – real-time fetches for assistants or users
  • Training – crawling to train or fine-tune models, including mixed-purpose bots

These labels drive which bots are blocked, allowed, or potentially charged in the future.[2][4]

⚠️ Key point: AI crawlers are judged by behavior, not just user agent strings; unverified bots can be swept into these categories.[2]

Mixed-purpose crawlers are treated as Training whenever a site chooses to block AI training, including under the older “Block AI bots” setting, closing the loophole where an LLM vendor could present its crawler as “search” while still harvesting data.[2][4]

For each behavior type, Cloudflare offers three mitigation levels:[2]

  • Block – across the entire zone
  • Block on pages with ads – via automated ad-page detection
  • Allow – no additional blocking

This lets you draw lines between:[2][4]

  • Monetized articles
  • Free-to-share docs or marketing pages
  • Fully protected premium areas

For many new ad-reliant sites, the “Block on pages with ads” default for Training and Agent crawlers will mean AI tools cannot read or reuse large parts of the site unless someone changes those settings.[1][2]


2. Why Cloudflare is tightening AI access and introducing pay-per-crawl

Cloudflare frames these changes as moving beyond a binary “open to all AI” vs “total walled garden.”[4]

Its stated principle: content creators should be “in the driver’s seat” about which crawlers access their work and on what terms.[4]

After discussions with news organizations, platforms, and other publishers, Cloudflare heard demand for a third path: allow AI crawlers, but only with compensation.[4]

📊 Data point: Because Cloudflare fronts a large share of web traffic, its defaults have outsized impact on AI companies’ access to high-quality training data.[3][4]

To support that third path, Cloudflare is rolling out a pay per crawl model using HTTP 402 (Payment Required).[4]

Each AI crawler request either:[4]

  • Presents valid payment intent and receives content (HTTP 200), or
  • Gets a 402 response with pricing information

Domain owners can then choose, per crawler:[4]

  • Allow – free access
  • Charge – require per-request payment
  • Block – deny with no option to pay

Commentary notes that Cloudflare is “digging its heels in” on AI crawl access, signaling that these defaults and monetization tools could reshape how much AI vendors must pay for training data at scale.[1][3]

For AI companies, this raises the cost and complexity of assembling training and inference corpora.[1][4]

For site owners, it creates leverage—but also new work: deciding which AI behaviors (Search, Agent, Training) to block, allow, or bill.[4]

💼 Key takeaway: Your crawl policy is now also a pricing strategy for LLM vendors.[4]


3. Practical implications and configuration checklist

Before September 15, 2026, review your AI bot policies—especially if you run ads or depend on AI-driven discovery.[2]

Otherwise, new domains may default to blocking Training and Agent crawlers on ad pages in ways that conflict with how you expect assistants like ChatGPT, Claude, or Perplexity to surface your content.[2][4]

Separate AI training from SEO:[2][4]

  • Search-classified bots stay allowed by default
  • Training and mixed-purpose crawlers can be blocked or later charged

This preserves visibility in traditional search while limiting free model training on your work.

📊 Operational checklist:[2][4]

  • Audit “Block AI bots” and AI bot policies in Cloudflare Security settings
  • Decide per behavior type: Block, Block on ad pages, or Allow
  • Document which crawlers matter for your business (AI search vs full-web LLMs)

Expect edge cases. A Cloudflare community user reported ClaudeBot and GPTBot being blocked on a custom domain even with “Block AI Bots” set to allow and AI Crawl Control permissive, while the platform subdomain worked normally.[5]

Zone-level configuration changes were required, showing how managed rules or origin behavior can override your intent.[5]

⚠️ Key point: Never assume a toggle guarantees access—test it like any production change.[5]

Test your site as an AI crawler would:[4][5]

  • Use curl or HTTP clients with known AI user agents
  • Hit both main domains and subdomains
  • Log status codes, especially 403 (blocked) and 402 (payment required)

Then align AI access with business goals:[4]

  • High-value, ad-supported, or premium content: consider Block or pay-per-crawl
  • Docs, FAQs, and marketing pages: allow Search and selected Agent bots to maximize LLM visibility

💡 Key takeaway: Treat AI bots as another audience segment—some you court, some you meter, some you exclude.[4]


Conclusion: From assumed access to negotiated terms

Cloudflare’s shift toward blocking many AI crawlers—especially on ad-supported pages—moves the web from automatic AI access to negotiated terms.[1][2]

By understanding the behavior buckets, 2026 defaults, and pay-per-crawl options, you can choose where to block, where to allow, and where to charge, instead of letting AI vendors unilaterally set the rules.[2][4]

Audit your Cloudflare AI bot settings, test how leading AI crawlers see your site, and define a clear policy—block, allow, or monetize—that matches your SEO, brand, and revenue strategy.[1][4]

Frequently Asked Questions

What exactly will change on September 15, 2026?
Cloudflare will apply default AI crawl rules to new domains created after September 15, 2026: Search-classified crawlers are allowed by default, while Training and Agent crawlers are blocked on ad-detected pages unless you override those settings. These defaults are applied at the zone level and can be configured to Block, Block on pages with ads, or Allow for each behavior type, and mixed-purpose crawlers are treated as Training when Training is blocked. The change means many ad-supported articles and media pages will be inaccessible to assistant-style and model-training crawlers by default, so site owners must proactively set policies to control visibility and monetization.
How does Cloudflare’s pay-per-crawl (HTTP 402) work?
Cloudflare’s pay-per-crawl sends a 402 Payment Required response to crawlers that do not present valid payment intent, with pricing metadata included; crawlers that support payment intent receive a 200 and the content. Domain owners can set per-crawler rules to Allow (free), Charge (require per-request payment), or Block (deny outright).
What immediate actions should site owners take?
Audit your Cloudflare AI bot and Security settings now, especially for ad-supported zones, and test access from both main domains and subdomains using curl or HTTP clients spoofing known AI user agents. Document which crawlers you want to allow, block, or charge, and implement zone-level rules and monitoring to confirm behavior matches your revenue and SEO objectives.

Sources & References (10)

Key Entities

💡
WikipediaConcept
💡
Search
WikipediaConcept
💡
AI Crawl Control
Concept
💡
Zone-level configuration
Concept
💡
HTTP 402
WikipediaConcept
💡
pay-per-crawl
Concept
💡
AI crawler
WikipediaConcept
💡
ad-supported sites
Concept
💡
mixed-purpose crawlers
Concept
💡
Block AI bots (Cloudflare setting)
Concept
📅
September 15, 2026
Event
🏢
GPTBot
Org
📦
WikipediaProduit

Generated by CoreProse in 1m 59s

10 sources verified & cross-referenced 989 words 0 false citations

Share this article

Generated in 1m 59s

What topic do you want to cover?

Get the same quality with verified sources on any subject.