Back to AI information
Cloudflare breaks down AI crawler permissions: search, agent, and training are no longer one-size-fits-all

Cloudflare breaks down AI crawler permissions: search, agent, and training are no longer one-size-fits-all

AI information Admin 8 views

On July 1, 2026, Cloudflare announced a new AI traffic management solution on its official blog, "Your site, your rules: new AI traffic options for all customers." Website administrators can now handle search, agent, and training crawlers separately, no longer limited to "let all" or "intercept all."

The new option is now available to all Cloudflare customers, including a free plan. Websites can now manage search indexing, real-time visits representing users, and model training separately.

The value of the three types of crawlers is completely different

Search refers to crawling to create indexes for subsequent queries, and websites typically expect to gain impressions and return traffic from this. An agent refers to an automated behavior representing real users accessing pages to complete the current task; Its differences from ordinary chatbots can be understood by comparing the site's AI Agents, chatbots, and workflows. Training involves using content for training or fine-tuning models, with data entering the model's capabilities over a longer period.

This classification is closer to the real interests of webmasters than "whether it's an AI bot." A website can retain search visibility while rejecting model training; It is also possible to allow agents to perform tasks for users without opening the door to all automated traffic.

After September 15, the default rules for new domain names will change

Cloudflare plans to enable new default values for newly integrated domains starting September 15, 2026: on ads displaying pages, Training and Agent are blocked by default, while Search is allowed by default. Existing customers can confirm or change their choices in advance in the security settings.

Special attention should be paid to multipurpose crawlers. If the same crawler is used for both Search and Training, the system will handle it according to stricter rules. If the webmaster directly blocks Training, it may affect the search exposure for such crawlers, so it's not advisable to turn them on in bulk before checking the category details.

What are the three things the webmaster should do first?

  1. Check which pages rely on search for traffic and which contain paid, advertising, or high-value original content.
  2. Set Search, Agent, and Training separately; avoid sticking to the old unified interception approach.
  3. After adjustments, continue to observe search indexing, recommended traffic, and crawler visits, then tighten the rules based on actual impact.

Cloudflare is also expanding BotBase and content usage signals, allowing websites to distinguish between "real-time interaction," "indexing and reference," and "full summaries or copying." This follows the same main thread as previously introduced on the site regarding Cloudflare AI Index content control: whether content can be used by AI, shifting from vague default to categorizable, expressive, and adjustable site owner decisions.

Recommended Tools

More