Loading...
Loading...
robots.txt
robots.txt block (14 of 14 blocked)
User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Claude-Web Disallow: / User-agent: Google-Extended Disallow: / User-agent: PerplexityBot Disallow: / User-agent: CCBot Disallow: / User-agent: Bytespider Disallow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: FacebookBot Disallow: / User-agent: Diffbot Disallow: /
Choose allow or block for GPTBot, ClaudeBot, Google-Extended, PerplexityBot and other AI bots.
The tool generates a clean robots.txt block with the correct User-agent and Disallow or Allow lines for each choice.
Copy the block into the robots.txt at your domain root, then re-check it to confirm the rules are live.
It builds a copy-paste robots.txt block where you decide, per AI crawler, whether to allow or block access. Toggle bots like GPTBot, ChatGPT-User, ClaudeBot, anthropic-ai, Google-Extended, PerplexityBot, CCBot, Bytespider and Amazonbot, and the tool writes the matching User-agent and Disallow or Allow rules. Everything runs in your browser — nothing is uploaded.
The common training and answer-engine crawlers: OpenAI's GPTBot and ChatGPT-User, Anthropic's ClaudeBot and anthropic-ai, Google-Extended (Gemini / Vertex training), PerplexityBot, Common Crawl's CCBot, ByteDance's Bytespider, Amazonbot, Applebot-Extended, Meta's FacebookBot and Diffbot. Each has its own toggle so you can allow some and block others.
It depends on your goals. Allowing crawlers like GPTBot and PerplexityBot can help your content surface and be cited in AI answers, driving referral traffic. Blocking them protects content you do not want used for training or summarization. Many sites allow answer-engine bots that drive citations while blocking pure training crawlers — the toggles let you take that nuanced position.
Not entirely. Blocking GPTBot stops OpenAI from crawling your site for training and, in many cases, for live retrieval, but content already in the model or surfaced via other sources may still appear. ChatGPT-User is the separate agent used when a user asks ChatGPT to fetch a page, so block both if you want to stop on-demand fetching too.
Add the rules to the robots.txt file at the root of your domain (https://yourdomain.com/robots.txt). You can paste them alongside your existing rules — just avoid duplicate User-agent groups for the same bot. After deploying, re-fetch your robots.txt to confirm the new rules are live and reachable.
Major, well-behaved crawlers such as GPTBot, ClaudeBot, Google-Extended and PerplexityBot publicly commit to honoring robots.txt directives. However, robots.txt is a request, not an enforcement mechanism, so less reputable scrapers may ignore it. For stronger control, combine robots.txt with server-side blocking by user agent or IP.
No. Google-Extended only governs Gemini and Vertex AI training data, not Googlebot, which handles Search indexing. Blocking AI-specific bots in this tool does not touch Googlebot, so your normal search visibility is unaffected. To manage classic search crawling, edit the Googlebot rules separately.