robots.txt
A file at the website root that tells search engine bots and AI crawlers which pages they are allowed or forbidden to access
What is robots.txt?
robots.txt is a plain-text file placed at a website's root directory that instructs search engine bots and AI crawlers which pages they may collect and which they should not access.
Relationship to AI crawlers
AI services such as ChatGPT, Claude, and Perplexity operate their own crawlers. Blocking these crawlers in robots.txt removes the site from that AI system's indexing and training pipeline, directly lowering the probability of brand mentions in AI-generated answers.
| AI Crawler | User-agent identifier |
|---|---|
| ChatGPT | GPTBot |
| Claude (Anthropic) | ClaudeBot |
| Perplexity | PerplexityBot |
| Google AI | Google-Extended |
Role in AIVS
trensee AIVS evaluates GPTBot, ClaudeBot, and PerplexityBot access independently within the AI Infra pillar (3 points each, 9 points maximum). All three must be allowed to score the maximum.
Recommended configuration
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /