How to Block OpenAI’s Web Crawler from Crawling Your Site
Just as you can use
robots.txt to block search engines and various bots from crawling your site, you can do the same with OpenAI's GPTBot.
GPTBot is OpenAI’s web crawler and can be identified by the following user agent and string.
User agent token: GPTBot Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)
To block GPTBot from crawling your site entirely, you can add the following to your
User-agent: GPTBot Disallow: /
If you want to allow GTPBot to crawl certain areas of your site, but block it from other areas, you can do this:
User-agent: GPTBot Allow: /directory-1/ Disallow: /directory-2/