How to Block OpenAI’s Web Crawler from Crawling Your Site

Roger Stringer

August 10, 2023

1 min read

Just as you can use robots.txt to block search engines and various bots from crawling your site, you can do the same with OpenAI's GPTBot.

GPTBot is OpenAI’s web crawler and can be identified by the following user agent and string.

User agent token: GPTBot
Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)

To block GPTBot from crawling your site entirely, you can add the following to your robots.txt file:

User-agent: GPTBot
Disallow: /

If you want to allow GTPBot to crawl certain areas of your site, but block it from other areas, you can do this:

User-agent: GPTBot
Allow: /directory-1/
Disallow: /directory-2/

Do you like my content?

Sponsor Me On Github