With the rapid growth of Artificial Intelligence (AI) and tools like ChatGPT becoming popular, it’s important to protect your website from harmful AI bots. These bots are constantly scanning websites to gather information, and there are new ones popping up all the time. In this article, I’ll show you how to stop AI robots from stealing your website content.
Some bots, like search engine bots, help your site get noticed. But others can cause problems. Bad bots might steal your content, take your data, or slow down your site.
If you run a business, this can hurt your ability to compete, damage your reputation, and even cost you money.
What’s an AI Robot?
How to Stop AI Robots From Stealing Your Website Content
Blocking AI bots involves editing the robots.txt file for your website.
A robots.txt file is a small text file that tells search engine bots (like Google’s crawlers) which pages or sections of a website they are allowed or not allowed to visit. It helps website owners control how their site is crawled and indexed.
However, like a sign on a property, it doesn’t physically block access—it only clearly instructs. Responsible and ethical bots, like those used by search engines, will respect your wishes and avoid crawling the areas you’ve restricted. On the other hand, less trustworthy or malicious bots might ignore the robots.txt file entirely and continue to access your site.
Here’s what a robots.txt file looks like:
https://www.ChristinaHills.com/robots.txt
Now, there are two options for editing or creating a robots.txt file to block bots…
Option 1: Using A WordPress SEO Plugin (Recommended Method)
A few SEO plugins (Yoast SEO, Rank Math, AIOSEO) will allow you to edit the robots.txt file from within the WordPress dashboard.
(BEFORE DOING THIS, MAKE SURE YOU ARE COMFORTABLE WITH EDITING CODE)
1. Yoast SEO by Team Yoast
The Yoast SEO Plugin offers an intuitive interface for managing your robots.txt file within its dashboard. Users can easily edit the robots.txt file to control search engine crawling and indexing, making it straightforward to block or allow specific parts of your site.
- Click on “Yoast SEO” in your WordPress admin menu to expand it.
- Click on “Tools.”
- Click on the “File Editor” link.
- On the next screen, you’ll see the “robots.txt” at the top. Here, you can edit your existing robots.txt file (or create a new one by clicking on the “Create robots.txt file” button).
- Add these lines (without deleting the existing content), for example, to block ChatGPT:
User-agent: GPTBot
Disallow: / - See the section below called “What to Add to the Robots.txt File” to get a full list of bots to block.
- Once done, click on the “Save changes to robots.txt” button.
2. Rank Math SEO by Rank Math
The Rank Math Plugin provides a dedicated section under its SEO settings where users can modify the robots.txt file. This feature allows users to tailor their crawling directives, enabling or disabling access to specific parts of their website for search engine bots.
- Click on “Rank Math SEO” in your WordPress admin menu to expand it.
- Click on the “Dashboard” and then on the upper right, click on Advanced mode to get more options.
- Then Click on “General Settings.”
- Click the “Edit robots.txt” tab. (If you don’t already have a robots.txt file, Rank Math SEO will automatically add it to your site with some default content.)
- Add these lines (without deleting the existing content), for example, to block ChatGPT:
User-agent: GPTBot
Disallow: / - To get a full list of bots to block, see the section titled “What to Add to the Robots.txt File.”
- Once done, click on the “Save changes to robots.txt” button.
3. All-in-One SEO by the All-in-One SEO Team
The All in One SEO Plugin and tool kit includes a robots.txt editor in its suite of SEO tools. It allows users to create and modify the robots.txt file directly from the plugin’s settings, providing a convenient way to manage how search engines interact with the site’s content.
-
- Click “All in One SEO” in your WordPress admin menu to expand it.
- Click on “Tools.”
- Under the “Robots.txt Editor” tab, turn on the “Enable Custom Robots.txt” option to open the custom robots.txt rule builder.
- Use the rule builder to enter your desired info. For example, use these to block ChatGPT:
User Agent: GPTBot
Directive: Disallow
Value: /And review the Custom Robots.txt Preview right below the builder to make sure the rules are displayed correctly.
- See the section below called “What to Add to the Robots.txt File” to get a full list of bots to block.
- Once done, click on the “Save changes to robots.txt” button.
What to Add to the Robots.txt File
To prevent AI bots from scraping your website content, you need to edit the robots.txt file and add the following:
# This code will block AI Robots from Scraping Your Site #CCBot = CommonCrawl (used to train AI bots) User-agent: CCBot Disallow: / # GPTBot = ChatGPT User-agent: GPTBot Disallow: / # ChatGPT-User = ChatGPT Plugins User-agent: ChatGPT-User Disallow: / #Google-Extended = GoogleBard User-agent: Google-Extended Disallow: / #Apple bot User-agent: Applebot-Extended Disallow: / #used with Claude User-agent: anthropic-ai Disallow: / User-agent: ClaudeBot Disallow: / #Omgili and Omiligot are from webz.io User-agent: Omgilibot Disallow: / User-agent: Omgili Disallow: / #from Facebook and Meta User-agent: FacebookBot Disallow: / User-agent: Meta-ExternalAgent Disallow: / User-agent: Meta-ExternalFetcher Disallow: / User-agent: Bytespider Disallow: / User-agent: PerplexityBot Disallow: / User-agent: Perplexity-ai Disallow: /
A good article that discusses this topic more is here: https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/
Here is an example of the robots.txt file from the Wall Street Journal. If you scroll down to the bottom, you can see where they have blocked these user agents (bots).
https://www.wsj.com/robots.txt
By adding these lines, you are requesting that the following bots NOT crawl your site. OpenAI and Google typically adhere to robots.txt protocols, but not all AI bots will follow them.
Google-Extended = GoogleBard
ChatGPT-User = ChatGPT Plugins
GPTBot = ChatGPT
CCBot = CommonCrawl (used to train AI bots)
Note that these AI bots change a lot, so you will need to update your robots.txt file as new AI tools come out.
Option 2: From the cPanel Inside Your Hosting Account (More Techie)
There’s also a way to update the robots.txt file in your hosting account’s cPanel. Please note: This way is much more techie and requires you to be comfortable accessing the server file in your hosting account. We don’t recommend this for folks who are not comfortable going into your cPanel.
Each hosting company’s dashboard may look slightly different, but here’s the general process.
Once you’re logged in to your hosting account, there’s usually a link or button that says “cPanel” or “Control Panel” in your account dashboard. Clicking that link will take you directly to your cPanel dashboard.
How to update an existing robots.txt file:
- Access your cPanel.
- Go to Files – File Manager.
- Locate your robots.txt file and select it by clicking on the file name once. (Typically, you can find it in the “public_html” folder, but the location depends on how your website is set up.)
- In the top menu, click on “Download” to download the current file to your computer as a backup, just in case.
- While the robots.txt file is still being selected, click on Edit in the top menu. This will let you edit the robots.txt file. Add these lines (without deleting the existing content), for example, to block ChatGPT:
User-agent: GPTBot
Disallow: /
- See the section below called “What to Add to the Robots.txt File” to get a full list of bots to block.
- Once done, click on the “Save Changes” button and close the window.
If you are using an FTP tool to access your website files, you can do the same by downloading the current robots.txt file, adding the desired lines to the file, and uploading the updated robots.txt file back to your site.
How to create a robots.txt file:
If you don’t already have a robots.txt file, you can create one
- Create a document called “robots.txt” in your text editor (Notepad, TextEdit, etc.).
- Add these lines, for example, to block ChatGPT:
User-agent: GPTBot
Disallow: /
- Save.
- Access your cPanel.
- Go to Files – File Manager.
- Go to the folder where you want to upload the robots.txt file to. Typically, this is the “public_html” folder, but the location depends on how your website is set up.
- In the top menu, click on Upload and upload the robots.txt file you just created.
If you are using an FTP tool to access your website files, you can do the same by uploading the new robots.txt file to the appropriate folder within your site.
Final Thoughts…
More AI robots are crawling the web than ever before, and new ones are popping up all the time. Most are helpful, but some can be out to cause damage. That’s why it’s important to take steps to stop AI robots from stealing your website content. However, it’s on the honor system that they respect your wishes. This is not guaranteed!
Using a plugin is a quick and user-friendly way to update your robots.txt file. This approach is especially helpful for those who may not be familiar with navigating their hosting cPanel.
However, if you’re comfortable working with your hosting’s CPanel, you can directly edit your robots.txt file using Method 2. This method gives you more control and is a great way to ensure that your changes are implemented exactly as you need them.
For more information on how to keep your website safe from other security threats, read my article on How to Keep Your Site Safe from Spammers and Hackers.
You’ve worked hard on your website, so protect it!
DISCLAIMER: We’ve done our best to research and put together this article. However, we cannot guarantee that the techniques we share will work for everyone and in all instances. Please do your own due diligence before implementing anything mentioned in this article.