3

How can I prevent a dedicated image folder from crawling/indexing by any search bot? Like Google search or image or Bing search/image bot?

I would like to give that folder name as /donotindex/ here will be all image that I will upload for users profile picture and others. Also, what's most easy way to prevent anyone from browsing this folder files?

Stephen Ostermiller
  • 99,822
  • 18
  • 143
  • 364
tvhex
  • 31
  • 2

3 Answers3

5

In your robots.txt file, assuming this folder is at the root, include the following:

User-agent: *
Disallow: /donotindex/
...

The ellipsis on line 3 simply stands in for any other Disallow or Allow directives for all bots. If none exist, omit it.

This will prevent all mainstream bots from crawling the folder or indexing what's in it. This might not prevent your content from ending up in the index, though; for example, if that picture's URL is copied and linked to by another page, it might still get discovered. Plus, not all bots are polite enough to obey robots.txt directions.

In WordPress, there are a few things you can do to prevent your images being linked to. This article describes the various techniques, mostly using the WPShield Content Protector plugin, which can help you do the following:

  • Disable Right Click and Dragging Images
  • Hide Image URLs
  • Disable Image Hotlinking

If you want your folder inside your web hosting panel to be accessible on the backend only to you (the admin) and not to anyone else who might be working on your website, check with your web host on how to restrict that folder access.

Stephen Ostermiller
  • 99,822
  • 18
  • 143
  • 364
Henry Visotski
  • 5,306
  • 11
  • 18
1

The easiest way to prevent browsing the folder is to add an index.html file to the directory, say the minimum valid html page with text saying "browsing this directory not welcome".

If you have access to the server config files, I suggest you also add X-Robots-Tag, so the images wouldn't be indexed even if hotlinked, e.g.

<Directory "<path to donotindex>">
 Header set X-Robots-Tag "noimageindex"
</Directory>

As the folder is dedicated to images, that should suffice. To apply to a folder that contains other content, you might want to add a filesmatch

<Directory "<path to donotindex>">
 <FilesMatch "\.(jpg|png|gif|webp|avif|ico|bmp)$">
  Header set X-Robots-Tag "noimageindex"
 </FilesMatch>
</Directory>

and change the extensions to those in actual and permitted use.

Uri Raz
  • 179
  • 5
0

Good point on this reply https://webmasters.stackexchange.com/a/141437/134770 this is a pretty classic situation of playing with the robots.txt file. Good learning reference over here if your not familiar with all aspects and uses of robots.txt https://moz.com/learn/seo/robotstxt

The "robots.txt" file tells search engine bots which pages or directories to crawl or not crawl on your site (as you may know)

robots.txt code

The line that begins with "User-agent: *" instructs all search engine bots not to crawl the "donotindex" folder and its contents by applying the disallow rule. Meanwhile, the "Disallow: /donotindex/" line specifically indicates which directory to avoid crawling.

John Paul
  • 23
  • 7