Website Design Edmonton       Website Design Feedback       Website Design Sitemap     Contact Website Design Company
Website Designers R Us - Home
robots.txt file related security issues

Did you used to say "I didn't steal any cookies" out of the blue and wondered how your mom found out that you were in fact stealing cookies before dinner?

Some of the robots.txt security related issues can be summarized using the above "cookies" example of voluntarily denying information that you don't want others to have. In other words, since the robots.txt file is accessible to everyone it should not be used to hide specific files or directories on your server.

For example, if you're trying to stop search engines from indexing a file named "list_of_my_passwords.txt" and a folder with sensitive information named "secrets_folder", adding their full names as follows should be avoided whenever possible:

Directory structure:
/list_of_my_passwords.txt
/secrets_folder/
 
robots.txt:
User-agent: *
Disallow: /list_of_my_passwords.txt
Disallow: /secrets_folder/
 

Instead, move your sensitive files and directories into a sub directory and exclude that sub directory by itself. As in the following example, excluding a non-specific directory name such as "folder_a" is a better solution.

 
New directory structure:
/folder_a/list_of_my_passwords.txt
/folder_a/secrets_folder/
 
 
New robots.txt:
User-agent: *
Disallow: /folder_a/
 
If you're unable to reorganize your directory structure, yet have a strong need to exclude certain directories from indexes, use only partial names in the robots.txt file. Although this may not be the best solution, it will at least make it almost impossible to guess full directory names. For example, to exclude "secrets_folder" and "list_of_my_passwords.txt" use following names (given that there aren't any other files or directories in the web root starting with those characters):
 
robots.txt:
User-agent: *
Disallow: /se
Disallow: /li
 
 

    

Do not use the robots.txt file to protect or hide information.

Misunderstanding and misuse of any technology or tool, including paperclips, can be a security risk. Use the robots.txt file only for what it was intended to do -- as a way to suggest robots how to index content on your web server. You must use other security methods such as not using default pages for sensitive directories, removing "allow directory browsing" attributes, password protecting or even utilizing a firewall, depending on the desired level of security, if you really want to protect data on your web servers.

Return to Listing

Website Designer R Us has over 12 years of IT experience and a focus on custom website design, web development and web hosting services. Our professional web design services will give your business the look and feel needed to beat your competitors! Our website design services include; web design, website redesign, website maintenance, web development, flash animation, eCommerce, shopping carts, domains, web hosting, search engine optimization, graphic design, logo design, blog writing, script installations & much more!
 Home          ::             About Us          ::             Support            ::             Services            ::             Link Partners          ::             Contact

Copyright © 2006-2011 Website Designers R Us, a DOT Specialist Company. All rights reserved.