Controlling Indexing and Spidering of Online Files

--Contributed by Linda K. Lewis

NOTE:  These will only work if the bot follows accepted standards and respects the tags and robots.txt file

 
1.  To prevent caching of files, insert the following META tag in every html file:

   <meta name="robots" content="noarchive">

2.  To prevent index and spidering by TGN/MyFamily, insert the following META tag in
     every html file:

  <meta name="MyFamilyBot" content="noindex,nofollow">

3.  To prevent MyFamily from indexing your files, upload a robots.txt file to your root
      directory (where your index.html default welcome page is located.)

     Using a word processing program (Word, Notepad, Note Tab, etc.), create a .txt file called
     robots.txt containing the following text:
 

User-Agent: *

Disallow: /images/

User-Agent: *

Disallow: /includes/

User-Agent: MyFamilyBot

Disallow: /

Explaination:  Each entry consists of two lines - the User-Agent line and the Disallow line.
The first pair tells ALL bots NOT to index the images folder
The second pair tells ALL bots not to index the includes folder
The third pair tells MyFamilyBot (the ancestry bot) not to index anything at all on the website.

Add additional entries to prevent other bots from indexing.



Home

Copyright Ellen Pack & the USGW-Helppages Mail List 2007 - All Rights Reserved

All files on this site are copyrighted by their creator and/or contributor. They may
be downloaded for personal use, or linked to, but may not be reproduced on
another site without specific permission from Ellen Pack and/or their contributor.