
Controlling Indexing and Spidering of Online Files
--Contributed by Linda K. LewisNOTE: These will only work if the bot follows accepted standards and respects the tags and robots.txt file
1. To prevent caching of files, insert the following META tag in every html file:<meta name="robots" content="noarchive">
2. To prevent index and spidering by TGN/MyFamily, insert the following META tag in
every html file:<meta name="MyFamilyBot" content="noindex,nofollow">
3. To prevent MyFamily from indexing your files, upload a robots.txt file to your root
directory (where your index.html default welcome page is located.)Using a word processing program (Word, Notepad, Note Tab, etc.), create a .txt file called
robots.txt containing the following text:
User-Agent: * Disallow: /images/
User-Agent: *
Disallow: /includes/
User-Agent: MyFamilyBot
Disallow: /
Explaination: Each entry consists of two lines - the User-Agent line and the Disallow line.
The first pair tells ALL bots NOT to index the images folder
The second pair tells ALL bots not to index the includes folder
The third pair tells MyFamilyBot (the ancestry bot) not to index anything at all on the website.Add additional entries to prevent other bots from indexing.
Copyright Ellen Pack & the USGW-Helppages Mail List 2007 - All Rights Reserved
All files on this site
are copyrighted by their creator and/or contributor. They may
be downloaded for personal
use, or linked to, but may not be reproduced on
another site without
specific permission from Ellen
Pack and/or their contributor.