Using the robots txt file.
A robots.txt file should be placed in the root directory of your website. This file tells the ‘robots’ of search engines where to travel throughout your site when you submit your site to the search engines.
Using the information in the file, the robot will ‘spider’ your website accordingly.
A robots.txt file is made up of a simple formula;
Item : Property
An example of this could be:
User-agent: googlebot
This would then tell the googlebot, of google.com to follow the instructions that directly follow – e.g.
Disallow: /contactus.html
The line above would tell the googlebot not to spider the contactus.html file on the website.
The Disallow command tells the spider what not to spider when visiting the site. See the below example which would stop the robot from ‘spidering’ the information in the images folder:
Disallow: /images/
The reason why you may not want your spider to visit certain pages or folders
on your site is simple - the content of files in some of
your folders may not be relevant to the rest of your document and therefore
the search engine spider will think that your site is less
relevant to the subject matter that you are trying to promote.
A few examples of valid robots.txt files to use on your website.
Robots.txt a
User-agent: *
Disallow:
This would allow all search engine robots to spider the whole of your website.
Robots.txt b
User-agent: *
Disallow: /cgi-bin/
This would disallow robots from visiting your /cgi-bin folder.
Robots.txt c
User-agent: googlebot
Disallow: /cgi-bin/
Disallow: /forum/
This would disallow the googlebot spider from visiting your /cgi-bin and /forum
folders.
More great robots.txt information can be found here.
James Welch
Website Timeline for a
Small Business
Dont follow the same mistakes as many thousands of new Business owners
each month around the globe.
On the right side
of the law
An article about keeping promotion
efforts within ethical boundaries by James Welch used
in publications such as PlugIn magazine.
Robots.txt - Whats it
all about?
Information about robots.txt files that are used
to inform robots of pages they can and cannot read.
Understanding Meta Keywords
An article to show how easy it is to use keywords
and head tags in a web page.
Market Your Website
Creating a web presence for your website and brand.
Build a Great Website
Top tips on creating a great website.