RECENT POSTS BELOW

Visit :Mobile Daddy

May 24, 2011

What Is A Robots Txt File And How To Use It ?


Web Crawler

A robots.txt file is the first thing a search engine spider or bot will look for when it visits your website, so its vital that you have one there, even in its most basic form.



These files (called a Robots Exclusion Protocol) help to control how the spider goes through your site, allowing you to block certain sections from being spidered or even the whole site, useful if you do not want your site to go live before its been tested for example.




Because spidering has become so much quicker than years ago, the chances of you being spidered in a short space of time are high, which means that if your site is not ready with all the bells and whistles functioning then you could end up with pages indexed and traffic coming to a site that is not working correctly.


In the most simplest form, the robots.txt (make sure the file name is lower case) file will look like:


User-agent:*


Disallow:


The code above simply tells the spider to go ahead and spider all areas.


These two lines are saved into a.txt file and uploaded to the root of your server and that really is all you have to do to satisfy the search engine spiders / bots who quickly find the file and carry on doing what they need to do.


The URL for your robots.txt file needs to be - "http://www.mysite.com/robots.txt"


The User-agent: * is the part of the code where you can tell certain spiders not to go through to your site. The Disallow: part tells the bots what pages they are not allowed to crawl and index.


For example, say you had a membership area (called "http://www.mysite.com/membership/") that you did not want Google to spider.


Your robots.txt file would be:


User-agent: Googlebot


Disallow: /membership/


So this version of the robots.txt would be telling just Google not to spider the Membership section of your site, but all of the other search engine bots could.


As we mentioned previously, most web masters and site owners will never change their robots.txt file so therefore once you have done yours you can just leave it be and never touch it again.


But the fact that a search engine expects to see one means that you do need to have a robots.txt file, even if its just like the basic version which tells all bots to go to all places, you may as well just make sure you are covered and have this file live on your site.


I am Rahul Bhatia works for Matrix Webtex Pvt. Ltd, a SEO Web Optimisation and Web Design company serving the India.





0 Responses to “What Is A Robots Txt File And How To Use It ?”

Post a Comment

Next previous home

Join The TechMyriad Community

All Rights Reserved @2011TechMyriad - Communnity of Technology
----
Back to TOP