Bill Gates is my home boy
Posts: 650
4740 credits Members referred : 0
« on: May 18, 2011, 08:51:48 pm »
Robots.txt is a regular text file that through its name, has special meaning to the majority of "honorable" robots on the web. By defining a few rules in this text file, you can instruct robots to not crawl and index certain files, directories within your site, or at all. For example, you may not want Google to crawl the /images directory of your site, as it's both meaningless to you and a waste of your site's bandwidth. "Robots.txt" lets you tell Google just that. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site i.e. it is not a firewall, or a kind of password protection and the fact that you put a robots.txt file is something like putting a note ???Please, do not enter?? on an unlocked door e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too na??ve to rely on robots.txt to protect it from being indexed and displayed in search results.
Affiliate Programs CPA Network Affiliate Network
Forum's Grocer
Gender:
Posts: 73
444 credits Members referred : 0
« Reply #1 on: May 28, 2011, 07:22:39 pm »
robots.txt is related to search engines robots and not specific to PHP projects. it contains instructions for robots that crawl you website such as which urls you dont want to be crawled by the search engines and so on.
I wish I was an Oscar winner
Posts: 95
630 credits Members referred : 0
My Name is Enigo Montoya
Posts: 32
204 credits Members referred : 0
« Reply #3 on: Jun 03, 2011, 08:22:03 am »
The primary purpose for using a robots.txt file is to gain complete control over the data indexed by the searchbots. Implement a Robots.txt file only when you want to prevent unwanted web pages from being indexed. A robots.txt file is always placed in the root folder of the website where the searchbots can access it easily.
I wish I was an Oscar winner
Posts: 92
576 credits Members referred : 0
« Reply #4 on: Jun 12, 2011, 02:00:24 pm »
In that file you tell search engines, which page to index and which not to index.
"How I made $1 million in ten minutes" Author
Gender:
Posts: 104
732 credits Members referred : 0
Robots.txt is a regular text file that through its name, has special meaning to the majority of "honorable" robots on the web. By defining a few rules in this text file, you can instruct robots to not crawl and index certain files, directories within your site, or at all. For example, you may not want Google to crawl the /images directory of your site, as it's both meaningless to you and a waste of your site's bandwidth. "Robots.txt" lets you tell Google just that. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site i.e. it is not a firewall, or a kind of password protection and the fact that you put a robots.txt file is something like putting a note ???Please, do not enter? on an unlocked door e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too na??ve to rely on robots.txt to protect it from being indexed and displayed in search results.
Yeah, i completely agree
Metal slug addict
Posts: 19
114 credits Members referred : 0
« Reply #6 on: Jul 26, 2011, 11:54:08 am »
All my friends including almost everything but if you want learn in deep visit: robotstxt dot org
Spy Agent
Gender:
Posts: 112
692 credits Members referred : 0
« Reply #7 on: Aug 19, 2011, 09:27:33 am »
It is not necessary to use robots.txt file if you don't want yo hide any page from crawler.
Trackback URI for this entry : http://www.webdigity.com/trackback.php?topic=12095