Sublime directory Surf the web anonymous Pagerank Monitor


Links and robots

wineo
Fri 14 April 2006, 06:04 am GMT +0200
I have a script on my error404 page that emails me the file that was not found, the referring page and the user agent that requested the file. Recently I have been receiving the following messages:

Quote
File not found: www. site .com/'http: //www. link .com'     [no spaces obviously]
Refering page:
User agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

So there is no referring page... obviously because it is the google bot, but my question is this... Why is the google bot adding the external link to the site domain with single quotes around it? I presume that the bot sees <a href='...' >, with single quotes and for some reason is adding the link to the main site link and trying to see if that page exists. I have changed as manu links as possible to <a href="..." > now so that there are double quotes aroung the url to try and prevent this.

Anyone have any idea why this bot is behaving like this?   ???

Nikolas
Fri 14 April 2006, 10:35 am GMT +0200
This is very strange for G. I think that this might propably be a spam bot that uses the googlebot headers, or a bug of the googlebot 2.

But anyway if you don't have a link to ww. site .com/'http: //www. link .com'  in your site there is no problem.

Also have in mind that the gbot continues to visit non existent pages for a long time.

olaf
Fri 14 April 2006, 02:25 pm GMT +0200
Right, the Googlebit spiders your site without any referer information, check the corresponding ip address. I don't think that this one is from Google.

Olaf

Archive for SMF v1.00 by N.P. Valid XHTML 1.0 Transitional