5, September 2008

does anyone have a list of search engine bot IP#'s? - webmaster forum

 
Webdigity webmaster forums
This forum shares its ad revenue with its members!
[ Home | Help | Search | Forum's Shop | Archive | Login | Register | Webmaster Directory ]
Webdigity Webmaster Forums  >  Web site promotion  >  Search Engine Optimization
Topic: does anyone have a list of search engine bot IP#'s?
« previous next »
Pages: [1] Print

Author Topic: does anyone have a list of search engine bot IP#'s?  (Read 1092 times)
aka J Love
Community Supporter ?
Bill Gates is my home boy
*****
Gender: Male
Posts: 884
1636 credits
Members referred : 4



« on: Jan 03, 2006, 07:33:10 AM »

does neone have a good list of the IP#'s of various search engine bots like google, yahoo, msn, etc.. ?

Visit through proxy Visit through proxy Visit through proxy

Last blog : phpHaze 1.59.1 in Development
I am a metal monkey!
Administrator
Community Supporter ?
Jedai Sword Master
*****
Gender: Male
Posts: 8037
41179 credits
Members referred : 3



« Reply #1 on: Jan 03, 2006, 08:41:39 AM »

There is no such list.

The only way to identify a crawler visit is by the visitor's browser.

Trial and Error my two best teachers Cool
Join us @ facebook Visit through proxy

Last blog : MIA - Where Nick and Tim
aka J Love
Community Supporter ?
Bill Gates is my home boy
*****
Gender: Male
Posts: 884
1636 credits
Members referred : 4



« Reply #2 on: Jan 03, 2006, 08:56:31 AM »

what exactly about the browser they use tips you off? i can monitior this with HCL so that is no prob just need 2 know what 2 look for

Visit through proxy Visit through proxy Visit through proxy

Last blog : phpHaze 1.59.1 in Development
I am a metal monkey!
Administrator
Community Supporter ?
Jedai Sword Master
*****
Gender: Male
Posts: 8037
41179 credits
Members referred : 3



« Reply #3 on: Jan 03, 2006, 10:11:47 AM »


Trial and Error my two best teachers Cool
Join us @ facebook Visit through proxy

Last blog : MIA - Where Nick and Tim
My Name is Enigo Montoya
*
Posts: 32
216 credits
Members referred : 0


« Reply #4 on: Jan 17, 2006, 09:35:46 AM »

method, what will u do with that?

cloaking?
aka J Love
Community Supporter ?
Bill Gates is my home boy
*****
Gender: Male
Posts: 884
1636 credits
Members referred : 4



« Reply #5 on: Jan 17, 2006, 10:10:05 AM »

no, when i am watching my visitors with Help Center Live I like to know who's who, but search bots show up as blank so its useless

Visit through proxy Visit through proxy Visit through proxy

Last blog : phpHaze 1.59.1 in Development
Sandwich Artist
*
Posts: 24
56 credits
Members referred : 0


« Reply #6 on: Mar 13, 2006, 03:30:19 PM »

ummm there was a list compiled a while ago on this site: http://johnny.ihackstuff.com Visit through proxy (it's a real slow site, I think the list was in the forums)
Novice Spammer
***
Posts: 103
552 credits
Members referred : 0



« Reply #7 on: Mar 25, 2006, 04:05:07 PM »

I have a list of alot of spider IPs (it's not all spiders, but it's a good start).

http://www.xavierforum.com/files/spider_ips.txt Visit through proxy
I am a metal monkey!
Administrator
Community Supporter ?
Jedai Sword Master
*****
Gender: Male
Posts: 8037
41179 credits
Members referred : 3



« Reply #8 on: Mar 25, 2006, 05:36:54 PM »

I have a list of alot of spider IPs (it's not all spiders, but it's a good start).

http://www.xavierforum.com/files/spider_ips.txt Visit through proxy

Interesting list.

You used some kind of software to populate this?

Trial and Error my two best teachers Cool
Join us @ facebook Visit through proxy

Last blog : MIA - Where Nick and Tim
Novice Spammer
***
Posts: 103
552 credits
Members referred : 0



« Reply #9 on: Mar 25, 2006, 05:46:22 PM »

Well, I looked at my server logs for the last year and that list is the result. Smiley
I am a metal monkey!
Administrator
Community Supporter ?
Jedai Sword Master
*****
Gender: Male
Posts: 8037
41179 credits
Members referred : 3



« Reply #10 on: Mar 25, 2006, 05:59:49 PM »

Oh I see. You got the difficult way I guess Smiley

You can also have a script doing this job.

Trial and Error my two best teachers Cool
Join us @ facebook Visit through proxy

Last blog : MIA - Where Nick and Tim
Novice Spammer
***
Posts: 103
552 credits
Members referred : 0



« Reply #11 on: Mar 25, 2006, 07:33:56 PM »

I have a spider trap too, I simply neede to speed up the work a little bit.

This is what my .htaccess looks like:

Code:
RewriteEngine on
RewriteRule ^robots.txt robots.php [L,NC]

Then I have a script called robots.php:

Code:
<?
$server = "localhost"; // MySQL hostname
$username = ""; // MySQL username
$password = ""; // MySQL password
$dbname = ""; // MySQL db name

$db = mysql_connect($server, $username, $password) or die(mysql_error());
mysql_select_db($dbname) or die(mysql_error());

mysql_query("INSERT INTO spiderips VALUES(INET_ATON('". $_SERVER['REMOTE_ADDR'] ."'),'". $_SERVER['HTTP_USER_AGENT'] ."',". time() .")");

mysql_close();

?>
User-agent: *
Disallow: /norobots/

Every time a spider visits my robots.txt I catch the spider. Then I have this code on some other pages (not all of my pages):

Code:
<?$server = "localhost"; // MySQL hostname
$username = ""; // MySQL username
$password = ""; // MySQL password
$dbname = ""; // MySQL db name

$db = mysql_connect($server, $username, $password) or die(mysql_error());
mysql_select_db($dbname) or die(mysql_error());

$spider_info = mysql_query("SELECT * FROM spiderips WHERE ip = INET_ATON('". $_SERVER['REMOTE_ADDR'] ."') OR useragent = '". $_SERVER['HTTP_USER_AGENT'] ."'");
if (mysql_num_rows($spider_info))
{
if (!mysql_query("INSERT INTO spiderips VALUES(INET_ATON('". $_SERVER['REMOTE_ADDR'] ."'),'". $_SERVER['HTTP_USER_AGENT'] ."',". time() .")"))
{
mysql_query("UPDATE spiderips SET useragent = '". $_SERVER['HTTP_USER_AGENT'] ."', lastvisit = '". time() ."' WHERE ip = INET_ATON('". $_SERVER['REMOTE_ADDR'] ."')");
}
}
mysql_close();
?>

This way I get the user agents for the spiders when some of them visits my robots.txt, then I get more IP addresses when I search in the database for the user agent that is a spider. I also get the last time a spider visited one of my sites.
I am a metal monkey!
Administrator
Community Supporter ?
Jedai Sword Master
*****
Gender: Male
Posts: 8037
41179 credits
Members referred : 3



« Reply #12 on: Mar 25, 2006, 09:36:50 PM »

That's very clever. Congrats, and thanks for sharing.

BTW you can make this work better if you use the LOW_PRIORITY statement in your inserts and updates Smiley

Instead of this :

Code:
$spider_info = mysql_query("SELECT * FROM spiderips WHERE ip = INET_ATON('". $_SERVER['REMOTE_ADDR'] ."') OR useragent = '". $_SERVER['HTTP_USER_AGENT'] ."'");
if (mysql_num_rows($spider_info))
{
if (!mysql_query("INSERT INTO spiderips VALUES(INET_ATON('". $_SERVER['REMOTE_ADDR'] ."'),'". $_SERVER['HTTP_USER_AGENT'] ."',". time() .")"))
{
mysql_query("UPDATE spiderips SET useragent = '". $_SERVER['HTTP_USER_AGENT'] ."', lastvisit = '". time() ."' WHERE ip = INET_ATON('". $_SERVER['REMOTE_ADDR'] ."')");
}
}

you can use this :

Code:
mysql_query("REPLACE LOW_PRIORITY INTO spiderips VALUES(INET_ATON('". $_SERVER['REMOTE_ADDR'] ."'),'". $_SERVER['HTTP_USER_AGENT'] ."',". time() .")")

Trial and Error my two best teachers Cool
Join us @ facebook Visit through proxy

Last blog : MIA - Where Nick and Tim
Chicken-run Manager
*
Posts: 9
58 credits
Members referred : 0


« Reply #13 on: Mar 25, 2006, 09:48:23 PM »

Cool script, I will use it too.

Thanks for sharing Smiley
Trackback URI for this entry : http://www.webdigity.com/trackback.php?topic=1121
Tags : mysql databases browsers robots.txt Bookmark this thread : Digg Del.icio.us Dzone more....

Topic sponsors:
Get a permanent link here for $1.99!


Pages: [1] Print 
Webdigity Webmaster Forums  >  Web site promotion  >  Search Engine Optimization
Topic: does anyone have a list of search engine bot IP#'s?
« previous next »
Jump to:
User Area
Welcome, Guest. Please login or register.
Did you miss your activation email?
Sep 05, 2008, 08:18:17 PM





Login with username, password and session length

Donate to our community, and get a permanent link back to your site!

Donate to our community, and get a permanent link back to your site!


Forum Statistics
Total Posts: 36.294
Total Topics: 7.477
Total Members: 3.899
Tutorials : 56
Resources : 143
Designs : 220
Latest Member: speedy5044

34 Guests, 4 Users online :

14 users online today:



Readers

Web Design Gallery · Whois Lookup · Pagerank · Tag Browsing · Lo-fi version · Syndication · Webmaster forum history · Advertise
Developed by HumanWorks © 2005 - 2008 Webdigity webmaster community · sublime directory
Webdigity Webmaster Forums | Powered by SMF 1.0.12. © 2001-2005, Lewis Media. All Rights Reserved.