Topic: Question, Tutorial: Detecting Googlebot with PHP (Read 647 times)
aka J Love Community Supporter?
Bill Gates is my home boy
Gender:
Posts: 884
1636 credits Members referred : 4
« on: Sep 24, 2007, 12:49:11 AM »
not asking for help on this tutorial, "detecting googlebot with php", just to say that first .
i did follow it, and successfully can detect when googlebot logs into the script so its a great tutorial. however, my question is can the same concept be used for msnbot and yahoo! slurp? i have tried with the following and they don't seem to pick up the visits (when obviously they are crawling)
Yahoo! Slurp: string for user agent = help.yahoo.com/help/us/ysearch/slurp (found from wiki article on slurp!)
MSNBot/Livebot: string for user agent = search.live.com (found from webalizer statistics)
i assume that since they are working on webdigity, i must be doing something wrong using the wrong strings or something? here is the PHP code that actually detects them:
Code:
<?php $spiders = dbquery("SELECT * FROM ".PRE."users WHERE is_spider = '1'"); if (dbrows($spiders)){ while ($bot = dbarray($spiders)){ if (strstr($_SERVER['HTTP_USER_AGENT'], $bot['username']) == true){ $host = gethostbyaddr(USER_IP); if (substr($host, (strlen($host)-13)) == $bot['location']){ $spider_visited = dbquery("UPDATE LOW_PRIORITY ".PRE."users SET `lastvisit` = '".time()."', `ip` = '".USER_IP."' WHERE `id` = '".$bot['id']."'"); } } } } ?>
aka J Love Community Supporter?
Bill Gates is my home boy
Gender:
Posts: 884
1636 credits Members referred : 4
« Reply #3 on: Sep 24, 2007, 08:56:01 PM »
thanks guys i have updated the details for both bots, waiting on visits now .. and nik, i followed your exact tutorial.. does it show strstr in tutorial? i know why to use strpos, much faster of course thanks for the tip and help as always
this is the code im using now Nik, mind verifying? have changed it a bit
$spiders = dbquery("SELECT * FROM ".PRE."users WHERE is_spider = '1'"); if (dbrows($spiders)){
while ($bot = dbarray($spiders)){
if (strpos($_SERVER['HTTP_USER_AGENT'], $bot['username'])){
aka J Love Community Supporter?
Bill Gates is my home boy
Gender:
Posts: 884
1636 credits Members referred : 4
« Reply #5 on: Sep 25, 2007, 01:42:21 AM »
i don't suppose it would be necessary for the spiders, but when users visit the site it checks their IP against what is stored, if it has changed it will update their hostname as well which is stored.. so the hostname call won't come everytime, to save the "every page load" issue.. the reason for storing it for users is for the blacklist, which can ban by IP, email, and hostname
thx for the tip on strpos, ive noticed when not using the 3rd operator "===" or "!==" it operates differently, taken down as a fundamental note thanks again Nikolas!!
still working on MSNbot and Yahoo! Slurp, haven't visited yet
aka J Love Community Supporter?
Bill Gates is my home boy
Gender:
Posts: 884
1636 credits Members referred : 4
« Reply #6 on: Sep 26, 2007, 09:56:14 PM »
ya.. i dont think those 2 solutions for msn and yahoo are working guys :S.. here is the data its pulling for them, and the evaluated code:
Note: "Data" below is case sensitive.
Original PHP Code that Loops for All Spiders. //Added for detection of spiders $spiders = dbquery("SELECT * FROM ".PRE."users WHERE is_spider = '1'"); if (dbrows($spiders)){
while ($bot = dbarray($spiders)){
if (strpos($_SERVER['HTTP_USER_AGENT'], $bot['username'])!==false){
$host = gethostbyaddr(USER_IP);
if (substr($host, (strlen($host)-13)) == $bot['location']){
$spider_visited = dbquery("UPDATE ".PRE."users SET `lastvisit` = '".time()."' WHERE `id` = '".$bot['id']."'");
if ($bot['ip'] != USER_IP) dbquery("UPDATE LOW_PRIORITY ".PRE."users SET `ip` = '".USER_IP."', `hostname` = '$host' WHERE `id` = '".$bot['id']."'");
if (substr($host, (strlen($host)-13)) == "Yahoo! Slurp"){
$spider_visited = dbquery("UPDATE ".PRE."users SET `lastvisit` = '".time()."' WHERE `id` = '".$bot['id']."'");
if ($bot['ip'] != USER_IP) dbquery("UPDATE LOW_PRIORITY ".PRE."users SET `ip` = '".USER_IP."', `hostname` = '$host' WHERE `id` = '".$bot['id']."'");
} }
with this, MSNBot and yahoo! havent visited yet (According to this script), and i've been watching closely for a few days at my stats, and they are visiting there.
« Last Edit: Sep 26, 2007, 11:23:22 PM by Meth0d »