编程语言：自己用PHP来分析Apache访问日志

热点排行

自己用PHP来分析Apache访问日志

类别：编程语言点击：0 评论：0 推荐：

根据实际工作需要，想从访问日志里找出自己想要的东西，如找不到的文件，从google来的还是从yahoo来的或从别的地方来的，还是搜索引擎的蜘蛛访问。原理很简单就是打开文件，过滤不要的记录，分解记录字段，列表所需结果。几乎凭一个PHP的函数preg_match()搞定。下面是源代码，自己研究吧　！
<html>
<head>
<title>
Simple tools for website logs
</title>
</head>
<body>
<form name="my_form" method="post">
Select your type :<br>
<select name="type">
  <option value="">Get the null links</option>
  <option value="yahoo">Acess from yahoo</option>
  <option value="google">Access from google</option>
  <option value="msn">Access from Msn</option>
  <option value="robot">Access by robots</option>
</select>
 
<input type="submit" name="submit" value="get the result">
</form>
<table border=1>
<tr bgcolor="#FFCCFF">
    <td><font color="#000000">ClientIP</font></td>
    <td><font color="#000000">AccessTime</font></td>
    <td><font color="#000000">TargetPage</font></td>
    <td><font color="#000000">Code</font></td>
    <td><font color="#000000">FromURL</font></td>
    <td><font color="#000000">Client ENV</font></td>
</tr>
<?PHP
$doc_path= $_SERVER["DOCUMENT_ROOT"];
if(substr($doc_path,-1)!="/"){
$doc_path=$doc_path."/";
}

if($type=='yahoo'){
        $lines = file ($doc_path.'logs/access_log');
        foreach ($lines as $line_num => $line) {
                if (preg_match ("/yahoo/i",strtolower($line))) {
                        if (!preg_match ("/slurp/",strtolower($line))){
                                preg_match("/([0-9.]+)?([ -]+)?(\[)?([0-9a-zA-Z+: \/]+)?(\])?( \"GET \/)?([a-z0-9A-Z.\/\?&=%_\-:+]+)?( HTTP\/1.[1|0|2]\" )?([0-9.]+)?( )?([0-9.\-]+)?( \")?([a-z0-9A-Z.\/\?&=%_\-:+]+)?(\" \")?(.*)/i",$line, $matches);
        echo "<tr><td>".$matches[1]."</td><td>".$matches[4]."</td><td>".$matches[7]."</td><td>".$matches[9]."</td><td>".$matches[13]."</td><td>".$matches[15]."</td><tr>";
                        }
                }

        }
}elseif($type=="robot"){
        $lines = file ($doc_path.'logs/access_log');
        foreach ($lines as $line_num => $line) {
                 if (!preg_match("/robots.txt/i",$line)){
      if (preg_match ("/(slurp)|(msnbot)|(googlebot)|(psbot)/i",strtolower($line))){
                                preg_match("/([0-9.]+)?([ -]+)?(\[)?([0-9a-zA-Z+: \/]+)?(\])?( \"GET \/)?([a-z0-9A-Z.\/\?&=%_\-:+]+)?( HTTP\/1.[1|0|2]\" )?([0-9.]+)?( )?([0-9.\-]+)?( \")?([a-z0-9A-Z.\/\?&=%_\-:+]+)?(\" \")?(.*)/i",$line, $matches);
        echo "<tr><td>".$matches[1]."</td><td>".$matches[4]."</td><td>".$matches[7]."</td><td>".$matches[9]."</td><td>".$matches[13]."</td><td>".$matches[15]."</td><tr>";
                        }
      }
        }
}elseif($type!=""){
        $lines = file ($doc_path.'logs/access_log');
        foreach ($lines as $line_num => $line) {
                if (preg_match ("/$type/i",strtolower($line))) {
                        if (!preg_match ("/".$type."bot/",strtolower($line))){
                            preg_match("/([0-9.]+)?([ -]+)?(\[)?([0-9a-zA-Z+: \/]+)?(\])?( \"GET \/)?([a-z0-9A-Z.\/\?&=%_\-:+]+)?( HTTP\/1.[1|0|2]\" )?([0-9.]+)?( )?([0-9.\-]+)?( \")?([a-z0-9A-Z.\/\?&=%_\-:+]+)?(\" \")?(.*)/i",$line, $matches);
       echo "<tr><td>".$matches[1]."</td><td>".$matches[4]."</td><td>".$matches[7]."</td><td>".$matches[9]."</td><td>".$matches[13]."</td><td>".$matches[15]."</td><tr>";
                        }
                }

}

}else{
$lines = file ($doc_path.'logs/access_log');
foreach ($lines as $line_num => $line) {
  if (preg_match ("/ 404 /i",$line)) {
   if (!preg_match ("/robots.txt/",$line)){
                            preg_match("/([0-9.]+)?([ -]+)?(\[)?([0-9a-zA-Z+: \/]+)?(\])?( \"GET \/)?([a-z0-9A-Z.\/\?&=%_\-:+]+)?( HTTP\/1.[1|0|2]\" )?([0-9.]+)?( )?([0-9.\-]+)?( \")?([a-z0-9A-Z.\/\?&=%_\-:+]+)?(\" \")?(.*)/i",$line, $matches);
       echo "<tr><td>".$matches[1]."</td><td>".$matches[4]."</td><td>".$matches[7]."</td><td>".$matches[9]."</td><td>".$matches[13]."</td><td>".$matches[15]."</td><tr>";
   }
  }

}
}
?>
</table>
</body>
</html>

本文地址：http://com.8s8s.com/it/it26545.htm