很少写 PHP script,这次要拆解小有份量的 Apache 记录档 (log),来练习一下。

用读档的模式来一行一行载入 log 档后,用正规表示式来拆开 LOG,把它塞进资料库里,以便可以再运用。

首先,开一个资料表来放 log。

CREATE TABLE if not exists `access_log` (
`sid` bigint(20) unsigned NOT NULL auto_increment COMMENT '流水号',
`remote_host` varchar(300) NOT NULL COMMENT '连进来的IP (clinet IP)',
`logname` varchar(300) NOT NULL COMMENT '使用者是否透过apache的登入机制连线(identd)',
`user` varchar(300) NOT NULL COMMENT '登入时的user name (若前一项有的话)',
`time` varchar(300) NOT NULL COMMENT '连入时间',
`method` varchar(300) NULL comment '连线方法 (GET / POST ...etc.)',
`request` varchar(3000) NULL comment '网址',
`protocol` varchar(300) NULL comment '协订',
`status` varchar(300) NOT NULL COMMENT 'HTTP status code (200/404/500...etc.)',
`bytes` varchar(300) NOT NULL comment '回传的页面大小',
`referer` varchar(300) NULL COMMENT '转入此页的前一页',
`user_agent` varchar(255) null comment 'client browser',
`seclength` int NULL COMMENT '自订的回应时间记录 (秒)',
`microlength` int NULL COMMENT '自订的回应时间记录 (microsecond)',
`rawdata` varchar(1024) NOT NULL COMMENT '原始log内容(未拆解)',
`filename` varchar(100) not null comment '原始log档档名',
primary key (`sid`)
) default charset=utf8 comment='网站log记录拆解结果';

然后来用 PHP 程式拆 log。我把程式放在 D:\Projects\a.php。

<?php
set_time_limit(0)//因为在 browser 里会跑到一半就逾时,所以设这个,但感觉不管用XD
ini_set('max_execution_time', 0)//因为在 browser 里会跑到一半就逾时,所以再设这个,但感觉还是不管用XD

 //连线资料库的 function
function getDbh($dbName=null, $host=null, $account=null, $password=null) {
    //成功传回 $dbh,失败传回 false
    $dbh = false;
    try{
        $dsn = "mysql:host=".(empty($host) ? DB_HOST : $host).";dbname=".(empty($dbName) ? MAIN_DB : $dbName);
        $dbh = new PDO($dsn,empty($account) ? DB_ACCOUNT : $account,empty($password) ? DB_PASSWORD : $password);
        $dbh->exec("SET NAMES 'big5';");
    }catch(Exception $ex){
    }
    return $dbh;
}

// 输出 HTML 结构 (前半段)
$arr = Array(
    "20161030-ssl-access.log",
    "20161030-www-access.log"
);

// Apache log Parse Pattern
$pattern = '/^([^ ]+) ([^ ]+) ([^ ]+) \[([^\]]+)\] "(.*) (.*) (.*)" ([0-9\-]+) ([0-9\-]+) "(.*)" "(.*)" \*\*([0-9]*)\/([0-9]*)\*\*$/';

// db connection
$dbh = getDbh("logParse");
foreach ($arr as $filename) {
    $rowCount = 0;
    $handle = fopen("D:/Projects/convertLog/log/".$filename, "r") or die();

    // 读档
    while (! feof($handle)) {
        // 逐行读出
        if ($str = trim(fgets($handle, 16384))) {
            if (preg_match($pattern,$str,$matches)) {
                list($whole_match,$remote_host,$logname,$user,$time,
                    $method,$request,$protocol,$status,$bytes,$referer,
                    $user_agent, $seclength, $microlength) = $matches;
                if ($remote_host != "127.0.0.1") {// 我有跳过一些我不想要parse的IP不写入
                    $sql = "insert into access_log (remote_host,logname,user,time,method,request,protocol,status,bytes,referer,user_agent, seclength, microlength, rawdata, filename) values (:remote_host,:logname,:user,:time,:method,:request,:protocol,:status,:bytes,:referer,:user_agent, :seclength, :microlength, :whole_match, :filename)";
                    $sth = $dbh->prepare($sql);
                    $sth->bindParam(':remote_host', $remote_host, PDO::PARAM_STR);
                    $sth->bindParam(':logname', $logname, PDO::PARAM_STR);
                    $sth->bindParam(':user', $user, PDO::PARAM_STR);
                    $sth->bindParam(':time', $time, PDO::PARAM_STR);
                    $sth->bindParam(':method', $method, PDO::PARAM_STR);
                    $sth->bindParam(':request', $request, PDO::PARAM_STR);
                    $sth->bindParam(':protocol', $protocol, PDO::PARAM_STR);
                    $sth->bindParam(':status', $status, PDO::PARAM_STR);
                    $sth->bindParam(':bytes', $bytes, PDO::PARAM_STR);
                    $sth->bindParam(':referer', $referer, PDO::PARAM_STR);
                    $sth->bindParam(':user_agent', $user_agent, PDO::PARAM_STR);
                    $sth->bindParam(':seclength', $seclength, PDO::PARAM_STR);
                    $sth->bindParam(':microlength', $microlength, PDO::PARAM_STR);
                    $sth->bindParam(':whole_match', $whole_match, PDO::PARAM_STR);
                    $sth->bindParam(':filename', $filename, PDO::PARAM_STR);
                    $sth->execute();
                }
            }
        }
        else {
            echo "[Error Parsing]".$str;
        }
        $rowCount++;
        echo $rowCount."\n";
    }

    fclose($handle);
}

// db connection close
$dbh= null;

 

然后在 DOS 视窗下,切到 PHP 的安装目录后,执行这支 PHP:

php -q D:\Projects\a.php

就可以顺顺的跑完,速度也比较好。

打完收工~

 

 

相关文章