爲了採集網站訪問日誌,構建了一套日誌採集系統,使用js探針的方式採集請求數據,避免了使用web服務器訪問日誌採集帶來的大量無效數據(js,css等的請求,佔比達到70%左右).javascript
先來看一下總體的流程圖:css
安裝nginx,修改配置文件(/etc/nginx/conf.d/default.conf)html
server {
listen 80;
server_name spark2;java
location / {
root /data/nginx/app;
index index.html index.htm;
access_log on;
}
}nginx
添加html頁面index.html,content.html
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>首頁</title> </head> <body> <a href="content.html">hello nginx</a> <script type="text/javascript" src="track.js"></script> </body> </html>
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>內容</title> </head> <body> <h3>來看內容啊</h3> <script type="text/javascript" src="track.js"></script> </body> </html>
啓動nginx(service nginx start)
頁面嵌入jsweb
<script type="text/javascript"> var _maq = _maq || []; _maq.push(['_setAccount', 'zx5352']); (function() { var ma = document.createElement('script'); ma.type = 'text/javascript'; ma.async = true; ma.src = 'http://flow.itcast.zx/ma.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ma, s); })(); </script>
track.js算法
(function () { var params = {}; //Document對象數據 if(document) { params.domain = document.domain || ''; params.url = document.URL || ''; params.title = document.title || ''; params.referrer = document.referrer || ''; } //Window對象數據 if(window && window.screen) { params.sh = window.screen.height || 0; params.sw = window.screen.width || 0; params.cd = window.screen.colorDepth || 0; } //navigator對象數據 if(navigator) { params.lang = navigator.language || ''; } //解析_maq配置 if(_maq) { for(var i in _maq) { switch(_maq[i][0]) { case '_setAccount': params.account = _maq[i][1]; break; default: break; } } } //拼接參數串 var args = ''; for(var i in params) { if(args != '') { args += '&'; } args += i + '=' + encodeURIComponent(params[i]); } //經過Image對象請求後端腳本 var img = new Image(1, 1); img.src = 'http://spark3/log.gif?' + args; })();
js請求的URL:後端
http://spark3/log.gif?domain=spark2&url=http://spark2/content.html&title=內容&referrer=http://spark2/&sh=768&sw=1366&cd=24&lang=zh-CN&account=hll
3:日誌服務器搭建緩存
1.安裝依賴ruby yum -y install gcc perl pcre-devel openssl openssl-devel 2.上傳LuaJIT-2.0.4.tar.gz並安裝LuaJIT tar -zxvf LuaJIT-2.0.4.tar.gz -C /usr/local/src/ cd /usr/local/src/LuaJIT-2.0.4/ make && make install PREFIX=/usr/local/luajit 3.設置環境變量 export LUAJIT_LIB=/usr/local/luajit/lib export LUAJIT_INC=/usr/local/luajit/include/luajit-2.0 4.建立modules保存nginx的模塊 mkdir -p /usr/local/nginx/modules
5.上傳openresty-1.9.7.3.tar.gz和依賴的模塊lua-nginx-module-0.10.0.tar、ngx_devel_kit-0.2.19.tar、ngx_devel_kit-0.2.19.tar、echo-nginx-module-0.58.tar.gz
6.將依賴的模塊直接解壓到/usr/local/nginx/modules目錄便可,不須要編譯安裝 tar -zxvf lua-nginx-module-0.10.0.tar.gz -C /usr/local/nginx/modules/ tar -zxvf set-misc-nginx-module-0.29.tar.gz -C /usr/local/nginx/modules/ tar -zxvf ngx_devel_kit-0.2.19.tar.gz -C /usr/local/nginx/modules/ tar -zxvf echo-nginx-module-0.58.tar.gz -C /usr/local/nginx/modules/
7.解壓openresty-1.9.7.3.tar.gz tar -zxvf openresty-1.9.7.3.tar.gz -C /usr/local/src/ cd /usr/local/src/openresty-1.9.7.3/ 8.編譯安裝openresty ./configure --prefix=/usr/local/openresty --with-luajit && make && make install
9.上傳nginx tar -zxvf nginx-1.8.1.tar.gz -C /usr/local/src/ cd /usr/local/src/nginx-1.8.1/ 10.編譯nginx並支持其餘模塊 ./configure --prefix=/usr/local/nginx \ --with-ld-opt="-Wl,-rpath,/usr/local/luajit/lib" \ --add-module=/usr/local/nginx/modules/ngx_devel_kit-0.2.19 \ --add-module=/usr/local/nginx/modules/lua-nginx-module-0.10.0 \ --add-module=/usr/local/nginx/modules/set-misc-nginx-module-0.29 \ --add-module=/usr/local/nginx/modules/echo-nginx-module-0.58 make -j2 make install
11.修改nginx配置文件 worker_processes 2; events { worker_connections 1024; } http { include mime.types; default_type application/octet-stream; log_format tick "$msec^A$remote_addr^A$u_domain^A$u_url^A$u_title^A$u_referrer^A$u_sh^A$u_sw^A$u_cd^A$u_lang^A$http_user_agent^A$u_utrace^A$u_account"; access_log logs/access.log tick; sendfile on; keepalive_timeout 65; server { listen 80; server_name localhost; location /1.gif { #假裝成gif文件 default_type image/gif; #自己關閉access_log,經過subrequest記錄log access_log off; access_by_lua " -- 用戶跟蹤cookie名爲__utrace local uid = ngx.var.cookie___utrace if not uid then -- 若是沒有則生成一個跟蹤cookie,算法爲md5(時間戳+IP+客戶端信息) uid = ngx.md5(ngx.now() .. ngx.var.remote_addr .. ngx.var.http_user_agent) end ngx.header['Set-Cookie'] = {'__utrace=' .. uid .. '; path=/'} if ngx.var.arg_domain then -- 經過subrequest到/i-log記錄日誌,將參數和用戶跟蹤cookie帶過去 ngx.location.capture('/i-log?' .. ngx.var.args .. '&utrace=' .. uid) end "; #此請求不緩存 add_header Expires "Fri, 01 Jan 1980 00:00:00 GMT"; add_header Pragma "no-cache"; add_header Cache-Control "no-cache, max-age=0, must-revalidate"; #返回一個1×1的空gif圖片 empty_gif; } location /i-log { #內部location,不容許外部直接訪問 internal; #設置變量,注意須要unescape set_unescape_uri $u_domain $arg_domain; set_unescape_uri $u_url $arg_url; set_unescape_uri $u_title $arg_title; set_unescape_uri $u_referrer $arg_referrer; set_unescape_uri $u_sh $arg_sh; set_unescape_uri $u_sw $arg_sw; set_unescape_uri $u_cd $arg_cd; set_unescape_uri $u_lang $arg_lang; set_unescape_uri $u_utrace $arg_utrace; set_unescape_uri $u_account $arg_account; #打開日誌 log_subrequest on; #記錄日誌到ma.log,實際應用中最好加buffer,格式爲tick access_log /var/nginx_logs/ma.log tick; #輸出空字符串 echo ''; } } } |
查看日誌:
1489718383.170^A192.168.154.2^Aspark2^Ahttp://spark2/^A\xE6\xA3\xA3\xE6\xA0\xAD\xE3\x80\x89^A^A768^A1366^A24^Azh-CN^AMozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36^A0f21f45cf2c1ba459e9812ee3de17d8a^Azx5352 1489718385.448^A192.168.154.2^Aspark2^Ahttp://spark2/content.html^A\xE5\x86\x85\xE5\xAE\xB9^Ahttp://spark2/^A768^A1366^A24^Azh-CN^AMozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36^A0f21f45cf2c1ba459e9812ee3de17d8a^Azx5352
4:日誌採集
logstash配置文件
input { file { type => "syslog" path => "/var/nginx_logs/track.log" discover_interval => 10 start_position => "beginning" } } output { stdout { codec => rubydebug } }
[root@spark3 logstash]# bin/logstash -f config/log.conf
logstash打印到屏幕的日誌
{ "message" => "1489718383.170^A192.168.154.2^Aspark2^Ahttp://spark2/^A\\xE6\\xA3\\xA3\\xE6\\xA0\\xAD\\xE3\\x80\\x89^A^A768^A1366^A24^Azh-CN^AMozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36^A0f21f45cf2c1ba459e9812ee3de17d8a^Azx5352", "@version" => "1", "@timestamp" => "2017-03-17T03:12:34.380Z", "path" => "/var/nginx_logs/track.log", "host" => "spark3", "type" => "syslog" } { "message" => "1489718385.448^A192.168.154.2^Aspark2^Ahttp://spark2/content.html^A\\xE5\\x86\\x85\\xE5\\xAE\\xB9^Ahttp://spark2/^A768^A1366^A24^Azh-CN^AMozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36^A0f21f45cf2c1ba459e9812ee3de17d8a^Azx5352", "@version" => "1", "@timestamp" => "2017-03-17T03:12:34.906Z", "path" => "/var/nginx_logs/track.log", "host" => "spark3", "type" => "syslog" }