OpenResty + Lua + Kafka 實現日誌收集系統以及部署過程當中遇到的坑

 ********************* 部署過程 **************************javascript

一:場景描述

對於線上大流量服務或者須要上報日誌的nginx服務,天天會產生大量的日誌,這些日誌很是有價值。可用於計數上報、用戶行爲分析、接口質量、性能監控等需求。但傳統nginx記錄日誌的方式數據會散落在各自nginx上,並且大流量日誌自己對磁盤也是一種衝擊。 
咱們須要把這部分nginx日誌統一收集彙總起來,收集過程和結果須要知足以下需求: 
支持不一樣業務獲取數據,如監控業務,數據分析統計業務,推薦業務等。 
數據實時性 
高性能保證php

二:技術方案

得益於openresty和kafka的高性能,咱們能夠很是輕量高效的實現當前需求,架構以下: 
這裏寫圖片描述 
方案描述: 
1:線上請求打向nginx後,使用lua完成日誌整理:如統一日誌格式,過濾無效請求,分組等。 
2:根據不一樣業務的nginx日誌,劃分不一樣的topic。 
3:lua實現producter異步發送到kafka集羣。 
4:對不一樣日誌感興趣的業務組實時消費獲取日誌數據。css

三:相關技術 
openresty: http://openresty.org 
kafka: http://kafka.apache.org 
lua-resty-kafka: https://github.com/doujiang24/lua-resty-kafkahtml

四:安裝配置 
爲了簡單直接,咱們採用單機形式配置部署,集羣狀況相似。 
1)準備openresty依賴: 
Java代碼 收藏代碼 
apt-get install libreadline-dev libncurses5-dev libpcre3-dev libssl-dev perl make build-essential 
# 或者 
yum install readline-devel pcre-devel openssl-devel gccjava

2)安裝編譯openresty: 
Java代碼 收藏代碼 
#1:安裝openresty: 
cd /opt/nginx/ # 安裝文件所在目錄 
wget https://openresty.org/download/openresty-1.9.7.4.tar.gz 
tar -xzf openresty-1.9.7.4.tar.gz /opt/nginx/nginx

#配置: 
# 指定目錄爲/opt/openresty,默認在/usr/local。 
./configure –prefix=/opt/openresty \ 
–with-luajit \ 
–without-http_redis2_module \ 
–with-http_iconv_module 
make 
make installgit

3)安裝lua-resty-kafka

Java代碼 收藏代碼 
#下載lua-resty-kafka: 
wget https://github.com/doujiang24/lua-resty-kafka/archive/master.zip 
unzip lua-resty-kafka-master.zip -d /opt/nginx/github

#拷貝lua-resty-kafka到openresty 
mkdir /opt/openresty/lualib/kafka 
cp -rf /opt/nginx/lua-resty-kafka-master/lib/resty /opt/openresty/lualib/kafka/redis

4):安裝單機kafka 
Java代碼 收藏代碼 
cd /opt/nginx/ 
wget http://apache.fayea.com/kafka/0.9.0.1/kafka_2.10-0.9.0.1.tgz 
tar xvf kafka_2.10-0.9.0.1.tgzapache

# 開啓單機zookeeper 
nohup sh bin/zookeeper-server-start.sh config/zookeeper.properties > ./zk.log 2>&1 & 
**# 綁定broker ip,必須綁定 
**#在config/servier.properties下修改host.name 
host.name={your_server_ip} 
# 啓動kafka服務 
nohup sh bin/kafka-server-start.sh config/server.properties > ./server.log 2>&1 & 
# 建立測試topic 
sh bin/kafka-topics.sh –zookeeper localhost:2181 –create –topic test1 –partitions 1 –replication-factor 1

五:配置運行

開發編輯/opt/openresty/nginx/conf/nginx.conf 實現kafka記錄nginx日誌功能,源碼以下: 
Java代碼 收藏代碼 
worker_processes 12;

events { 
use epoll; 
worker_connections 65535; 
}

http { 
include mime.types; 
default_type application/octet-stream; 
sendfile on; 
keepalive_timeout 0; 
gzip on; 
gzip_min_length 1k; 
gzip_buffers 4 8k; 
gzip_http_version 1.1; 
gzip_types text/plain application/x-javascript text/css application/xml application/X-JSON; 
charset UTF-8; 
# 配置後端代理服務 
upstream rc{ 
server 10.10.*.15:8080 weight=5 max_fails=3; 
server 10.10.*.16:8080 weight=5 max_fails=3; 
server 10.16.*.54:8080 weight=5 max_fails=3; 
server 10.16.*.55:8080 weight=5 max_fails=3; 
server 10.10.*.113:8080 weight=5 max_fails=3; 
server 10.10.*.137:8080 weight=6 max_fails=3; 
server 10.10.*.138:8080 weight=6 max_fails=3; 
server 10.10.*.33:8080 weight=4 max_fails=3; 
# 最大長連數 
keepalive 32; 

# 配置lua依賴庫地址 
lua_package_path 「/opt/openresty/lualib/kafka/?.lua;;」;

server {  
    listen       80;  
    server_name  localhost;  
    location /favicon.ico {  
        root   html;  
            index  index.html index.htm;  
    }  
    location / {  
        proxy_connect_timeout 8;  
        proxy_send_timeout 8;  
        proxy_read_timeout 8;  
        proxy_buffer_size 4k;  
        proxy_buffers 512 8k;  
        proxy_busy_buffers_size 8k;  
        proxy_temp_file_write_size 64k;  
        proxy_next_upstream http_500 http_502  http_503 http_504  error timeout invalid_header;  
        root   html;  
        index  index.html index.htm;  
        proxy_pass http://rc;  
        proxy_http_version 1.1;  
        proxy_set_header Connection "";  
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;  
        # 使用log_by_lua 包含lua代碼,由於log_by_lua指令運行在請求最後且不影響proxy_pass機制  
        log_by_lua '  
            -- 引入lua全部api  
            local cjson = require "cjson"  
            local producer = require "resty.kafka.producer"  
            -- 定義kafka broker地址,ip須要和kafka的host.name配置一致  
            local broker_list = {  
                { host = "10.10.78.52", port = 9092 },  
            }  
            -- 定義json便於日誌數據整理收集  
            local log_json = {}  
            log_json["uri"]=ngx.var.uri  
            log_json["args"]=ngx.var.args  
            log_json["host"]=ngx.var.host  
            log_json["request_body"]=ngx.var.request_body  
            log_json["remote_addr"] = ngx.var.remote_addr  
            log_json["remote_user"] = ngx.var.remote_user  
            log_json["time_local"] = ngx.var.time_local  
            log_json["status"] = ngx.var.status  
            log_json["body_bytes_sent"] = ngx.var.body_bytes_sent  
            log_json["http_referer"] = ngx.var.http_referer  
            log_json["http_user_agent"] = ngx.var.http_user_agent  
            log_json["http_x_forwarded_for"] = ngx.var.http_x_forwarded_for  
            log_json["upstream_response_time"] = ngx.var.upstream_response_time  
            log_json["request_time"] = ngx.var.request_time  
            -- 轉換json爲字符串  
            local message = cjson.encode(log_json);  
            -- 定義kafka異步生產者  
            local bp = producer:new(broker_list, { producer_type = "async" })  
            -- 發送日誌消息,send第二個參數key,用於kafka路由控制:  
            -- key爲nill(空)時,一段時間向同一partition寫入數據  
            -- 指定key,按照key的hash寫入到對應的partition  
            local ok, err = bp:send("test1", nil, message)  

            if not ok then  
                ngx.log(ngx.ERR, "kafka send err:", err)  
                return  
            end  
        ';  
    }  
    error_page   500 502 503 504  /50x.html;  
    location = /50x.html {  
        root   html;  
    }  
}

 

}

六:檢測&運行

Java代碼 收藏代碼 
檢測配置,只檢測nginx配置是否正確,lua錯誤日誌在nginx的error.log文件中 
./nginx -t /opt/openresty/nginx/conf/nginx.conf 
# 啓動 
./nginx -c /opt/openresty/nginx/conf/nginx.conf 
# 重啓 
./nginx -s reload

七:測試

1:使用任意http請求發送給當前nginx,如: 
引用

http://10.10.78.52/m/personal/AC8E3BC7-6130-447B-A9D6-DF11CB74C3EF/rc/v1?passport=83FBC7337D681E679FFBA1B913E22A0D@qq.sohu.com&page=2&size=10

2:查看upstream代理是否工做正常 
3:查看kafka 日誌對應的topic是否產生消息日誌,以下: 
引用

# 從頭消費topic數據命令 
sh kafka-console-consumer.sh –zookeeper 10.10.78.52:2181 –topic test1 –from-beginning

效果監測: 
這裏寫圖片描述
4:ab壓力測試 
引用

#單nginx+upstream測試: 
ab -n 10000 -c 100 -k http://10.10.34.15/m/personal/AC8E3BC7-6130-447B-A9D6-DF11CB74C3EF/rc/v1?passport=83FBC7337D681E679FFBA1B913E22A0D@qq.sohu.com&page=2&size=10

#結果 
Server Software: nginx 
Server Hostname: 10.10.34.15 
Server Port: 80 
Document Path: /m/personal/AC8E3BC7-6130-447B-A9D6-DF11CB74C3EF/rc/v1?passport=83FBC7337D681E679FFBA1B913E22A0D@qq.sohu.com 
Document Length: 13810 bytes 
Concurrency Level: 100 
Time taken for tests: 2.148996 seconds 
Complete requests: 10000 
Failed requests: 9982 
(Connect: 0, Length: 9982, Exceptions: 0) 
Write errors: 0 
Keep-Alive requests: 0 
Total transferred: 227090611 bytes 
HTML transferred: 225500642 bytes 
Requests per second: 4653.34 [#/sec] (mean) 
Time per request: 21.490 [ms] (mean) 
Time per request: 0.215 [ms] (mean, across all concurrent requests) 
Transfer rate: 103196.10 [Kbytes/sec] received 
Connection Times (ms) 
min mean[+/-sd] median max 
Connect: 0 0 0.1 0 2 
Processing: 5 20 23.6 16 701 
Waiting: 4 17 20.8 13 686 
Total: 5 20 23.6 16 701 
Percentage of the requests served within a certain time (ms) 
50% 16 
66% 20 
75% 22 
80% 25 
90% 33 
95% 41 
98% 48 
99% 69 
100% 701 (longest request)

引用

#單nginx+upstream+log_lua_kafka接入測試: 
ab -n 10000 -c 100 -k http://10.10.78.52/m/personal/AC8E3BC7-6130-447B-A9D6-DF11CB74C3EF/rc/v1?passport=83FBC7337D681E679FFBA1B913E22A0D@qq.sohu.com&page=2&size=10

#結果 
Server Software: openresty/1.9.7.4 
Server Hostname: 10.10.78.52 
Server Port: 80 
Document Path: /m/personal/AC8E3BC7-6130-447B-A9D6-DF11CB74C3EF/rc/v1?passport=83FBC7337D681E679FFBA1B913E22A0D@qq.sohu.com 
Document Length: 34396 bytes 
Concurrency Level: 100 
Time taken for tests: 2.234785 seconds 
Complete requests: 10000 
Failed requests: 9981 
(Connect: 0, Length: 9981, Exceptions: 0) 
Write errors: 0 
Keep-Alive requests: 0 
Total transferred: 229781343 bytes 
HTML transferred: 228071374 bytes 
Requests per second: 4474.70 [#/sec] (mean) 
Time per request: 22.348 [ms] (mean) 
Time per request: 0.223 [ms] (mean, across all concurrent requests) 
Transfer rate: 100410.10 [Kbytes/sec] received 
Connection Times (ms) 
min mean[+/-sd] median max 
Connect: 0 0 0.2 0 3 
Processing: 6 20 27.6 17 1504 
Waiting: 5 15 12.0 14 237 
Total: 6 20 27.6 17 1504 
Percentage of the requests served within a certain time (ms) 
50% 17 
66% 19 
75% 21 
80% 23 
90% 28 
95% 34 
98% 46 
99% 67 
100% 1004 (longest request)

 

********************* 最重要的模塊 **************************

nginx配置文件配置以下:

#user  nobody;
worker_processes  1;

#error_log  logs/error.log;
#error_log  logs/error.log  notice;
#error_log  logs/error.log  info;

#pid        logs/nginx.pid;


events {
    worker_connections  1024;
}


http {
    include       mime.types;
    default_type  application/octet-stream;

    #log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
    #                  '$status $body_bytes_sent "$http_referer" '
    #                  '"$http_user_agent" "$http_x_forwarded_for"';

    #access_log  logs/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  65;

    #gzip  on;

    upstream myServer {
    server 192.168.0.109:8080 weight=1;
    }

    lua_package_path "/opt/openresty/lualib/kafka/?.lua;;";
    lua_need_request_body on;

    server {
        listen       80;
        server_name  localhost;

        #charset koi8-r;

        #access_log  logs/host.access.log  main;

        location /test1 {
       # 請求轉向自定義的服務器列表
            proxy_pass http://myServer;
        }

    location /test2 {

        # 使用log_by_lua 包含lua代碼,由於log_by_lua指令運行在請求最後且不影響proxy_pass機制  
        log_by_lua '  
            -- 引入lua全部api
        local topic = "test"
            local cjson = require "cjson"  
            local producer = require "resty.kafka.producer"  
            -- 定義kafka broker地址,ip須要和kafka的host.name配置一致  
            local broker_list = {  
                { host = "192.168.0.109", port = 9092 },
        { host = "192.168.0.110", port = 9092 },
        { host = "192.168.0.101", port = 9092 }
            }  
            -- 定義json便於日誌數據整理收集  
            local log_json = {}  
            log_json["uri"]=ngx.var.uri  
            log_json["args"]=ngx.req.get_uri_args()  
            log_json["host"]=ngx.var.host  
            log_json["request_body"]=ngx.var.request_body  
            log_json["remote_addr"] = ngx.var.remote_addr  
            log_json["remote_user"] = ngx.var.remote_user  
            log_json["time_local"] = ngx.var.time_local  
            log_json["status"] = ngx.var.status  
            log_json["body_bytes_sent"] = ngx.var.body_bytes_sent  
            log_json["http_referer"] = ngx.var.http_referer  
            log_json["http_user_agent"] = ngx.var.http_user_agent  
            log_json["http_x_forwarded_for"] = ngx.var.http_x_forwarded_for  
            log_json["upstream_response_time"] = ngx.var.upstream_response_time  
            log_json["request_time"] = ngx.var.request_time  
            -- 轉換json爲字符串  
            local message = cjson.encode(ngx.req.get_uri_args());  
            -- 定義kafka異步生產者  
            local bp = producer:new(broker_list, { producer_type = "async" })  
            -- 發送日誌消息,send第二個參數key,用於kafka路由控制:  
            -- key爲nill(空)時,一段時間向同一partition寫入數據  
            -- 指定key,按照key的hash寫入到對應的partition  
            local ok, err = bp:send(topic, nil, message)  

            if not ok then  
                ngx.log(ngx.ERR, "kafka send err:", err)  
                return  
            end  
        ';  
        }  


        #error_page  404              /404.html;

        # redirect server error pages to the static page /50x.html
        #
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }

        # proxy the PHP scripts to Apache listening on 127.0.0.1:80
        #
        #location ~ \.php$ {
        #    proxy_pass   http://127.0.0.1;
        #}

        # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
        #
        #location ~ \.php$ {
        #    root           html;
        #    fastcgi_pass   127.0.0.1:9000;
        #    fastcgi_index  index.php;
        #    fastcgi_param  SCRIPT_FILENAME  /scripts$fastcgi_script_name;
        #    include        fastcgi_params;
        #}

        # deny access to .htaccess files, if Apache's document root
        # concurs with nginx's one
        #
        #location ~ /\.ht {
        #    deny  all;
        #}
    }


    # another virtual host using mix of IP-, name-, and port-based configuration
    #
    #server {
    #    listen       8000;
    #    listen       somename:8080;
    #    server_name  somename  alias  another.alias;

    #    location / {
    #        root   html;
    #        index  index.html index.htm;
    #    }
    #}


    # HTTPS server
    #
    #server {
    #    listen       443 ssl;
    #    server_name  localhost;

    #    ssl_certificate      cert.pem;
    #    ssl_certificate_key  cert.key;

    #    ssl_session_cache    shared:SSL:1m;
    #    ssl_session_timeout  5m;

    #    ssl_ciphers  HIGH:!aNULL:!MD5;
    #    ssl_prefer_server_ciphers  on;

    #    location / {
    #        root   html;
    #        index  index.html index.htm;
    #    }
    #}

}

 

********************* 遇到的坑 ************************** 

問題概述:

  利用server1服務器上的openresty nginx的lua腳本往server5中kafka寫數據,發現報錯 沒法解析主機(no resolver defined to resolve "xxxxx"),xxxxx是某臺機器的域名,再後來,通過一天的摸索,發現了問題。

問題緣由:

  最終發現,原來是openResty不會去解析 host 映射,由於kafka客戶端用IP鏈接後會請求broker,而後去到zookeeper拿到broker集羣信息(地址記錄是 kafka236:1111),這時候lua消費者拿到的是 kafka236 的IP,

可是又不會經過 host去解析,就會報錯沒法解析主機的問題。

解決方案
     若是存在路由器DNS解析服務,直接在DNS配置個域名解析,再nginx配置裏面指向這個DNS服務器便可(沒有的話須要本身搭建DNS服務)

    nginx.conf配置:

 

   DNS配置:

 

備註說明:
    一、若是kafka服務端配置成IP或者域名,在kafka服務端的本機kafka客戶端是沒法用localhost鏈接的(除非服務端也用localhost)

    二、若是kafka服務端Listen配置成IP,那麼在zookeeper記錄的是IP地址

         若是kafka服務端Listen配置成域名,那麼在zookeeper記錄的是域名

         若是kafka服務端有advertised.listeners配置成域名,那麼zookeeper會記錄成域名,無論Listen配置成什麼

 

後來發現
       低版本的 openresty-1.7.10.2 , 在kafka中配置域名或者IP,均可以訪問

       高版本的 openresty-1.13.6.2  ,  在kafka中配置域名沒法訪問,只能是IP,配置resolver也不行。

相關文章
相關標籤/搜索