日誌系統EFK搭建

參考: dapeng日誌的收集處理及查詢應用mysql

最近照着 dapeng 的日誌管理系統, 想着本身也來搭建一個EFK試試. 在這裏記錄一下本身的踩坑經歷, 也很是感謝 Ever_00洋洋_3720 等大佬們的支持和幫助git

技術選型

迴歸正題, 咱們選取的依舊是fluent-bit+fluentd+kafka+elasticsearch做爲日誌系統的方案, 其中dapeng服務已經集成了單結點fluent-bit收集各個docker容器中的日誌文件發往fluentd, fluentd作爲中轉收集全部的日誌發往kafak用於削峯填谷,削峯後的數據再經由fluentd發送給elasticsearch進行存儲. 這次搭建過程當中沒有使用Kibana, 而是使用elasticsearch-head來作日誌界面展現.github

對於非dapeng服務須要本身修改Dockerfile, 將修改過得fluent-bit打包到容器 服務啓動時運行 sh /opt/fluent-bit/fluent-bit.sh便可

fluent-bit的日誌收集配置

fluent-bit-dapeng.confredis

[SERVICE]
    Flush        5
    Daemon       On
    Log_Level    error
    Log_File     /fluent-bit/log/fluent-bit.log
    Parsers_File parse_dapeng.conf

[INPUT]
    Name tail
    Path /dapeng-container/logs/*.log
    Exclude_Path  /dapeng-container/logs/fluent*.log,/dapeng-container/logs/console.log,/dapeng-container/logs/gc*.log
    Tag  dapeng
    Multiline  on
    Buffer_Chunk_Size 2m
    buffer_max_size  30m
    Mem_Buf_Limit  32m
    DB.Sync  Normal
    db_count 400
    Parser_Firstline dapeng_multiline
    db  /fluent-bit/db/logs.db

[FILTER]
    Name record_modifier
    Match *
    Record hostname ${soa_container_ip}
    Record tag ${serviceName}

[OUTPUT]
    Name  Forward
    Match *
    Host  fluentd
    Port  24224
    HostStandby fluentdStandby
    PortStandby 24224

在dapeng服務中, 對於每一個服務 serviceName, soa_container_ip, fluentd, fluentdStandby 的配置必不可少, 其中的Path, Exclude_Path用來指定哪些日誌須要收集, 哪些須要過濾, 能夠經過環境變量來修改:sql

fluentBitLogPath=/dapeng-container/logs/*.log
fluentBitLogPathExclude=/dapeng-container/logs/fluent*.log,/dapeng-container/logs/console.log,/dapeng-container/logs/gc*.log

同時須要將上面的 fluent-bit-dapeng.conf 掛載到 /opt/fluent-bit/etc/fluent-bit.confdocker

environment:
       - serviceName=payment
       - container_ip=${host_ip}
       - soa_container_port=${payment_port}
       - soa_container_ip=${host_ip}
       - host_ip=${host_ip}
       - soa_service_timeout=60000
       - JAVA_OPTS=-Dname=payment -Dfile.encoding=UTF-8 -Dsun.jun.encoding=UTF-8  -Dio.netty.leakDetectionLevel=advanced
       - kafka_consumer_host=${kafka_host_ip}:9092
       - kafka_producer_host=${kafka_host_ip}:9092
     env_file:
       - .envs/application.env
       - .envs/common.env
     volumes:
       - "/data/logs/payment:/dapeng-container/logs"
       - "/data/var/fluent/order/:/fluent-bit/db/"
       - "./config/fluent-bit-dapeng.conf:/opt/fluent-bit/etc/fluent-bit.conf"
       - "/data/var/shm:/data/shm"
     ports:
       - "${payment_port}:${payment_port}"
     extra_hosts:
       - "fluentd:${fluentd_host}"
       - "fluentdStandby:${fluentdStandby_host}"
       - "db-master:${mysql_host_ip}"
       - "soa_zookeeper:${zookeeper_host_ip}"
       - "redis_host:${redis_host_ip}"

同時在dapeng服務容器中能夠看到parse_dapeng.conf以下json

[PARSER]
    Name        dapeng_multiline
    Format      regex
    Regex       (?<logtime>\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2} \d{1,3}) (?<threadPool>.*) (?<level>.*) \[(?<sessionTid>.*)\] - (?<message>.*)

其中的Regex就是對日誌進行正則匹配解析出咱們須要的信息, 例如其中的logtime, message等
咱們也能夠經過環境變量來設置解析表達式跨域

fluentbitParserRegex=(?<logtime>^\d{2}-\d{2} \d{2}:\d{2}:\d{2} \d{3}) (?<threadPool>[^ ]+|Check idle connection Thread) (?<level>[^ ]+) \[(?<sessionTid>\w*)\] - (?<message>.*)
注意: 雖然dapeng集成了fluent-bit, 可是默認是不開啓的, 須要修改環境變量:
fluent_bit_enable=true

fluentd的鏡像

首先是準備fluentd的鏡像, 如下是fluentd的Dockerfilebash

FROM fluent/fluentd:v1.2
#增長es插件, kafka插件
RUN  fluent-gem install fluent-plugin-elasticsearch
RUN  fluent-gem install fluent-plugin-kafka
CMD exec fluentd -c /fluentd/etc/${FLUENTD_CONF} -p /fluentd/plugins $FLUENTD_OPT
  1. 打包image (注意在Dockerfile所在目錄, . 即表明Dockerfile的上下文)
    docker build docker.****.com:80/basic/fluentd:v1.2 .
  2. push到docker私服
    docker push docker.****.com:80/basic/fluentd:v1.2
  3. dc-all.yml文件中配置fluentd (dapeng 的 source-compose 封裝了 docker-compose)
fluentd:
    container_name: fluentd
    image: docker.****.com:80/basic/fluentd:v1.2
    restart: on-failure:3
    volumes:
      - /data/var/fluentd/log:/fluentd/log
      - /data/var/fluentd/etc:/fluentd/etc
    environment:
      - LANG=zh_CN.UTF-8
      - TZ=CST-8
    ports:
      - "24224:24224"
    labels:
      - project.source=
      - project.extra=public-image
      - project.depends=
      - project.owner=

對於fluentd的相關配置配置在/data/var/fluentd/etc下
fluent.conf 配置fluentd的轉發器
理論上須要開啓兩個fluentd, 分別作下面1和2的工做, 這裏咱們先合併到一個服務當中服務器

# 1. 收集日誌發送到kafka, topic爲efk  
# 開啓8個工做線程, 端口從24225日後累加
<system>
        log_level error
        flush_thread_count 8
        workers 8
</system>
<source>
  @type  forward
  port  24224
</source>
<source>
  @type monitor_agent
  port 24225
</source>

<match dapeng>
  @type kafka_buffered
  brokers kafak服務器地址:9092
  topic_key efk
  buffer_type file
  buffer_path /tmp/buffer
  flush_interval 5s
  default_topic efk
  output_data_type json
  compression_codec gzip
  max_send_retries 3
  required_acks -1
  discard_kafka_delivery_failed true
</match>
# 1. 收集日誌發送到kafka, topic爲efk  結尾

# 2. 消費kafka中的日誌消息發送到elasticsearch, topic爲efk, group爲efk-consumer
#<system>
#        log_level error
#        flush_thread_count 2
#        workers 2
#</system>
#<source>
#  @type monitor_agent
#  port 24225
#</source>
<source>
  @type kafka_group
  brokers kafka服務器地址:9092
  consumer_group efk-consumer
  topics efk
  format json
  start_from_beginning false
  max_wait_time 5
  max_bytes 1500000
</source>

<match>
    @type elasticsearch
    hosts elasticsearch服務器地址:9200
    index_name dapeng_log_index
    type_name  dapeng_log
    #content_type application/x-ndjson
    buffer_type file
    buffer_path /tmp/buffer_file
    buffer_chunk_limit 10m
    buffer_queue_limit 512
    flush_mode interval
    flush_interval 5s
    request_timeout 5s
    flush_thread_count 2
    reload_on_failure true
    resurrect_after 30s
    reconnect_on_error true
    with_transporter_log true
    logstash_format true
    logstash_prefix dapeng_log_index
    template_name dapeng_log_index
    template_file  /fluentd/etc/template.json
    num_threads 2
    utc_index  false
</match>
# 2. 消費kafka中的日誌消息發送到elasticsearch  結尾

template.json配置elasticsearch關於索引建立的模板

{
  "template": "dapeng_log_index-*",
  "mappings": {
    "dapeng_log": {
      "properties": {
        "logtime": {
          "type": "date",
          "format": "MM-dd HH:mm:ss SSS"
        },
        "threadPool": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "level": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "tag": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "message": {
          "type": "keyword",
          "ignore_above": 2048,
          "norms": false,
          "index_options": "docs"
        },
        "hostname": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "sessionTid": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "log": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        }
      }
    }
  },
  "settings": {
    "index": {
      "max_result_window": "100000000",
      "number_of_shards": "3",
      "number_of_replicas": "1",
      "codec": "best_compression",
      "translog": {
        "sync_interval": "60s",
        "durability": "async",
        "flush_threshold_size": "1024mb"
      },
      "merge":{
        "policy":{
          "max_merged_segment": "2gb"
        }
      },
      "refresh_interval": "10s"
    }
  },
  "warmers": {}
}

elasticsearch的鏡像準備

  1. dc-all.yml文件中關於elasticsearch的配置
elasticsearch:
    image: elasticsearch:6.7.1
    container_name: elasticsearch
    restart: on-failure:3
    environment:
      - LANG=zh_CN.UTF-8
      - TZ=CST-8
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    volumes:
      - /data/var/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
    ports:
      - "9200:9200"
      - "9300:9300"
    labels:
      - project.source=
      - project.extra=public-image
      - project.depends=
      - project.owner=

elasticsearch.yml 配置啓用cors跨域訪問, 就能夠經過elasticsearch-head來訪問elasticsearch了

cluster.name: "docker-cluster"
network.host: 0.0.0.0

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-methods: OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers: "X-Requested-With, Content-Type, Content-Length, X-User"

elasticsearch啓動報錯:

max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

即: elasticsearch用戶擁有的內存權限過小,至少須要262144

sudo vi /etc/sysctl.conf 文件最後添加一行: vm.max_map_count=262144 而後 sudo sysctl -p 從新加載配置, 再重啓elasticsearch便可

elasticsearch-head的鏡像準備

  1. 首先clone elasticsearch-head 項目到/data/workspace目錄下
    git clone git://github.com/mobz/elasticsearch-head.git
  2. dc-all.yml文件中配置elasticsearch-head
elasticsearch-head:
    image: mobz/elasticsearch-head:5
    container_name: elasticsearch-head
    restart: on-failure:3
    environment:
      - LANG=zh_CN.UTF-8
      - TZ=CST-8
    volumes:
      - /data/workspace/elasticsearch-head/Gruntfile.js:/usr/src/app/Gruntfile.js
      - /data/workspace/elasticsearch-head/_site/app.js:/usr/src/app/_site/app.js
    ports:
      - "9100:9100"
    labels:
      - project.source=
      - project.extra=public-image
      - project.depends=
      - project.owner=

對於 Gruntfile.js 須要改動97行以下:

connect: {
        server: {
                options: {
                        hostname: '0.0.0.0',
                        port: 9100,
                        base: '.',
                        keepalive: true
                }
        }
}

對於 app.js 須要改動4379行: 修改localhost爲elasticsearch集羣地址

/** 修改localhost爲elasticsearch集羣地址,Docker部署中,通常是elasticsearch宿主機地址 */
this.base_uri = this.config.base_uri || this.prefs.get("app-base_uri") || "http://elasticsearch服務器地址:9200/";

服務啓動

將以上的服務所有啓動以後, 訪問 http://elasticsearch-head服務器地址:9100/, 能夠看到以下界面(集羣健康值爲黃色, 是由於我沒有作備份)

clipboard.png

固然, 剛開始由於沒有建立索引, 因此是看不到日誌的, 咱們能夠加一個定時任務天天自動建立索引並處理掉以前的索引:
autoIndex4DapengLog.sh: 按期保存7天的索引, 打開最近三天的索引, 建立次日的索引

#!/bin/bash
#
# 索引關閉及刪除

# @date 2018年05月10日18:00:00
# @description Copyright (c) 2015, github.com/dapeng-soa All Rights Reserved.


date=`date -d "2 days ago" +%Y.%m.%d`
date1=`date -d "6 days ago" +%Y.%m.%d`
echo $date
echo $date1
#關閉索引
curl -H "Content-Type: application/json" -XPOST http://elasticsearch服務器地址:9200/dapeng_log_index-$date/_close
#刪除索引
curl -H "Content-Type: application/json" -XDELETE "http://elasticsearch服務器地址:9200/dapeng_log_index-$date1"
#添加索引
tomorrow=`date -d tomorrow +%Y.%m.%d`
# 須要建立索引的elasticsearch服務器列表
ipList=(elasticsearch服務器地址:9200)
for i in ${ipList[@]};do
curl -H "Content-Type: application/json" -XPUT http://$i/dapeng_log_index-$tomorrow -d'
{
  "mappings": {
    "_default_": {
            "_all": {
                "enabled": "false"
            }
        },
    "dapeng_log": {
      "properties": {
        "logtime": {
          "type": "date",
          "format": "MM-dd HH:mm:ss SSS"
        },
        "threadPool": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "level": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "tag": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "message": {
          "type": "keyword",
          "ignore_above": 2048,
          "norms": false,
          "index_options": "docs"
        },
        "hostname": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "sessionTid": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        },
        "log": {
          "type": "keyword",
          "norms": false,
          "index_options": "docs"
        }
      }
    }
  },
  "settings": {
    "index": {
      "max_result_window": "100000000",
      "number_of_shards": "3",
      "number_of_replicas": "1",
      "codec": "best_compression",
      "translog": {
        "sync_interval": "60s",
        "durability": "async",
        "flush_threshold_size": "1024mb"
      },
      "merge":{
        "policy":{
          "max_merged_segment": "2gb"
        }
      },
      "refresh_interval": "10s"

    }
  },
  "warmers": {}
}'
response=`curl -H "Content-Type: application/json" -s "http://$i/_cat/indices?v" |grep open | grep dapeng_log_index-$tomorrow |wc -l`

echo -e "\n"

if [ "$response" == 1 ];then
    break
else
    continue
fi
done;

crontab -e 將此命令加入定時任務, 天天23:00定時執行, 建立次日的索引:

0 23 * * *    (cd /data/workspace/elasticsearch-head/; sh autoIndex4DapengLog.sh) > /data/workspace/elasticsearch-head/autoIndex4DapengLog.log

如今就能夠在 查看日誌數據了

clipboard.png

若是想去除elasticsearch自帶的一些字段信息(例如_index, _id, _score等)顯示在表格中, 須要修改elasticsearch-head/_site/app.js, 改動2038行以下:

_data_handler: function(store) {
        // 去除結果集中無用字段
        var customFields = ["logtime", "hostname", "tag", "sessionTid", "threadPool", "level", "message", "log"];
        store.columns = customFields;
        //store.columns = store.columns.filter(i => customFields.indexOf(i) > -1);
        this.tools.text(store.summary);
        this.headers.empty().append(this._header_template(store.columns));
        this.body.empty().append(this._body_template(store.data, store.columns));
        this._reflow();
},

注意 customFields 中的字段和建立索引時的字段一致, 且其中部分字段是由fluent-bit解析獲得的

todo

  1. 固然 這邊圖中也能夠看到一些字段爲空的數據, 將其中的log字段的值拷貝到 https://regex101.com/ 網站進行解析, 發現和以前的正則解析 Regex 不匹配, 因此部分字段沒有解析到值, 沒法解析的部份內容就在log中, 後續須要將這些內容過濾掉
  2. 基於現有的日誌系統,開發生產故障實時告警系統
相關文章
相關標籤/搜索