億級PV的ELK集羣實踐之路

前言

筆者多年前便維護過ELK,可是因爲站點日誌流量及服務器數量並非不少基本都是單機搞定。php

然而光Web服務器就400+,Nginx日誌大小天天50G+,加上其餘業務系統日誌,以前單機ELK顯然不足以支撐現有的業務場景。html

規劃篇

目前的業務採用阿里雲+自建機房的模式,阿里雲作爲線上業務,自建機房作爲災備中心,儘量的將線上日誌實時傳輸到自建機房進行數據分析。java

架構圖

簡述

1.日誌集中處理

筆者一開始是在每臺機器經過filebeat+logstash的方式將日誌進行收集和處理後發送到elasticsearch,logstash自己java應用比較耗費內存,並且維護成本較高。node

後來採用rsyslog的方式將全部服務器存儲到單臺阿里雲服務器,再經過rsyslog轉發到自建機房,基本實現了毫秒級的同步。nginx

2.解耦

起初是經過Logstash直接往Elasticsearch存儲日誌,一旦遇到須要重啓或者維護Elasticsearch集羣的時候這時日誌將無處安放,不免形成日誌丟失.web

引入Redis後即使elasticsearch在維護期間也能夠先將數據緩存下來。將Logstash-shipper和Logstash-indexer劃清了界限。redis

 3.集羣多實例

自建機房預備了3臺內存爲256G服務器部署ELK集羣,可是官方建議jvm的內存不要超過32G,大概緣由是一旦jvm設置超過32G將會採用不一樣的算法,這種算法會耗費更多系統資源。算法

好比設置爲48G的狀況下性能甚至不及20G,具體解釋參考官方連接。json

https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html

因此爲了避免浪費資源我在每臺機器上部署了兩個Elasticsearch節點,共6個node.bootstrap

4.收集日誌類型

  • nginx-access.log
  • nginx-error.log
  • php-error.log
  • php-slow.log
  • action.log 前面幾個都好理解,這個日誌是開發採集用戶後臺管理操做關鍵行爲的json格式日誌。

5.服務器規劃

主機名和公網IP均作了化名處理.ELK stak3臺物理機跑了6個Elasticsearch集羣node,6個logstash-indexer,1個logstash-shipper,1個redis,1個kibana,1個nginx.

主機名 配置 用途 備註
rsyslog-relay 8core 16G 1T 收集全部服務器所需日誌,集中存儲轉發。 系統爲centos7.3
elk01 16core 256G 9.8T

nginx

kibana

rsyslog-server

logstash-shipper

elk01-indexer

elk01-indexer2

elk01-elasticsearch

elk01-elasticsearch2

3臺服務器配置均相同

系統均爲centos7.3

磁盤爲12塊1.8T 10k的sas盤,作Raid 0+1.

elk02 16core 256G 9.8T

elk02-indexer

elk02-indexer2

elk02-elasticsearch

elk02-elasticsearch2

elk03 16core 256G 9.8T

elk03-indexer

elk03-indexer2

elk03-elasticsearch

elk03-elasticsearch2

部署篇

rsyslog

升級

將日誌集中存儲到本地機房,Centos7.3自帶的rsyslog爲V7版本,先升級到V8。由於V8的rsyslog-relp有日誌重傳機制,可防止數據丟失。

卸載原有版本,添加v8 yum源,安裝新版本。

[root@rsyslog-relay ~]# rpm -qa|grep rsyslog
rsyslog-7.4.7-16.el7.x86_64
[root@rsyslog-relay ~]# yum remove -y rsyslog-7.4.7-16.el7.x86_64
vim /etc/yum.repos.d/rsyslog_v8.repo [rsyslog_v8] baseurl = http://rpms.adiscon.com/v8-stable/epel-$releasever/$basearch enabled = 1 gpgcheck = 1 gpgkey = http://rpms.adiscon.com/RPM-GPG-KEY-Adiscon name = Rsyslog version 8 repository

[root@rsyslog-relay ~]#yum install rsyslog rsyslog-relp -y

rsyslog客戶端配置

nginx的文件名根據業務區分爲後臺管理訪問日誌,前臺訪問日誌,支付日誌,可是根據類型打上了nginx-access和nginx-error兩個標籤。

配置文件路徑爲/etc/rsyslog.conf

 1 $ModLoad imuxsock
 2 $ModLoad imjournal
 3 $ModLoad imfile
 4 $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
 5 $IncludeConfig /etc/rsyslog.d/*.conf
 6 *.info;mail.none;authpriv.none;cron.none                /var/log/messages
 7 authpriv.*                                              /var/log/secure
 8 mail.*                                                  -/var/log/maillog
 9 cron.*                                                  /var/log/cron
10 *.emerg                                                 :omusrmsg:*
11 uucp,news.crit                                          /var/log/spooler
12 local7.*                                                /var/log/boot.log
13 
14 ##########Start Nginx Log File#################
15 $InputFileName /var/log/nginx/access-admin.log
16 $InputFileTag site-web1-nginx-access:
17 $InputFileStateFile site-web1-nginx-access
18 $InputFileSeverity debug
19 $InputRunFileMonitor
20 $InputFilePollInterval 1
21 
22 $InputFileName /var/log/nginx/error-admin.log
23 $InputFileTag site-web1-nginx-error:
24 $InputFileStateFile site-web1-nginx-error
25 $InputFileSeverity debug
26 $InputRunFileMonitor
27 $InputFilePollInterval 1
28 
29 $InputFileName /var/log/nginx/access-frontend.log
30 $InputFileTag site-web1-nginx-access:
31 $InputFileStateFile site-web1-nginx-access
32 $InputFileSeverity debug
33 $InputRunFileMonitor
34 $InputFilePollInterval 1
35 
36 $InputFileName /var/log/nginx/error-frontend.log
37 $InputFileTag site-web1-nginx-error:
38 $InputFileStateFile site-web1-nginx-error
39 $InputFileSeverity debug
40 $InputRunFileMonitor
41 $InputFilePollInterval 1
42 
43 $InputFileName /var/log/nginx/access-pay.log
44 $InputFileTag site-web1-nginx-access:
45 $InputFileStateFile site-web1-nginx-access
46 $InputFileSeverity debug
47 $InputRunFileMonitor
48 $InputFilePollInterval 1
49 
50 $InputFileName /var/log/nginx/error-pay.log
51 $InputFileTag site-web1-nginx-error:
52 $InputFileStateFile site-web1-nginx-error
53 $InputFileSeverity debug
54 $InputRunFileMonitor
55 $InputFilePollInterval 1
56 ######################End Of Nginx Log File################
57 
58 ######################Start Of Action Log File#############
59 $InputFileName /var/log/php-fpm/action_log.log
60 $InputFileTag site-web1-action:
61 $InputFileStateFile site-web1-action
62 $InputFileSeverity debug
63 $InputRunFileMonitor
64 $InputFilePollInterval 1
65 ######################End Of Action Log File###############
66 
67 #####################Start PHP Log File###################
68 $InputFileName /var/log/php-fpm/www-slow.log
69 $InputFileTag site-web1-php-slow:
70 $InputFileStateFile site-web1-php-slow
71 $InputFileSeverity debug
72 $InputRunFileMonitor
73 $InputFilePollInterval 1
74 $InputFileReadMode 2
75 
76 $InputFileName /var/log/php-fpm/error.log
77 $InputFileTag site-web1-php-error:
78 $InputFileStateFile site-web1-php-error
79 $InputFileSeverity debug
80 $InputRunFileMonitor
81 $InputFilePollInterval 1
82 
83 $WorkDirectory /var/lib/rsyslog
84 $ActionQueueType LinkedList
85 $ActionQueueFileName srvrfwd
86 $ActionResumeRetryCount -1
87 $ActionQueueSaveOnShutdown on
88 ####################End Of PHP log File#####################
89 
90 ###################Start Log Forward##############################################
91 if $programname == 'site-web1-nginx-access' then @@阿里雲rsyslog服務器內網地址:514
92 if $programname == 'site-web1-nginx-error' then @@阿里雲rsyslog服務器內網地址:514
93 if $programname == 'site-web1-php-slow' then @@阿里雲rsyslog服務器內網地址:514
94 if $programname == 'site-web1-php-error' then @@阿里雲rsyslog服務器內網地址:514
95 if $programname == 'site-web1-action' then @@阿里雲rsyslog服務器內網地址:514
96 ###################End Of log Forward##############################################

rsyslog阿里雲中繼服務器配置(rsyslog-relay)

1.主配置文件路徑爲/etc/rsyslogconf,每臺服務器對應的配置文件經過include的方式放置在

/etc/rsyslog.d/,文件名以.conf結尾,對應主配置文件的第7行。
 1 $ModLoad omrelp
 2 $ModLoad imudp
 3 $UDPServerRun 514
 4 $ModLoad imtcp
 5 $InputTCPServerRun 514
 6 $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
 7 $IncludeConfig /etc/rsyslog.d/*.conf
 8 $umask 0022
 9 *.info;mail.none;authpriv.none;cron.none                /var/log/messages
10 authpriv.*                                              /var/log/secure
11 mail.*                                                  -/var/log/maillog
12 cron.*                                                  /var/log/cron
13 *.emerg                                                 :omusrmsg:*
14 uucp,news.crit                                          /var/log/spooler
15 local7.*                                                /var/log/boot.log
16 *.* :omrelp:本地機房rsyslog服務器:20514

2.每臺服務器配置文件,如site-web1的示例以下:

 1 $template site-web1-nginx-access,"/data/rsyslog/nginx/site/site-web1-nginx-access.log"
 2 $template site-web1-nginx-error,"/data/rsyslog/nginx/site/site-web1-nginx-error.log"
 3 $template site-web1-php-slow,"/data/rsyslog/php/site/site-web1-php-slow.log"
 4 $template site-web1-php-error,"/data/rsyslog/php/site/site-web1-php-error.log"
 5 $template site-web1-action,"/data/rsyslog/php/site/site-web1-action.log"
 6 
 7 if $programname == 'site-web1-nginx-access' then ?site-web1-nginx-access
 8 if $programname == 'site-web1-nginx-error' then ?site-web1-nginx-error
 9 if $programname == 'site-web1-php-slow' then ?site-web1-php-slow
10 if $programname == 'site-web1-php-error' then ?site-web1-php-error
11 if $programname == 'site-web1-action' then ?site-web1-action

本地機房rsyslog配置(elk01)

/etc/rsyslog.conf主配置文件以下,另外/etc/rsyslog.d/裏面的站點配置文件跟中繼服務器裏面的如出一轍。

 1 $ModLoad imrelp
 2 $ModLoad omrelp
 3 $InputRELPServerRun 20514
 4 $WorkDirectory /var/lib/rsyslog
 5 $DirCreateMode 0755
 6 $FileCreateMode 0644
 7 $FileOwner logstash
 8 $DirOwner logstash
 9 $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
10 $IncludeConfig /etc/rsyslog.d/*.conf
11 $OmitLocalLogging on
12 $IMJournalStateFile imjournal.state
13 *.info;mail.none;authpriv.none;cron.none                /var/log/messages
14 authpriv.*                                              /var/log/secure
15 mail.*                                                  -/var/log/maillog
16 cron.*                                                  /var/log/cron
17 *.emerg                                                 :omusrmsg:*
18 uucp,news.crit                                          /var/log/spooler
19 local7.*                                                /var/log/boot.log
20 $PrivDropToGroup logstash

至此全部服務器日誌都經過rsyslog集中收集了.


 elk安裝

均是經過yum方式安裝的最新6.x版本,根據規劃elk01上安裝的nginx及elk02上安裝的redis均是採用yum方式安裝,就不在一一贅述。

elasticsearch yum源

[elasticsearch-6.x]
name=Elasticsearch repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

logstash yum源

[logstash-6.x]
name=Elastic repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

kibana yum源

[kibana-6.x]
name=Kibana repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

logstash-shipper配置

logstash安裝後配置文件路徑默認爲/etc/logstash,拷貝一份作爲logstash-shipper的配置文件目錄.

cp -r /etc/logstash /etc/logstash-shipper
chown -R logstash.logstash /etc/logstash-shipper

主配置文件/etc/logstash-shipper/logstash.yml

path.data: /var/lib/logstash-shipper
path.config: /etc/logstash-shipper/conf.d
path.logs: /var/log/logstash/shipper

建立相關目錄並受權予logstash用戶

mkdir -p /var/lib/logstash-shipper && chown logstash.logstash /var/lib/logstash-shipper
mkdir -p /var/log/logstash/shipper && chown logstash.logstash /var/log/logstash/shipper

站點配置文件/etc/logstash-shipper/conf.d/shipper.conf

截至目前站點配置有3千多行,所有貼出來略顯冗餘,這裏挑一個站的配置供參考。

input {
    file {
         path => "/data/rsyslog/php/site/site-*-php-error.log"
         type => "site-php-error"
         sincedb_path => "/data/sincedb/site"
         }

    file {
         path => "/data/rsyslog/php/site/site-*-php-slow.log"
         type => "site-php-slow"
         sincedb_path => "/data/sincedb/site"
         }

    file {
         path => "/data/rsyslog/nginx/site/site-*-nginx-error.log"
         type => "site-nginx-error"
         sincedb_path => "/data/sincedb/site"
         }

    file {
         path => "/data/rsyslog/nginx/site/site-*-nginx-access.log"
         type => "site-nginx-access"
         sincedb_path => "/data/sincedb/site"
         }

    file {
         path => "/data/rsyslog/php/site/site*action.log"
         type => "site-action"
         sincedb_path => "/data/sincedb/site"
         }
}

output {
       redis {
         host => "elk02內網地址"
         port => "6379"
         db => "8"
         data_type => "list"
         key => "server-log"
    }
}

服務器啓動腳本:/etc/systemd/system/logstash-shipper.service

[Unit]
Description=logstash-shipper

[Service]
Type=simple
User=logstash
Group=logstash
# Load env vars from /etc/default/ and /etc/sysconfig/ if they exist.
# Prefixing the path with '-' makes it try to load, but if the file doesn't
# exist, it continues onward.
EnvironmentFile=-/etc/default/logstash
EnvironmentFile=-/etc/sysconfig/logstash
ExecStart=/usr/share/logstash/bin/logstash "--path.settings" "/etc/logstash-shipper"
Restart=always
WorkingDirectory=/
Nice=19
LimitNOFILE=16384

[Install]
WantedBy=multi-user.target

logstash-indexer配置

每臺服務器均跑了兩個indexer

logstash-indexer1

拷貝配置文件目錄

cp -r /etc/logstash /etc/logstash-indexerchown -R logstash.logstash /etc/logstash-indexer

主配置文件/etc/logstash-indexer/logstash.yml

path.data: /var/lib/logstash
path.config: /etc/logstash-indexer/conf.d
path.logs: /var/log/logstash/indexer

建立相關目錄並受權予logstash用戶

mkdir -p /var/log/logstash/indexer && chown logstash.logstash /var/log/logstash/indexer

每一個日誌類型都對應了一個配置文件,放置在/etc/logstash-indexer/conf.d

/etc/logstash-indexer/conf.d
├── action.conf            程序自定義的用戶行爲日誌
├── nginx_access.conf      nginx訪問日誌
├── nginx_error.conf       nginx錯誤日誌
├── php_error.conf         php錯誤日誌
└── php_slow.conf          php-slow日誌

各配置文件以下

action.conf

input {
    redis {
        host => "elk02內網ip"
        port => "6379"
        db => "8"
        data_type => "list"
        key => "kosun-log"
    }
}

filter {
    if [type] =~ '^.+action' {
      mutate {
      gsub => [ "message", "^.+-action: ", "" ]
             }
     json {
       source => "message"
           }
      date {
          match => ["time", "yyyy-MM-dd HH:mm:ss"]
          target => "@timestamp"
          "locale" => "en"
          timezone => "Asia/Shanghai"
          remove_field => ["time"]
            }
     }
}

output {
    if [type] =~ '^.+action' {
        elasticsearch {
            hosts => ["elasticsearch:9200"]
            index => "logstash-action-%{+YYY.MM.dd}"
        }
    }
}

 

nginx-access

input {
    redis {
        host => "elk02內網地址"
        port => "6379"
        db => "8"
        data_type => "list"
        key => "kosun-log"
    }
}

filter {
    if [type] =~ '^.+nginx-access' {
        fingerprint {
            method => "SHA1"
            key    => "^.+nginx-access"
        }
        grok {
            match => [ "message" , "%{COMBINEDAPACHELOG} %{DATA:msec} %{QUOTEDSTRING:x_forward_ip} %{DATA:server_name} %{DATA:request_time} %{DATA:upstream_response_time} %{DATA:scheme} %{GREEDYDATA:extra_fields}" ]
            overwrite => [ "message" ]
        }

        date {
            match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss.SSS Z"]
            #target => "@timestamp"
            "locale" => "en"
            timezone => "Asia/Shanghai"
        }

        mutate {
            gsub => ["agent", "\"", ""]
            gsub => ["referrer", "\"", ""]
            gsub => ["x_forward_ip", "\"", ""]
            gsub => ["extra_fields", "\"", ""]
        }

        if [extra_fields] =~ /^{.*}$/ {
            mutate {
                gsub => ["extra_fields", "\"","",
                    "extra_fields", "\\x0A","",
                    "extra_fields", "\\x22",'\"',
                    "extra_fields", "(\\)",""
                    ]
            }

            json {
                source => "extra_fields"
                target => "extra_fields_json"
            }
        }

        geoip {
            source => "clientip"
            fields => ["city_name","location"]
        }
    }
}

output {
    if [type] =~ '^.+nginx-access' {
        elasticsearch {
            hosts => ["elasticsearch:9200"]
            index => "logstash-%{type}-%{+YYYY.MM.dd}"
            document_type => "%{type}"
           # flush_size => 50000
           # idle_flush_time => 10
            sniffing => true
            template_overwrite => true
            document_id => "%{fingerprint}"
        }
    }
}

nginx-error

input {
       redis {
         host => "elk02內網地址"
         port => "6379"
         db => "8"
         data_type => "list"
         key => "kosun-log"
       }
}

filter {
   if [type] =~ '^.+nginx-error' {
     fingerprint {
       method => "SHA1"
         key    => "^.+nginx-error"
        }
  grok {
    match => {
        "message" => [
            "(?<timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[%{DATA:err_severity}\] (%{NUMBER:pid:int}#%{NUMBER}: \*%{NUMBER}|\*%{NUMBER}) %{DATA:err_message}(?:, client: (?<clientip>%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:server})(?:, request: %{QS:request})?(?:, host: %{QS:client_ip})?(?:, referrer: \"%{URI:referrer})?",

            "%{DATESTAMP:timestamp} \[%{DATA:err_severity}\] %{GREEDYDATA:err_message}"
        ]
      }
    }
    date {
        match => ["timestamp" , "YYYY/MM/dd HH:mm:ss"]
        "locale" => "en"
        timezone => "Asia/Shanghai"
        remove_field => [ "timestamp" ]
        }
     }
}

output {
      if [type] =~ '^.+nginx-error' {
        elasticsearch {
           hosts => ["elasticsearch:9200"]
           index => "logstash-nginx-error-%{+YYY.MM.dd}"
           document_id => "%{fingerprint}"
     }
   }
}

php-error

input {
       redis {
         host => "elk02內網地址"
         port => "6379"
         db => "8"
         data_type => "list"
         key => "kosun-log"
       }
}

output {
    if [type] =~ '^.+php-error' {
       elasticsearch {
       hosts => ["elasticsearch:9200"]
       index => "logstash-php-error-%{+YYY.MM.dd}"
     }
  }
}

php-slow

input {
       redis {
         host => "elk02內網地址"
         port => "6379"
         db => "8"
         data_type => "list"
         key => "kosun-log"
       }
}

filter {
    if [type] =~ '^.+slow' {
        multiline {
            pattern => "\[\d{2}-"
            negate => true
            what => "previous"
        }
    }
}

output {
     if [type] =~ '^.+slow' {
       elasticsearch {
       hosts => ["elasticsearch:9200"]
       index => "logstash-php-slow-%{+YYY.MM.dd}"
     }
  }
}

服務啓動腳本/etc/systemd/system/logstash-indexer.service

[Unit]
Description=logstash-indexer

[Service]
Type=simple
User=logstash
Group=logstash
# Load env vars from /etc/default/ and /etc/sysconfig/ if they exist.
# Prefixing the path with '-' makes it try to load, but if the file doesn't
# exist, it continues onward.
EnvironmentFile=-/etc/default/logstash
EnvironmentFile=-/etc/sysconfig/logstash
ExecStart=/usr/share/logstash/bin/logstash "--path.settings" "/etc/logstash-indexer"
Restart=always
WorkingDirectory=/
Nice=19
LimitNOFILE=16384

[Install]
WantedBy=multi-user.target

  logstash-indexer2的配置文件同樣,只需修改相應的目錄和啓動腳本便可。

Elasticsearch集羣部署

每臺服務器跑了兩個elasticsearch實例,一個爲yum安裝,一個爲源碼包解壓。

yum默認安裝的配置文件位於/etc/elasticsearch

以elk01爲例elasticsearch實例1配置

修改jvm.options裏面的內存設置爲31g

-Xms31g
-Xmx31g

主配置文件/etc/elasticsearch/elasticsearch.yml

#============================cluster setting==============================
cluster.name:elk
cluster.routing.allocation.same_shard.host: true
#============================node setting=================================
node.name: elk01
node.master: true
node.data: true
#============================path setting=================================
path.data: /data/es-data
path.logs: /var/log/elasticsearch
#============================memory setting===============================
bootstrap.memory_lock: false
#============================network setting==============================
network.host: elk01內網地址
http.port: 9200
transport.tcp.port: 9300
#============================thread_pool setting==========================
thread_pool.search.queue_size: 10000
#============================discovery setting============================
discovery.zen.ping.unicast.hosts: ["elk01:9300", "elk02:9300", "elk03:9300"]
discovery.zen.minimum_master_nodes: 2
#============================gateway setting==============================
gateway.recover_after_nodes: 4
gateway.recover_after_time: 5m
gateway.expected_nodes: 5
indices.recovery.max_bytes_per_sec: 800mb
http.cors.enabled: true
http.cors.allow-origin: "*"
xpack.security.enabled: false
xpack.monitoring.enabled: true
xpack.graph.enabled: false
xpack.watcher.enabled: false

建立數據目錄並受權

mkdir -p /data/es-data && chown elasticsearch.elasticsearch /data/es-data

服務啓動腳本,yum安裝自帶,無需修改/usr/lib/systemd/system/elasticsearch.service

elk01 elasticsearch實例2配置

該實例是用源碼包作的

下載源碼包,解壓,移動到指定目錄,受權

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.1.2.zip
tar xf
elasticsearch-6.1.2.zip
mv elasticsearch-6.1.2 /usr/local/
chown -R elasticsearch.elasticsearch /usr/local/elasticsearch-6.1.2

修改/usr/local/elasticsearch-6.1.2/config/jvm.options 內存設置31g

配置文件/usr/local/elasticsearch-6.1.2/config/elasticsearch.yml

#============================cluster setting==============================
cluster.name: elk
cluster.routing.allocation.same_shard.host: true
#node.max_local_storage_nodes: 2
#============================node setting=================================
node.name: elk01-2
node.master: false
node.data: true
#============================path setting=================================
path.data: /data/es-data2
path.logs: /var/log/elasticsearch2
#============================memory setting===============================
bootstrap.memory_lock: false
#============================network setting==============================
network.host: elk01內網ip
http.port: 9201
transport.tcp.port: 9301
#============================thread_pool setting==========================
thread_pool.search.queue_size: 10000
#============================discovery setting============================
discovery.zen.ping.unicast.hosts: ["elk01:9300", "elk02:9300", "elk03:9300"]
discovery.zen.minimum_master_nodes: 2
#============================gateway setting==============================
gateway.recover_after_nodes: 4
gateway.recover_after_time: 5m
gateway.expected_nodes: 5
indices.recovery.max_bytes_per_sec: 800mb
http.cors.enabled: true
http.cors.allow-origin: "*"
xpack.security.enabled: false
xpack.monitoring.enabled: true
xpack.graph.enabled: false
xpack.watcher.enabled: false

建立相關目錄並受權

mkdir -p /data/es-data2 && chown elasticsearch.elasticsearch /data/es-data2

mkdir -p /var/log/elasticsearch2 elasticsearch.elasticsearch /var/log/elasticsearch2

服務啓動腳本/etc/init.d/elasticsearch2

#!/bin/bash
#
# elasticsearch <summary>
#
# chkconfig:   2345 80 20
# description: Starts and stops a single elasticsearch instance on this system
#

### BEGIN INIT INFO
# Provides: Elasticsearch
# Required-Start: $network $named
# Required-Stop: $network $named
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: This service manages the elasticsearch daemon
# Description: Elasticsearch is a very scalable, schema-free and high-performance search solution supporting multi-tenancy and near realtime search.
### END INIT INFO

#
# init.d / servicectl compatibility (openSUSE)
#
if [ -f /etc/rc.status ]; then
    . /etc/rc.status
    rc_reset
fi

#
# Source function library.
#
if [ -f /etc/rc.d/init.d/functions ]; then
    . /etc/rc.d/init.d/functions
fi

# Sets the default values for elasticsearch variables used in this script
ES_HOME="/usr/share/elasticsearch2"
MAX_OPEN_FILES=65536
MAX_MAP_COUNT=262144
ES_PATH_CONF="/etc/elasticsearch2"

PID_DIR="/var/run/elasticsearch2"

# Source the default env file
ES_ENV_FILE="/etc/sysconfig/elasticsearch"
if [ -f "$ES_ENV_FILE" ]; then
    . "$ES_ENV_FILE"
fi

exec="$ES_HOME/bin/elasticsearch"
prog="elasticsearch"
pidfile="$PID_DIR/${prog}.pid"

export ES_JAVA_OPTS
export JAVA_HOME
export ES_PATH_CONF
export ES_STARTUP_SLEEP_TIME

lockfile=/var/lock/subsys/$prog

start() {

    # Ensure that the PID_DIR exists (it is cleaned at OS startup time)
    if [ -n "$PID_DIR" ] && [ ! -e "$PID_DIR" ]; then
        mkdir -p "$PID_DIR" && chown elasticsearch:elasticsearch "$PID_DIR"
    fi
    if [ -n "$pidfile" ] && [ ! -e "$pidfile" ]; then
        touch "$pidfile" && chown elasticsearch:elasticsearch "$pidfile"
    fi

    cd $ES_HOME
    echo -n $"Starting $prog: "
    # if not running, start it up here, usually something like "daemon $exec"
    #daemon --user elasticsearch --pidfile $pidfile $exec -p $pidfile -d
    su - elasticsearch -c "${exec} -p ${pidfile} -d"

    retval=$?
    echo
    [ $retval -eq 0 ] && touch $lockfile
    return $retval
}

stop() {
    echo -n $"Stopping $prog: "
    # stop it here, often "killproc $prog"
    killproc -p $pidfile -d 86400 $prog
    retval=$?
    echo
    [ $retval -eq 0 ] && rm -f $lockfile
    return $retval
}

restart() {
    stop
    start
}

reload() {
    restart
}

force_reload() {
    restart
}

rh_status() {
    # run checks to determine if the service is running or use generic status
    status -p $pidfile $prog
}

rh_status_q() {
    rh_status >/dev/null 2>&1
}


case "$1" in
    start)
        rh_status_q && exit 0
        $1
        ;;
    stop)
        rh_status_q || exit 0
        $1
        ;;
    restart)
        $1
        ;;
    reload)
        rh_status_q || exit 7
        $1
        ;;
    force-reload)
        force_reload
        ;;
    status)
        rh_status
        ;;
    condrestart|try-restart)
        rh_status_q || exit 0
        restart
        ;;
    *)
        echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload}"
        exit 2
esac
exit $?

Kibana

elasticsearch均添加了hosts文件解析到本機內網IP

配置文件/etc/kibana/kibana.yml

egrep -v '^$|#' /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://elasticsearch:9200"
elasticsearch.requestTimeout: 3000000

Nginx server配置

server {
    listen 80;
    server_name 域名;

    access_log /var/log/nginx/access-elk.log;
    error_log /var/log/nginx/error-elk.log;

    location ~ /.well-known/acme-challenge {
        allow all;

        content_by_lua_block {
            auto_ssl:challenge_server()
        }
    }

    location / {
        return 301 https://$host$request_uri;
    }
}

server {
    listen 443 ssl http2;
    server_name 域名;

    ssl_certificate_by_lua_block {
        auto_ssl:ssl_certificate()
    }

    ssl_certificate /etc/nginx/ssl/elk.crt;
    ssl_certificate_key /etc/nginx/ssl/elk.key;
    ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:ECDHE-RSA-AES128-GCM-SHA256:AES256+EECDH:DHE-RSA-AES128-GCM-SHA256:AES256+EDH:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:DES-CBC3-SHA:HIGH:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4";
    ssl_prefer_server_ciphers on;
    ssl_session_cache shared:SSL:16m;
    ssl_dhparam /etc/nginx/ssl/dhparam.pem;

    access_log /var/log/nginx/acess-elk.log;
    error_log /var/log/nginx/error-elk.log warn;

    allow 開放訪問IP段;
    deny all;

    location / {
        proxy_pass http://elasticsearch:5601;
        proxy_set_header Host $host;
    }
}

 

到這裏配置部署篇基本已經完成,後續篇幅將介紹平常運維中常見的操做。

好比一鍵部署腳本,配置增刪,批量部署,index,shard,常見操做,性能監控,優化,擴容......

先上個如今集羣運行1個星期左右的x-pack監控截圖,共採集2,448,834,111條記錄,3T數據,日均約3億條。

結語:

這篇博文斷斷續續大概花了將近1個星期整理完,原本想詳細到每行配置文件都加上註解,熟話說萬事開頭難,好歹在博客園也開了個頭了。

 此時網易雲音樂隨機到一首喬杉唱的《塑料袋》很有共鳴,忽然以爲他是個被喜劇耽誤了的胖歌手。

相關文章
相關標籤/搜索