由於我是以 elk stock 結構爲目標,因此我會以 elasticsearch + redis + logstash + kibana 爲中心來寫下面的內容。php
經過 《logstash 最佳實踐》 學習html
管理配置文件主要用來發起任務的。輸入(input)、處理(filter)、輸出(output)。nginx
這個主要指定監聽那些文件或輸出。個人elk stock 架構中,只有文件和redis兩個類型redis
input { # redis redis { host => "127.0.0.1" port => 6379 password => "123456" key => "logstash-queue" data_type => "list" db => 0 } # 文件 file { type => "nginx-access" path => "/usr/local/nginx/logs/access.log" start_position => beginning sincedb_path => "/var/log/logstash/sincedb/nginx" codec => multiline { pattern => "^\d+" negate => true what => "previous" } } }
note: input.file.codec 這個日誌內容若是會出現多行,能夠經過 ^d+ 進行分割,換行會被轉成 nthinkphp
經常使用匹配方式 grok(正則匹配)json
logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns 目錄下面是預約義的正則匹配。使用方法如 %{IPORHOST:client}ruby
若是沒有辦法知足,能夠本身寫正則去匹配。驗證正則是否正確能夠經過 kibana 裏的開發工具(Dev) > Grok調試器(Grok Debugger) 來驗證。
也能夠經過 http://grokdebug.herokuapp.com/ 驗證。架構
filter { if [type] == "nginx-access" { grok { match => { "message" => "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}" } } } else if [type] == "nginx-error" { grok { match => ["message" , "(?<timestamp>%{YEAR}[./-]%{MONTHNUM}[./-]%{MONTHDAY}[- ]%{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:errormessage}(?:, client: (?<clientip>%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:server}?)(?:, request: %{QS:request})?(?:, upstream: (?<upstream>\"%{URI}\"|%{QS}))?(?:, host: %{QS:request_host})?(?:, referrer: \"%{URI:referrer}\")?"] } } }
優化方案 app
直接傳入日誌能夠省日誌內容匹配部分資源佔用。可是並非全部的軟件日誌都能配置。有些雞肋。elasticsearch
output { redis { host => "127.0.0.1" port => 6379 password => "123456" key => "logstash-queue" data_type => "list" db => 4 } elasticsearch { hosts => ["http://localhost:9200"] index => "logstash-%{+YYYY.MM.dd}" } }
es 裏支持全文索引,可是默認是支持英文的。不符合咱們的需求,咱們須要借用 ik 分詞插件才能達到要求。
一、一條數據有不少行的處理辦法
使用 input.codec 進行合併, 以 nginx 默認格式日誌爲例。
2019/09/23 10:39:01 [error] 4130#0: *1 FastCGI sent in stderr: "PHP message: PHP Warning: require(/var/www/study/tp5-study/public/../thinkphp/base.php): failed to open stream: No such file or directory in /var/www/study/tp5-study/public/index.php on line 16 PHP message: PHP Stack trace: PHP message: PHP 1. {main}() /var/www/study/tp5-study/public/index.php:0 PHP message: PHP Fatal error: require(): Failed opening required '/var/www/study/tp5-study/public/../thinkphp/base.php' (include_path='.:') in /var/www/study/tp5-study/public/index.php on line 16 PHP message: PHP Stack trace: PHP message: PHP 1. {main}() /var/www/study/tp5-study/public/index.php:0" while reading response header from upstream, client: 192.168.33.1, server: tp5.study.me, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "tp5.study.me", referrer: "http://tp5.study.me/" 2019/09/23 10:40:14 [error] 4130#0: *7 FastCGI sent in stderr: "PHP message: PHP Warning: require(/var/www/study/tp5-study/public/../thinkphp/base.php): failed to open stream: No such file or directory in /var/www/study/tp5-study/public/index.php on line 16 PHP message: PHP Stack trace: PHP message: PHP 1. {main}() /var/www/study/tp5-study/public/index.php:0 PHP message: PHP Fatal error: require(): Failed opening required '/var/www/study/tp5-study/public/../thinkphp/base.php' (include_path='.:') in /var/www/study/tp5-study/public/index.php on line 16 PHP message: PHP Stack trace: PHP message: PHP 1. {main}() /var/www/study/tp5-study/public/index.php:0" while reading response header from upstream, client: 192.168.33.1, server: tp5.study.me, request: "GET /favicon.ico HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "tp5.study.me"
以上內容可知,第條日誌的開頭都是由日期組成的。因此咱們以數字開頭的進行日誌分割。便可
input { stdin { codec => multiline { pattern => "^\d+" negate => true what => "previous" } } }
二、日誌默認會帶着一個 message,這個message 是未匹配數據的日誌。已經把內容提出來了,就沒有必要存在原始數據。
filter { grok { match => ["message" , "(?<timestamp>%{YEAR}[./-]%{MONTHNUM}[./-]%{MONTHDAY}[- ]%{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:message}(?:, client: (?<clientip>%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:server}?)(?:, request: %{QS:request})?(?:, upstream: (?<upstream>\"%{URI}\"|%{QS}))?(?:, host: %{QS:request_host})?(?:, referrer: \"%{URI:referrer}\")?"] overwrite => ["message"] } }
經過 overwrite 進行重寫。overwrite必需在 filter.grok 裏
三、日誌抓取中都有 @timestamp,我但願舊數據的時間寫到這個時間裏去
注:這個是 logstash 自帶的東西,不推薦修改,因此用 timestamp 來代替,不一樣的是這個是匹配獲得的時間