http://www.ttlsa.com/elk/elk-logstash-configuration-syntax/
https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.htmlhtml
參考:
https://www.kancloud.cn/hanxt/elk/155901
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.htmljava
正則表達式參考:
https://github.com/kkos/oniguruma/blob/master/doc/RElinux
grok的意思: (用感受感知,而非動腦思考)to understand sth completely using your feelings rather than considering the factsios
/usr/local/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.1.2/patterns 或者直接訪問這個: https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns $ ls /usr/local/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-4.1.2/patterns/ aws bind exim grok-patterns httpd junos maven mcollective-patterns nagios rails ruby bacula bro firewalls haproxy java linux-syslog mcollective mongodb postgresql redis squid
如apache日誌解析: logstash過濾解析apache日誌nginx
filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}"} } }
logstash內置的pattern的定義(嵌套調用)
git
再舉個例子
%{IP:client} 這裏意思是: 用IP正則去匹配日誌內容,匹配到的內容存儲在key client裏.github
input { file { path => "/var/log/http.log" } } filter { grok { match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" } } } output { stdout { codec => rubydebug } }
參考:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
https://doc.yonyoucloud.com/doc/logstash-best-practice-cn/filter/grok.html正則表達式
咱們只須要request_time字段,默認僅match會讀取message某字段賦給新字段,這樣就形成了數據重複,爲了解決這個問題,幹掉message字段redis
input {stdin{}} filter { grok { match => { "message" => "\s+(?<request_time>\d+(?:\.\d+)?)\s+" } } } output {stdout{ codec => rubydebug }} begin 123.456 end { "@version" => "1", "host" => "ip-70.32.1.32.hosted.by.gigenet.com", "@timestamp" => 2017-11-29T03:47:15.377Z, "request_time" => "123.456", "message" => "begin 123.456 end" }
input {stdin{}} filter { grok { match => { "message" => "\s+(?<request_time>\d+(?:\.\d+)?)\s+" } remove_field => ["message"] } } output {stdout{ codec => rubydebug }} begin 123.456 end { "@version" => "1", "host" => "ip-70.32.1.32.hosted.by.gigenet.com", "@timestamp" => 2017-11-29T03:51:01.135Z, "request_time" => "123.456" }
參考: https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
能夠寫文件裏,也能夠直接指定,如上一個例子.sql
$ cat /var/sample.log Jan 1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=<20130101142543.5828399CCAF@mailserver14.example.com> $ cat ./patterns/postfix: POSTFIX_QUEUEID [0-9A-F]{10,11} input { file { path => "/var/sample.log" } } filter { grok { patterns_dir => ["./patterns"] match => { "message" => "%{SYSLOGBASE} %{POSTFIX_QUEUEID:queue_id}: %{GREEDYDATA:syslog_message}" } } } output { stdout { codec => rubydebug } }
參考:http://blog.51cto.com/irow10/1828077 這裏格式有問題,我修復了.
input { stdin {} } filter { grok { match => { "message" => "%{IPORHOST:addre} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] \"%{WORD:http_method} %{NOTSPACE:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:status} (?:%{NUMBER:bytes}|-) \"(?:%{URI:http_referer}|-)\" \"%{GREEDYDATA:User_Agent}\"" } remove_field => ["message"] } date { match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ] } } output { stdout { codec => rubydebug } }
192.168.10.97 - - [19/Jul/2016:16:28:52 +0800] "GET / HTTP/1.1" 200 23 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36" { "request" => "/", "auth" => "-", "ident" => "-", "User_Agent" => "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36", "addre" => "192.168.10.97", "@timestamp" => 2016-07-19T08:28:52.000Z, "http_method" => "GET", "bytes" => "23", "@version" => "1", "host" => "no190.pp100.net", "httpversion" => "1.1", "timestamp" => "19/Jul/2016:16:28:52 +0800", "status" => "200" }
grok在線檢測
參考: http://grokdebug.herokuapp.com/
192.168.10.97 - - [19/Jul/2016:16:28:52 +0800] "GET / HTTP/1.1" 200 23 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36" %{IPORHOST:addre} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] \"%{WORD:http_method} %{NOTSPACE:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:status} (?:%{NUMBER:bytes}|-) \"(?:%{URI:http_referer}|-)\" \"%{GREEDYDATA:User_Agent}\"
參考: https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html
input { stdin { } } filter { mutate { add_field => { "show" => "This data will be in the output" } } } output { if [@metadata][test] == "Hello" { stdout { codec => rubydebug } } }
sdf { "@version" => "1", "host" => "ip-70.32.1.32.hosted.by.gigenet.com", "show" => "This data will be in the output", "@timestamp" => 2017-11-29T09:23:44.160Z, "message" => "sdf" }
參考: http://www.21yunwei.com/archives/5296
input { file { path => "/logs/nginx/access.log" type => "nginx" start_position => "beginning" add_field => { "key"=>"value"} codec => "json" } } output { stdout{ codec => rubydebug{ } } }
參考:
http://blog.51cto.com/irow10/1828077
https://segmentfault.com/a/1190000011721483
https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html
date插件能夠對日期格式定義
mutate插件能夠增刪字段,能夠改寫字段格式
kv插件可...
使用上面的日誌做爲示例,使用 mutate 插件的 lowercase 配置選項,咱們能夠將「log-level」字段轉換爲小寫:
filter { grok {...} mutate { lowercase => [ "log-level" ] } }
kv filter 來指示 Logstash 如何處理它
kv插件能夠拆解
filter { kv { source => "metadata" trim => "\"" include_keys => [ "level","service","customerid",」queryid」 ] target => "kv" } }