Elastic stack(基於7.0.0官方文檔)

Lucene

  • TextField創建索引 可是能夠不存儲
  • TermQuery不會將查詢分詞了,把查詢條件當成固定的詞條
  • 文檔更新是把舊的給刪了再整一個新的出來(文檔ID會變)。更新的代價太大了
  • 布爾語句,BooleanClause

+ 表明 must - 表明 mustnot 表明 should html

Beats

開始看filebeat,被官網帶到了Getting started with the Elastic Stack.node

Getting started with the Elastic Stack

這個小教程是用metricbeat來採集服務器指標數據,而後用kibana作展現。linux

安裝ES port:9200

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.0.1-linux-x86_64.tar.gz
tar -xzvf elasticsearch-7.0.1-linux-x86_64.tar.gz
cd elasticsearch-7.0.1
./bin/elasticsearch

curl http://127.0.0.1:9200
複製代碼

安裝kibana port:5601

kibana是專門用於ES的,對數據進行搜索以及可視化。小教程裏建議kibana和ES裝同一臺機器上。git

配置文件裏須要配置ES集羣的地址 web

curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-7.0.1-linux-x86_64.tar.gz
tar xzvf kibana-7.0.1-linux-x86_64.tar.gz
cd kibana-7.0.1-linux-x86_64/
./bin/kibana
複製代碼

安裝Metricbeat 各類beat

beat是作採集用的,裝在服務器上的agent。通常輸出到ESlogstash本身自己不能作解析redis

curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.0.1-linux-x86_64.tar.gz
tar xzvf metricbeat-7.0.1-linux-x86_64.tar.gz
複製代碼
  • 用系統模塊來採集系統日誌,例如CPU、內存
  • 開啓系統模塊./metricbeat modules enable system
  • kibana配置文件裏默認是沒有開啓
#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
複製代碼
  • 上述配置文件說了,加命令setup 加載儀表板./metricbeat setup -e-e是將輸出打到stderr不是syslog就是把日誌打到控制檯能看得見
  • ./metricbeat -e

Logstash

若是beat採的數須要額外處理那麼須要進logstash(其實就是解析) json

curl -L -O https://artifacts.elastic.co/downloads/logstash/logstash-7.0.1.tar.gz
tar -xzvf logstash-7.0.1.tar.gz
複製代碼
  • 建立一個logstash pipeline 一根管子一邊接受beat 一邊懟到ES
  • 這根管子是一個配置文件例如demo-metrics-pipeline.conf,監聽5044端口
input {
  beats {
    port => 5044
  }
}

# The filter part of this file is commented out to indicate that it
# is optional.
# filter {
#
# }

output {
  elasticsearch {
    hosts => "localhost:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
}
複製代碼
  • 啓動 ./bin/logstash -f demo-metrics-pipeline.conf

還得配置beat 讓其吐出到logstash 安全

filter

metricbeat採集了cmdline完整參數 太長了,解析一下它。用grokruby

filter {
  if [system][process] {
    if [system][process][cmdline] {
      grok {
        match => { 
          "[system][process][cmdline]" => "^%{PATH:[system][process][cmdline_path]}"
        }
        remove_field => "[system][process][cmdline]" 
      }
    }
  }
}
複製代碼

解析這塊。grok後面會再寫。 bash


正篇FileBeat

  • FileBeat是一個agent,輸出到es、logstash
  • 配置輸入,能夠有多個輸入/var/log/*.log
  • 工做原理
  • 有一個或者多個輸入,把收集到的全部日誌發給libbeat 平臺,libbeat再統一輸出。

安裝filebeat

  • 直接在官網下載tar.gz安裝(後續上容器)

配置filebeat

filebeat.inputs:
- type: log
 enabled: true
 paths:
 - /var/log/*.log
複製代碼
  • /var/log/*/*.log/var/log子文件夾中獲取全部.log,如今還不支持獲取全部層中的全部文件
  • 一般!輸出要麼ES 要麼Logstash進行額外的處理。也能吐到kafka
  • 還可使用filebeat提供的Kibana示例儀表盤
  • 吐ES
output.elasticsearch:
 hosts: ["myEShost:9200"]
複製代碼
  • 吐Logstash
setup.kibana:
 host: "mykibanahost:5601" 
複製代碼
  • 若是ES和kibana設置了安全性 (這裏密碼應該配置是加密的,不要硬編碼),加密密碼
  • 若是沒有爲Kibana指定憑據,那用ES的憑據
  • 如今能夠再kibana中管理beat Beats central management.這個功能還不是正式版。
output.elasticsearch:
 hosts: ["myEShost:9200"]
 username: "filebeat_internal"
 password: "YOUR_PASSWORD" 
setup.kibana:
 host: "mykibanahost:5601"
 username: "my_kibana_user"  
 password: "YOUR_PASSWORD"
複製代碼

詳細配置相關信息

filebeat的預約義模塊

./filebeat modules list

File is inactive: /var/log/boot.log. Closing because close_inactive of 5m0s reached. 說明文件沒有新東西

配置filebeat到logstash

  • 配置Logstash接受filebeat傳來的消息,對採集的信息作進一步處理。
  • filebeat讀取log會有遊標記錄,在data/registry/filebeat/data.json
#這是filebeat.yml 輸出到5044端口 默認logstash監聽的端口
#----------------------------- Logstash output --------------------------------
output.logstash:
 hosts: ["127.0.0.1:5044"]
複製代碼
  • For this configuration, you must load the index template into Elasticsearch manually because the options for auto loading the template are only available for the Elasticsearch output.??索引模板是什麼?

加載索引模板

  • 在Elasticsearch中,索引模板用於定義肯定如何分析字段的設置和映射。
  • Filebeat包安裝了推薦的Filebeat索引模板文件。
#==================== Elasticsearch template setting ==========================
#默認一個分片 這就是Filebeat在ES中索引只有一個分片緣由
setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false
複製代碼

目錄佈局

用rpm和tar安裝目錄以及日誌的位置不同具體查官方文檔。

beat採集日誌有的須要用root採

filebeat是如何工做的

理解這些概念會在配置的時候作出明智的選擇

  • filebeat由兩個主要的組件,inputs harvesters輸入和收割機
  • harvesters讀單個文件,逐行讀取發送到輸出。每一個文件都有一個harvester,收割機負責文件的打開和關閉,收割機工做時文件時打開的狀態。最後沒有讀不到了會出現close_inactive
  • 輸入模塊用來管理收割機,而且找到要讀取的資源。若是輸入類型是log,那麼會給這個log文件一個收割機。
  • 默認狀況下,Filebeat會保持文件處於打開狀態,直到close_inactive達到。

logstash

因爲有人幫咱們採集好了日誌,在kafka中,因此先用logstash對接

  • logstash能力很強,可以收集各類數據,交給下游來處理(es,kibana)
  • 安裝logstash須要配置 JAVA_HOME
  • 輸入處理輸出
# 可以從標準輸入中拿數懟到標準輸出 -e 是能夠直接跟配置 快速測試
cd logstash-7.0.1
bin/logstash -e 'input { stdin { } } output { stdout {} }'

# 結果以下
什麼鬼
{
       "message" => "什麼鬼",
      "@version" => "1",
    "@timestamp" => 2019-05-07T02:00:39.581Z,
          "host" => "node1"
}
複製代碼
  • 用filebeat讀取一個示例數據,output設置爲logstash
# logstash管道配置以下 先打印到標準輸出上查看
input {
  beats {
    port => 5044
  }
}

# rubydebug 這是用ruby的一個打印庫 讓輸出更好看
output {
  stdout { codec => rubydebug }
}
複製代碼
  • bin/logstash -f first-pipeline.conf --config.test_and_exit這個命令能夠看配置文件是否是好使
  • 跑一次bin/logstash -f first-pipeline.conf --config.reload.automatic config.reload.automatic這個選項能自動加載新的配置文件,不用重啓logstash
  • 經過IP解析地理位置 geoip插件
* logstash支持多輸入多輸出,能夠直接對接twitter 就無法實驗了。能夠直接輸出到文件。


geoip {
    source => "clientip"
}

{
            "ecs" => {
        "version" => "1.0.0"
    },
          "input" => {
        "type" => "log"
    },
          "agent" => {
        "ephemeral_id" => "860d92a1-9fdb-4b41-8898-75021e3edaaf",
             "version" => "7.0.0",
            "hostname" => "node1",
                  "id" => "c389aa98-534d-4f37-ba62-189148baa6a3",
                "type" => "filebeat"
    },
        "request" => "/robots.txt",
           "verb" => "GET",
           "host" => {
             "hostname" => "node1",
        "containerized" => true,
         "architecture" => "x86_64",
                   "os" => {
              "kernel" => "3.10.0-693.el7.x86_64",
            "codename" => "Maipo",
              "family" => "redhat",
            "platform" => "rhel",
             "version" => "7.4 (Maipo)",
                "name" => "Red Hat Enterprise Linux Server"
        },
                   "id" => "b441ff6952f647e7a366c69db8ea6664",
                 "name" => "node1"
    },
          "ident" => "-",
      "timestamp" => "04/Jan/2015:05:27:05 +0000",
           "auth" => "-",
           "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
       "referrer" => "\"-\"",
       "@version" => "1",
       "response" => "200",
    "httpversion" => "1.1",
        "message" => "218.30.103.62 - - [04/Jan/2015:05:27:05 +0000] \"GET /robots.txt HTTP/1.1\" 200 - \"-\" \"Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)\"",
       "clientip" => "218.30.103.62",
          "geoip" => {
           "region_code" => "BJ",
              "latitude" => 39.9288,
                    "ip" => "218.30.103.62",
              "location" => {
            "lat" => 39.9288,
            "lon" => 116.3889
        },
           "region_name" => "Beijing",
             "longitude" => 116.3889,
             "city_name" => "Beijing",
              "timezone" => "Asia/Shanghai",
         "country_code3" => "CN",
         "country_code2" => "CN",
          "country_name" => "China",
        "continent_code" => "AS"
    },
     "@timestamp" => 2019-05-07T03:43:18.368Z,
            "log" => {
          "file" => {
            "path" => "/itoa/elastic-stack/test-cas/logstash-demo/logstash-tutorial.log"
        },
        "offset" => 19301
    }
}
複製代碼

logstash工做原理

輸入來源

  • file 相似於 tail -f
  • syslog: listens on the well-known port 514 for syslog messages and parses according to the RFC3164 format
  • redis: reads from a redis server, using both redis channels and redis lists. Redis is often used as a "broker" in a centralized Logstash installation, which queues Logstash events from remote Logstash "shippers".相似於消息隊列
  • beats: processes events sent by Beats.

由於工做中使用kafka 因此先記錄一下logstash讀kafka

  • 用一個kafka客戶端去從kafka那消息。能夠啓動多個logstash來消費kafka 把他們弄成一個消費者組。理想狀態是和kafka分區的線程數相同最佳配置consumers_thread

logstash各個配置文件的說明

配置文件默認沒有開啓轉義,因此`\t`解析不了,須要去配置文件中修改這個配置。

官方文檔

logstash shutdown的時候會發生什麼

它在關閉前會執行一些操做

  • 中止全部的輸入過濾輸出
  • 處理全部事件
  • 終止自身進程 影響shutdown的緣由
  • 接受的東西來的特別慢
  • 一直沒連上output

不安全的關閉會丟數

Elastic Search

  • ES的索引的分片在一開始就要肯定好,由於文檔具體分到了那個分片上是根據分片的個數算出來的,分片數目若是能夠修改的話那麼讀文檔的時候就找不到分片了。

Kibana

相關文章
相關標籤/搜索