1、簡介html
什麼是ELK?ELK是Elasticsearch、Logstash、Kibana這三個軟件的首字母縮寫;其中elasticsearch是用來作數據的存儲和搜索的搜索引擎;logstash是數據收集處理平臺,它可以對特定的數據作分析、切詞、收集、過濾等等處理,一般用於對日誌的處理;kibana是用於把處理後的數據作可視化展現,提供一個web界面,方便咱們去elasticsearch中檢索想要的數據;elasticsearch是一個高度可擴展的開源全文搜索和分析引擎,它可實現數據的實時全文搜索,支持分佈式實現高可用,提供RUSTfull風格的API接口,能夠處理大規模日誌數據;java
elasticsearch是基於java語言在lucene的框架上進行開發實現;lucene是java中的一個成熟免費的開源搜索類庫,本質上lucene只是提供編程API接口,要想使用lucene框架作搜索引擎,須要用戶自行開發lucene的外殼,實現調用lucene的API接口實現全文檢索和搜尋;elasticsearch就是以lucene爲信息檢索庫的搜索引擎;node
elasticsearch的基本組件web
索引(index):文檔容器,具備相似屬性的文檔的集合。相似關係型數據庫中的表的概念;在elasticsearch中索引名稱必須使用小寫字母;數據庫
類型(type):類型是索引內部的邏輯分區,其意義徹底取決於用戶需求。一個索引內部可定義一個或多個類型。一搬來講,類型就是擁有相同的域的文檔的預約義;express
文檔(document):文檔是lucene索引和搜索的原子單位,它包含了一個或多個域。是域的容器,基於JSON格式表示。一個域由一個名字,一個或多個值組成;擁有多個值得域,一般咱們稱爲多值域;編程
映射(mapping):原始內容存儲爲文檔以前須要事先進行分析,例如切詞、過濾掉某些詞等;映射用於定義此分析機制該如何實現;除此以外,ES(elasticsearch)還爲映射提供了諸如將域中的內容排序等功能。json
elasticsearch集羣組件bootstrap
cluster:ES的集羣標識爲集羣名稱;默認爲"elasticsearch"。節點就是靠此名字來決定加入到哪一個集羣中。一個節點只能屬於於一個集羣。bash
Node:運行了單個ES實例的主機即爲節點。用於存儲數據、參與集羣索引及搜索操做。節點的標識靠節點名。
Shard:將索引切割成爲的物理存儲組件;但每個shard都是一個獨立且完整的索引;建立索引時,ES默認將其分割爲5個shard,用戶也能夠按需自定義,建立完成以後不可修改。shard有兩種類型primary shard和replica。Replica用於數據冗餘及查詢時的負載均衡。每一個主shard的副本數量可自定義,且可動態修改。
ES Cluster工做過程
啓動時,經過多播(默認)或單播方式在9300/tcp查找同一集羣中的其它節點,並與之創建通訊。集羣中的全部節點會選舉出一個主節點負責管理整個集羣狀態,以及在集羣範圍內決定各shards的分佈方式。站在用戶角度而言,每一個node都可接收並響應用戶的各種請求。
集羣有狀態:green, red, yellow;green表示集羣狀態健康,各節點上的shard和咱們定義的同樣;yellow表示集羣狀態亞健康,可能存在shard和咱們定義的不一致,好比某個節點宕機了,它上面的shard也隨着消失,此時集羣的狀態就是亞健康狀態;通常yellow狀態是很容易轉變爲green狀態的;red表示集羣狀態不健康,好比3個節點有2個節點都宕機了,那麼也就意味着這兩個節點上的shard丟失,固然shard丟失,對應的數據也會隨之丟失;因此red狀態表示集羣有丟失數據的風險;
2、elasticsearch集羣部署
環境說明
某個服務若是以分佈式或集羣的模式工做,首先咱們要把各節點的時間進行同步,這是集羣的基本原則;其次,一個集羣的名稱解析不能也不該該依賴外部的dns服務來解析,由於一旦dns服務掛掉,它會影響整個集羣的通訊,因此若是須要用到名稱解析,咱們應該首先考慮hosts文件來解析各節點名稱;若是集羣各節點間須要互相拷貝數據,咱們應該還要作ssh 互信;以上三個條件是大多數集羣的最基本條件;
名稱 | ip地址 | 端口 |
es1 | 192.168.0.41 | 9200/9300 |
es2 | 192.168.0.42 | 9200/9300 |
各節點安裝jdk
yum install -y java-1.8.0-openjdk-devel
提示:不一樣的es版本對jdk的版本要求也不同,這個能夠去官方文檔中看,對應es版本須要用到的jdk版本;
導出JAVA_HOME
驗證java版本和JAVA_HOME環境變量
下載elasticsearch rpm包
[root@node01 ~]# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.12.rpm --2020-10-01 20:44:29-- https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.12.rpm Resolving artifacts.elastic.co (artifacts.elastic.co)... 151.101.110.222, 2a04:4e42:36::734 Connecting to artifacts.elastic.co (artifacts.elastic.co)|151.101.110.222|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 148681336 (142M) [application/octet-stream] Saving to: ‘elasticsearch-6.8.12.rpm’ 100%[==========================================================================>] 148,681,336 133MB/s in 1.1s 2020-10-01 20:45:07 (133 MB/s) - ‘elasticsearch-6.8.12.rpm’ saved [148681336/148681336]
安裝elasticsearch rpm包
[root@node01 ~]# ll total 145200 -rw-r--r-- 1 root root 148681336 Aug 18 19:38 elasticsearch-6.8.12.rpm [root@node01 ~]# yum install ./elasticsearch-6.8.12.rpm Loaded plugins: fastestmirror Examining ./elasticsearch-6.8.12.rpm: elasticsearch-6.8.12-1.noarch Marking ./elasticsearch-6.8.12.rpm to be installed Resolving Dependencies --> Running transaction check ---> Package elasticsearch.noarch 0:6.8.12-1 will be installed --> Finished Dependency Resolution Dependencies Resolved =================================================================================================================================== Package Arch Version Repository Size =================================================================================================================================== Installing: elasticsearch noarch 6.8.12-1 /elasticsearch-6.8.12 229 M Transaction Summary =================================================================================================================================== Install 1 Package Total size: 229 M Installed size: 229 M Is this ok [y/d/N]: y Downloading packages: Running transaction check Running transaction test Transaction test succeeded Running transaction Creating elasticsearch group... OK Creating elasticsearch user... OK Installing : elasticsearch-6.8.12-1.noarch 1/1 ### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd sudo systemctl daemon-reload sudo systemctl enable elasticsearch.service ### You can start elasticsearch service by executing sudo systemctl start elasticsearch.service Created elasticsearch keystore in /etc/elasticsearch Verifying : elasticsearch-6.8.12-1.noarch 1/1 Installed: elasticsearch.noarch 0:6.8.12-1 Complete! [root@node01 ~]#
編輯配置文件
提示:es的主配置文件是/etc/elasticsearch/elasticsearch.yml;其中咱們須要配置cluster.name,node.name,path.data,path.log,這四項是很是重要的,cluster.name是配置的集羣名稱,同一集羣各主機就是依賴這個配置判斷是不是同一集羣,因此在同一集羣的其餘節點的配置,這個名稱必須一致;node.name是用於標識節點名稱,這個名稱在集羣中是惟一的,也就說這個名稱在同一集羣的其餘節點必須惟一,不能重複;path.data用於指定es存放數據的目錄,建議各節點都配置同一個目錄方便管理;其次這個目錄還建議掛載一個存儲;path.logs用於指定es的日誌存放目錄;
提示:bootstrap.memory_lock: true這項配置表示啓動es時,當即分配jvm.options這個文件中定義的內存大小;默認沒有啓用,若是要啓用,咱們須要主機節點內存是否夠用,以及在啓動腳本中指定es啓動對內存無限制,同時咱們也應該把elasticsearch用戶的資源限制調大一些;network.host用於指定es監聽的ip地址,0.0.0.0表示監聽本機全部可用地址;http.port用於指定對用戶提供服務的端口地址;discovery.zen.ping.unicast.hosts指定對那些主機作單播通訊來發現節點;discovery.zen.minimum_master_nodes指定master節點的的最小數量;不指定默認就是1;
完整的配置
[root@node01 ~]# cat /etc/elasticsearch/elasticsearch.yml # ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: # cluster.name: test-els-cluster # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: # node.name: node01 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): # path.data: /els/data # # Path to log files: # path.logs: /els/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): # network.host: 0.0.0.0 # # Set a custom port for HTTP: # http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when new node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] # discovery.zen.ping.unicast.hosts: ["node01", "node02"] # # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1): # discovery.zen.minimum_master_nodes: 1 # # For more information, consult the zen discovery module documentation. # # ---------------------------------- Gateway ----------------------------------- # # Block initial recovery after a full cluster restart until N nodes are started: # #gateway.recover_after_nodes: 3 # # For more information, consult the gateway module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true [root@node01 ~]#
建立數據目錄和日誌目錄,並把對應目錄修改爲elasticsearch屬主和屬組
複製配置文件到其餘節點對應位置,並修改node.name爲對應節點名稱,並在對應節點上建立數據目錄和日誌目錄並把其屬主和屬組修改爲elasticsearch
提示:對於node02上的es配置和node01上的配置,惟一不一樣的就是節點名稱,其他都是同樣的;
啓動node0一、node02上的es,並把es設置爲開機啓動
提示:能夠看到node01和node02上的9200和9300都處於監聽狀態了;9200是用戶對外提供服務的端口,9300是用於集羣各節點通訊端口;到此2節點的es集羣就搭建好了;
驗證:訪問node01和node02的9200端口,看看響應內容cluster_name和cluster_uuid是不是同樣?
提示:能夠看到訪問node01和node02的9200端口,響應內容都響應了相同cluster_name和cluster_uuid;說明node01和node02屬於同一個集羣;
查看es接口提供的cat接口
[root@node01 ~]# curl http://node02:9200/_cat =^.^= /_cat/allocation /_cat/shards /_cat/shards/{index} /_cat/master /_cat/nodes /_cat/tasks /_cat/indices /_cat/indices/{index} /_cat/segments /_cat/segments/{index} /_cat/count /_cat/count/{index} /_cat/recovery /_cat/recovery/{index} /_cat/health /_cat/pending_tasks /_cat/aliases /_cat/aliases/{alias} /_cat/thread_pool /_cat/thread_pool/{thread_pools} /_cat/plugins /_cat/fielddata /_cat/fielddata/{fields} /_cat/nodeattrs /_cat/repositories /_cat/snapshots/{repository} /_cat/templates [root@node01 ~]#
查看集羣node信息
[root@node01 ~]# curl http://node02:9200/_cat/nodes 192.168.0.42 19 96 1 0.00 0.05 0.05 mdi - node02 192.168.0.41 15 96 1 0.03 0.04 0.05 mdi * node01 [root@node01 ~]#
提示:後面帶*號的表示master節點;
查看集羣健康狀態
[root@node01 ~]# curl http://node02:9200/_cat/health 1601559464 13:37:44 test-els-cluster green 2 2 0 0 0 0 0 0 - 100.0% [root@node01 ~]#
查看集羣索引信息
[root@node01 ~]# curl http://node02:9200/_cat/indices [root@node01 ~]#
提示:這裏顯示空,是由於集羣裏沒有任何數據;
查看集羣分片信息
[root@node01 ~]# curl http://node02:9200/_cat/shards [root@node01 ~]#
獲取myindex索引下的test類型的1號文檔信息
[root@node01 ~]# curl http://node02:9200/myindex/test/1 {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_expression","resource.id":"myindex","index_uuid":"_na_","index":"myindex"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_expression","resource.id":"myindex","index_uuid":"_na_","index":"myindex"},"status":404}[root@node01 ~]# [root@node01 ~]# curl http://node02:9200/myindex/test/1?pretty { "error" : { "root_cause" : [ { "type" : "index_not_found_exception", "reason" : "no such index", "resource.type" : "index_expression", "resource.id" : "myindex", "index_uuid" : "_na_", "index" : "myindex" } ], "type" : "index_not_found_exception", "reason" : "no such index", "resource.type" : "index_expression", "resource.id" : "myindex", "index_uuid" : "_na_", "index" : "myindex" }, "status" : 404 } [root@node01 ~]#
提示:?pretty表示用易讀的JSON格式輸出;從上面的反饋內容,它告訴咱們沒有找到指定的索引;
添加一個文檔到es集羣的指定索引
[root@node01 ~]# curl -XPUT http://node01:9200/myindex/test/1 -d ' {"name":"zhangsan","age":18,"gender":"nan"}' {"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406}[root@node01 ~]#
提示:這裏向es寫指定文檔到指定索引下,返回不支持header頭部;解決辦法,手動指定頭部類型;
[root@node01 ~]# curl -XPUT http://node01:9200/myindex/test/1 -H 'content-Type:application/json' -d ' {"name":"zhangsan","age":18,"gender":"nan"}' {"_index":"myindex","_type":"test","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}[root@node01 ~]#
驗證:查看myindex索引下的test類型的1號文檔,看看是否可以查到咱們剛纔寫的數據?
[root@node01 ~]# curl http://node01:9200/myindex/test/1?pretty { "_index" : "myindex", "_type" : "test", "_id" : "1", "_version" : 1, "_seq_no" : 0, "_primary_term" : 1, "found" : true, "_source" : { "name" : "zhangsan", "age" : 18, "gender" : "nan" } } [root@node01 ~]#
提示:能夠看到返回了咱們剛纔寫的文檔內容;
如今再次查看集羣的索引信息和分片信息
提示:能夠看到如今es集羣中有一個myindex的索引,其狀態爲green;分片信息中也能夠看到有5各主分片和5個replica分片;而且每一個分片都的master和replica都沒有在同一個節點;
搜索全部的索引和類型
提示:jq是用於以美觀方式顯示json數據,做用同pretty的同樣;以上命令表示從全部類型所用索引中搜索,name字段爲zhangsan的信息;若是命中了,就會把對應文檔打印出來;未命中就告訴咱們未命中;以下
[root@node01 ~]# curl http://node01:9200/_search?q=age:19|jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 135 100 135 0 0 2906 0 --:--:-- --:--:-- --:--:-- 2934 { "took": 37, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } } [root@node01 ~]# curl http://node01:9200/_search?q=age:18|jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 247 100 247 0 0 10795 0 --:--:-- --:--:-- --:--:-- 11227 { "took": 12, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "myindex", "_type": "test", "_id": "1", "_score": 1, "_source": { "name": "zhangsan", "age": 18, "gender": "nan" } } ] } } [root@node01 ~]#
提示:若是要在指定索引中搜索在前面的url加上指定的索引名稱便可;
提示:若是有多個索引咱們也能夠根據多個索引名稱的特色來使用*來匹配;以下
[root@node01 ~]# curl http://node01:9200/*/_search?q=age:18|jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 247 100 247 0 0 8253 0 --:--:-- --:--:-- --:--:-- 8517 { "took": 20, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "myindex", "_type": "test", "_id": "1", "_score": 1, "_source": { "name": "zhangsan", "age": 18, "gender": "nan" } } ] } } [root@node01 ~]# curl http://node01:9200/my*/_search?q=age:18|jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 247 100 247 0 0 7843 0 --:--:-- --:--:-- --:--:-- 7967 { "took": 19, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "myindex", "_type": "test", "_id": "1", "_score": 1, "_source": { "name": "zhangsan", "age": 18, "gender": "nan" } } ] } } [root@node01 ~]#
搜索指定的單個索引的指定類型
提示:以上就是在es集羣的命令行接口經常使用操做,一般咱們用es集羣,不會在命令行中作搜索,咱們會利用web界面來作;命令行只是用於測試;好了到此es集羣就搭建好了;後續咱們就能夠用logstash收集指定地方的數據,傳給es,而後再利用kibana的web界面來展現es中的數據;