搭建elasticsearch集羣及故障轉移（3）

時間 2020-02-02

原文原文鏈接

1、書接上文html

上篇博客說到單臺es狀態爲yellow，須要搭建集羣解決。若是你資源有限只有一臺機器, cp一個elasticsearch，使用相同命令在本機再啓動一個es實例，再次檢查集羣健康, 發現unassigned_shards減小, active_shards增多。下面看es介紹、es配置文件和es集羣搭建。java

2、es介紹node

一、elasticsearch(下文稱es)是一款分佈式且高可擴展的全文搜索和分析引擎，自己是基於Lucene進行開發，但ES足夠簡單易用，提供豐富的API，部署容易。elasticsearch 已經提供了大部分設置，都是合理的默認配置。因此你沒必要進行煩人的配置就能夠嘗試一下。大多數時候，這些默認的配置就足以運行一個生產集羣了。linux

二、分片以及水平擴展.數據庫

elasticsearch用於構建高可用和可擴展的系統。擴展的方式能夠是購買更好的服務器(縱向擴展)或者購買更多的服務器（橫向擴展）,elasticsearch能從更強大的硬件中得到更好的性能，可是縱向擴展也有必定的侷限性。真正的擴展應該是橫向的，它經過增長節點來傳播負載和增長可靠性。對於大多數數據庫而言，橫向擴展意味着你的程序將作很是大的改動來利用這些新添加的設備。對比來講，elasticsearch天生是分佈式的：它知道如何管理節點來提供高擴展和高可用。bootstrap

三、集羣和節點服務器

節點(node)是你運行的elasticsearch實例。一個集羣(cluster)是一組具備相同cluster.name的節點集合，他們協同工做，共享數據並提供故障轉移和擴展功能，當有新的節點加入或者刪除節點，集羣就會感知到並平衡數據。集羣中一個節點會被選舉爲主節點(master),它用來管理集羣中的一些變動，例如新建或刪除索引、增長或移除節點等,固然一個節點也能夠組成一個集羣。網絡

四、節點通訊併發

咱們可以與集羣中的任何節點通訊，包括主節點。任何一個節點互相知道文檔存在於哪一個節點上，它們能夠轉發請求到咱們須要數據所在的節點上。咱們通訊的節點負責收集各節點返回的數據，最後一塊兒返回給客戶端。這一切都由elasticsearch透明的管理。app

四、分片(shard)與副本分片(replica shard)

分片用於elasticsearch在你的集羣中分配數據。想象把分片看成數據的容器。文檔存儲在分片中，而後分片分配給你集羣中的節點上。當你的集羣擴容或縮小，elasticsearch將會自動在你的節點間遷移分片，以使集羣保持平衡。

一個分片(shard)是一個最小級別的「工做單元(worker unit)」,它只是保存索引中全部數據的一小片.咱們的文檔存儲和被索引在分片中，可是咱們的程序不知道如何直接與它們通訊。取而代之的是，他們直接與索引通訊.elasticsearch中的分片分爲主分片和副本分片,複製分片只是主分片的一個副本，它用於提供數據的冗餘副本，在硬件故障以後提供數據保護，同時服務於像搜索和檢索等只讀請求，主分片的數量和複製分片的數量均可以經過配置文件配置。可是主切片的數量只能在建立索引時定義且不能修改.相同的分片不會放在同一個節點上。

五、集羣生態:

(1)、同集羣中節點之間能夠擴容縮容

(2)、主分片的數量會在其索引建立完成後修正，可是副本分片的數量會隨時變化

(3)、相同的分片不會放在同一個節點上

六、集羣健康(cluster health)

es中用三種顏色狀態表示:green，yellow，red.

green：全部主分片(Primary Shard)和副本分片(Replica Shard)均可用

yellow：全部主分片可用，但不是全部副本分片均可用

red：不是全部的主分片均可用

[root@ossec-server plugins]# curl -XGET http://localhost:9200/_cluster/health\?pretty

{

"cluster_name" : "elasticsearch", #集羣的名字

"status" : "yellow", #集羣的狀態

"timed_out" : false,

"number_of_nodes" : 2, #節點數

"number_of_data_nodes" : 1, #數據節點數

"active_primary_shards" : 216, #主分片數，216個index庫

"active_shards" : 216, #共有216個分片

"relocating_shards" : 0,

"initializing_shards" : 0,

"unassigned_shards" : 216, #未指定節點，配置了副本，僅用一臺機器部署會出現這種狀況

"number_of_pending_tasks" : 0,

"number_of_in_flight_fetch" : 0

}

3、elasticsearch配置文件詳解

[root@elasticsearch-node02 config]# tree

├── elasticsearch.yml

└── logging.yml

第一個是es的基本配置文件，第二個是日誌配置文件。下面主要講elasticsearch.yml。

cluster.name: elasticsearch

配置es的集羣名稱，默認是elasticsearch，es會自動發如今同一網段下的es，若是在同一網段下有多個集羣，就能夠用這個屬性來區分不一樣的集羣。

node.name: "Franz Kafka"

節點名，默認隨機指定一個name列表中名字，該列表在es的jar包中config文件夾裏name.txt文件中，其中有不少做者添加的有趣名字。

node.master: true

指定該節點是否有資格被選舉成爲node，默認是true，es是默認集羣中的第一臺機器爲master，若是這臺機掛了就會從新選舉master。

node.data: true

指定該節點是否存儲索引數據，默認爲true。

index.number_of_shards: 5

設置默認索引分片個數，默認爲5片。

index.number_of_replicas: 1

設置默認索引副本個數，默認爲1個副本。

path.conf: /path/to/conf

設置配置文件的存儲路徑，默認是es根目錄下的config文件夾。

path.data: /path/to/data

設置索引數據的存儲路徑，默認是es根目錄下的data文件夾，能夠設置多個存儲路徑，用逗號隔開，例：

path.data: /path/to/data1,/path/to/data2

path.work: /path/to/work

設置臨時文件的存儲路徑，默認是es根目錄下的work文件夾。

path.logs: /path/to/logs

設置日誌文件的存儲路徑，默認是es根目錄下的logs文件夾

path.plugins: /path/to/plugins

設置插件的存放路徑，默認是es根目錄下的plugins文件夾

bootstrap.mlockall: true

設置爲true來鎖住內存。由於當jvm開始swapping時es的效率會下降，因此要保證它不swap，能夠把ES_MIN_MEM和ES_MAX_MEM兩個環境變量設置成同一個值，而且保證機器有足夠的內存分配給es。同時也要容許elasticsearch的進程能夠鎖住內存，linux下能夠經過`ulimit -l unlimited`命令。

network.bind_host: 192.168.0.1

設置綁定的ip地址，能夠是ipv4或ipv6的，默認爲0.0.0.0。

network.publish_host: 192.168.0.1

設置其它節點和該節點交互的ip地址，若是不設置它會自動判斷，值必須是個真實的ip地址。

network.host: 192.168.0.1

這個參數是用來同時設置bind_host和publish_host上面兩個參數。

transport.tcp.port: 9300

設置節點間交互的tcp端口，默認是9300。

transport.tcp.compress: true

設置是否壓縮tcp傳輸時的數據，默認爲false，不壓縮。

http.port: 9200

設置對外服務的http端口，默認爲9200。

http.max_content_length: 100mb

設置內容的最大容量，默認100mb

http.enabled: false

是否使用http協議對外提供服務，默認爲true，開啓。

gateway.type: local

gateway的類型，默認爲local即爲本地文件系統，能夠設置爲本地文件系統，分佈式文件系統，hadoop的HDFS，和amazon的s3服務器，其它文件系統的設置方法下次再詳細說。

gateway.recover_after_nodes: 1

設置集羣中N個節點啓動時進行數據恢復，默認爲1。

gateway.recover_after_time: 5m

設置初始化數據恢復進程的超時時間，默認是5分鐘。

gateway.expected_nodes: 2

設置這個集羣中節點的數量，默認爲2，一旦這N個節點啓動，就會當即進行數據恢復。

cluster.routing.allocation.node_initial_primaries_recoveries: 4

初始化數據恢復時，併發恢復線程的個數，默認爲4。

cluster.routing.allocation.node_concurrent_recoveries: 2

添加刪除節點或負載均衡時併發恢復線程的個數，默認爲4。

indices.recovery.max_size_per_sec: 0

設置數據恢復時限制的帶寬，如入100mb，默認爲0，即無限制。

indices.recovery.concurrent_streams: 5

設置這個參數來限制從其它分片恢復數據時最大同時打開併發流的個數，默認爲5。

discovery.zen.minimum_master_nodes: 1

設置這個參數來保證集羣中的節點能夠知道其它N個有master資格的節點。默認爲1，對於大的集羣來講，能夠設置大一點的值（2-4）

discovery.zen.ping.timeout: 3s

設置集羣中自動發現其它節點時ping鏈接超時時間，默認爲3秒，對於比較差的網絡環境能夠高點的值來防止自動發現時出錯。

discovery.zen.ping.multicast.enabled: false

設置是否打開多播發現節點，默認是true。

discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3[portX-portY]"]

設置集羣中master節點的初始列表，能夠經過這些節點來自動發現新加入集羣的節點。

下面是一些查詢時的慢日誌參數設置

index.search.slowlog.level: TRACE

index.search.slowlog.threshold.query.warn: 10s

index.search.slowlog.threshold.query.info: 5s

index.search.slowlog.threshold.query.debug: 2s

index.search.slowlog.threshold.query.trace: 500ms

index.search.slowlog.threshold.fetch.warn: 1s

index.search.slowlog.threshold.fetch.info: 800ms

index.search.slowlog.threshold.fetch.debug:500ms

index.search.slowlog.threshold.fetch.trace: 200ms

4、搭建es集羣

一、搭建elasticsearch集羣很簡單，只要cluster.name設置一致，而且機器在同一網段下，啓動的es會自動發現對方，組成集羣。elasticsearch採用廣播的方式自動發現節點，須要等待一段時間才能發現新的節點。搭建完成後, 集羣健康從yellow恢復到green

二、默認的elasticsearch把本身和0.0.0.0地址綁定，HTTP傳輸的監聽端口在[9200-9300]，節點之間

通訊的端口在[9300-9400]。(範圍的意思是說若是一個端口已經被佔用，它將會自動嘗試下一個端口)

三、集羣最好是3個以上節點，方便故障轉移

四、默認狀況下，eleasticsearch會把插件、日誌、最重要的是你的數據都放在安裝目錄下。這可能會不幸的意外，經過安裝新的elasticsearch就可能把安裝目錄覆蓋了。若是你不當心，你可能擦除你全部的數據，建議更改數據、日誌和插件路徑

path.data: /home/elk/data

path.logs: /home/elk/logs

path.plugins: /home/elk/plugins

五、腦裂的危害:

這個配置必定程度地防止腦裂（沒法發現部分節點）問題。設置最小主節點數Minimum Master Nodes。最小主節點數的設置對集羣的穩定是很是重要的。該設置對預防腦裂是有幫助的，即一個集羣中存在兩個master。這個配置就是告訴elasticsearch除非有足夠可用的master候選節點，不然就不選舉master，只有有足夠可用的master候選節點才進行選舉。

六、該設置應該始終被配置爲有主節點資格的法定節點數，法定節點數：（主節點資格的節點數/2)+1。

即最少須要2個節點纔會選舉master節點（即產生集羣）。在配置文件的註釋中看到官方的建議是，數字設置爲【節點個數/2+1】，向上取整，本人的是3個節點，所以設置爲2；

發現集羣的超時時間爲10s。默認狀況下，一個節點會認爲，若是master節點在3秒以內沒有應答，那麼這個節點就是死掉了，而增長這個值，會增長節點等待響應的時間，從必定程度上會減小誤判。

discovery.zen.minimum_master_nodes: 2

discovery.zen.ping_timeout: 10s

七、節點類型

在配置集羣中的節點時, 應該明確指定各節點的角色類型。如

(1)、node.mater:true & node.data:true

既爲master，也爲data節點，負責集羣協調工做和數據存儲等

(2)、node.mater:true & node.data:false

爲master節點，負責集羣協調工做等

(3)、node.mater:false & node.data:true

爲data節點，負責索引數據和搜索等

(4)、node.mater:false & node.data:false

既不是master，也不是data節點，負責搜索負載均衡(從data節點抓取數據並做處理)

八、主節點配置

elasticsearch-node01配置:(我配了節點數以後，logstash和es起不來，不知道什麼緣由，我就不配了)

[root@ossec-server config]# egrep -v '^$|^#' /home/elk/elasticsearch-1.6.0/config/elasticsearch.yml

cluster.name: elasticsearch

node.name: "elasticsearch-node01"

node.master: true

node.data: true

bootstrap.mlockall: true

http.max_content_length: 2000mb

http.compression: true

index.cache.field.type: soft

index.cache.field.max_size: 50000

index.cache.field.expire: 10m

九、從節點配置

elasticsearch-node02配置：

[root@elasticsearch-node02 ~]# yum install java-1.8.0-openjdk

[root@elasticsearch-node02 ~]# java -version

openjdk version "1.8.0_91"

[root@elasticsearch-node02 ~]# egrep -v '^$|^#' /usr/local/elasticsearch-1.6.0/config/elasticsearch.yml

cluster.name: elasticsearch

node.name: "elasticsearch-node02"

node.master: false

node.data: true

path.data: /home/elk/data

path.work: /home/elk/work

path.logs: /home/elk/logs

path.plugins: /home/elk/plugins

bootstrap.mlockall: true

transport.tcp.port: 9301

http.port: 9201

http.max_content_length: 2000mb

http.compression: true

index.cache.field.type: soft

index.cache.field.max_size: 50000

index.cache.field.expire: 10m

開啓bootstrap.mlockall，須要執行ulimit -l unlimited

bootstrap.mlockall: true

ulimit -l unlimited

[root@elasticsearch-node02 config]# ulimit -n 65535

[root@elasticsearch-node02 config]# ulimit -a

core file size (blocks, -c) 0

data seg size (kbytes, -d) unlimited

scheduling priority (-e) 0

file size (blocks, -f) unlimited

pending signals (-i) 62835

max locked memory (kbytes, -l) unlimited

max memory size (kbytes, -m) unlimited

open files (-n) 65535

pipe size (512 bytes, -p) 8

POSIX message queues (bytes, -q) 819200

real-time priority (-r) 0

stack size (kbytes, -s) 10240

cpu time (seconds, -t) unlimited

max user processes (-u) 62835

virtual memory (kbytes, -v) unlimited

file locks (-x) unlimited

[root@elasticsearch-node02 config]# cat /etc/security/limits.conf

root hard memlock unlimited

十、集羣搭建完成以後，查看ossec-server上es集羣狀態，已變爲green

[root@ossec-server config]# curl -XGET http://localhost:9200/_cluster/health\?pretty

{

"cluster_name" : "elasticsearch",

"status" : "green",

"timed_out" : false,

"number_of_nodes" : 3,

"number_of_data_nodes" : 2,

"active_primary_shards" : 216,

"active_shards" : 432,

"relocating_shards" : 0,

"initializing_shards" : 0,

"unassigned_shards" : 0,

"number_of_pending_tasks" : 0,

"number_of_in_flight_fetch" : 0

}

[root@elasticsearch-node02 ~]# curl -XGET http://localhost:9201/_cluster/health\?pretty

{

"cluster_name" : "elasticsearch",

"status" : "green",

"timed_out" : false,

"number_of_nodes" : 3,

"number_of_data_nodes" : 2,

"active_primary_shards" : 216,

"active_shards" : 432,

"relocating_shards" : 0,

"initializing_shards" : 0,

"unassigned_shards" : 0,

"number_of_pending_tasks" : 0,

"number_of_in_flight_fetch" : 0

}

因爲我只搞了2臺node，就不演示es故障轉移了，須要的話請參考下面文章

參考文章：

http://www.searchtech.pro/articles/2013/02/18/1361194291548.html

http://www.cnblogs.com/dennisit/p/4133131.html

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。