Elasticsearch集羣運維

時間 2019-11-06

標籤 elasticsearch 集羣欄目日誌分析简体版

原文原文鏈接

1、索引管理html

一、 建立索引node

PUT test-2019-03網絡

{併發

"settings": {app

"index": {dom

"number_of_shards": 10,curl

"number_of_replicas": 1,elasticsearch

"routing": {ide

"allocation": {ui

"include": {

"type": "hot"

}

二、 刪除索引

DELETE test-2019-03

DELETE test*

支持通配符*

三、 修改索引

修改副本數：

PUT test-2019-03/_settings

{

"index": {

"number_of_replicas": 0

}

四、 重構索引ReIndex

POST _reindex

{

"source": {

"index": ["test-2018-07-*"]

"dest": {

"index": "test -2018-07"

}

查看reIndex任務：

GET _tasks?detailed=true&actions=*reindex

五、 刪除數據delete_by_query

POST indexApple-2019-02/_delete_by_query?conflicts=proceed

{

"query": {

"bool" : {

"must" : {

"term" : { "appIndex" : "apple" }

"filter" : {

"range": {

"timestamp": {

"gte": "2019-02-23 08:00:00",

"lte": "2019-02-23 22:00:00",

"time_zone" :"+08:00"

}

查看delete_by_query任務：

GET _tasks?detailed=true&actions=*/delete/byquery

2、集羣設置

ES cluster的settings：

curl -XPUT http://<domain>:<port>/_cluster/settings

1、Shard Allocation Settings

{"persistent":{"cluster.routing.allocation.enable": "all"}}

設置集羣哪一種分片容許分配，4個選項：

all - (default) Allows shard allocation for all kinds of shards.

primaries - Allows shard allocation only for primary shards.

new_primaries - Allows shard allocation only for primary shards for new indices.

none - No shard allocations of any kind are allowed for any indices.

{"persistent":{"cluster.routing.allocation.node_concurrent_recoveries": 8}}
設置在節點上併發分片恢復的個數（寫和讀）。

{"persistent":{"cluster.routing.allocation.node_initial_primaries_recoveries": 16}}
設置節點重啓後有多少併發數從本地恢復未分配的主分片。

{"persistent":{"indices.recovery.max_bytes_per_sec": "500mb"}}
設置索引恢復時每秒字節數。

2、Shard Rebalancing Settings

{"persistent":{"cluster.routing. rebalance.enable": "all"}}

設置集羣哪一種分片容許重平衡，4個選項：

all - (default) Allows shard balancing for all kinds of shards.

primaries - Allows shard balancing only for primary shards.

replicas - Allows shard balancing only for replica shards.

none - No shard balancing of any kind are allowed for any indices.

{"persistent":{"cluster.routing. allocation. allow_rebalance": "all"}}

always - Always allow rebalancing.

indices_primaries_active - Only when all primaries in the cluster are allocated.

indices_all_active - (default) Only when all shards (primaries and replicas) in the cluster are allocated.

{"transient":{"cluster.routing.allocation.cluster_concurrent_rebalance": 8}}
設置在集羣上併發分片重平衡的個數，只控制「重平衡」過程的併發數，對集羣「恢復」和其餘狀況下的併發數沒有影響。

{"transient":{"cluster.routing.allocation.cluster_concurrent_rebalance": 0}}

禁用集羣「rebalance」

{"transient":{"cluster.routing.allocation.cluster_concurrent_rebalance": null}}
啓用集羣「rebalance」

3、Disk-based Shard Allocation

#調整數據節點的低水位值爲80%
{"transient":{"cluster.routing.allocation.disk.watermark.low":"80%"}}
#調整數據節點的高水位值爲90%
{"transient":{"cluster.routing.allocation.disk.watermark.high":"90%"}}
#取消用戶設置，集羣恢復這一項的默認配置
{"transient":{"cluster.routing.allocation.disk.watermark.low": null}}
{"transient":{"cluster.routing.allocation.disk.watermark.high": null}}

4、Allocation策略

明確指定是否容許分片分配到指定Node上，分爲index級別和cluster級別

index.routing.allocation.require.{attribute}
index.routing.allocation.include{attribute}
index.routing.allocation.exclude.{attribute}
cluster.routing.allocation.require.{attribute}
cluster.routing.allocation.include.{attribute}
cluster.routing.allocation.exclude.{attribute}

require表示必須分配到指定node，include表示能夠分配到指定node，exclude表示不容許分配到指定Node，cluster的配置會覆蓋index級別的配置，好比index include某個node，cluster exclude某個node，最後的結果是exclude某個node

#經過IP，排除集羣中的某個節點：節點IP：10.100.0.11
{"transient":{"cluster.routing.allocation.exclude._ip":"10.100.0.11"}}
#經過IP，排除集羣中的多個節點：節點IP：10.10.0.11,10.100.0.12
{"transient":{"cluster.routing.allocation.exclude._ip":"10.100.0.11,10.100.0.12"}}
#取消節點排除的限制
{"transient":{"cluster.routing.allocation.exclude._ip": null}}

設置索引不分配到某些IP：

PUT test/_settings

{

"index.routing.allocation.exclude._ip": "192.168.2.*"

}

默認支持的屬性：

_name Match nodes by node name

_host_ip Match nodes by host IP address (IP associated with hostname)

_publish_ip Match nodes by publish IP address

_ip Match either _host_ip or _publish_ip

_host Match nodes by hostname

五、Shard分配問題

一、查看集羣unassigned shards緣由
GET _cluster/allocation/explain?pretty

二、查看索引的恢復狀態，以索引user爲例
GET user/_recovery?active_only=true

三、使用reroute重試以前分配失敗的，集羣在嘗試分配分片index.allocation.max_retries（默認爲5）次後會放棄分配
POST /_cluster/reroute?retry_failed=true

四、查看狀態是red的索引
GET _cat/indices?health=red

集羣滾動重啓

1、準備工做
##提早打開以下信息，有些API是須要觀察的各項指標（出現問題則中止重啓），其他是配合檢查的API：
##查看集羣UNASSIGEN shards緣由
curl http://0.0.0.0:9200/_cluster/allocation/explain?pretty

###集羣配置
curl http://0.0.0.0:9200/_cluster/settings?pretty

###pending-tasks
curl http://0.0.0.0:9200/_cluster/pending_tasks?pretty

###集羣健康
curl http://0.0.0.0:9200/_cluster/health?pretty
2、重啓client-node
#start
步驟1：關閉其中一個client節點
步驟2：重啓節點
步驟3：檢查節點是否加入集羣
步驟4：重複步驟2-3重啓其餘節點
#end

3、重啓master-node
#start
步驟1：明確master節點IP
步驟2：關閉master-node組的一個非master節點
步驟3：重啓節點
步驟4：檢查節點是否加入集羣（確保已經加入集羣）
步驟5：重複步驟2-4，重啓另外的master-node組的一個非master節點
步驟6：關閉master節點
步驟7：重啓master節點
##在master節點選舉過程當中，集羣功能不可用（包括了：索引功能、search功能，API功能堵塞等），集羣並不會當即選舉出master節點（默認進行選舉的時間爲3s, 因爲網絡的問題，每每將master選舉的時間延長）
步驟8：檢查集羣裝填，檢查節點是否加入集羣。
##當master選舉出來，集羣功能將所有正常。
#end

4、重啓data-node
#start
步驟1：禁用分片分配
curl -X PUT http://0.0.0.0:9200/_cluster/settings?pretty -d '{"transient": {"cluster.routing.allocation.enable": "new_primaries"}}'
##禁用分片分配期間，集羣新建索引將沒法分配副本分片，容許新建索引主分片的分配
步驟2：執行同步刷新
curl -XPOST "http://0.0.0.0:9200/_flush/synced?pretty"
##對於在此刻不在更新的索引，此操做將經過synced值來確認主副分片是否數據一致（加快了分片加入集羣的時間）；對於在此刻索引起生變化的分片，此操做對節點加入集羣的索引恢復沒有做用
步驟3：關閉一個data-node節點
步驟4：重啓節點
步驟5：檢查節點是否加入集羣
步驟6：啓用分片分配
curl -X PUT http://0.0.0.0:9200/_cluster/settings?pretty -d '{"transient": {"cluster.routing.allocation.enable": "all"}}'
步驟7：檢查集羣狀態是否爲green
##在啓用了分片分配後，UNASSIGEN shards會瞬間減小（不會瞬間減小爲0，由於在大的ES集羣中，每一個節點都會有在更新的索引分片）；以後會出現一些initializing shards，這部分分片會須要等待一段時間纔會減小爲0（分片同步過程當中）
步驟8：重複步驟3-7，重啓其餘節點
步驟9：節點所有重啓完畢後，檢查集羣配置，確保沒有禁用分片分配
#end
參考資料：