記錄 Elasticsearch 的 maximum shards open 問題

時間 2019-12-26

標籤記錄 elasticsearch maximum shards open 問題欄目日誌分析简体版

原文原文鏈接

問題背景

某天打開 Jaeger UI 後，發現裏面沒有任何數據了，這是個奇怪的問題。html

而後立立刻服務器檢查了 jaeger-collector, jaeger-agent, jaeger-query 和 Elasticsearch 的服務進程、端口及網絡通訊。全部一切都正常。node

而後進一步排查數據流向問題，經過排查 jaeger-collector 日誌，發現 jaeger-agent -> jaeger-collector 之間的數據傳輸沒有問題。git

而 jaeger-collector -> ES 之間數據傳輸時報錯了。錯誤以下：github

{"level":"error","ts":1576483292.2617185,"caller":"config/config.go:130","msg":"Elasticsearch part of bulk request failed","map-key":"index","response":{"_index":"jaeger-span-2019-12-16","_type":"_doc","status":400,"error":{"type":"validation_exception","reason":"Validation Failed: 1: this action would add [10] total shards, but this cluster currently has [992]/[1000] maximum shards open;"}},"stacktrace":"github.com/jaegertracing/jaeger/pkg/es/config.(*Configuration).NewClient.func2\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/pkg/es/config/config.go:130\ngithub.com/jaegertracing/jaeger/vendor/github.com/olivere/elastic.(*bulkWorker).commit\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/olivere/elastic/bulk_processor.go:588\ngithub.com/jaegertracing/jaeger/vendor/github.com/olivere/elastic.(*bulkWorker).work\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/olivere/elastic/bulk_processor.go:501"}

提取關鍵錯誤信息json

this action would add [10] total shards, but this cluster currently has [992]/[1000] maximum shards open

根據報錯，能夠看出，目前集羣的shard數量已是992個，集羣最大索引爲1000個，將要添加的shard 數量超越了集羣管理的最大值，因此數據沒法寫入。vim

1000 個shards的限制是怎麼來的？服務器

根據官方解釋，從Elasticsearch v7.0.0 開始，集羣中的每一個節點默認限制 1000 個shard，若是你的es集羣有3個數據節點，那麼最多 3000 shards。這裏咱們是開發環境，只有一臺es。因此只有1000。網絡

ES 基本概念

若是您是Elasticsearch的新手，那麼瞭解基本術語並掌握基本概念很是重要。架構

Elasticsearch集羣的簡單圖app

羣集– Elasticsearch羣集由一個或多個節點組成，而且能夠經過其羣集名稱進行標識。

節點–一個Elasticsearch實例。在大多數環境中，每一個節點都在單獨的盒子或虛擬機上運行。

index–在Elasticsearch中，索引是文檔的集合。

分片–因爲Elasticsearch是分佈式搜索引擎，所以索引一般會分爲多個元素，這些元素稱爲分片，分佈在多個節點上。Elasticsearch自動管理這些分片的排列。它還會根據須要從新平衡分片，所以用戶無需擔憂細節。

副本–默認狀況下，Elasticsearch爲每一個索引建立五個主要分片和一個副本。這意味着每一個索引將包含五個主要分片，而且每一個分片將具備一個副本。

分配多個分片和副本是分佈式搜索功能設計的本質，它提供了高可用性並能夠快速訪問索引中的文檔。主分片和副本分片之間的主要區別在於，只有主分片才能接受索引請求。副本和主分片均可以知足查詢請求。

在上圖中，咱們有一個Elasticsearch集羣，該集羣由默認分片配置中的兩個節點組成。Elasticsearch會自動在兩個節點之間排列五個主要分片。每一個主碎片都有一個副本碎片，可是這些副本碎片的排列與主要碎片的排列徹底不一樣。

請記住，number_of_shards值與索引有關，而不與整個集羣有關。此值指定每一個索引的分片數量（而不是集羣中的總主分片）。

副本主要是爲了提升搜索性能，用戶能夠隨時添加或刪除它們。它們爲您提供了額外的容量，更高的吞吐量和更強的故障轉移。咱們始終建議生產集羣具備2個副本以進行故障轉移。

解決方案

找到了問題緣由，那麼如何解決這個問題？

解決這個問題須要回答兩個問題：

一，每一個 `Index` 多少個 `Shard` 合適？

配置 Elasticsearch 集羣后，對於分片數，是比較難肯定的。由於一個索引分片數一旦肯定，之後就沒法修改，因此咱們在建立索引前，要充分的考慮到，之後咱們建立的索引所存儲的數據量，不然建立了不合適的分片數，會對咱們的性能形成很大的影響。

若是之後發現有必要更改分片的數量，則須要從新索引全部源文檔。（儘管從新編制索引是一個漫長的過程，但能夠在不停機的狀況下完成）。

主分片配置與硬盤分區很是類似，在硬盤分區中，原始磁盤空間的從新分區要求用戶備份，配置新分區並將數據重寫到新分區上。

稍微過分分配是好的。可是若是你給每一個 Index 分配 1000 個Shard 那就是很差的。

請記住，您分配的每一個分片都須要支付額外費用：

因爲分片本質上是Lucene索引，所以會消耗文件句柄，內存和CPU資源。
每一個搜索請求都將觸摸索引中每一個分片的副本，當分片分佈在多個節點上時，這不是問題。當分片爭奪相同的硬件資源時，就會出現爭用而且性能會降低。

咱們的客戶指望他們的業務增加，而且其數據集會相應地擴展。所以，始終須要應急計劃。許多用戶說服本身，他們將遇到爆炸性增加（儘管大多數用戶從未真正看到沒法控制的增加）。此外，咱們全部人都但願最大程度地減小停機時間並避免從新分片。

若是您擔憂數據的快速增加，那麼咱們建議您關注一個簡單的約束：Elasticsearch的最大JVM堆大小建議約爲30-32GB。這是對絕對最大分片大小限制的可靠估計。例如，若是您確實認爲能夠達到200GB（但在不更改其餘基礎架構的狀況下沒法達到更大容量），那麼咱們建議分配7個分片，或最多8個分片。

絕對不要爲從如今起三年後達到的過高的10 TB目標分配資源。

若是如今你的場景是分片數不合適了，可是又不知道如何調整，那麼有一個好的解決方法就是按照時間建立索引，而後進行通配查詢。若是天天的數據量很大，則能夠按天建立索引，若是是一個月積累起來致使數據量很大，則能夠一個月建立一個索引。若是要對現有索引進行從新分片，則須要重建索引.

修改默認的 Elasticsearch 分片數

這是正確的改變配置文件中的index.number_of_shards默認值將涉及更改全部節點上的設置，而後理想地按照rolling restarts的指導方針從新啓動實例。

可是，若是這不是一個選項，若是在建立新索引時在設置中明確指定number_of_shards並不理想，那麼解決方法將使用index templates

能夠建立index_defaults默認值，以下所示

PUT /_template/index_defaults 
{
  "template": "*", 
  "settings": {
    "number_of_shards": 4
  }
}

這會將index_defaults模板中指定的設置應用於全部新索引。

重建索引

參考：教你如何在 elasticsearch 中重建索引

二，每一個節點的 `maximum shards open` 設置爲多大合適？

對於分片數的大小，業界一致認爲分片數的多少與內存掛鉤，認爲 1GB 堆內存對應 20-25 個分片。所以，具備30GB堆的節點最多應有600個分片，可是越低於此限制，您可使其越好。而一個分片的大小不要超過50G，一般，這將有助於羣集保持良好的運行情況。

三，具體措施

個人觀點是開發、測試環境，若是數據不那麼重要的話，能夠清空全部 index

DELETE /_all

官方文檔：刪除索引

而後，從新設置默認值，下降number_of_shards數量，同時提升max_shards_per_node的數量。

vim elasticsearch.yml

# Set the number of shards (splits) of an index (5 by default):
#
index.number_of_shards: 2
# Set the number of replicas (additional copies) of an index (1 by default):
#
index.number_of_replicas: 1

cluster.max_shards_per_node: 3000

對於生產環境

vim elasticsearch.yml

# Set the number of shards (splits) of an index (5 by default):
#
index.number_of_shards: 5
# Set the number of replicas (additional copies) of an index (1 by default):
#
index.number_of_replicas: 2

cluster.max_shards_per_node: 3000

Index 的配置參數在es配置文件裏直接配置，會報錯。不過能夠用Kibana來設置。

打開 console

DELETE /_all

PUT _template/default
{
    "index_patterns" : ["jaeger*"],
    "order" : 1,
    "settings": {
        "number_of_shards": "2",
        "number_of_replicas": "1"
    }
}

PUT _template/default1
{
    "index_patterns" : ["*"],
    "order" : 0,
    "settings": {
        "number_of_shards": "5",
        "number_of_replicas": "2"
    }
}

PUT /_cluster/settings
{
  "transient": {
    "cluster": {
      "max_shards_per_node":10000
    }
  }
}

order

(Optional,integer) Order in which Elasticsearch applies this template if index matches multiple templates.

Templates with lowerordervalues are merged first.
Templates with higherordervalues are merged later, overriding templates with lower values.

Put index template API

能夠發現生效了

$ curl -X GET 'http://127.0.0.1:9200/\_cat/indices?v'

health status index uuid  pri rep docs.count docs.deleted store.size pri.store.size

yellow open  jaeger-span-2019-12-17 2DHx3EaGTnKlVC4mefsUJw  2  1  27 0  22kb  22kb

yellow open  .kibana SeD9KqnhR7aKpzc2\_AcIDw  2  1 1 0 4.3kb 4.3kb

最重要的是，要按期刪除無用數據，好比對於jaeger的數據，能夠每月初刪除一個月前的全部數據，即只保留最近1個月的Index數據。

僅保存近30天的數據任務分解爲

delete_by_query設置檢索近30天數據；
執行forcemerge操做，手動釋放磁盤空間。

#!/bin/sh
curl -XPOST "http://127.0.0.1:9200/jaeger-*/_delete_by_query?conflicts=proceed" -H'Content-Type:application/json' -d'{
    "query": {
        "range": {
            "pt": {
                "lt": "now-100d",
                "format": "epoch_millis"
            }
        }
    }
}
'

force merge API

這裏有3個參數能夠用

max_num_segments 指望merge到多少個segments，1的意思是強行merge到1個segment
only_expunge_deletes 只作清理有deleted的segments，即瘦身
flush 清理完執行一下flush，默認是true

forcemerge 腳本以下：

#!/bin/sh
curl -XPOST 'http://127.0.0.1:9200/_forcemerge?only_expunge_deletes=true&max_num_segments=1'

有沒有更簡便的方法？

有，使用ES官網工具——curator工具。

curator 簡介

主要目的：規劃和管理ES的索引。支持常見操做：建立、刪除、合併、reindex、快照等操做。curator 官網地址

curator 適用場景

最重要的是：

僅以刪除操做爲例：curator能夠很是簡單地刪除x天后的索引。不過前提是：索引命名要遵循特定的命名模式——如:以天爲命名的索引：
```
jaeger-span-2019-09-20 
jaeger-span-2019-10-07 
jaeger-service-2019-10-11
```
命名模式須要和action.yml中的delete_indices下的timestring對應。

四，臨時提升閾值

經過ES API零時修改

curl -X PUT "dev-jaeger-es01.bj:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
    "persistent" : {
        "cluster.max_shards_per_node" : "5000"
    }
}
'

也可在kibana的tools中改變設置

PUT /_cluster/settings
{
  "transient": {
    "cluster": {
      "max_shards_per_node":10000
    }
  }
}

參考

Optimizing Elasticsearch: How Many Shards per Index?
How many shards should I have in my Elasticsearch cluster?
從 10 秒到 2 秒！ElasticSearch 性能調優
 Elasticsearch調優實踐
 索引設置

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。