Elasticsearch一些使用筆記(持續更新)

時間 2019-12-06

原文原文鏈接

這篇博客記錄這一些運維ES的一些經驗。java

一、節點磁盤使用率太高，致使ES集羣shard沒法分配，丟失數據？node

有兩個配置，分配副本的時候api

參數名稱	默認值	含義
cluster.routing.allocation.disk.watermark.low	85%	當節點磁盤佔用量高於85%時，就不會往該節點分配副本了
cluster.routing.allocation.disk.watermark.high	90%	當節點磁盤佔用量高於90%時，嘗試將該節點的副本重分配到其餘節點

配置方式併發

curl -XPUT 'localhost:9200/_cluster/settings' -d
'{
    "transient": {  
      "cluster.routing.allocation.disk.watermark.low": "90%"    
    }
}'

建議：密切關注ES集羣節點的性能參數，對潛在風險有感知。app

二、模板管理運維

template機制是比較有用的，特別是管理大量索引的時候。先給一個template的demo。curl

order：10 template的優先級，優先級高(order數字大的)會覆蓋優先級低的template裏的字段。jvm

template：test*，這個template會命中test開頭的索引。async

index.number_of_shards：20 //index的一些配置
elasticsearch

index.number_of_replicas:：1

index.refresh_interval：5s

{
    "aliases": {},
    "order": 10,
    "template": "test*",
    "settings": {
        "index": {
            "priority": "5",
            "merge": {
                "scheduler": {
                    "max_thread_count": "1"
                }
            },
            "search": {
                "slowlog": {
                    "threshold": {
                        "query": {
                            "warn": "10s",
                            "debug": "1s",
                            "info": "5s",
                            "trace": "500ms"
                        },
                        "fetch": {
                            "warn": "1s",
                            "debug": "500ms",
                            "info": "800ms",
                            "trace": "200ms"
                        }
                    }
                }
            },
            "unassigned": {
                "node_left": {
                    "delayed_timeout": "5m"
                }
            },
            "max_result_window": "10000",
            "number_of_shards": "20",
            "number_of_replicas": "1", 
            "translog": {
                "durability": "async"
            },
            "requests": {
                "cache": {
                    "enable": "true"
                }
            },
            "mapping": {
                "ignore_malformed": "true"
            },
            "refresh_interval": "5s"
        }
    }
}

配置方式

curl -XPUT localhost:9200/_template/template_1 -d '
{
    "template" : "test*",
    "order" : 0,
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "type1" : {
            "_source" : { "enabled" : false }
        }
    }
}
'

在配置了模板之後，如何創建索引

# 索引建立
curl -XPUT http://35.1.4.127:9200/index_name

三、mapping建立的一些注意事項

在建立索引type mapping的時候要妥善處理好_all和_source，否則會影響索引的性能。

_all，enable的話會把一個type中的全部字段合併成一個大字段，增長索引時間和大小。

_source，enable的話會請求會返回_source的結構體。

通常咱們會禁用_all，打開_source。

另外，對時間的處理，能夠以下這樣，對於各類繁瑣的時間格式都是支持的。

配置方式

curl -PUT http://35.1.4.129:9200/index_name/RELATION/_mapping -d '{
    "RELATION": {
        "_all": {
            "enabled": "false"
        },
        "_source": {
            "enabled": "true"
        },
        "properties": {
            "FROM_SFZH": {
                "type": "keyword"
            },
            "TO_SFZH": {
                "type": "keyword"
            },
            "CREATE_TIME": {
                "type": "date",
                "format": "yyyy-MM-dd HH:mm:ss.SSS Z||yyyy-MM-dd HH:mm:ss.SSS||yyyy-MM-dd HH:mm:ss,SSS||yyyy/MM/dd HH:mm:ss||yyyy-MM-dd HH:mm:ss,SSS Z||yyyy/MM/dd HH:mm:ss,SSS Z||strict_date_optional_time||epoch_millis||yyyy-MM-dd HH:mm:ss"
            }
        }
    }
}'

四、批量數據灌入ES時要禁用副本和刷新

大規模批量導入數據的時候，要禁用副本和刷新，ES在索引數據的時候，若是有副本的話，會同步副本，形成壓力。

等到數據索引完成後，在恢復副本。

配置方法

// 關閉
curl -PUT http://35.1.4.129:9200/_settings -d '{
　　"index": {
　　　　"number_of_replicas" : 0
　　　　"refresh_interval" : -1
　　}    
}'

// 打開
curl -PUT http://35.1.4.129:9200/_settings -d '{
　　"index": {
　　　　"number_of_replicas" : 1
　　　　"refresh_interval" : 5s
　　}    
}'

五、jvm層面監控和優化

Elasticsearch是java開發的組件，固然能夠壓測看一下jvm的表現，例如經過jconsole遠程鏈接。

config/jvm.options裏面有各類jvm的配置，能夠根據硬件資源合理配置一下。jvm調優就不說了。

-Djava.rmi.server.hostname=192.168.1.152
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=9110
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false

六、高併發查詢時，優化ES線程池

當你查詢併發上來了，有時候你會發現下面這個異常

EsRejectedExcutionException[rejected execution(queue capacity 50) on.......]

這個緣由是在新版本的elasticsearch中線程池已是fixed類型了，即固定大小的線程池，默認是5*core數，當全部線程忙碌，且隊列滿的狀況下，es會拒絕請求。

多種請求類型對應多種線程池

index：此線程池用於索引和刪除操做。它的類型默認爲fixed，size默認爲可用處理器的數量，隊列的size默認爲200。
search：此線程池用於搜索和計數請求。它的類型默認爲fixed，size默認爲(可用處理器的數量* 3) / 2) + 1，隊列的size默認爲1000。
suggest：此線程池用於建議器請求。它的類型默認爲fixed，size默認爲可用處理器的數量，隊列的size默認爲1000。
get：此線程池用於實時的GET請求。它的類型默認爲fixed，size默認爲可用處理器的數量，隊列的size默認爲1000。
bulk：此線程池用於批量操做。它的類型默認爲fixed，size默認爲可用處理器的數量，隊列的size默認爲50。
percolate：此線程池用於預匹配器操做。它的類型默認爲fixed，size默認爲可用處理器的數量，隊列的size默認爲1000。

這裏以index爲例，能夠在elasticsearch.yml中修改線程池配置

threadpool.index.type: fixed
threadpool.index.size: 100
threadpool.index.queue_size: 500

經過api控制

curl -XPUT 'localhost:9200/_cluster/settings' -d '{
    "transient": {
        "threadpool.index.type": "fixed",
        "threadpool.index.size": 100,
        "threadpool.index.queue_size": 500
    }
}'

七、若干副本shard分配不成功，集羣狀態yellow

7.1 先看看集羣狀態

curl -XGET http://10.96.78.164:9200/_cluster/health?pretty

結果以下，若是有未分配的分片，unassigned_shards應該不爲0，status=yellow。

{
"cluster_name": "elasticsearch",
"status": "green",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 575,
"active_shards": 575,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}

7.2 查看未分配的shard屬於哪一個index，以及allocate的目標機器是哪一個。

curl -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED

結果

xiankan_xk_qdhj                  3 r UNASSIGNED    0    261b 10.96.78.164 yfbf9D3
xiankan_xk_qdhj                  2 r UNASSIGNED    0    261b 10.96.78.164 yfbf9D3
xiankan_xk_qdhj                  1 r UNASSIGNED    0    261b 10.96.78.164 yfbf9D3
xiankan_xk_qdhj                  4 r UNASSIGNED    0    261b 10.96.78.164 yfbf9D3

r-表示副本分片，p是主分片，ip是分配目標機器

7.3 嘗試1：索引級別的副本從新分配

有問題的索引，先關閉其副本，而後打開從新分配副本。

關閉

curl -PUT http://35.1.4.129:9200/xiankan_xk_zjhj/_settings -d '{
　　"index": {
　　　　"number_of_replicas" : 0
　　}    
}'

打開

http://10.96.78.164:9200/xiankan_xk_zjhj/_settings -d '{
  "index": {
    "number_of_replicas": 1
  }
}'

7.4 嘗試2：node級別的副本從新分配

重啓shard分配不成功的node，若是shard分佈在爲數很少的幾個node上，能夠根據ip重啓node上的es實例

殺死es

ps -ef | grep elasticsearch | grep -v grep | awk '{print $2}' | xargs kill -9

啓動es

./bin/elasticsearch -d

7.5 嘗試3：逐個索引shard的reroute

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "xiankan_xk_zjhj",
"shard" : 1,
"node" : "yfbf9D3",
"allow_primary" : true
}
}
] }'

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。