ELK 使用小技巧（第 3 期）

時間 2019-11-06

標籤 elk 使用技巧简体版

原文原文鏈接

ELK Tips 主要介紹一些 ELK 使用過程當中的小技巧，內容主要來源爲 Elastic 中文社區。java

1、Logstash

一、Filebeat 設置多個 output

在 6.0 以前，Filebeat 能夠設置多個輸出（必須是不一樣類型的輸出）；從 6.0 開始已經禁止多輸出了，只能擁有一個輸出，若是想實現多輸出，能夠藉助 logstash 等中間組件進行輸出分發。node

2、Elasticsearch

一、ES 用戶佔用的內存大於爲 ES 設置的 heapsize

ES 是 Java 應用，底層存儲引擎是基於 Lucene 的，heapsize 設置的是 Java 應用的內存；而 Lucene 創建倒排索引（Inverted Index）是先在內存裏生成，而後按期以段文件（segment file）的形式刷到磁盤的，所以 Lucene 也會佔用一部份內存。git

elasticsearch.cn/article/32github

二、ES 使用別名插入數據

ES 能夠經過索引的方式向索引插入數據，可是同時只能有一個索引能夠被寫入，並且須要手動設置，未設置的狀況下會報錯：no write index is defined for alias [xxxx]， The write index may be explicitly disabled using is_write_index=false or the alias points to multiple indices without one being designated as a write index。算法

POST /_aliases
{
    "actions" : [
        {
            "add" : {
                 "index" : "test",
                 "alias" : "alias1",
                 "is_write_index" : true
            }
        }
    ]
}
複製代碼

三、ES 設置 G1 垃圾回收

修改 jvm.options文件，將下面幾行:編程

-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
複製代碼

改成bash

-XX:+UseG1GC
-XX:MaxGCPauseMillis=50
複製代碼

便可。app

其中 -XX:MaxGCPauseMillis 是控制預期的最高 GC 時長，默認值爲 200ms，若是線上業務特性對於 GC 停頓很是敏感，能夠適當設置低一些。可是這個值若是設置太小，可能會帶來比較高的 cpu 消耗。jvm

四、ES 和 Zipkin 集成時設置驗證信息

java -DKAFKA_ZOOKEEPER=10.14.123.117:2181 
-DSTORAGE_TYPE=elasticsearch 
-DES_HOSTS=http://10.14.125.5:9200 
ES_USERNAME=xxx ES_PASSWORD=xxx 
-jar zipkin.jar
複製代碼

五、ES 集羣部署報錯

問題 1 報錯信息以下：async

Received message from unsupported version:[2.0.0] minimal compatible version is:[5.6.0]
複製代碼

經排查是集羣中存在低版本的 ES 實例，將低版本實例移除便可。

問題 2 報錯信息以下：

with the same id but is a different node instance
複製代碼

刪除對應節點 elsticsearch 文件夾下的 data 文件夾下的節點數據便可。

六、海量中文分詞插件

海量分詞是天津海量信息技術股份有限公司自主研發的中文分詞核心，經測試分詞效果仍是不錯的，值得一試。

github.com/HylandaOpen…

七、查詢一個索引下的全部 type 名

經過下面的 API，便可獲取所有的 type，下面的例子中 doc 就是 indexName 索引下的一個 type：

GET http://es127.0.0.1:9200/indexName/_mappings
-----------------------------------------------
{
    indexName: - {
        mappings: - {
            doc: - {
                _all: + {... },
                dynamic_date_formats: + [... ],
                dynamic_templates: + [... ],
                properties: + {... }
            }
        }
    }
}
複製代碼

八、索引模板中根據字段值設置別名

設置索引模板的時候，別名可使用 Query 條件來進行匹配。

PUT _template/template_1
{
    "index_patterns" : ["te*"],
    "settings" : {
        "number_of_shards" : 1
    },
    "aliases" : {
        "alias2" : {
            "filter" : {
                "term" : {"user" : "kimchy" }
            },
            "routing" : "kimchy"
        },
        "{index}-alias" : {} 
    }
}
複製代碼

九、索引模板設置默認時間匹配格式

ES 默認是不會將 yyyy-MM-dd HH:mm:ss 識別爲時間的，能夠經過在索引模板進行以下設置實現多種時間格式的識別：

"mappings": {
"doc": {
  "dynamic_date_formats": ["yyyy-MM-dd HH:mm:ss||strict_date_optional_time||epoch_millis"],
複製代碼

十、ES 中 Merge 相關設置

Merge 是很是耗費 CPU 的操做；並且若是不是 SSD 的話，推薦將 index.merge.scheduler.max_thread_count 設置爲 1；不然 ES 會啓動 Math.min(3, Runtime.getRuntime().availableProcessors() / 2) 個線程進行 Merge 操做；這樣大部分機械硬盤的磁盤 IO 都很難承受，就可能出現阻塞。

"index": {
  "refresh_interval": "5s",
  "number_of_shards": "3",
  "max_result_window": 10000,
  "translog": {
    "flush_threshold_size": "500mb",
    "sync_interval": "30s",
    "durability": "async"
  },
  "merge": {
    "scheduler": {
      "max_merge_count": "100",
      "max_thread_count": "1"
    }
  },
複製代碼

十一、mapping 中 enabled store index 參數

enabled：默認是true，只用於 mapping 中的 object 字段類型；當設置爲 false 時，其做用是使 es 不去解析該字段，而且該字段不能被查詢和 store，只有在 source 中才能看到，設置 enabled 爲 false，能夠不設置字段類型，默認類型爲 object；
store：默認 false，store 參數的功能和 source 有一些類似，咱們的數據默認都會在 source 中存在，但咱們也能夠將數據 store 起來；當咱們使用 copy_to 參數時，copy_to 的目標字段並不會在 source 中存儲，此時 store 就派上用場了；
index：默認是 true，當設置爲 false，代表該字段不能被查詢，若是查詢會報錯。

十二、ES 圖片搜索

能夠藉助局部敏感 LSH 或者 pHash 來實現：stackoverflow.com/questions/3…
Github 也有一個開源項目使用了多種 Hash 算法藉助 ES 來實現圖片搜索：github.com/usc-isi-i2/…

1三、Term 聚合根據子聚合結果排序

GET /_search
{
    "aggs" : {
        "genres" : {
            "terms" : {
                "field" : "genre",
                "order" : { "playback_stats.max" : "desc" }
            },
            "aggs" : {
                "playback_stats" : { "stats" : { "field" : "play_count" } }
            }
        }
    }
}
複製代碼