elasticsearch 配置說明

時間 2019-12-14

原文原文鏈接

elasticsearch的config文件夾裏面有兩個配置文件：elasticsearch.yml和logging.yml，第一個是es的基本配置文件，第二個是日誌配置文件，es也是使用log4j來記錄日誌的，因此logging.yml裏的設置按普通log4j配置文件來設置就好了。node

下面主要講解下elasticsearch.yml這個文件中可配置的東西。

配置es的集羣名稱，默認是elasticsearch，es會自動發如今同一網段下的es，若是在同一網段下有多個集羣，就能夠用這個屬性來區分不一樣的集羣。 linux

Java代碼

cluster.name: elasticsearch

節點名，默認隨機指定一個name列表中名字，該列表在es的jar包中config文件夾裏name.txt文件中，其中有不少做者添加的有趣名字。 git

Java代碼

node.name: "Franz Kafka"

指定該節點是否有資格被選舉成爲node，默認是true，es是默認集羣中的第一臺機器爲master，若是這臺機掛了就會從新選舉master。 github

Java代碼

node.master: true

指定該節點是否存儲索引數據，默認爲true。 bootstrap

Java代碼

node.data: true

設置默認索引分片個數，默認爲5片。服務器

Java代碼

index.number_of_shards: 5

設置默認索引副本個數，默認爲1個副本。網絡

Java代碼

index.number_of_replicas: 1

設置配置文件的存儲路徑，默認是es根目錄下的config文件夾。併發

Java代碼

path.conf: /path/to/conf

設置索引數據的存儲路徑，默認是es根目錄下的data文件夾 app

Java代碼

path.data: /path/to/data

能夠設置多個存儲路徑，用逗號隔開，例：負載均衡

Java代碼

path.data: /path/to/data1,/path/to/data2

設置臨時文件的存儲路徑，默認是es根目錄下的work文件夾。

Java代碼

path.work: /path/to/work

設置日誌文件的存儲路徑，默認是es根目錄下的logs文件夾

Java代碼

path.logs: /path/to/logs

設置插件的存放路徑，默認是es根目錄下的plugins文件夾

Java代碼

path.plugins: /path/to/plugins

強制全部內存鎖定，不要搞什麼swap的來影響性能
設置爲true來鎖住內存。由於當jvm開始swapping時es的效率會下降，因此要保證它不swap，能夠把ES_MIN_MEM和 ES_MAX_MEM兩個環境變量設置成同一個值，而且保證機器有足夠的內存分配給es。同時也要容許elasticsearch的進程能夠鎖住內存，linux下能夠經過`ulimit -l unlimited`命令。

Java代碼

bootstrap.mlockall: true

設置綁定的ip地址，能夠是ipv4或ipv6的，默認爲0.0.0.0。

Java代碼

network.bind_host: 192.168.0.1

設置其它節點和該節點交互的ip地址，若是不設置它會自動判斷，值必須是個真實的ip地址。

Java代碼

network.publish_host: 192.168.0.1

這個參數是用來同時設置bind_host和publish_host上面兩個參數。

Java代碼

network.host: 192.168.0.1

設置節點間交互的tcp端口，默認是9300。

Java代碼

transport.tcp.port: 9300

設置是否壓縮tcp傳輸時的數據，默認爲false，不壓縮。

Java代碼

transport.tcp.compress: true

設置對外服務的http端口，默認爲9200。

Java代碼

http.port: 9200

設置內容的最大容量，默認100mb

Java代碼

http.max_content_length: 100mb

是否使用http協議對外提供服務，默認爲true，開啓。

Java代碼

http.enabled: false

網絡配置

Java代碼

#network.tcp.keep_alive : true
#network.tcp.send_buffer_size : 8192
#network.tcp.receive_buffer_size : 8192

自動發現相關配置

Java代碼

#discovery.zen.fd.connect_on_network_disconnect : true
#discovery.zen.initial_ping_timeout : 10s
#discovery.zen.fd.ping_interval : 2s
#discovery.zen.fd.ping_retries : 10

The gateway snapshot interval (only applies to shared gateways).

Java代碼

#index.gateway.snapshot_interval : 1s

分片異步刷新時間間隔

Java代碼

#index.refresh_interval : -1

Set to an actual value (like 0-all) or false to disable it.

Java代碼

index.auto_expand_replicas

Set to true to have the index read only. false to allow writes and metadata changes.

Java代碼

index.blocks.read_only

Set to true to disable read operations against the index.

Java代碼

index.blocks.read

Set to true to disable write operations against the index.

Java代碼

index.blocks.write

Set to true to disable metadata operations against the index.

Java代碼

index.blocks.metadata

Lucene index term間隔，僅用於新建立的doc

Java代碼

index.term_index_interval

Lucene reader term index divisor

Java代碼

index.term_index_divisor

When to flush based on operations.

Java代碼

index.translog.flush_threshold_ops

When to flush based on translog (bytes) size.

Java代碼

index.translog.flush_threshold_size

When to flush based on a period of not flushing.

Java代碼

index.translog.flush_threshold_period

Disables flushing. Note, should be set for a short interval and then enabled.

Java代碼

index.translog.disable_flush

The maximum size of filter cache (per segment in shard). Set to -1 to disable.

Java代碼

index.cache.filter.max_size

The expire after access time for filter cache. Set to -1 to disable.

Java代碼

index.cache.filter.expire

merge policy
All the settings for the merge policy currently configured. A different merge policy can’t be set.

A node matching any rule will be allowed to host shards from the index.

Java代碼

index.routing.allocation.include.*

A node matching any rule will NOT be allowed to host shards from the index.

Java代碼

index.routing.allocation.exclude.*

Only nodes matching all rules will be allowed to host shards from the index.

Java代碼

index.routing.allocation.require.*

Controls the total number of shards allowed to be allocated on a single node. Defaults to unbounded (-1).

Java代碼

index.routing.allocation.total_shards_per_node

When using local gateway a particular shard is recovered only if there can be allocated quorum shards in the cluster. It can be set to quorum (default), quorum-1 (or half), full and full-1. Number values are also supported, e.g. 1.

Java代碼

index.recovery.initial_shards

Disables temporarily the purge of expired docs.

Java代碼

index.ttl.disable_purge

默認索引合併因子

Java代碼

#index.merge.policy.merge_factor : 100
#index.merge.policy.min_merge_docs : 1000
#index.merge.policy.use_compound_file : true
#indices.memory.index_buffer_size : 5%

Gateway相關配置
當集羣指望節點達不到的時候，集羣就會處於block，沒法正常索引和查詢，說明集羣中某個節點未能正常啓動，這正是咱們指望的效果，block住，避免照成數據的不一致。
gateway的類型，默認爲local即爲本地文件系統，能夠設置爲本地文件系統，分佈式文件系統，hadoop的HDFS，和amazon的s3服務器，其它文件系統的設置方法下次再詳細說。

Java代碼

gateway.type: local

設置集羣中N個節點啓動時進行數據恢復，默認爲1。

Java代碼

gateway.recover_after_nodes: 1

設置初始化數據恢復進程的超時時間，默認是5分鐘。

Java代碼

gateway.recover_after_time: 5m

設置這個集羣中節點的數量，默認爲2，一旦這N個節點啓動，就會當即進行數據恢復。

Java代碼

gateway.expected_nodes: 2

初始化數據恢復時，併發恢復線程的個數，默認爲4。

Java代碼

cluster.routing.allocation.node_initial_primaries_recoveries: 4

添加刪除節點或負載均衡時併發恢復線程的個數，默認爲4。

Java代碼

cluster.routing.allocation.node_concurrent_recoveries: 2

設置數據恢復時限制的帶寬，如入100mb，默認爲0，即無限制。

Java代碼

indices.recovery.max_size_per_sec: 0

設置這個參數來限制從其它分片恢復數據時最大同時打開併發流的個數，默認爲5。

Java代碼

indices.recovery.concurrent_streams: 5

設置這個參數來保證集羣中的節點能夠知道其它N個有master資格的節點。默認爲1，對於大的集羣來講，能夠設置大一點的值（2-4）。

Java代碼

discovery.zen.minimum_master_nodes: 1

設置集羣中自動發現其它節點時ping鏈接超時時間，默認爲3秒，對於比較差的網絡環境能夠高點的值來防止自動發現時出錯。

Java代碼

discovery.zen.ping.timeout: 3s

Java代碼

discovery.zen.ping.multicast.enabled: false

設置是否打開多播發現節點，默認是true。
當禁用multcast廣播的時候，能夠手動設置集羣的節點ip

設置集羣中master節點的初始列表，能夠經過這些節點來自動發現新加入集羣的節點。

Java代碼

discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3[portX-portY]"]

下面是一些查詢時的慢日誌參數設置

Java代碼

index.search.slowlog.level: TRACE
index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.query.debug: 2s
index.search.slowlog.threshold.query.trace: 500ms
index.search.slowlog.threshold.fetch.warn: 1s
index.search.slowlog.threshold.fetch.info: 800ms
index.search.slowlog.threshold.fetch.debug:500ms
index.search.slowlog.threshold.fetch.trace: 200ms

1.設置cache大小和過時時間。

Java代碼

index.cache.field.max_size
index.cache.field.expire

例如設置：
//index中每一個segment中可包含的最大的entries數目

Java代碼

index.cache.field.max_size: 50000

//過時時間爲10分鐘

Java代碼

index.cache.field.expire: 10m

2.改變cache類型。

Java代碼

index.cache.field.type: soft

默認類型爲resident，字面意思是常駐（居民），一直增長，直到內存耗盡。改成soft就是當內存不足的時候，先clear掉佔用的，而後再往內存中放。設置爲soft後，至關於設置成了相對的內存大小。resident的話，除非內存夠大。 3.對數據進行處理。文章中提到的是減少字段值長度，如將大寫轉成小寫。這點上，實際中可能將數據精煉。固然，也能夠把要作facet的字段作一個轉化，用int型代替。關於string轉化int呢，能夠參考M大神的: https://github.com/medcl/elasticsearch-analysis-string2int

相關標籤/搜索

配置

說明

elasticsearch+elasticsearch

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。