Elasticsearch內部提供了一個rest接口用於查看集羣內部的健康情況:html
Elasticsearch內部提供了一個rest接口用於查看集羣內部的健康情況:html
curl -XGET http://localhost:9200/_cluster/health
|
response結果:java
{
"cluster_name": "format-es",
"status": "green",
...
}
|
這裏的status有3種狀態,分別是green(全部主分片和複製分片均可用),yellow(全部主分片可用,但不是全部複製分片均可用)和red(不是全部主分片可用)。node
Elasticsearch中的索引(index)是由分片(shard)構成的。git
好比咱們集羣中有個索引users,該索引由3個分片組成,那麼這個users索引中的文檔數據將分佈在這3個分片中。github
users索引中的文檔是根據下面這個規則肯定該文檔屬於哪一個分片:spring
shard = hash(routing) % number_of_primary_shards
// routing值默認是文檔的_id,number_of_primary_shards是索引的主分片個數
|
這個routing默認是文檔的_id,能夠自定義(文章後面部分會舉例說明)。api
這3個分片能夠進行復制,複製是爲了實現容錯性,好比複製1份,那麼一共就須要6個分片(3個主分片+3個主分片複製出來的複製分片)。springboot
users索引的建立命令(主分片3個,複製1份):併發
curl -XPUT http://localhost:9200/users
-d '
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
}
}
'
|
建立完users索引以後,es集羣(單節點)分片狀況以下:curl
因爲users索引有3個分片,es內部會建立出3個分片,分別是P0、P1和P2(大寫P指的是primary),且這3個分片都是主分片。users索引須要對分片進行復制1份,因此這3個主分片都須要複製1份,分別對應R0、R1和R2這3個複製分片(大寫R指的是replica)。這個時候咱們的集羣只有1個節點node-1,因此複製分片並無起做用(若是複製分片和主分片在同一個節點了,那麼這個複製分片的意義就不存在了。複製分片的意義在於容錯性,當一個節點掛了,另外一個節點上的分片能夠代替掛掉節點上的分片)。
查看健康狀態:
curl -XGET http://localhost:9200/_cluster/health
|
response結果:
{
"cluster_name": "format-es",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 3,
"active_shards": 3,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 3,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 50
}
|
這裏能夠看到,集羣的狀態變成了yellow。這是由於users索引中的分片須要複製1份,可是沒有足夠的機器用來存儲複製出來的複製分片,還有其它的一些字段好比unassigned_shards字段爲3,對應R0、R1和R2這3個未分配的複製分片。
在集羣中加入節點node-2,查看健康情況(這裏使用僞集羣。node-1節點對應9200端口的進程,node-2節點對應9201端口的進程):
curl -XGET http://localhost:9200/_cluster/health
|
response結果:
{
"cluster_name": "format-es",
"status": "green",
"timed_out": false,
"number_of_nodes": 2,
"number_of_data_nodes": 2,
"active_primary_shards": 3,
"active_shards": 6,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}
|
主分片和複製分片都可用,status爲green。
此時,es集羣分片狀況以下:
這個時候es集羣由2個節點node-1和node-2組成,而且這2個節點上具備主分片和複製分片,具備容錯性。
咱們往users索引中插入一條文檔:
curl -XPOST http://localhost:9200/users/normal -d '
{
"name" : "Format",
"age" : 111
}
'
|
返回:
{
"_index": "users",
"_type": "normal",
"_id": "AV0hs4LnkXxVJ5DURwXr",
"_version": 1,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"created": true
}
|
從返回的信息中能夠看到,這個文檔已經被建立成功,而且2個分片都成功。id由es內部自動建立,值爲AV0hs4LnkXxVJ5DURwXr。
讀取id爲AV0hs4LnkXxVJ5DURwXr的文檔:
curl -XGET http://localhost:9200/users/normal/AV0hs4LnkXxVJ5DURwXr
# 結果
{
"_index": "users",
"_type": "normal",
"_id": "AV0hs4LnkXxVJ5DURwXr",
"_version": 1,
"found": true,
"_source": {
"name": "Format",
"age": 111
}
}
|
這個時候若是節點node-1掛了,讀取數據:
curl -XGET http://localhost:9201/users/normal/AV0hs4LnkXxVJ5DURwXr
# 結果
{
"_index": "users",
"_type": "normal",
"_id": "AV0hs4LnkXxVJ5DURwXr",
"_version": 1,
"found": true,
"_source": {
"name": "Format",
"age": 111
}
}
|
在節點node-1已經掛了的狀況下仍是讀取到了以前插入的文檔。這是由於咱們users索引會複製2份,node-1節點雖然已經掛了,可是node-2節點上這個文檔的數據還在,因此文檔會被讀取到。
在node-1節點掛掉的狀況下,再次插入一條文檔:
curl -XPOST http://localhost:9201/users/normal -d '
{
"name" : "Jim",
"age" : 66
}
'
|
返回:
{
"_index": "users",
"_type": "normal",
"_id": "AV0qMto5dJHprgu99sSN",
"_version": 1,
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": true
}
|
這裏看到返回的數據中,這個文檔對應的分片只有1個成功插入,由於另1個分片對應的節點已經掛了。
而後讀取這個新插入的文檔:
curl -XGET http://localhost:9201/users/normal/AV0qMto5dJHprgu99sSN
# 結果
{
"_index": "users",
"_type": "normal",
"_id": "AV0qMto5dJHprgu99sSN",
"_version": 1,
"found": true,
"_source": {
"name": "Jim",
"age": 66
}
}
|
而後node-1節點恢復(節點恢復以後,es內部會自動從數據全的分片中複製數據到數據少的分片上,保證高可用),而後讀取數據:
curl -XGET http://localhost:9200/users/normal/AV0qMto5dJHprgu99sSN
# 結果
{
"_index": "users",
"_type": "normal",
"_id": "AV0qMto5dJHprgu99sSN",
"_version": 1,
"found": true,
"_source": {
"name": "Jim",
"age": 66
}
}
|
ES中文檔的新建、刪除和修改都是先在主分片上完成的,在主分片上完成這些操做之後,纔會進行復制操做。好比有3個節點node-一、node-2和node-3,索引blogs有2個主分片,而且複製2份,集羣結構以下:
當進行新建文檔的時候過程以下:
當進行檢索文檔的時候過程以下:
這裏es集羣會使用輪詢的策略對讀取不一樣節點上的分片中的文檔數據,好比針對上圖中的查詢,下次查詢就會讀取node-3節點上的R0分片中的文檔。
當對文檔進行局部更新的時候過程以下:
在分佈式集羣狀況下,ES中的節點可分爲4類:
能夠經過es內部提供的rest接口查看master節點:
curl -XGET http://localhost:9200/_cat/master?v
id host ip node
9FINsHCpTKqcpFlnnA4Yww 10.1.251.164 10.1.251.164 node-1
|
查看節點信息:
curl -XGET http://localhost:9200/_cat/nodes?v
host ip heap.percent ram.percent load node.role master name
10.1.251.164 10.1.251.164 6 100 5.48 d * node-1
10.1.251.164 10.1.251.164 6 100 5.48 d m node-3
10.1.251.164 10.1.251.164 7 100 5.48 d m node-2
|
或者使用head插件查看節點狀況。圖中帶有五角星的節點是master,這裏users索引有3個主分片和3個複製分片(綠色框外部加粗的邊框就是主分片,不然就是複製分片):
若是咱們的集羣上node-1節點因爲硬盤容量不足致使不可用時,head插件狀況以下(3個複製節點未被分配,健康情況爲黃色):
也可以使用es內部的rest接口查看分片信息:
curl -XGET http://localhost:9200/_cat/shards?v
index shard prirep state docs store ip node
users 1 p STARTED 1 3.3kb 10.1.251.164 node-2
users 1 r UNASSIGNED
users 2 p STARTED 0 159b 10.1.251.164 node-2
users 2 r UNASSIGNED
users 0 p STARTED 2 6.6kb 10.1.251.164 node-3
users 0 r UNASSIGNED
|
routing參數決定如何分片(能夠在index、get、delete、update、bulk等方法中使用),咱們覆蓋默認的routing爲_id的默認策略:
# 執行10次
curl -XPOST http://localhost:9200/users/normal?routing=1
-d '
{
"name" : "Format345",
"age" : 456
}
'
# 執行1次
curl -XPOST http://localhost:9200/users/normal
-d '
{
"name" : "Format345",
"age" : 456
}
'
# 使用routing參數獲得文檔的結果(多了個_rouring屬性)
{
"_index": "users",
"_type": "normal",
"_id": "AV07AubA6HDSJNRJle0i",
"_version": 1,
"_routing": "1",
"found": true,
"_source": {
"name": "Format345",
"age": 456
}
}
# 查詢文檔分佈狀況(前面10次分佈到了P2分片,後面1次分佈到了P1分片)
curl -XGET http://localhost:9200/_cat/shards?v
index shard prirep state docs store ip node
users 1 p STARTED 2 3.3kb 10.1.251.164 node-2
users 1 r UNASSIGNED
users 2 p STARTED 10 159b 10.1.251.164 node-2
users 2 r UNASSIGNED
users 0 p STARTED 2 6.6kb 10.1.251.164 node-3
users 0 r UNASSIGNED
|
官網上有更多關於_cat api和_cluster api相關的文檔。
es中文檔的操做可使用其內部提供的rest接口,使用過程當中能夠指定一些參數修改默認行爲。
1.replication:用於設置複製分片的處理過程是同步仍是異步。默認值是sync(主分片須要等待複製分片所有處理完畢),也能夠設置成async(主分片不須要等待複製分片的處理過程,可是仍是會轉發請求給複製分片,這個轉發過程是異步的)。該參數在2.0.0版本後已經被廢棄,由於異步轉發給複製分片的話,不知道複製分片是否成功與否,並且複製分片在尚未處理完成的狀況下因爲一直過來的異步請求而致使es過載,不建議使用async
2.consistency:寫文檔的一致性參數,能夠設置成one,quorum和all;分別表示主分片可用便可、過半分片可用[公式:int( (primary + number_of_replicas) / 2 ) + 1]以及所有分片可用。好比有個blogs索引,有3個主分片,而且複製2份,當集羣中的1個節點掛了,並使用all的話,將會拋出異常:
curl -XPOST http://localhost:9200/blogs/normal?consistency=all
-d '
{
"name" : "POST-1"
}
'
# 一分鐘後拋出異常
{
"error": {
"root_cause": [
{
"type": "unavailable_shards_exception",
"reason": "[blogs][0] Not enough active copies to meet write consistency of [ALL] (have 2, needed 3). Timeout: [1m], request: [index {[blogs][normal][AV1AF1FEl7qPpRBCQMV7], source[{\n \"name\" : \"POST-1\"\n}]}]"
}
],
"type": "unavailable_shards_exception",
"reason": "[blogs][0] Not enough active copies to meet write consistency of [ALL] (have 2, needed 3). Timeout: [1m], request: [index {[blogs][normal][AV1AF1FEl7qPpRBCQMV7], source[{\n \"name\" : \"POST-1\"\n}]}]"
},
"status": 503
}
|
使用默認的quorum策略:
curl -XPOST http://localhost:9200/blogs/normal
-d '
{
"name" : "POST-1"
}
# 因爲集羣中的節點掛了1個,所分片只有2個success
{
"_index": "blogs",
"_type": "normal",
"_id": "AV1AckLfl7qPpRBCQMV_",
"_version": 1,
"_shards": {
"total": 3,
"successful": 2,
"failed": 0
},
"created": true
}
'
|
consistency參數在5.0.0版本已經被棄用
3.timeout:當分片不足的時候,es等待的時間(等待節點從新啓動,分片恢復),默認爲1分鐘,能夠進行修改,改爲10秒:
curl -XPOST http://localhost:9200/blogs/normal?consistency=all&timeout=10s
-d '
{
"name" : "POST-1"
}
'
# 10秒後拋出異常
{
"error": {
"root_cause": [
{
"type": "unavailable_shards_exception",
"reason": "[blogs][1] Not enough active copies to meet write consistency of [ALL] (have 2, needed 3). Timeout: [10s], request: [index {[blogs][normal][AV1AdXxsl7qPpRBCQMWB], source[{\n \"name\" : \"POST-1\"\n}]}]"
}
],
"type": "unavailable_shards_exception",
"reason": "[blogs][1] Not enough active copies to meet write consistency of [ALL] (have 2, needed 3). Timeout: [10s], request: [index {[blogs][normal][AV1AdXxsl7qPpRBCQMWB], source[{\n \"name\" : \"POST-1\"\n}]}]"
},
"status": 503
}
|
4.version
es中每一個文檔都有對應的版本信息,可使用version版本參數用來實現併發狀況下的樂觀鎖機制:
# 新建一個文檔
curl -XPUT http://localhost:9200/blogs/normal/format-001
-d '
{
"name" : "format-post-001"
}
'
# 結果
{
"_index": "blogs",
"_type": "normal",
"_id": "format-001",
"_version": 1,
"_shards": {
"total": 3,
"successful": 3,
"failed": 0
},
"created": true
}
# id爲format-001的文檔目前的version爲1,進行更新
# 用version爲2去更新
curl -XPUT http://localhost:9200/blogs/normal/format-001?version=2
-d '
{
"name" : "format-post-001-001"
}
'
# 報錯,版本衝突
{
"error": {
"root_cause": [
{
"type": "version_conflict_engine_exception",
"reason": "[normal][format-001]: version conflict, current [1], provided [2]",
"shard": "0",
"index": "blogs"
}
],
"type": "version_conflict_engine_exception",
"reason": "[normal][format-001]: version conflict, current [1], provided [2]",
"shard": "0",
"index": "blogs"
},
"status": 409
}
# 用version爲1去更新
curl -XPUT http://localhost:9200/blogs/normal/format-001?version=1
-d '
{
"name" : "format-post-001-001"
}
'
# 更新成功,文檔版本變成2
{
"_index": "blogs",
"_type": "normal",
"_id": "format-001",
"_version": 2,
"_shards": {
"total": 3,
"successful": 3,
"failed": 0
},
"created": false
}
|
5.op_type:能夠指定本次操做的類型,好比create操做。
# 建立一個id爲1,type爲normal,在blogs索引中的文檔
curl -XPUT http://localhost:9200/blogs/normal/1?op_type=create
-d '
{
"name" : "POST-2"
}
'
{
"_index": "blogs",
"_type": "normal",
"_id": "1",
"_version": 1,
"_shards": {
"total": 3,
"successful": 3,
"failed": 0
},
"created": true
}
# 繼續調用同一個操做
curl -XPUT http://localhost:9200/blogs/normal/1?op_type=create
-d '
{
"name" : "POST-2"
}
'
# 報錯,文檔已經存在
{
"error": {
"root_cause": [
{
"type": "document_already_exists_exception",
"reason": "[normal][1]: document already exists",
"shard": "0",
"index": "blogs"
}
],
"type": "document_already_exists_exception",
"reason": "[normal][1]: document already exists",
"shard": "0",
"index": "blogs"
},
"status": 409
}
|
能夠不使用op_type操做,在url中指定。這兩種方法效果是同樣的
http://localhost:9200/blogs/normal/1/_create 效果跟 http://localhost:9200/blogs/normal/1?op_type=create 是同樣的。
目前支持的op_type有create(只支持建立文檔)和index(支持建立和更新文檔)。
6.wait_for_active_shards
在5.0.0版本新引入的一個參數,表示等待活躍的分片數。做用跟consistency相似,能夠設置成all或者任意正整數。
好比在這種場景下:集羣中有3個節點node-一、node-2和node-3,而且索引中的分片須要複製3份。那麼該索引一共擁有4個分片,包括1個主分片和3個複製分片。
默認狀況下,索引操做只須要等待主分片可用(wait_for_active_shards爲1)便可。
若是node-2和node-3節點掛了,索引操做是不會受影響的(wait_for_active_shards默認爲1);若是設置了wait_for_active_shards爲3,那麼須要3個節點所有存活;若是設置了wait_for_active_shards爲4或者all(一共4個分片,4和all是同樣的效果),那麼該集羣中的索引操做永遠都會失敗,由於集羣一共就3個節點,不能處理全部的4個分片。
好比設置成all,則會拋出以下錯誤:
{
"error": {
"root_cause": [
{
"type": "unavailable_shards_exception",
"reason": "[blogs][2] Not enough active copies to meet shard count of [ALL] (have 3, needed 4). Timeout: [1m], request: [index {[blogs][normal][AV1QVDz3RpA5iuXn159C], source[{\n \"name\" : \"POST-1\"\n}]}]"
}
],
"type": "unavailable_shards_exception",
"reason": "[blogs][2] Not enough active copies to meet shard count of [ALL] (have 3, needed 4). Timeout: [1m], request: [index {[blogs][normal][AV1QVDz3RpA5iuXn159C], source[{\n \"name\" : \"POST-1\"\n}]}]"
},
"status": 503
}
|
wait_for_active_shards的默認值能夠在定義索引的時候進行設置,也能夠動態地進行修改:
curl -XPUT http://localhost:9200/blogs/_settings
-d '
{
"index.write.wait_for_active_shards": 3
}
'
|
7.自動生成id
建立文檔的時候,能夠不指定id,es會自動爲你生成1個id,須要注意的話須要使用POST方式,而不是PUT方式。
curl -XPOST http://localhost:9200/blogs/normal
-d '
{
"name" : "my-post"
}
'
{
"_index": "blogs",
"_type": "normal",
"_id": "AV1Pj6MdAuPf3r3i0ysL", # 自動生成的id
"_version": 1,
"_shards": {
"total": 3,
"successful": 3,
"failed": 0
},
"created": true
}
|
8.文檔的局部更新
# 新建文檔
curl -XPUT http://localhost:9200/blogs/normal/format-doc-001
-d '
{
"title" : "springboot in action",
"author" : "Format"
}
'
# 執行全更新操做
curl -XPUT http://localhost:9200/blogs/normal/format-doc-001
-d '
{
"create_at": "2017-07-18"
}
'
# 獲取文檔
curl -XGET http://localhost:9200/blogs/normal/format-doc-001
{
"_index": "blogs",
"_type": "normal",
"_id": "format-doc-001",
"_version": 2,
"found": true,
"_source": {
"create_at": "2017-07-18"
}
}
# 使用文檔局部更新
curl -XPOST http://localhost:9200/blogs/normal/format-doc-001/_update
-d '
{
"doc": {
"title" : "springboot in action",
"author" : "Format"
}
}
'
# 獲取文檔
curl -XGET http://localhost:9200/blogs/normal/format-doc-001
{
"_index": "blogs",
"_type": "normal",
"_id": "format-doc-001",
"_version": 3,
"found": true,
"_source": {
"create_at": "2017-07-18",
"author": "Format",
"title": "springboot in action"
}
}
# 使用腳本局部更新
curl -XPOST http://localhost:9200/blogs/normal/format-doc-001/_update
-d '
{
"script" : "ctx._source.views = 0; ctx._source.tags = [new_tag]",
"params": {
"new_tag": "java"
}
}
'
# 獲取文檔
curl -XGET http://localhost:9200/blogs/normal/format-doc-001
{
"_index": "blogs",
"_type": "normal",
"_id": "format-doc-001",
"_version": 3,
"found": true,
"_source": {
"create_at": "2017-07-18",
"author": "Format",
"title": "springboot in action",
"tags": [
"java"
],
"views": 0
}
}
# 使用腳本局部更新新建立的文檔
curl -XPOST http://localhost:9200/blogs/normal/format-doc-002/_update
-d '
{
"script" : "ctx._source.views+=1"
}
'
# 報錯,由於id爲format-doc-002的文檔不存在
{
"error": {
"root_cause": [
{
"type": "document_missing_exception",
"reason": "[normal][format-doc-002]: document missing",
"shard": "0",
"index": "blogs"
}
],
"type": "document_missing_exception",
"reason": "[normal][format-doc-002]: document missing",
"shard": "0",
"index": "blogs"
},
"status": 404
}
# 加上upsert參數(設置字段的初始值)
curl -XPOST http://localhost:9200/blogs/normal/format-doc-002/_update
-d '
{
"script" : "ctx._source.views+=1",
"upsert": {
"views": 1
}
}
'
# 獲取文檔
curl -XGET http://localhost:9200/blogs/normal/format-doc-002
{
"_index": "blogs",
"_type": "normal",
"_id": "format-doc-002",
"_version": 1,
"found": true,
"_source": {
"views": 1
}
}
|
9.檢索多個文檔(Multi Get API)
能夠在一個請求中得到多個文檔數據。
# 在全部索引中執行mget,在參數中指定索引
curl -XGET http://localhost:9200/_mget
-d '
{
"docs" : [
{
"_index" : "blogs",
"_type" : "normal",
"_id" : "format-doc-001"
},
{
"_index" : "blogs",
"_type" : "normal",
"_id" : "format-doc-002"
}
]
}
'
# 結果
{
"docs": [
{
"_index": "blogs",
"_type": "normal",
"_id": "format-doc-001",
"_version": 3,
"found": true,
"_source": {
"create_at": "2017-07-18",
"author": "Format",
"title": "springboot in action",
"tags": [
"java"
],
"views": 0
}
},
{
"_index": "blogs",
"_type": "normal",
"_id": "format-doc-002",
"_version": 1,
"found": true,
"_source": {
"views": 1
}
}
]
}
# 基於特定的索引作mget
curl -XGET http://localhost:9200/blogs/_mget
-d '
{
"docs" : [
{
"_type" : "normal",
"_id" : "format-doc-001"
},
{
"_type" : "normal",
"_id" : "format-doc-002"
}
]
}
'
# 基於特定的索引和類型作mget
curl -XGET http://localhost:9200/blogs/normal/_mget
-d '
{
"docs" : [
{
"_id" : "format-doc-001"
},
{
"_id" : "format-doc-002"
}
]
}
'
# 簡化版的基於特定的索引和類型作mget
curl -XGET http://localhost:9200/blogs/normal/_mget
-d '
{
"ids": ["format-doc-001", "format-doc-002"]
}
'
# 過濾source中的屬性
curl -XGET http://localhost:9200/_mget
-d '
{
"docs" : [
{
"_index": "blogs",
"_type": "normal",
"_id" : "format-doc-001",
"_source": ["title", "author"]
},
{
"_index": "blogs",
"_type": "normal",
"_id" : "format-doc-002",
"_source": false
},
{
"_index": "blogs",
"_type": "normal",
"_id" : "format-doc-003",
"_source": {
"include": ["title"],
"exclude": ["author"]
}
}
]
}
'
|
10.批量操做(bulk)
批量操做能夠實現同一個請求操做多個文檔的過程。須要注意的是bulk操做Http Body中的格式,對文檔進行處理的話須要使用換行。好比建立新文檔,更新文檔都須要使用換行把建立目錄和文檔數據進行分割。不一樣的操做也須要用換行進行分割,好比建立文檔和刪除文檔。