Index索引java
Node節點node
文檔會被序列化成JSON格式,保存在Elasticsearch中。正則表達式
每一個文檔都有一個Unique ID數據庫
JSON文檔格式靈活,不須要預先定義格式。express
{ "_index" : "movies", "_type" : "_doc", "_id" : "8609", "_score" : 1.0, "_source" : { "year" : 1923, "title" : "Our Hospitality", "@version" : "1", "id" : "8609", "genre" : [ "Comedy" ] } }
_index
:文檔所屬的索引名_type
:文檔所屬的類型名_id
: 文檔惟一id_score
:文檔相關性打分_source
:文檔的原始JSON數據_version
:文檔的版本信息索引是文檔的容器,是一類文檔的集合。json
Index
:體現了邏輯空間
的概念。每一個索引都有本身的Mapping定義,用於定義包含的文檔的字段名和字段類型。Shard
:體現了物理空間
的概念,索引中的數據分佈在 Shard
上。索引的 Mapping
和 Setting
數組
Mapping
:定義文檔字段的類型。Setting
:定義不一樣的數據分佈。動詞:保存一個文檔到Elasticsearh的過程也叫索引(indexing)app
7.0以前
,一個Index能夠設置多個Types7.0開始
,一個Index只能建立一個Type:_doc
elasticsearch
6.0開始
,Type被Deprated。GET /_cat/indices?v
: 查看索引GET /_cat/indices?v&health=green
:查看狀態爲綠的索引GET /_cat/indices?v&s=docs.count:desc
:按照文檔個數對索引進行排序高可用性分佈式
可擴展性
elasticsearch
-E cluster.name=demo
進行設置節點是一個Elasticsearch實例
-E node.name=node1
指定。data
目錄下每一個節點啓動後,默認就是一個Master-eligible節點。
node.master:false
禁止每一個節點上都保存了集羣的狀態,只有Master節點才能修改集羣的狀態信息
集羣狀態,爲了一個集羣中必要的信息
Mapping
、Setting
信息Data Node
Coordinating Node
默認
都起到了Coordinating Node的做用。生產環境,應該設置單一角色的節點。
節點類型 | 配置參數 | 默認值 |
---|---|---|
master eligible | node.master | true |
data | node.data | true |
ingest | node.ingest | true |
coordinating only | / | 每一個節點默認都是Coordinating Node |
machine learning | node.ml | true, 須要enable x-pack |
主分片(Primary Shard):
副本分片 (Replica Shard):
生產環境中分片的設置,須要提早作好容量規劃。
分片數設置太小
分片數設置過大
GET /_cluster/health { "cluster_name" : "learn_es", "status" : "green", "timed_out" : false, "number_of_nodes" : 3, "number_of_data_nodes" : 3, "active_primary_shards" : 7, "active_shards" : 14, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }
green
: 主分片與副本分片都正常分配。yellow
: 主分片正常分配,有副本分片未能正常分配。red
: 有主分片未能分配。四種基本操做:
Index
:添加文檔
POST <index>/_doc
: 添加的文檔id爲系統自動生成。PUT <index>/_doc/<_id>
:若是該id的文檔不存在則添加,存在則更新同時增長版本號(version
字段)。POST <index>/_create/<_id>
:若是該id的文檔已存在,則報錯。PUT <index>/_create/<_id>
:若是該id的文檔已存在,則報錯。Get
:讀取文檔
GET <index>/_doc/<_id>
:獲取該id文檔的元信息GET <index>/_source/<_id>
:獲取該id文檔元信息中的 _source
字段HEAD <index>/_doc/<_id>
:判斷該id文檔是否存在,存在返回200,不存在返回404HEAD <index>/_source/<_id>
:判斷該id文檔中的_source
字段是否存在,存在返回200,不存在返回404Update
:更新文檔
POST <index>/_update/<_id>
:更新部分文檔,body體中使用doc
字段。Delete
:刪除文檔
DELETE /<index>/_doc/<_id>
:刪除該id的文檔,若是文檔不存在 什麼都不作自動生成
文檔id和 指定
文檔id。自動生成文檔id。
POST <index>/_doc
demo:
POST users/_doc { "user" : "Mike", "phone" : "15512345678" } ----------- { "_index" : "users", "_type" : "_doc", "_id" : "RfXT_28B5V-KMglJX8bm", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 3, "_primary_term" : 1 }
指定文檔id
PUT <index>/_doc/<_id>
或者 POST | PUT <index>/_create/<_id>
demo 1:PUT <index>/_doc/<_id>
PUT users/_doc/1 { "user" : "John", "phone" : "15812345678" } -------- # 不存在該id的文檔時,直接新增 { "_index" : "users", "_type" : "_doc", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 4, "_primary_term" : 1 } # 存在該id的文檔時,替換文檔(刪除現有的,建立新的,version +1) { "_index" : "users", "_type" : "_doc", "_id" : "1", "_version" : 23, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 26, "_primary_term" : 1 }
demo 2: POST | PUT <index>/_create/<_id>
POST users/_create/2 { "user" : "Dave", "phone" : "15912345678" } --------- # 不存在該id的文檔時,直接新增 { "_index" : "users", "_type" : "_doc", "_id" : "2", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 27, "_primary_term" : 1 } # 存在該id的文檔時,version衝突,報錯。 { "error": { "root_cause": [ { "type": "version_conflict_engine_exception", "reason": "[2]: version conflict, document already exists (current version [1])", "index_uuid": "mjgjxIROT72xLMHnYNiUxw", "shard": "0", "index": "users" } ], "type": "version_conflict_engine_exception", "reason": "[2]: version conflict, document already exists (current version [1])", "index_uuid": "mjgjxIROT72xLMHnYNiUxw", "shard": "0", "index": "users" }, "status": 409 }
根據id查找文檔
GET <index>/_doc/<_id>
demo:
GET users/_doc/2 -------- # 該id的文檔存在,返回文檔元信息 { "_index" : "users", "_type" : "_doc", "_id" : "2", "_version" : 1, "_seq_no" : 27, "_primary_term" : 1, "found" : true, "_source" : { "user" : "Dave", "phone" : "15912345678" } } # 該id的文檔不存在,返回找不到 { "_index" : "users", "_type" : "_doc", "_id" : "2", "found" : false }
更新指定id的文檔:
POST <index>/_update/<_id>
demo:更新部分文檔
POST users/_update/1 { "doc": { "age":28 } } -------- # 該id的文檔存在,且字段值有變更 則更新文檔;若是文檔存在,且字段值無變更,result爲noop { "_index" : "users", "_type" : "_doc", "_id" : "1", "_version" : 27, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 34, "_primary_term" : 1 }
demo2:按照腳本更新文檔
# index the doc PUT users/_doc/2 { "name" : "John", "counter" : 1 } { "_index" : "users", "_type" : "_doc", "_id" : "2", "_version" : 6, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 53, "_primary_term" : 2 } -------- # update the doc POST users/_update/2 { "script": { "source": "ctx._source.counter += params.count", "params": { "count":2 } } } { "_index" : "users", "_type" : "_doc", "_id" : "2", "_version" : 7, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 54, "_primary_term" : 2 }
根據id刪除文檔
Delete <index>/_doc/<_id>
demo:
DELETE users/_doc/2 -------- # 該id的文檔存在,直接刪除 { "_index" : "users", "_type" : "_doc", "_id" : "2", "_version" : 2, "result" : "deleted", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 31, "_primary_term" : 1 } # 該id的文檔不存在,什麼都不作 { "_index" : "users", "_type" : "_doc", "_id" : "2", "_version" : 3, "result" : "not_found", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 32, "_primary_term" : 1 }
支持四種類型操做。
Index
Create
Update
Delete
語法;POST _bulk
或者 POST <index>/_bulk
`
newline delimited JSON (NDJSON)結構
action_and_meta_data\n optional_source\n action_and_meta\_data\n optional_source\n .... action_and_meta_data\n optional_source\n
demo:
POST _bulk # index、create:下一行須要跟着source { "index" : { "_index" : "test", "_id" : "1" } } { "field1" : "value1" } { "create" : { "_index" : "test", "_id" : "2" } } { "field2" : "value2" } # update下一行須要跟着doc或者script { "update" : {"_id" : "1", "_index" : "test"} } { "doc" : {"field3" : "value3"} } # delete與標準delete API語法同樣 { "delete" : { "_index" : "test", "_id" : "2" } }
GET _mget
或者 GET <index>/_mget
demo:
GET /_mget { "docs" : [ { "_index" : "users", "_id" : "1" }, { "_index" : "twitter", "_id" : "2" } ] } -------- { "docs" : [ { "_index" : "users", "_type" : "_doc", "_id" : "1", "_version" : 31, "_seq_no" : 38, "_primary_term" : 2, "found" : true, "_source" : { "user" : "abc", "class" : 8, "age" : 28, "gender" : "male", "field1" : "value1" } }, { "_index" : "twitter", "_type" : null, "_id" : "2", "error" : { "root_cause" : [ { "type" : "index_not_found_exception", "reason" : "no such index [twitter]", "resource.type" : "index_expression", "resource.id" : "twitter", "index_uuid" : "_na_", "index" : "twitter" } ], "type" : "index_not_found_exception", "reason" : "no such index [twitter]", "resource.type" : "index_expression", "resource.id" : "twitter", "index_uuid" : "_na_", "index" : "twitter" } } ] }
demo2:
GET users/_mget { "docs": [ { "_id" : "2" }, { "_id" : "3" } ] } GET users/_mget { "ids" : ["2", "3"] } -------- { "docs" : [ { "_index" : "users", "_type" : "_doc", "_id" : "2", "_version" : 7, "_seq_no" : 54, "_primary_term" : 2, "found" : true, "_source" : { "name" : "John", "counter" : 3 } }, { "_index" : "users", "_type" : "_doc", "_id" : "3", "found" : false } ] }
倒排索引包含兩個部分:
單詞詞典(Term Dictionary):
倒排列表(Posting List):
倒排索引項:
在如下文檔中搜索Elasticsearch
文檔內容
文檔Id | 文檔內容 |
---|---|
1 | Mastering Elasticsearch |
2 | Elasticsearch Server |
3 | Elasticsearch Essentials |
倒排列表
文檔Id | 詞頻TF | 位置 | 偏移 |
---|---|---|---|
1 | 1 | 1 | <10,23> |
2 | 1 | 0 | <0,13> |
3 | 1 | 0 | <0,13> |
能夠指定對某些字段不作索引。
Analysis:
Analyzer:
Analyzer由三部分組成。
默認
分詞器,按詞切分,小寫處理。GET /_analyze
POST /_analyze
GET /<index>/_analyze
POST /<index>/_analyze
demo:
POST _analyze { "analyzer": "standard", "text": ["share your experience with NoSql & big data technologies"] }