若在傳統DBMS 關係型數據庫中查詢海量數據,特別是模糊查詢,通常咱們都是使用like %查詢的值%,但這樣會致使沒法應用索引,從而造成全表掃描效率低下,即便是在有索引的字段精確值查找,面對海量數據,效率也是相對較低的,因此目前通常的互聯網公司或大型公司,若要查詢海量數據,最好的辦法就是使用搜索引擎,目前比較主流的搜索引擎框架就是:Elasticsearch,故今天我這裏總結了Elasticsearch必知必會的乾貨知識一:ES索引文檔的CRUD,後面陸續還會有其它乾貨知識分享,敬請期待。html
ES索引文檔的CRUD(6.X與7.X有區別,6.X中支持一個index建立多個type,而7.X中及以上只支持1個固定的type,即:_doc,API用法上也稍有不一樣):正則表達式
Create建立索引文檔【POST index/type/id可選,若是index、type、id已存在則重建索引文檔(先刪除後建立索引文檔,與Put index/type/id 原理相同),若是在指定id狀況下須要限制自動更新,則可使用:index/type/id?op_type=create 或 index/type/id/_create,指明操做類型爲建立,這樣當存在的記錄的狀況下會報錯】數據庫
POST demo_users/_doc 或 demo_users/_doc/2vJKsm8BriJODA6s9GbQ/_create
json
Request Body:api
{ "userId":1, "username":"張三", "role":"administrator", "enabled":true, "createdDate":"2020-01-01T12:00:00" }
Response Body:數組
{ "_index": "demo_users", "_type": "_doc", "_id": "2vJKsm8BriJODA6s9GbQ", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 0, "_primary_term": 1 }
Get獲取索引文檔【Get index/type/id】框架
Get demo_users/_doc/123
less
Response Body:elasticsearch
{ "_index": "demo_users", "_type": "_doc", "_id": "123", "_version": 1, "found": true, "_source": { "userId": 1, "username": "張三", "role": "administrator", "enabled": true, "createdDate": "2020-01-01T12:00:00" } }
Index Put重建索引文檔【PUT index/type/id 或 index/type/id?op_type=index,id必傳,若是id不存在文檔則建立文檔,不然先刪除原有id文檔後再從新建立文檔,version加1】ide
Put/POST demo_users/_doc/123 或 demo_users/_doc/123?op_type=index
Request Body:
{ "userId":1, "username":"張三", "role":"administrator", "enabled":true, "createdDate":"2020-01-01T12:00:00", "remark":"僅演示" }
Response Body:
{ "_index": "demo_users", "_type": "_doc", "_id": "123", "_version": 4, "result": "updated", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 10, "_primary_term": 1 }
Update更新索引文檔【POST index/type/id/_update 請求體必需是{"doc":{具體的文檔JSON}},若是指定的鍵字段已存在則更新,若是指定的鍵字段不存在則附加新的鍵值對,支持多層級嵌套,屢次請求,若是有字段值有更新則version加1,不然提示更新0條 】
POST demo_users/_doc/123/_update
Request Body:
{ "doc": { "userId": 1, "username": "張三", "role": "administrator", "enabled": true, "createdDate": "2020-01-01T12:00:00", "remark": "僅演示POST更新5", "updatedDate": "2020-01-17T15:30:00" } }
Response Body:
{ "_index": "demo_users", "_type": "_doc", "_id": "123", "_version": 26, "result": "updated", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 35, "_primary_term": 1 }
Delete刪除索引文檔【DELETE index/type/id】
DELETE demo_users/_doc/123
Response Body:
{ "_index": "demo_users", "_type": "_doc", "_id": "123", "_version": 2, "result": "deleted", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 39, "_primary_term": 1 }
Bulk批量操做文檔【POST _bulk 或 index/_bulk 或 index/type/_bulk
一次請求支持進行多個索引、多個type的多種不一樣的CRUD操做,若是操做中有某個出現錯誤不會影響其它操做;】
POST _bulk
Request Body:(注意最後還得多一個換行,由於ES是根據換行符來識別多條命令的,若是缺乏最後一條換行則會報錯,注意請求體非標準的JSON,每行纔是一個JSON,總體頂多可當作是\n區分的JSON對象數組)
{ "index" : { "_index" : "demo_users_test", "_type" : "_doc", "_id" : "1" } } { "bulk_field1" : "測試建立index" } { "delete" : { "_index" : "demo_users", "_type" : "_doc", "_id" : "123" } } { "create" : { "_index" : "demo_users", "_type" : "_doc", "_id" : "2" } } { "bulk_field2" : "測試建立index2" } { "update" : { "_index" : "demo_users_test","_type" : "_doc","_id" : "1" } } { "doc": {"bulk_field1" : "測試建立index1","bulk_field2" : "測試建立index2"} }
Response Body:
{ "took": 162, "errors": true, "items": [ { "index": { "_index": "demo_users_test", "_type": "_doc", "_id": "1", "_version": 8, "result": "updated", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 7, "_primary_term": 1, "status": 200 } }, { "delete": { "_index": "demo_users", "_type": "_doc", "_id": "123", "_version": 2, "result": "not_found", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 44, "_primary_term": 1, "status": 404 } }, { "create": { "_index": "demo_users", "_type": "_doc", "_id": "2", "status": 409, "error": { "type": "version_conflict_engine_exception", "reason": "[_doc][2]: version conflict, document already exists (current version [1])", "index_uuid": "u7WE286CQnGqhHeuwW7oyw", "shard": "2", "index": "demo_users" } } }, { "update": { "_index": "demo_users_test", "_type": "_doc", "_id": "1", "_version": 9, "result": "updated", "_shards": { "total": 2, "successful": 2, "failed": 0 }, "_seq_no": 8, "_primary_term": 1, "status": 200 } } ] }
mGet【POST _mget 或 index/_mget 或 index/type/_mget
,若是指定了index或type,則請求報文中則無需再指明index或type,能夠經過_source指明要查詢的include以及要排除exclude的字段】
POST _mget
Request Body:
{ "docs": [ { "_index": "demo_users", "_type": "_doc", "_id": "12345" }, { "_index": "demo_users", "_type": "_doc", "_id": "1234567", "_source": [ "userId", "username", "role" ] }, { "_index": "demo_users", "_type": "_doc", "_id": "1234", "_source": { "include": [ "userId", "username" ], "exclude": [ "role" ] } } ] }
Response Body:
{ "docs":[ { "_index":"demo_users", "_type":"_doc", "_id":"12345", "_version":1, "found":true, "_source":{ "userId":1, "username":"張三", "role":"administrator", "enabled":true, "createdDate":"2020-01-01T12:00:00" } }, { "_index":"demo_users", "_type":"_doc", "_id":"1234567", "_version":7, "found":true, "_source":{ "role":"administrator", "userId":1, "username":"張三" } }, { "_index":"demo_users", "_type":"_doc", "_id":"1234", "_version":1, "found":true, "_source":{ "userId":1, "username":"張三" } } ] }
POST demo_users/_doc/_mget
Request Body:
{ "ids": [ "1234", "12345", "123457" ] }
Response Body:
{ "docs":[ { "_index":"demo_users", "_type":"_doc", "_id":"1234", "_version":1, "found":true, "_source":{ "userId":1, "username":"張三", "role":"administrator", "enabled":true, "createdDate":"2020-01-01T12:00:00", "remark":"僅演示" } }, { "_index":"demo_users", "_type":"_doc", "_id":"12345", "_version":1, "found":true, "_source":{ "userId":1, "username":"張三", "role":"administrator", "enabled":true, "createdDate":"2020-01-01T12:00:00" } }, { "_index":"demo_users", "_type":"_doc", "_id":"123457", "found":false } ] }
_update_by_query根據查詢條件更新匹配到的索引文檔的指定字段【POST index/_update_by_query
請求體寫查詢條件以及更新的字段,更新字段這裏採用了painless腳本進行靈活更新】
POST demo_users/_update_by_query
Request Body:(意思是查詢role=administrator【可能你們看到keyword,這是由於role字段爲text類型,沒法直接匹配,須要藉助於子字段role.keyword,若是有不理解後面會有簡要說明】,更新role爲poweruser、remark爲remark+採用_update_by_query更新)
{ "script":{ "source":"ctx._source.role=params.role;ctx._source.remark=ctx._source.remark+params.remark", "lang":"painless", "params":{ "role":"poweruser", "remark":"採用_update_by_query更新" } }, "query":{ "term":{ "role.keyword":"administrator" } } }
painless寫法請具體參考:painless語法教程
Response Body:
{ "took": 114, "timed_out": false, "total": 6, "updated": 6, "deleted": 0, "batches": 1, "version_conflicts": 0, "noops": 0, "retries": { "bulk": 0, "search": 0 }, "throttled_millis": 0, "requests_per_second": -1, "throttled_until_millis": 0, "failures": [ ] }
_delete_by_query根據查詢條件刪除匹配到的索引文檔【 POST index/_delete_by_query
請求體寫查詢匹配條件】
POST demo_users/_delete_by_query
Request Body:(意思是查詢enabled=false)
{ "query": { "match": { "enabled": false } } }
Response Body:
{ "took":29, "timed_out":false, "total":3, "deleted":3, "batches":1, "version_conflicts":0, "noops":0, "retries":{ "bulk":0, "search":0 }, "throttled_millis":0, "requests_per_second":-1, "throttled_until_millis":0, "failures":[ ] }
search查詢
URL GET查詢(GET index/_search?q=query_string語法,注意中文內容默認分詞器是一個漢字拆分紅一個term)
A.Term Query:【即分詞片斷(詞條)查詢,注意這裏講的包含是指與分詞片斷匹配】 GET /demo_users/_search?q=role:poweruser //指定字段查詢,即:字段包含查詢的值 GET /demo_users/_search?q=poweruser //泛查詢(沒有指定查詢的字段),即查詢文檔中全部字段包含poweruser的值,只要有一個字段符合,那麼該文檔將會被返回 B.Phrase Query【即分組查詢】 操做符有:AND / OR / NOT 或者表示爲: && / || / ! +表示must -表示must_not 例如:field:(+a -b)意爲field中必需包含a但不能包含b GET /demo_users/_search?q=remark:(POST test) GET /demo_users/_search?q=remark:(POST OR test) GET /demo_users/_search?q=remark:"POST test" //分組查詢,即:查詢remark中包含POST 或 test的文檔記錄 GET /demo_users/_search?q=remark:(test AND POST) //remark同時包含test與POST GET /demo_users/_search?q=remark:(test NOT POST) //remark包含test但不包含POST C.範圍查詢 區間表示:[]閉區間,{}開區間 如:year:[2019 TO 2020] 或 {2019 TO 2020} 或 {2019 TO 2020] 或 [* TO 2020] 算數符號 year:>2019 或 (>2012 && <=2020) 或 (+>=2012 +<=2020) GET /demo_users/_search?q=userId:>123 //查詢userId字段大於123的文檔記錄 D.通配符查詢 ?表示匹配任意1個字符,*表示匹配0或多個字符 例如:role:power* , role:use? GET /demo_users/_search?q=role:power* //查詢role字段前面是power,後面能夠是0或多個其它任意字符。 可以使用正則表達式,如:username:張三\d+ 可以使用近似查詢偏移量(slop)提升查詢匹配結果【使用~N,N表示偏移量】 GET /demo_users/_search?q=remark:tett~1 //查詢remark中包含test的文檔,但實際寫成了tett,故使用~1偏移近似查詢,能夠得到test的查詢結果 GET /demo_users/_search?q=remark:"i like shenzhen"~2 //查詢i like shenzhen但實際remark字段中值爲:i like hubei and shenzhen,比查詢值多了 hubei and,這裏使用~2指定可偏移相隔2個term(這裏即兩個單詞),最終也是能夠查詢出結果
DSL POST查詢(POST index/_search)
POST demo_users/_search
Request Body:
{ "query":{ "bool":{ "must":[ { "term":{ "enabled":"true" #查詢enabled=true } }, { "term":{ "role.keyword":"poweruser" #且role=poweruser } }, { "query_string":{ "default_field":"username.keyword", "query":"張三" #且 username 包含張三 } } ], "must_not":[ ], "should":[ ] } }, "from":0, "size":1000, "sort":[ { "createdDate":"desc" #根據createdDate倒序 } ], "_source":{ #指明返回的字段,includes需返回字段,excludes不須要返回字段 "includes":[ "role", "username", "userId", "remark" ], "excludes":[ ] } }
具體用法可參見:
【Elasticsearch】query_string的各類用法
Elasticsearch中 match、match_phrase、query_string和term的區別
Indices APIs:負責索引Index的建立(create)、刪除(delete)、獲取(get)、索引存在(exist)等操做。
Document APIs:負責索引文檔的建立(index)、刪除(delete)、獲取(get)等操做。
Search APIs:負責索引文檔的search(查詢),Document APIS根據doc_id進行查詢,Search APIs]根據條件查詢。
Aggregations:負責針對索引的文檔各維度的聚合(Aggregation)。
cat APIs:負責查詢索引相關的各種信息查詢。
Cluster APIs:負責集羣相關的各種信息查詢。