Elasticsearch必知必會的乾貨知識一:ES索引文檔的CRUD

​ 若在傳統DBMS 關係型數據庫中查詢海量數據,特別是模糊查詢,通常咱們都是使用like %查詢的值%,但這樣會致使沒法應用索引,從而造成全表掃描效率低下,即便是在有索引的字段精確值查找,面對海量數據,效率也是相對較低的,因此目前通常的互聯網公司或大型公司,若要查詢海量數據,最好的辦法就是使用搜索引擎,目前比較主流的搜索引擎框架就是:Elasticsearch,故今天我這裏總結了Elasticsearch必知必會的乾貨知識一:ES索引文檔的CRUD,後面陸續還會有其它乾貨知識分享,敬請期待。html

  1. ES索引文檔的CRUD(6.X與7.X有區別,6.X中支持一個index建立多個type,而7.X中及以上只支持1個固定的type,即:_doc,API用法上也稍有不一樣):正則表達式

    1. Create建立索引文檔【POST index/type/id可選,若是index、type、id已存在則重建索引文檔(先刪除後建立索引文檔,與Put index/type/id 原理相同),若是在指定id狀況下須要限制自動更新,則可使用:index/type/id?op_type=create 或 index/type/id/_create,指明操做類型爲建立,這樣當存在的記錄的狀況下會報錯】數據庫

      POST demo_users/_doc 或 demo_users/_doc/2vJKsm8BriJODA6s9GbQ/_createjson

      Request Body:api

      {
      "userId":1,
      "username":"張三",
      "role":"administrator",
      "enabled":true,
      "createdDate":"2020-01-01T12:00:00"
      }

      Response Body:數組

      {
      "_index": "demo_users",
      "_type": "_doc",
      "_id": "2vJKsm8BriJODA6s9GbQ",
      "_version": 1,
      "result": "created",
      "_shards": {
      "total": 2,
      "successful": 1,
      "failed": 0
      },
      "_seq_no": 0,
      "_primary_term": 1
      }
    2. Get獲取索引文檔【Get index/type/id】框架

      Get demo_users/_doc/123less

      Response Body:elasticsearch

      {
      "_index": "demo_users",
      "_type": "_doc",
      "_id": "123",
      "_version": 1,
      "found": true,
      "_source": {
      "userId": 1,
      "username": "張三",
      "role": "administrator",
      "enabled": true,
      "createdDate": "2020-01-01T12:00:00"
      }
      }
    3. Index Put重建索引文檔【PUT index/type/id 或 index/type/id?op_type=index,id必傳,若是id不存在文檔則建立文檔,不然先刪除原有id文檔後再從新建立文檔,version加1】ide

      Put/POST demo_users/_doc/123 或 demo_users/_doc/123?op_type=index

      Request Body:

      {
      "userId":1,
      "username":"張三",
      "role":"administrator",
      "enabled":true,
      "createdDate":"2020-01-01T12:00:00",
      "remark":"僅演示"
      }

      Response Body:

      {
      "_index": "demo_users",
      "_type": "_doc",
      "_id": "123",
      "_version": 4,
      "result": "updated",
      "_shards": {
      "total": 2,
      "successful": 2,
      "failed": 0
      },
      "_seq_no": 10,
      "_primary_term": 1
      }
    4. Update更新索引文檔【POST index/type/id/_update 請求體必需是{"doc":{具體的文檔JSON}},若是指定的鍵字段已存在則更新,若是指定的鍵字段不存在則附加新的鍵值對,支持多層級嵌套,屢次請求,若是有字段值有更新則version加1,不然提示更新0條 】

      POST demo_users/_doc/123/_update

      Request Body:

      {
        "doc": {
          "userId": 1,
          "username": "張三",
          "role": "administrator",
          "enabled": true,
          "createdDate": "2020-01-01T12:00:00",
          "remark": "僅演示POST更新5",
          "updatedDate": "2020-01-17T15:30:00"
        }
      }

      Response Body:

      {
      "_index": "demo_users",
      "_type": "_doc",
      "_id": "123",
      "_version": 26,
      "result": "updated",
      "_shards": {
      "total": 2,
      "successful": 2,
      "failed": 0
      },
      "_seq_no": 35,
      "_primary_term": 1
      }
    5. Delete刪除索引文檔【DELETE index/type/id】

      DELETE demo_users/_doc/123

      Response Body:

      {
      "_index": "demo_users",
      "_type": "_doc",
      "_id": "123",
      "_version": 2,
      "result": "deleted",
      "_shards": {
      "total": 2,
      "successful": 2,
      "failed": 0
      },
      "_seq_no": 39,
      "_primary_term": 1
      }
    6. Bulk批量操做文檔【POST _bulk 或 index/_bulk 或 index/type/_bulk 一次請求支持進行多個索引、多個type的多種不一樣的CRUD操做,若是操做中有某個出現錯誤不會影響其它操做;】

      POST _bulk

      Request Body:(注意最後還得多一個換行,由於ES是根據換行符來識別多條命令的,若是缺乏最後一條換行則會報錯,注意請求體非標準的JSON,每行纔是一個JSON,總體頂多可當作是\n區分的JSON對象數組)

      { "index" : { "_index" : "demo_users_test", "_type" : "_doc", "_id" : "1" } }
      { "bulk_field1" : "測試建立index" }
      { "delete" : { "_index" : "demo_users", "_type" : "_doc", "_id" : "123" } }
      { "create" : { "_index" : "demo_users", "_type" : "_doc", "_id" : "2" } }
      { "bulk_field2" : "測試建立index2" }
      { "update" : { "_index" : "demo_users_test","_type" : "_doc","_id" : "1" } }
      { "doc": {"bulk_field1" : "測試建立index1","bulk_field2" : "測試建立index2"} }

      Response Body:

      {
          "took": 162,
          "errors": true,
          "items": [
              {
                  "index": {
                      "_index": "demo_users_test",
                      "_type": "_doc",
                      "_id": "1",
                      "_version": 8,
                      "result": "updated",
                      "_shards": {
                          "total": 2,
                          "successful": 2,
                          "failed": 0
                      },
                      "_seq_no": 7,
                      "_primary_term": 1,
                      "status": 200
                  }
              },
              {
                  "delete": {
                      "_index": "demo_users",
                      "_type": "_doc",
                      "_id": "123",
                      "_version": 2,
                      "result": "not_found",
                      "_shards": {
                          "total": 2,
                          "successful": 2,
                          "failed": 0
                      },
                      "_seq_no": 44,
                      "_primary_term": 1,
                      "status": 404
                  }
              },
              {
                  "create": {
                      "_index": "demo_users",
                      "_type": "_doc",
                      "_id": "2",
                      "status": 409,
                      "error": {
                          "type": "version_conflict_engine_exception",
                          "reason": "[_doc][2]: version conflict, document already exists (current version [1])",
                          "index_uuid": "u7WE286CQnGqhHeuwW7oyw",
                          "shard": "2",
                          "index": "demo_users"
                      }
                  }
              },
              {
                  "update": {
                      "_index": "demo_users_test",
                      "_type": "_doc",
                      "_id": "1",
                      "_version": 9,
                      "result": "updated",
                      "_shards": {
                          "total": 2,
                          "successful": 2,
                          "failed": 0
                      },
                      "_seq_no": 8,
                      "_primary_term": 1,
                      "status": 200
                  }
              }
          ]
      }
    7. mGet【POST _mget 或 index/_mget 或 index/type/_mget ,若是指定了index或type,則請求報文中則無需再指明index或type,能夠經過_source指明要查詢的include以及要排除exclude的字段】

      POST _mget

      Request Body:

      {
        "docs": [
          {
            "_index": "demo_users",
            "_type": "_doc",
            "_id": "12345"
          },
          {
            "_index": "demo_users",
            "_type": "_doc",
            "_id": "1234567",
            "_source": [
              "userId",
              "username",
              "role"
            ]
          },
          {
            "_index": "demo_users",
            "_type": "_doc",
            "_id": "1234",
            "_source": {
              "include": [
                "userId",
                "username"
              ],
              "exclude": [
                "role"
              ]
            }
          }
        ]
      }

      Response Body:

      {
          "docs":[
              {
                  "_index":"demo_users",
                  "_type":"_doc",
                  "_id":"12345",
                  "_version":1,
                  "found":true,
                  "_source":{
                      "userId":1,
                      "username":"張三",
                      "role":"administrator",
                      "enabled":true,
                      "createdDate":"2020-01-01T12:00:00"
                  }
              },
              {
                  "_index":"demo_users",
                  "_type":"_doc",
                  "_id":"1234567",
                  "_version":7,
                  "found":true,
                  "_source":{
                      "role":"administrator",
                      "userId":1,
                      "username":"張三"
                  }
              },
              {
                  "_index":"demo_users",
                  "_type":"_doc",
                  "_id":"1234",
                  "_version":1,
                  "found":true,
                  "_source":{
                      "userId":1,
                      "username":"張三"
                  }
              }
          ]
      }

      POST demo_users/_doc/_mget

      Request Body:

      {
        "ids": [
          "1234",
          "12345",
          "123457"
        ]
      }

      Response Body:

      {
          "docs":[
              {
                  "_index":"demo_users",
                  "_type":"_doc",
                  "_id":"1234",
                  "_version":1,
                  "found":true,
                  "_source":{
                      "userId":1,
                      "username":"張三",
                      "role":"administrator",
                      "enabled":true,
                      "createdDate":"2020-01-01T12:00:00",
                      "remark":"僅演示"
                  }
              },
              {
                  "_index":"demo_users",
                  "_type":"_doc",
                  "_id":"12345",
                  "_version":1,
                  "found":true,
                  "_source":{
                      "userId":1,
                      "username":"張三",
                      "role":"administrator",
                      "enabled":true,
                      "createdDate":"2020-01-01T12:00:00"
                  }
              },
              {
                  "_index":"demo_users",
                  "_type":"_doc",
                  "_id":"123457",
                  "found":false
              }
          ]
      }
    8. _update_by_query根據查詢條件更新匹配到的索引文檔的指定字段【POST index/_update_by_query 請求體寫查詢條件以及更新的字段,更新字段這裏採用了painless腳本進行靈活更新】

      POST demo_users/_update_by_query

      Request Body:(意思是查詢role=administrator【可能你們看到keyword,這是由於role字段爲text類型,沒法直接匹配,須要藉助於子字段role.keyword,若是有不理解後面會有簡要說明】,更新role爲poweruser、remark爲remark+採用_update_by_query更新)

      {
          "script":{ "source":"ctx._source.role=params.role;ctx._source.remark=ctx._source.remark+params.remark",
              "lang":"painless",
              "params":{
                  "role":"poweruser",
                  "remark":"採用_update_by_query更新"
              }
          },
          "query":{
              "term":{
                  "role.keyword":"administrator"
              }
          }
      }

      painless寫法請具體參考:painless語法教程

      Response Body:

      {
      "took": 114,
      "timed_out": false,
      "total": 6,
      "updated": 6,
      "deleted": 0,
      "batches": 1,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
      "bulk": 0,
      "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1,
      "throttled_until_millis": 0,
      "failures": [ ]
      }
    9. _delete_by_query根據查詢條件刪除匹配到的索引文檔【 POST index/_delete_by_query 請求體寫查詢匹配條件】

      POST demo_users/_delete_by_query

      Request Body:(意思是查詢enabled=false)

      {
        "query": {
          "match": {
            "enabled": false
          }
        }
      }

      Response Body:

      {
                 "took":29,
                 "timed_out":false,
                 "total":3,
                 "deleted":3,
                 "batches":1,
                 "version_conflicts":0,
                 "noops":0,
                 "retries":{
                     "bulk":0,
                     "search":0
                 },
                 "throttled_millis":0,
                 "requests_per_second":-1,
                 "throttled_until_millis":0,
                 "failures":[
      
                 ]
            }
    10. search查詢

      1. URL GET查詢(GET index/_search?q=query_string語法,注意中文內容默認分詞器是一個漢字拆分紅一個term

        A.Term Query:【即分詞片斷(詞條)查詢,注意這裏講的包含是指與分詞片斷匹配】
        GET /demo_users/_search?q=role:poweruser //指定字段查詢,即:字段包含查詢的值
        
        GET /demo_users/_search?q=poweruser //泛查詢(沒有指定查詢的字段),即查詢文檔中全部字段包含poweruser的值,只要有一個字段符合,那麼該文檔將會被返回
        
        B.Phrase Query【即分組查詢】
        操做符有:AND / OR  / NOT 或者表示爲: && / || / ! 
        +表示must -表示must_not 例如:field:(+a -b)意爲field中必需包含a但不能包含b
        
        GET /demo_users/_search?q=remark:(POST test) 
        GET /demo_users/_search?q=remark:(POST OR test) 
        GET /demo_users/_search?q=remark:"POST test" 
        //分組查詢,即:查詢remark中包含POST 或 test的文檔記錄
        
        GET /demo_users/_search?q=remark:(test AND POST) //remark同時包含test與POST
        GET /demo_users/_search?q=remark:(test NOT POST) //remark包含test但不包含POST
        
        C.範圍查詢
        區間表示:[]閉區間,{}開區間
        如:year:[2019 TO 2020] 或 {2019 TO 2020} 或 {2019 TO 2020] 或 [* TO 2020]
        算數符號
        year:>2019 或 (>2012 && <=2020) 或 (+>=2012 +<=2020)
        
        GET /demo_users/_search?q=userId:>123 //查詢userId字段大於123的文檔記錄
        
        D.通配符查詢
        ?表示匹配任意1個字符,*表示匹配0或多個字符 例如:role:power* , role:use?
        
        GET /demo_users/_search?q=role:power* //查詢role字段前面是power,後面能夠是0或多個其它任意字符。
        
        可以使用正則表達式,如:username:張三\d+
        
        可以使用近似查詢偏移量(slop)提升查詢匹配結果【使用~N,N表示偏移量】
        GET /demo_users/_search?q=remark:tett~1 //查詢remark中包含test的文檔,但實際寫成了tett,故使用~1偏移近似查詢,能夠得到test的查詢結果
        
        GET /demo_users/_search?q=remark:"i like shenzhen"~2 //查詢i like shenzhen但實際remark字段中值爲:i like hubei and shenzhen,比查詢值多了 hubei and,這裏使用~2指定可偏移相隔2個term(這裏即兩個單詞),最終也是能夠查詢出結果
      2. DSL POST查詢(POST index/_search)

        POST demo_users/_search

        Request Body:

        {
            "query":{
                "bool":{
                    "must":[
                        {
                            "term":{
                                "enabled":"true"  #查詢enabled=true
                            }
                        },
                        {
                            "term":{
                                "role.keyword":"poweruser" #且role=poweruser
                            }
                        },
                        {
                            "query_string":{
                                "default_field":"username.keyword",
                                "query":"張三" #且 username 包含張三
                            }
                        }
                    ],
                    "must_not":[
        
                    ],
                    "should":[
        
                    ]
                }
            },
            "from":0,
            "size":1000,
            "sort":[
                {
                    "createdDate":"desc"  #根據createdDate倒序
                }
            ],
            "_source":{ #指明返回的字段,includes需返回字段,excludes不須要返回字段
                "includes":[
                    "role",
                    "username",
                    "userId",
                    "remark"
                ],
                "excludes":[
        
                ]
            }
        }

具體用法可參見:

【Elasticsearch】query_string的各類用法

Elasticsearch中 match、match_phrase、query_string和term的區別

Elasticsearch Query DSL 整理總結

[布爾查詢Bool Query]

最後附上ES官方的API操做連接指引:

Indices APIs:負責索引Index的建立(create)、刪除(delete)、獲取(get)、索引存在(exist)等操做。

Document APIs:負責索引文檔的建立(index)、刪除(delete)、獲取(get)等操做。

Search APIs:負責索引文檔的search(查詢),Document APIS根據doc_id進行查詢,Search APIs]根據條件查詢。

Aggregations:負責針對索引的文檔各維度的聚合(Aggregation)。

cat APIs:負責查詢索引相關的各種信息查詢。

Cluster APIs:負責集羣相關的各種信息查詢。

相關文章
相關標籤/搜索