Elasticsearch-初識查詢

時間 2019-12-12

標籤 elasticsearch 查詢欄目日誌分析简体版

原文原文鏈接

本小節主要講述關於Elasticsearch的幾種常見查詢,但願本身在使用時候再回來看此文更能快速理解其中含義.數據庫

本文全部實踐基於Elasticsearch 2.3.3數組

咱們先從查詢小蒼蒼這個用戶開始今天的話題:

1. 第一種方式(全字段檢索)

由於咱們已肯定要查詢name字段,不推薦使用,而且數據並不許確緩存

curl  http://127.0.0.1:9200/synctest/article/_search?q=小蒼蒼
複製代碼

2. 第二種方式(term 表示包含某精確值)

curl  http://127.0.0.1:9200/synctest/article/_search?pretty 
-d '{ "filter":{ "term":{ "name":"小蒼蒼" } } }'
複製代碼

一般的規則是，使用查詢（query）語句來進行全文搜索或者其它任何須要影響相關性得分的搜索。除此之外的狀況都使用過濾（filters)。bash

推薦使用語句query+filter,將會緩存filter部分數據,而後再進行評分過濾。下面咱們將遇到這種組合模式curl

注意這裏的term用法含義表示爲包含某精確值，也就是說當 "name":["小蒼蒼","小衣衣"],條件也是成立的。

3. 第二種方式(query term查詢)

curl  http://127.0.0.1:9200/synctest/article/_search?pretty 
-d '{ "query":{ "term":{ "name":"小蒼蒼" } } }'

{
  "hits" : {
    "total" : 1,
    "max_score" : 0.30685282,
    "hits" : [ {
      "_index" : "synctest",
      "_type" : "article",
      "_id" : "1",
      "_score" : 0.30685282,
      "_source" : {
        "name" : "小蒼蒼",
      }
    } ]
  }
}
複製代碼

默認query term也會自帶評分, 若是不需此功能能夠去掉, 更好的提供性能和緩存性能

4. 第四種方式 (filtered filter 關閉評分)

curl  http://127.0.0.1:9200/synctest/article/_search?pretty 
-d '{ "query":{ "filtered":{ "filter":{ "term":{ "name":"小蒼蒼" } } } } }'

{
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "synctest",
      "_type" : "article",
      "_id" : "1",
      "_score" : 1.0,
      "_source" : {
        "name" : "小蒼蒼",
      }
    } ]
  }
}

複製代碼

使用 filter 並不計算得分，且它能夠緩存文檔, 因此當你不須要評分時候, 大部分場景下用它去查詢小蒼蒼能夠提升檢索性能優化

你還能夠使用 constant_score 來關閉評分url

curl  http://127.0.0.1:9200/synctest/article/_search?pretty 
-d ' { "query":{ "constant_score":{ "filter":{ "term":{ "name":"小蒼蒼" } } } } } '
複製代碼

多條件組合使用

1. select * from article where name in ("小蒼蒼","小衣衣");

curl  http://127.0.0.1:9200/synctest/article/_search?pretty 
 -d '{ "query":{ "constant_score":{ "filter":{ "terms":{ "name":[ "小蒼蒼", "小衣衣" ] } } } } }'
複製代碼

若是咱們想要獲取2002年的某個用戶,如何實現呢 (若是實現不一樣的OR、AND條件呢)spa

咱們須要的更加複雜的查詢-組合過濾器

{
   "bool" : {
      "must" :     [],  #AND
      "should" :   [],  #OR
      "must_not" : [],  #NOT
   }
}
複製代碼

must 全部的語句都必須（must）匹配，與 AND 等價。
must_not 全部的語句都不能（must not）匹配，與 NOT 等價。
至少有一個語句要匹配，與 OR 等價。

select * from article where year=2002 and name like %蒼天空%

curl  http://127.0.0.1:9200/synctest/article/_search?pretty 
 -d '{ "query":{ "bool":{ "must":[ { "term":{ "year":2002 } }, { "match":{ "user_name":"蒼天空" } } ] } } }'
複製代碼

match等於like描述並不許確,而是取決於設置分詞器模糊查詢的結果. 禁用評分能夠將query替換爲filter3d

select * from article where (year=2002 or name='麒麟臂') and name not like %蒼天空%

curl  http://127.0.0.1:9200/synctest/article/_search?pretty  -d ' { "query":{ "bool":{ "should":[ { "term":{ "year":2002 } }, { "term":{ "name":"麒麟臂" } } ], "must_not":{ "match":{ "user_name":"蒼天空" } } } } }'
複製代碼

咱們發現must_not 並非數組格式的,由於咱們只有一個條件,當有多個條件時, 能夠將must提煉成數組

相似(只關注語法便可):

{
    "query":{
        "bool":{
            "should":[
                {
                    "term":{
                        "year":2002
                    }
                },
                {
                    "term":{
                        "name":"麒麟臂"
                    }
                }
            ],
            "must_not":[
                {
                    "match":{
                        "user_name":"蒼天空"
                    }
                },
                {
                    "term":{
                        "job":"teacher"
                    }
                }
            ]
        }
    }
}
複製代碼

更加靈活的should

curl  http://127.0.0.1:9200/synctest/article/_search?pretty  -d ' { "query":{ "bool":{ "should":[ { "term":{ "id":1 } }, { "match":{ "user_name":"蒼天空" } }, { "match":{ "nick_name":"小蒼蒼" } } ], "minimum_should_match":2 } } } '
複製代碼

minimum_should_match = 2 最少匹配兩項, 若是不須要評分功能,能夠直接將最外層query 替換爲 filter 便可

還有另外一種模式,實際中用處也很是大,咱們來看看 query 和 filtered 的組合是有很大優點的,下面咱們再看這條查詢語句:

當咱們有時候須要 分詞查詢 和 term 精確查詢一塊兒使用時,咱們是但願term不須要緩存數據,而match根據匹配度進行排序

{
    "query":{
        "bool":{
            "must":[
                {
                    "match":{
                        "user_name":"小倉鼠"
                    }
                },
                {
                    "term":{
                        "id":1
                    }
                }
            ]
        }
    }
}
複製代碼

當咱們使用上面的語句查詢的時候,並非最優解, 咱們發現term參與了評分, 咱們進行優化

curl  http://127.0.0.1:9200/synctest/article/_search?pretty  -d ' { "query":{ "bool":{ "must":[ { "match":{ "user_name":"小蒼蒼" } } ], "filter":{ "term":{ "id":1 } } } } } '
複製代碼

經過觀察max_score值,發現只對 user_name 進行了過濾, 這是很是重要的, 由於es能夠首先執行 filter 並對此進行緩存優化。

範圍查詢

curl  http://127.0.0.1:9200/synctest/article/_search?pretty  -d ' { "query":{ "constant_score":{ "filter":{ "range":{ "id":{ "gte":1, "lte":4 } } } } } } '
複製代碼

finish--分頁和返回指定的字段

curl  http://127.0.0.1:9200/synctest/article/_search?pretty  -d ' { "from":1, "size":1, "query":{ "terms":{ "id":[ 1, 2, 6, 9, 15 ] } }, "sort":{ "id":{ "order":"desc" } } } '
複製代碼

咱們使用了 from+size 的分頁方式, 注意es的from+size模式分頁是有侷限和限制的,咱們後面再講. 咱們還使用了 sort 對 id 進行倒敘排序。

可是咱們在數據庫操做中, 還常常使用返回某些字段呢, 儘可能放棄select * 吧。

{
    "from":1,
    "size":1,
    "_source":[
        "id",
        "name"
    ],
    "query":{
        "terms":{
            "id":[
                1,
                2,
                6,
                9,
                15
            ]
        }
    },
    "sort":{
        "id":{
            "order":"desc"
        }
    }
}
複製代碼