大數據利器Elasticsearch之全文本查詢之match_bool_prefix查詢

這是我參與8月更文挑戰的第10天,活動詳情查看:8月更文挑戰
本Elasticsearch相關文章的版本爲:7.4.2markdown

一個match_bool_prefix查詢:post

  1. 對輸入的內容進行分詞;
  2. 而後構造bool查詢;
  3. 對每一個分詞(除了最後一個分詞)使用term查詢;
  4. 但對最後一個分詞采用prefix查詢。

一個match_bool_prefix的例子以下:
測試數據:測試

POST /match_test/_doc/1
{
  "my_text": "my Favorite food is cold porridge"
}

POST /match_test/_doc/2
{
  "my_text": "when it's cold my favorite food is porridge"
}
複製代碼

進行match_bool_prefix查詢:spa

POST /match_test/_search
{
  "query": {
    "match_bool_prefix":{
      "my_text": {
        "query": "food p"
      }
    }
  }
}
複製代碼

查詢分析:code

  1. 」food p「通過分詞將會變成foodp;
  2. food分詞應用於term查詢,p分詞應用於prefix查詢;
  3. 由於doc1和doc2的my_text分詞後都有food和以p開頭(porridge)的分詞,因此doc1和doc2都會命中

因此會和下面的bool查詢等效:orm

POST /match_test/_search
{
    "query": {
        "bool" : {
            "should": [
                { "term": { "my_text": "food" }},
                { "prefix": { "my_text": "p"}}
            ]
        }
    }
}
複製代碼

返回的數據:ip

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 1.3147935,
    "hits" : [
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.3147935,
        "_source" : {
          "my_text" : "my Favorite food is cold porridge"
        }
      },
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.2816185,
        "_source" : {
          "my_text" : "when it's cold my favorite food is porridge"
        }
      }
    ]
  }
}

複製代碼

其餘參數:
match_bool_prefix支持minimum_should_match和operator參數的配置,只有知足最小匹配子句數量的文檔纔會返回。同時也支持在查詢的時候指定要使用的analyzer,默認是使用所查詢的字段的analyzer。若是指定了analyzer,那麼在分詞階段將會使用所指定的analyzer。文檔

POST /match_test/_search
{
  "query": {
    "match_bool_prefix":{
      "my_text": {
        "query": "favorite Food p",
        "minimum_should_match": 2,
        "analyzer": "standard"
      }
    }
  }
}
複製代碼

等同於如下的bool查詢:get

POST /match_test/_search
{
    "query": {
        "bool" : {
            "should": [
                { "term": { "my_text": "favorite" }},
                { "term": { "my_text": "food" }},
                { "prefix": { "my_text": "p"}}
            ],
            "minimum_should_match": 2
        }
    }
}
複製代碼

返回的數據:it

{
  "took" : 10,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 1.9045854,
    "hits" : [
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.9045854,
        "_source" : {
          "my_text" : "my Favorite food is cold porridge"
        }
      },
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.8092544,
        "_source" : {
          "my_text" : "when it's cold my favorite food is porridge"
        }
      }
    ]
  }
}
複製代碼

總結:

  1. match_bool_prefix會把輸入的數據使用字段的analyzer或用戶指定的analyzer進行分詞;
  2. 除了最後一個分詞以外全部分詞都使用term查詢,最後一個分詞使用prefix查詢,而後把全部子查詢放到bool查詢的should列表中。
相關文章
相關標籤/搜索