23 款實用的 Elasticsearch 查詢示例

ElasticSearch是一個基於Lucene的搜索服務器,它是用Java開發的,並做爲Apache許可條款下的開放源碼發佈,是當前流行的企業級搜索引擎。本文介紹了幾種經常使用的Elasticsearch查詢方式,並分別進行了舉例,但願它們對你有幫助。(注:文章翻譯自Tim Ojo的23 Useful Elasticsearch Example Queries如有翻譯不到位的地方,歡迎你們進行指正。喜歡的也不要忘了打賞、點贊、收藏哦:))html

爲了介紹Elasticsearch中的不一樣查詢類型,咱們將對帶有下列字段的文檔進行搜索:title(標題),authors(做者),summary(摘要),release date(發佈時間)以及number of reviews(評論數量)。
首先,讓咱們建立一個新的索引,並經過bulk API查詢文檔:算法

PUT /bookdb_index

    { "settings": { "number_of_shards": 1 }}
POST /bookdb_index/book/_bulk

    { "index": { "_id": 1 }}

    { "title": "Elasticsearch: The Definitive Guide", "authors": ["clinton gormley", "zachary tong"], "summary" : "A distibuted real-time search and analytics engine", "publish_date" : "2015-02-07", "num_reviews": 20, "publisher": "oreilly" }

    { "index": { "_id": 2 }}

    { "title": "Taming Text: How to Find, Organize, and Manipulate It", "authors": ["grant ingersoll", "thomas morton", "drew farris"], "summary" : "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization", "publish_date" : "2013-01-24", "num_reviews": 12, "publisher": "manning" }

    { "index": { "_id": 3 }}

    { "title": "Elasticsearch in Action", "authors": ["radu gheorge", "matthew lee hinman", "roy russo"], "summary" : "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms", "publish_date" : "2015-12-03", "num_reviews": 18, "publisher": "manning" }

    { "index": { "_id": 4 }}

    { "title": "Solr in Action", "authors": ["trey grainger", "timothy potter"], "summary" : "Comprehensive guide to implementing a scalable search engine using Apache Solr", "publish_date" : "2014-04-05", "num_reviews": 23, "publisher": "manning" }

舉例

基本匹配查詢

有兩種方式執行基本全文(匹配)查詢:使用Search Lite API,它將搜索參數做爲URL的一部分傳遞;使用完整的JSON請求消息體,它容許你使用完整的Elasticsearch DSL。數組

如下是基本的匹配查詢,在全部字段中查詢字符串「guide」:緩存

GET /bookdb_index/book/_search?q=guide

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.28168046,

        "_source": {

          "title": "Elasticsearch: The Definitive Guide",

          "authors": [

            "clinton gormley",

            "zachary tong"

          ],

          "summary": "A distibuted real-time search and analytics engine",

          "publish_date": "2015-02-07",

          "num_reviews": 20,

          "publisher": "manning"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.24144039,

        "_source": {

          "title": "Solr in Action",

          "authors": [

            "trey grainger",

            "timothy potter"

          ],

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "publish_date": "2014-04-05",

          "num_reviews": 23,

          "publisher": "manning"

        }

      }

    ]

這個查詢的完整消息體以下,它產生的結果與上述查詢相同:服務器

{

    "query": {

        "multi_match" : {

            "query" : "guide",

            "fields" : ["_all"]

        }

    }

}

做爲對多個字段運行相同查詢的簡便方法,multi_match關鍵字能夠用在match關鍵字的位置。fields屬性指定要查詢的字段,在這種狀況下,咱們要對文檔中的全部字段進行查詢。app

兩種API都容許你指定你想查詢的字段。好比,指定搜索標題字段中含「in Action」的圖書:elasticsearch

GET /bookdb_index/book/_search?q=title:in action

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.6259885,

        "_source": {

          "title": "Solr in Action",

          "authors": [

            "trey grainger",

            "timothy potter"

          ],

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "publish_date": "2014-04-05",

          "num_reviews": 23,

          "publisher": "manning"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 0.5975345,

        "_source": {

          "title": "Elasticsearch in Action",

          "authors": [

            "radu gheorge",

            "matthew lee hinman",

            "roy russo"

          ],

          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",

          "publish_date": "2015-12-03",

          "num_reviews": 18,

          "publisher": "manning"

        }

      }

    ]

然而,完整的DSL能提供更大的靈活性,讓你能夠建立更復雜的查詢(咱們在下文會提到)以及指定查詢結果的返回方式。在下列示例中,咱們指定了要返回的結果數量、偏移位置(對分頁有用)、要返回的文檔字段和高亮顯示的項。ide

POST /bookdb_index/book/_search

{

    "query": {

        "match" : {

            "title" : "in action"

        }

    },

    "size": 2,

    "from": 0,

    "_source": [ "title", "summary", "publish_date" ],

    "highlight": {

        "fields" : {

            "title" : {}

        }

    }

}

[Results]

"hits": {

    "total": 2,

    "max_score": 0.9105287,

    "hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 0.9105287,

        "_source": {

          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",

          "title": "Elasticsearch in Action",

          "publish_date": "2015-12-03"

        },

        "highlight": {

          "title": [

            "Elasticsearch <em>in</em> <em>Action</em>"

          ]

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.9105287,

        "_source": {

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        },

        "highlight": {

          "title": [

            "Solr <em>in</em> <em>Action</em>"

          ]

        }

      }

    ]

  }

注:對於多詞(multi-word)查詢,相應的匹配(match)查詢容許你指定是否使用and運算符,而不是默認使用or運算符。你也能夠指定minimum_should_match選項來調整返回結果的相關性。詳細信息能夠在Elasticsearch指南中找到。函數

多字段查詢

爲了在一次查詢中查找多個字段(如,在title和summary中查找相同的字符串),你使用了multi_match查詢:性能

POST /bookdb_index/book/_search

{

    "query": {

        "multi_match" : {

            "query" : "elasticsearch guide",

            "fields": ["title", "summary"]

        }

    }

}

[Results]

"hits": {

    "total": 3,

    "max_score": 0.9448582,

    "hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.9448582,

        "_source": {

          "title": "Elasticsearch: The Definitive Guide",

          "authors": [

            "clinton gormley",

            "zachary tong"

          ],

          "summary": "A distibuted real-time search and analytics engine",

          "publish_date": "2015-02-07",

          "num_reviews": 20,

          "publisher": "manning"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 0.17312013,

        "_source": {

          "title": "Elasticsearch in Action",

          "authors": [

            "radu gheorge",

            "matthew lee hinman",

            "roy russo"

          ],

          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",

          "publish_date": "2015-12-03",

          "num_reviews": 18,

          "publisher": "manning"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.14965448,

        "_source": {

          "title": "Solr in Action",

          "authors": [

            "trey grainger",

            "timothy potter"

          ],

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "publish_date": "2014-04-05",

          "num_reviews": 23,

          "publisher": "manning"

        }

      }

    ]

  }

注:上面的查詢匹配了3個結果,由於單詞「guide」在summary(摘要)中有出現。

Boosting 算法

有時候,咱們在多個字段中進行搜索,可能會但願提升某個字段中的權重。如,在下列設計示例中,咱們將summary字段的權重提升三倍,以提升這個字段的重要性,從而加強文檔 _id 4的相關性。

POST /bookdb_index/book/_search

{

    "query": {

        "multi_match" : {

            "query" : "elasticsearch guide",

            "fields": ["title", "summary^3"]

        }

    },

    "_source": ["title", "summary", "publish_date"]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.31495273,

        "_source": {

          "summary": "A distibuted real-time search and analytics engine",

          "title": "Elasticsearch: The Definitive Guide",

          "publish_date": "2015-02-07"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.14965448,

        "_source": {

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 0.13094766,

        "_source": {

          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",

          "title": "Elasticsearch in Action",

          "publish_date": "2015-12-03"

        }

      }

    ]

注:Boosting並不意味着計算的權重會被boost因子翻倍。實際的boost值會進行一些規範化和內部優化。想了解更多boost工做原理的信息,可參考Elasticsearch指南

Bool 查詢

爲得到更具相關性和更具體的查詢結果,AND / OR / NOT運算符可在咱們的搜索查詢進行微調。這在搜索API中做爲bool查詢實現。bool查詢接受must參數(等效於AND),must_not參數(等效於NOT),should參數(等效於OR)。好比,我想查詢標題中帶有「Elasticsearch」 或(OR) 「Solr」的書,而且(AND)是由「clinton gormley」創做,而不是(NOT) 「radu gheorge」。

POST /bookdb_index/book/_search

{

    "query": {

        "bool": {

            "must": {

                "bool" : { "should": [

                      { "match": { "title": "Elasticsearch" }},

                      { "match": { "title": "Solr" }} ] }

            },

            "must": { "match": { "authors": "clinton gormely" }},

            "must_not": { "match": {"authors": "radu gheorge" }}

        }

    }

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.3672021,

        "_source": {

          "title": "Elasticsearch: The Definitive Guide",

          "authors": [

            "clinton gormley",

            "zachary tong"

          ],

          "summary": "A distibuted real-time search and analytics engine",

          "publish_date": "2015-02-07",

          "num_reviews": 20,

          "publisher": "oreilly"

        }

      }

    ]

注:如你所見,bool查詢囊括全部其餘的搜索類型,包括其餘類型的bool查詢,以構建複雜和深層嵌套的查詢體系。

模糊查詢

模糊匹配能夠在匹配和多重匹配查詢上啓用以捕獲拼寫錯誤。模糊程度由原始詞之間的Levenshtein距離決定。

POST /bookdb_index/book/_search

{

    "query": {

        "multi_match" : {

            "query" : "comprihensiv guide",

            "fields": ["title", "summary"],

            "fuzziness": "AUTO"

        }

    },

    "_source": ["title", "summary", "publish_date"],

    "size": 1

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.5961596,

        "_source": {

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        }

      }

    ]

注:當術語長度大於5個字符時,"AUTO"的模糊值等同於指定值「2」。可是,80%的人類拼寫錯誤的編輯距離爲1,因此,將模糊值設置爲「1」可能會提升您的總體搜索性能。更多詳細信息,請參閱Elasticsearch指南中的「排版和拼寫錯誤」(Typos and Misspellings)章節。

通配符查詢

通配符查詢容許你指定匹配的模式,而不是整個術語。? 匹配任何字符,*匹配零個或多個字符。例如,要查找名稱以字母't'開頭的全部做者的記錄:

POST /bookdb_index/book/_search

{

    "query": {

        "wildcard" : {

            "authors" : "t*"

        }

    },

    "_source": ["title", "authors"],

    "highlight": {

        "fields" : {

            "authors" : {}

        }

    }

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 1,

        "_source": {

          "title": "Elasticsearch: The Definitive Guide",

          "authors": [

            "clinton gormley",

            "zachary tong"

          ]

        },

        "highlight": {

          "authors": [

            "zachary <em>tong</em>"

          ]

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "2",

        "_score": 1,

        "_source": {

          "title": "Taming Text: How to Find, Organize, and Manipulate It",

          "authors": [

            "grant ingersoll",

            "thomas morton",

            "drew farris"

          ]

        },

        "highlight": {

          "authors": [

            "<em>thomas</em> morton"

          ]

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 1,

        "_source": {

          "title": "Solr in Action",

          "authors": [

            "trey grainger",

            "timothy potter"

          ]

        },

        "highlight": {

          "authors": [

            "<em>trey</em> grainger",

            "<em>timothy</em> potter"

          ]

        }

      }

    ]

正則查詢

正則查詢容許你指定比通配符查詢更復雜的查詢模式。

POST /bookdb_index/book/_search

{

    "query": {

        "regexp" : {

            "authors" : "t[a-z]*y"

        }

    },

    "_source": ["title", "authors"],

    "highlight": {

        "fields" : {

            "authors" : {}

        }

    }

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 1,

        "_source": {

          "title": "Solr in Action",

          "authors": [

            "trey grainger",

            "timothy potter"

          ]

        },

        "highlight": {

          "authors": [

            "<em>trey</em> grainger",

            "<em>timothy</em> potter"

          ]

        }

      }

    ]

匹配短語查詢

匹配短語查詢要求查詢字符串中的全部字詞都在文檔中存在,要遵循查詢字符串的指定順序還要彼此接近。默認狀況下,術語要求彼此相同,但你能夠指定slop值,進行文檔匹配時,該值能夠指定詞的距離。

POST /bookdb_index/book/_search

{

    "query": {

        "multi_match" : {

            "query": "search engine",

            "fields": ["title", "summary"],

            "type": "phrase",

            "slop": 3

        }

    },

    "_source": [ "title", "summary", "publish_date" ]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.22327082,

        "_source": {

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.16113183,

        "_source": {

          "summary": "A distibuted real-time search and analytics engine",

          "title": "Elasticsearch: The Definitive Guide",

          "publish_date": "2015-02-07"

        }

      }

    ]

注:在上述例子中,對於非短語類型查詢,文檔_id 1一般會以較高的權重出如今文檔_id 4以前,由於其字段長度更加短。然而,做爲短語查詢,術語的接近度也須要考慮在內,所以文檔_id 4權重會更高。

匹配短語前綴查詢

匹配短語前綴查詢在查詢時提供「自動搜索」功能(search-as-you-type)或者說詞窮時的自動補充功能,你無需以任何方式準備數據。和match_phrase查詢同樣,它接受slop參數,使得字的順序和相對位置的調整不那麼死板。它還接受max_expansions參數,以限制匹配的術語數量,減小資源強度。

POST /bookdb_index/book/_search

{

    "query": {

        "match_phrase_prefix" : {

            "summary": {

                "query": "search en",

                "slop": 3,

                "max_expansions": 10

            }

        }

    },

    "_source": [ "title", "summary", "publish_date" ]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.5161346,

        "_source": {

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.37248808,

        "_source": {

          "summary": "A distibuted real-time search and analytics engine",

          "title": "Elasticsearch: The Definitive Guide",

          "publish_date": "2015-02-07"

        }

      }

    ]

注:查詢時(query-time)搜索類型具備性能成本。 因此你能夠選擇將索引時(index-time)搜索做爲搜索類型。更多詳情,請查看Completion Suggester API或使用Edge-Ngram filters獲取。

查詢字符串查詢

查詢字符串查詢提供了以簡明的速記語法執行multi_match查詢,bool查詢,boosting查詢,模糊匹配查詢,通配符查詢,regexp和範圍查詢的方法。下面示例中,我對「search algorithm」執行了模糊查詢,其中一本書的做者是「grant ingersoll」 或 「tom morton」,我對全部字段都進行查詢,但在summary字段,boost值設爲「2」。

POST /bookdb_index/book/_search

{

    "query": {

        "query_string" : {

            "query": "(saerch~1 algorithm~1) AND (grant ingersoll)  OR (tom morton)",

            "fields": ["_all", "summary^2"]

        }

    },

    "_source": [ "title", "summary", "authors" ],

    "highlight": {

        "fields" : {

            "summary" : {}

        }

    }

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "2",

        "_score": 0.14558059,

        "_source": {

          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",

          "title": "Taming Text: How to Find, Organize, and Manipulate It",

          "authors": [

            "grant ingersoll",

            "thomas morton",

            "drew farris"

          ]

        },

        "highlight": {

          "summary": [

            "organize text using approaches such as full-text <em>search</em>, proper name recognition, clustering, tagging, information extraction, and summarization"

          ]

        }

      }

簡單查詢字符串查詢

簡單查詢字符串(simple_query_string)查詢是字符串(query_string)查詢的一個版本,更適合用戶在單個搜索框中使用。它分別用+ / | / - 替換AND / OR / NOT的使用,而且自動過濾掉查詢的無效部分,而不是在用戶犯錯誤時拋出異常。

POST /bookdb_index/book/_search

{

    "query": {

        "simple_query_string" : {

            "query": "(saerch~1 algorithm~1) + (grant ingersoll)  | (tom morton)",

            "fields": ["_all", "summary^2"]

        }

    },

    "_source": [ "title", "summary", "authors" ],

    "highlight": {

        "fields" : {

            "summary" : {}

        }

    }

}

術語查詢

以上都是全文搜索的例子。可是有些盆友對結構化搜索更感興趣,但願在其中找到徹底匹配並返回結果。這時,術語查詢即可以幫到咱們。在下面例子中,咱們將搜索Manning Publications出版的全部書籍。

POST /bookdb_index/book/_search

{

    "query": {

        "term" : {

            "publisher": "manning"

        }

    },

    "_source" : ["title","publish_date","publisher"]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "2",

        "_score": 1.2231436,

        "_source": {

          "publisher": "manning",

          "title": "Taming Text: How to Find, Organize, and Manipulate It",

          "publish_date": "2013-01-24"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 1.2231436,

        "_source": {

          "publisher": "manning",

          "title": "Elasticsearch in Action",

          "publish_date": "2015-12-03"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 1.2231436,

        "_source": {

          "publisher": "manning",

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        }

      }

    ]

可使用術語關鍵字來指定多個術語,並傳入搜索術語數組。

{

    "query": {

        "terms" : {

            "publisher": ["oreilly", "packt"]

        }

    }

}

術語查詢——排序

術語查詢結果(與全部其餘查詢結果同樣)能夠輕鬆排序, 也容許多級排序:

POST /bookdb_index/book/_search

{

    "query": {

        "term" : {

            "publisher": "manning"

        }

    },

    "_source" : ["title","publish_date","publisher"],

    "sort": [

        { "publish_date": {"order":"desc"}},

        { "title": { "order": "desc" }}

    ]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": null,

        "_source": {

          "publisher": "manning",

          "title": "Elasticsearch in Action",

          "publish_date": "2015-12-03"

        },

        "sort": [

          1449100800000,

          "in"

        ]

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": null,

        "_source": {

          "publisher": "manning",

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        },

        "sort": [

          1396656000000,

          "solr"

        ]

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "2",

        "_score": null,

        "_source": {

          "publisher": "manning",

          "title": "Taming Text: How to Find, Organize, and Manipulate It",

          "publish_date": "2013-01-24"

        },

        "sort": [

          1358985600000,

          "to"

        ]

      }

    ]

範圍查詢

另外一個結構化查詢示例是範圍查詢。 在此示例中,咱們將搜索在2015年出版的圖書:

POST /bookdb_index/book/_search

{

    "query": {

        "range" : {

            "publish_date": {

                "gte": "2015-01-01",

                "lte": "2015-12-31"

            }

        }

    },

    "_source" : ["title","publish_date","publisher"]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 1,

        "_source": {

          "publisher": "oreilly",

          "title": "Elasticsearch: The Definitive Guide",

          "publish_date": "2015-02-07"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 1,

        "_source": {

          "publisher": "manning",

          "title": "Elasticsearch in Action",

          "publish_date": "2015-12-03"

        }

      }

    ]

注:範圍查詢適用於日期,數字和字符串類型字段。

過濾查詢

過濾查詢容許您過濾查詢的結果。 例如,咱們要查詢標題或摘要中包含術語「Elasticsearch」的書籍,但要求結果過濾到包含20條以上評論的書。

POST /bookdb_index/book/_search

{

    "query": {

        "filtered": {

            "query" : {

                "multi_match": {

                    "query": "elasticsearch",

                    "fields": ["title","summary"]

                }

            },

            "filter": {

                "range" : {

                    "num_reviews": {

                        "gte": 20

                    }

                }

            }

        }

    },

    "_source" : ["title","summary","publisher", "num_reviews"]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.5955761,

        "_source": {

          "summary": "A distibuted real-time search and analytics engine",

          "publisher": "oreilly",

          "num_reviews": 20,

          "title": "Elasticsearch: The Definitive Guide"

        }

      }

    ]

注:過濾查詢不要求過濾的查詢的存在。若是沒有指定查詢,則運行match_all查詢,它基本上能返回索引中的全部文檔,而後對其進行過濾。 實際上,首先運行的是過濾器,這減小了須要查詢的面積。 此外,過濾器在第一次使用後緩存,這能使它更高效。

POST /bookdb_index/book/_search

{

    "query": {

        "bool": {

            "must" : {

                "multi_match": {

                    "query": "elasticsearch",

                    "fields": ["title","summary"]

                }

            },

            "filter": {

                "range" : {

                    "num_reviews": {

                        "gte": 20

                    }

                }

            }

        }

    },

    "_source" : ["title","summary","publisher", "num_reviews"]

}

這一樣適用於下面示例中的過濾器。

多項過濾器

多項過濾器能夠經過bool過濾器結合起來,在下一個示例中,過濾器指定返回的結果必須至少有20條評論,發佈時間在2015年以後,並應由oreilly發佈。

POST /bookdb_index/book/_search

{

    "query": {

        "filtered": {

            "query" : {

                "multi_match": {

                    "query": "elasticsearch",

                    "fields": ["title","summary"]

                }

            },

            "filter": {

                "bool": {

                    "must": {

                        "range" : { "num_reviews": { "gte": 20 } }

                    },

                    "must_not": {

                        "range" : { "publish_date": { "lte": "2014-12-31" } }

                    },

                    "should": {

                        "term": { "publisher": "oreilly" }

                    }

                }

            }

        }

    },

    "_source" : ["title","summary","publisher", "num_reviews", "publish_date"]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.5955761,

        "_source": {

          "summary": "A distibuted real-time search and analytics engine",

          "publisher": "oreilly",

          "num_reviews": 20,

          "title": "Elasticsearch: The Definitive Guide",

          "publish_date": "2015-02-07"

        }

      }

    ]

函數權重:字段值要素

可能有這樣的狀況,您但願將文檔中特定字段的值考慮到相關性權重的計算中。 這在腳本中很常見,基於其受歡迎程度,你會但願boost文檔的相關性。 在咱們的例子中,咱們但願更受歡迎的書(根據評論的數量判斷)獲得boost。 這就可能使用到field_value_factor函數權重:

POST /bookdb_index/book/_search

{

    "query": {

        "function_score": {

            "query": {

                "multi_match" : {

                    "query" : "search engine",

                    "fields": ["title", "summary"]

                }

            },

            "field_value_factor": {

                "field" : "num_reviews",

                "modifier": "log1p",

                "factor" : 2

            }

        }

    },

    "_source": ["title", "summary", "publish_date", "num_reviews"]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.44831306,

        "_source": {

          "summary": "A distibuted real-time search and analytics engine",

          "num_reviews": 20,

          "title": "Elasticsearch: The Definitive Guide",

          "publish_date": "2015-02-07"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.3718407,

        "_source": {

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "num_reviews": 23,

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 0.046479136,

        "_source": {

          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",

          "num_reviews": 18,

          "title": "Elasticsearch in Action",

          "publish_date": "2015-12-03"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "2",

        "_score": 0.041432835,

        "_source": {

          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",

          "num_reviews": 12,

          "title": "Taming Text: How to Find, Organize, and Manipulate It",

          "publish_date": "2013-01-24"

        }

      }

    ]

注1:咱們能夠只運行一個常規的multi_match查詢並按num_reviews字段排序,可是咱們失去了得到相關性分值的好處。

注2:有許多額外的參數在原始相關性權重上加強boost的程度,好比「modifier」, 「factor」,「boost_mode」等。這些在Elasticsearch指南中進行了詳細探討。

函數權重:關聯功能遞減函數

假設想要的不是讓某個字段值按某種關聯度遞增,而是想讓你關注的值按照同關聯度遞減。 這在基於lat / long,數字字段(如價格或日期)的boost中很是有用。 在下列示例中,咱們要在「搜索引擎」上搜索於2014年6月發佈的書籍。

POST /bookdb_index/book/_search

{

    "query": {

        "function_score": {

            "query": {

                "multi_match" : {

                    "query" : "search engine",

                    "fields": ["title", "summary"]

                }

            },

            "functions": [

                {

                    "exp": {

                        "publish_date" : {

                            "origin": "2014-06-15",

                            "offset": "7d",

                            "scale" : "30d"

                        }

                    }

                }

            ],

            "boost_mode" : "replace"

        }

    },

    "_source": ["title", "summary", "publish_date", "num_reviews"]

}

[Results]

"hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.27420625,

        "_source": {

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "num_reviews": 23,

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.005920768,

        "_source": {

          "summary": "A distibuted real-time search and analytics engine",

          "num_reviews": 20,

          "title": "Elasticsearch: The Definitive Guide",

          "publish_date": "2015-02-07"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "2",

        "_score": 0.000011564,

        "_source": {

          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",

          "num_reviews": 12,

          "title": "Taming Text: How to Find, Organize, and Manipulate It",

          "publish_date": "2013-01-24"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 0.0000059171475,

        "_source": {

          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",

          "num_reviews": 18,

          "title": "Elasticsearch in Action",

          "publish_date": "2015-12-03"

        }

      }

    ]

函數權重: 腳本權重

在內置評分函數不能知足您的須要的狀況下,能夠選擇指定一個Groovy腳本用於評分。在咱們的示例中,咱們想要指定一個考慮發佈日期的腳本,而後再決定評論數,由於新出版的書可能沒有足夠的評論數。

權重腳本以下所示:

publish_date = doc['publish_date'].value

num_reviews = doc['num_reviews'].value

if (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) {

  my_score = Math.log(2.5 + num_reviews)

} else {

  my_score = Math.log(1 + num_reviews)

}

return my_score

要想動態使用權重腳本,咱們須要使用腳本權重參數:

POST /bookdb_index/book/_search

{

    "query": {

        "function_score": {

            "query": {

                "multi_match" : {

                    "query" : "search engine",

                    "fields": ["title", "summary"]

                }

            },

            "functions": [

                {

                    "script_score": {

                        "params" : {

                            "threshold": "2015-07-30"

                        },

                        "script": "publish_date = doc['publish_date'].value; num_reviews = doc['num_reviews'].value; if (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) { return log(2.5 + num_reviews) }; return log(1 + num_reviews);"

                    }

                }

            ]

        }

    },

    "_source": ["title", "summary", "publish_date", "num_reviews"]

}

[Results]

"hits": {

    "total": 4,

    "max_score": 0.8463001,

    "hits": [

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "1",

        "_score": 0.8463001,

        "_source": {

          "summary": "A distibuted real-time search and analytics engine",

          "num_reviews": 20,

          "title": "Elasticsearch: The Definitive Guide",

          "publish_date": "2015-02-07"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "4",

        "_score": 0.7067348,

        "_source": {

          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr",

          "num_reviews": 23,

          "title": "Solr in Action",

          "publish_date": "2014-04-05"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "3",

        "_score": 0.08952084,

        "_source": {

          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand advanced data science algorithms",

          "num_reviews": 18,

          "title": "Elasticsearch in Action",

          "publish_date": "2015-12-03"

        }

      },

      {

        "_index": "bookdb_index",

        "_type": "book",

        "_id": "2",

        "_score": 0.07602123,

        "_source": {

          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization",

          "num_reviews": 12,

          "title": "Taming Text: How to Find, Organize, and Manipulate It",

          "publish_date": "2013-01-24"

        }

      }

    ]

  }

注1:要使用動態腳本,必須在config / elasticsearch.yaml文件的Elasticsearch實例中激活。 固然,咱們也可使用存儲在Elasticsearch服務器上的腳本。 更多相關信息,請參閱Elasticsearch參考文檔

注2:JSON不能包含嵌入的換行符,所以分號用來分隔語句。

相關文章
相關標籤/搜索