高級別全文檢索一般用於在全文本字段(如電子郵件正文)上運行全文檢索。 他們瞭解如何分析被查詢的字段,並在執行以前將每一個字段的分析器(或search_analyzer)應用於查詢字符串。spa
term是表明徹底匹配,也就是精確查詢,搜索前不會再對搜索詞進行分詞,因此咱們的搜索詞必須是文檔分詞集合中的一個。code
例如咱們能夠經過指定分詞器對」週五召開董事會會議 審議及批准更新後的一季報「進行分詞。token
GET telegraph/_analyze { "analyzer": "ik_max_word", "text": "週五召開董事會會議 審議及批准更新後的一季報" }
分詞結果集合中共有15個ip
{ "tokens": [ { "token": "週五", "start_offset": 0, "end_offset": 2, "type": "CN_WORD", "position": 0 }, { "token": "五", "start_offset": 1, "end_offset": 2, "type": "TYPE_CNUM", "position": 1 }, { "token": "召開", "start_offset": 2, "end_offset": 4, "type": "CN_WORD", "position": 2 }, { "token": "董事會", "start_offset": 4, "end_offset": 7, "type": "CN_WORD", "position": 3 }, { "token": "董事", "start_offset": 4, "end_offset": 6, "type": "CN_WORD", "position": 4 }, { "token": "會會", "start_offset": 6, "end_offset": 8, "type": "CN_WORD", "position": 5 }, { "token": "會議", "start_offset": 7, "end_offset": 9, "type": "CN_WORD", "position": 6 }, { "token": "審議", "start_offset": 10, "end_offset": 12, "type": "CN_WORD", "position": 7 }, { "token": "及", "start_offset": 12, "end_offset": 13, "type": "CN_CHAR", "position": 8 }, { "token": "批准", "start_offset": 13, "end_offset": 15, "type": "CN_WORD", "position": 9 }, { "token": "更新", "start_offset": 15, "end_offset": 17, "type": "CN_WORD", "position": 10 }, { "token": "後", "start_offset": 17, "end_offset": 18, "type": "CN_CHAR", "position": 11 }, { "token": "的", "start_offset": 18, "end_offset": 19, "type": "CN_CHAR", "position": 12 }, { "token": "一季", "start_offset": 19, "end_offset": 21, "type": "CN_WORD", "position": 13 }, { "token": "一", "start_offset": 19, "end_offset": 20, "type": "TYPE_CNUM", "position": 14 }, { "token": "季報", "start_offset": 20, "end_offset": 22, "type": "CN_WORD", "position": 15 } ] }
咱們用term進行搜索」會議「文檔
GET telegraph/_search { "query": { "term": { "title": { "value": "會議" } } } }
因爲搜索字段」會議「屬於分詞集合,能夠搜索到結果字符串
{ "took": 9, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "telegraph", "_type": "msg", "_id": "AZetp2QBW8hrYY3zGJk7", "_score": 0.2876821, "_source": { "title": "週五召開董事會會議 審議及批准更新後的一季報", "content": "以審議及批准更新後的2018年第一季度報告", "author": "中興通信", "pubdate": "2018-07-17T12:33:11" } } ] } }
若是咱們搜索」董事會會議「string
GET telegraph/_search { "query": { "term": { "title": { "value": "董事會會議" } } } }
」董事會會議「雖然屬於文檔文本中的一部分,可是因爲沒有在分詞集合中,因此也是搜索不到的it
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } }
match查詢會先對搜索詞進行分詞,分詞完畢後再逐個對分詞結果進行匹配,所以相比於term的精確搜索,match是分詞匹配搜索。io
當咱們搜索」河北會議「時,搜索詞首先會被分解爲」河北「、」會議「,只要文檔中包含」河北「、」會議「任意一個就會被搜索到。固然咱們也能夠經過」operator「來指定被分解詞匹配邏輯關係,好比咱們能夠指定」operator「爲」and「時,只有文檔的分詞集合中同時含有」河北「和」會議「纔會被搜索到。默認」operator「爲」or「,也就是隻要文檔分詞集合中只要含有任意一個就會被搜索到。date
GET telegraph/_search { "query": { "match": { "title": { "query": "河北會議" } } } }
搜索結果
{ "took": 4, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.99277425, "hits": [ { "_index": "telegraph", "_type": "msg", "_id": "BJetp2QBW8hrYY3zGJk7", "_score": 0.99277425, "_source": { "title": "河北聚焦十大行業推動國際產能合做", "content": "河北省政府近日出臺積極參與「一帶一路」建設推動國際產能合做實施方案", "author": "財聯社", "pubdate": "2018-07-17T14:14:55" } }, { "_index": "telegraph", "_type": "msg", "_id": "AZetp2QBW8hrYY3zGJk7", "_score": 0.2876821, "_source": { "title": "週五召開董事會會議 審議及批准更新後的一季報", "content": "以審議及批准更新後的2018年第一季度報告", "author": "中興通信", "pubdate": "2018-07-17T12:33:11" } } ] } }
若是咱們指定」operator「爲」and「進行搜索
GET telegraph/_search { "query": { "match": { "title": { "query": "河北會議", "operator": "and" } } } }
由於全部文檔中沒有一個的分詞集合中既包含」河北「又包含」會議「,因此搜索結果爲空。
{ "took": 8, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } }
match_phrase查詢會將查詢內容分詞,分詞器能夠自定義,文檔中同時知足如下三個條件纔會被檢索到:
一樣上面的例子,咱們搜索」董事會會議「,文檔會被搜索到。若是分詞順序不一致或者沒有緊密相鄰都不能被搜索到。
GET telegraph/_search { "query": { "match_phrase": { "title":{ "query": "董事會會議" } } } }
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1.1507283, "hits": [ { "_index": "telegraph", "_type": "msg", "_id": "AZetp2QBW8hrYY3zGJk7", "_score": 1.1507283, "_source": { "title": "週五召開董事會會議 審議及批准更新後的一季報", "content": "以審議及批准更新後的2018年第一季度報告", "author": "中興通信", "pubdate": "2018-07-17T12:33:11" } } ] } }
match_phrase_prefix與match_phrase比較相近,只是match_phrase_prefix容許搜索詞的最後一個分詞的前綴匹配上便可。
上面的例子中文檔的分詞集合中有」召開「、」董事會「這兩個緊鄰的分詞。咱們使用match_phrase_prefix搜索時只須要搜索詞中包含」召開「以及」董事會「的前綴就能匹配上。
GET telegraph/_search { "query": { "match_phrase_prefix": { "title": { "query": "召開董" } } } }
{ "took": 10, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.8630463, "hits": [ { "_index": "telegraph", "_type": "msg", "_id": "AZetp2QBW8hrYY3zGJk7", "_score": 0.8630463, "_source": { "title": "週五召開董事會會議 審議及批准更新後的一季報", "content": "以審議及批准更新後的2018年第一季度報告", "author": "中興通信", "pubdate": "2018-07-17T12:33:11" } } ] } }
當咱們想對多個字段進行匹配,其中一個字段包含分詞就被文檔就被搜索到時,能夠用multi_match。
咱們搜索」聚焦成交「,只要」title「、」content「任意一個字段中包含
GET telegraph/_search { "query": { "multi_match": { "query": "聚焦成交", "fields": ["title","content"] } } }
{ "took": 9, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 1.0806551, "hits": [ { "_index": "telegraph", "_type": "msg", "_id": "Apetp2QBW8hrYY3zGJk7", "_score": 1.0806551, "_source": { "title": "長生生物再次跌停 三機構拋售近1000萬元", "content": "長生生物再次一字跌停,報收19.89元,成交1432萬元", "author": "長生生物", "pubdate": "2018-07-17T10:03:11" } }, { "_index": "telegraph", "_type": "msg", "_id": "BJetp2QBW8hrYY3zGJk7", "_score": 0.99277425, "_source": { "title": "河北聚焦十大行業推動國際產能合做", "content": "河北省政府近日出臺積極參與「一帶一路」建設推動國際產能合做實施方案", "author": "財聯社", "pubdate": "2018-07-17T14:14:55" } } ] } }