elasticsearch多種搜索方式

簡要

一、query string search
二、query DSL
三、query filter
四、full-text search
五、phrase search
六、highlight searchweb

一、query string search

搜索所有商品:GET /ecommerce/product/_searchsql

query string search的由來,由於search參數都是以http請求的query string來附帶的。json

搜索商品名稱中包含yagao的商品,並且按照售價降序排序:GET /ecommerce/product/_search?q=name:yagao&sort=price:descapp

 

適用於臨時的在命令行使用一些工具,好比curl,快速的發出請求,來檢索想要的信息;curl

可是若是查詢請求很複雜,是很難去構建的,在生產環境中,幾乎不多使用query string searchnosql

 

took:耗費了幾毫秒
timed_out:是否超時,這裏是沒有
_shards:數據拆成了5個分片,因此對於搜索請求,會打到全部的primary shard(或者是它的某個replica shard也能夠)
hits.total:查詢結果的數量,3個document
hits.max_score:score的含義,就是document對於一個search的相關度的匹配分數,越相關,就越匹配,分數也高
hits.hits:包含了匹配搜索的document的詳細數據elasticsearch

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}
View Code

 

GET /test_index/test_type/_search?q=test_field:testide

GET /test_index/test_type/_search?q=+test_field:test工具

GET /test_index/test_type/_search?q=-test_field:testpost

 

一個是掌握q=field:search content的語法,還有一個是掌握+-的含義,+是必須包含,-是不包含

 

_all  metadata的原理和做用

 

GET /test_index/test_type/_search?q=test

 

直接能夠搜索全部的field,任意一個field包含指定的關鍵字就能夠搜索出來。

 

二、query DSL

DSLDomain Specified Language,特定領域的語言

http request body:請求體,能夠用json的格式來構建查詢語法,比較方便,能夠構建各類複雜的語法,比query string search確定強大多了

 

查詢全部的商品

 

GET /ecommerce/product/_search

{

  "query": { "match_all": {} }

}

 

查詢名稱包含yagao的商品,同時按照價格降序排序

 

GET /ecommerce/product/_search

{

    "query" : {

        "match" : {

            "name" : "yagao"

        }

    },

    "sort": [

        { "price": "desc" }

    ]

}

 

分頁查詢

分頁查詢商品,總共3條商品,假設每頁就顯示1條商品,如今顯示第2頁,因此就查出來第2個商品.from://從第幾個商品開始查

 

GET /ecommerce/product/_search

{

  "query": { "match_all": {} },

  "from": 1,  

  "size": 1

}

 

指定要查詢出來商品的名稱和價格就能夠

 

GET /ecommerce/product/_search

{

  "query": { "match_all": {} },

  "_source": ["name", "price"]

}

 

更加適合生產環境的使用,能夠構建複雜的查詢

 

 

Scoll滾動搜索

若是一次性要查出來好比10萬條數據,那麼性能會不好,此時通常會採起用scoll滾動查詢,一批一批的查,直到全部數據都查詢完處理完

 

使用scoll滾動搜索,能夠先搜索一批數據,而後下次再搜索一批數據,以此類推,直到搜索出所有的數據來

scoll搜索會在第一次搜索的時候,保存一個當時的視圖快照,以後只會基於該舊的視圖快照提供數據搜索,若是這個期間數據變動,是不會讓用戶看到的

採用基於_doc進行排序的方式,性能較高

每次發送scroll請求,咱們還須要指定一個scoll參數,指定一個時間窗口,每次搜索請求只要在這個時間窗口內能完成就能夠了

 

每次取3

GET /test_index/test_type/_search?scroll=1m

{

  "query": {

    "match_all": {}

  },

  "sort": [ "_doc" ],

  "size": 3

}

 

 

得到的結果會有一個scoll_id,下一次再發送scoll請求的時候,必須帶上這個scoll_id

 

GET /_search/scroll

{

    "scroll": "1m",

    "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAACxeFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYBY0b25zVFlWWlRqR3ZJajlfc3BXejJ3AAAAAAAALF8WNG9uc1RZVlpUakd2SWo5X3NwV3oydwAAAAAAACxhFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYhY0b25zVFlWWlRqR3ZJajlfc3BXejJ3"

}

 

scoll,看起來挺像分頁的,可是其實使用場景不同。分頁主要是用來一頁一頁搜索,給用戶看的;scoll主要是用來一批一批檢索數據,讓系統進行處理的。

 

 

組合多個搜索條件

GET /website/article/_search

{

  "query": {

    "bool": {

      "must": [   //title必須包含elasticsearch

        {

          "match": {

            "title": "elasticsearch"

          }

        }

      ],

      "should": [  //content能夠包含elasticsearch也能夠不包含

        {

          "match": {

            "content": "elasticsearch"

          }

        }

      ],

      "must_not": [ //author_id必須不爲111

        {

          "match": {

            "author_id": 111

          }

        }

      ]

    }

  }

}

 

 

 

1match all

 

GET /_search

{

    "query": {

        "match_all": {}

    }

}

 

2match

 

GET /_search

{

    "query": { "match": { "title": "my elasticsearch article" }}

}

 

3multi match

 

GET /test_index/test_type/_search

{

  "query": {

    "multi_match": {

      "query": "test",  //搜索的文本

      "fields": ["test_field", "test_field1"]  //多個field上面搜索

    }

  }

}

 

4range query

 

GET /company/employee/_search

{

  "query": {

    "range": {

      "age": {

        "gte": 30

      }

    }

  }

}

 

5term query   

//把這個字段當成exact value去查詢(前提條件:手動建立mapping的時候須要指定no_analy不分詞去創建索引,這樣才能夠用test helloterm搜到)

 

GET /test_index/test_type/_search

{

  "query": {

    "term": {

      "test_field": "test hello"

    }

  }

}

 

6terms query

 

GET /_search

{

    "query": { "terms": { "tag": [ "search", "full_text", "nosql" ] }}  //tag字段指定多個搜索詞

}

 

 

三、query filter

 

搜索商品名稱包含yagao,並且售價大於25元的商品

 

GET /ecommerce/product/_search

{

    "query" : {

        "bool" : {

            "must" : {

                "match" : {

                    "name" : "yagao"

                }

            },

            "filter" : {

                "range" : {

                    "price" : { "gt" : 25 }

                }

            }

        }

    }

}

 

 

 

{

    "bool": {

        "must":     { "match": { "title": "how to make millions" }},

        "must_not": { "match": { "tag":   "spam" }},

        "should": [

            { "match": { "tag": "starred" }}

        ],

        "filter": {

          "range": { "date": { "gte": "2014-01-01" }}

        }

    }

}

 

 

 

{

    "bool": {

        "must":     { "match": { "title": "how to make millions" }},

        "must_not": { "match": { "tag":   "spam" }},

        "should": [

            { "match": { "tag": "starred" }}

        ],

        "filter": {

          "bool": {

              "must": [

                  { "range": { "date": { "gte": "2014-01-01" }}},

                  { "range": { "price": { "lte": 29.99 }}}

              ],

              "must_not": [

                  { "term": { "category": "ebooks" }}

              ]

          }

        }

    }

}

 

GET /company/employee/_search

{

  "query": {

    "constant_score": {  //constant_score是固定語法單純使用filter的時候須要加上的

      "filter": {

        "range": {

          "age": {

            "gte": 30

          }

        }

      }

    }

  }

}

 

 

 

四、full-text search

GET /ecommerce/product/_search

{

    "query" : {

        "match" : {

            "producer" : "yagao producer"

        }

    }

}

五、phrase search(短語搜索)

跟全文檢索相對應,相反,全文檢索會將輸入的搜索串拆解開來,去倒排索引裏面去一一匹配,只要能匹配上任意一個拆解後的單詞,就能夠做爲結果返回

phrase search,要求輸入的搜索串,必須在指定的字段文本中,徹底包含如出一轍的,才能夠算匹配,才能做爲結果返回

GET /ecommerce/product/_search

{

    "query" : {

        "match_phrase" : {

            "producer" : "yagao producer"

        }

    }

}

六、highlight search

GET /ecommerce/product/_search

{

    "query" : {

        "match" : {

            "producer" : "producer"

        }

    },

    "highlight": {

        "fields" : {

            "producer" : {}

        }

    }

}

 

 

7、判斷搜索是否合法

//判斷搜索是否合法,若是不合法問題在哪裏

GET /test_index/test_type/_validate/query?explain

{

  "query": {

    "math": {

      "test_field": "test"

    }

  }

}

 

{

  "valid": false,

  "error": "org.elasticsearch.common.ParsingException: no [query] registered for [math]"

}

 

8、排序

 

1、默認排序規則

 

默認狀況下,是按照_score降序排序的

 

然而,某些狀況下,可能沒有有用的_score,好比說filter

 

GET /_search

{

    "query" : {

        "bool" : {

            "filter" : {

                "term" : {

                    "author_id" : 1

                }

            }

        }

    }

}

 

固然,也能夠是constant_score

 

GET /_search

{

    "query" : {

        "constant_score" : {

            "filter" : {

                "term" : {

                    "author_id" : 1

                }

            }

        }

    }

}

 

2、定製排序規則

 

GET /company/employee/_search

{

  "query": {

    "constant_score": {

      "filter": {

        "range": {

          "age": {

            "gte": 30

          }

        }

      }

    }

  },

  "sort": [

    {

      "join_date": {

        "order": "asc"

      }

    }

  ]

}

 

 

 

問題:若是對一個string field進行排序,結果每每不許確,由於分詞後是多個單詞,再排序就不是咱們想要的結果了

 

一般解決方案是,將一個string field創建兩次索引,一個分詞,用來進行搜索;一個不分詞,用來進行排序

 

PUT /website

{

  "mappings": {

    "article": {

      "properties": {

        "title": {

          "type": "text", //分詞索引

          "fields": {

            "raw": {     //不分詞索引

              "type": "string",

              "index": "not_analyzed"

            }

          },

          "fielddata": true  //正排索引

        },

        "content": {

          "type": "text"

        },

        "post_date": {

          "type": "date"

        },

        "author_id": {

          "type": "long"

        }

      }

    }

  }

}

 

PUT /website/article/1

{

  "title": "first article",

  "content": "this is my second article",

  "post_date": "2017-01-01",

  "author_id": 110

}

 

 

 

{

  "took": 2,

  "timed_out": false,

  "_shards": {

    "total": 5,

    "successful": 5,

    "failed": 0

  },

  "hits": {

    "total": 3,

    "max_score": 1,

    "hits": [

      {

        "_index": "website",

        "_type": "article",

        "_id": "2",

        "_score": 1,

        "_source": {

          "title": "first article",

          "content": "this is my first article",

          "post_date": "2017-02-01",

          "author_id": 110

        }

      },

      {

        "_index": "website",

        "_type": "article",

        "_id": "1",

        "_score": 1,

        "_source": {

          "title": "second article",

          "content": "this is my second article",

          "post_date": "2017-01-01",

          "author_id": 110

        }

      },

      {

        "_index": "website",

        "_type": "article",

        "_id": "3",

        "_score": 1,

        "_source": {

          "title": "third article",

          "content": "this is my third article",

          "post_date": "2017-03-01",

          "author_id": 110

        }

      }

    ]

  }

}

 

 

GET /website/article/_search

{

  "query": {

    "match_all": {}

  },

  "sort": [

    {

      "title.raw": {  //拿未分詞索引的去排,上面有建立了

        "order": "desc"

      }

    }

  ]

}

相關文章
相關標籤/搜索