elasticsearch-dsl查詢

接續上篇,本篇使用python的elasticsearch-dsl庫操做elasticsearch進行查詢。html

7.查詢

Elasticsearch是功能很是強大的搜索引擎,使用它的目的就是爲了快速的查詢到須要的數據。python

查詢分類:django

  • 基本查詢:使用es內置查詢條件進行查詢
  • 組合查詢:把多個查詢組合在一塊兒進行復合查詢
  • 過濾:查詢同時,經過filter條件在不影響打分的狀況下篩選數據

7.一、基本查詢

    • 查詢前先建立一張表
       1 PUT chaxun  2 {  3   "mappings": {  4     "job":{  5       "properties": {  6         "title":{  7           "store": true,  8           "type": "text",  9           "analyzer": "ik_max_word"
      10  }, 11         "company_name":{ 12           "store": true, 13           "type": "keyword"
      14  }, 15         "desc":{ 16           "type": "text"
      17  }, 18         "comments":{ 19           "type":"integer"
      20  }, 21         "add_time":{ 22           "type":"date", 23           "format": "yyyy-MM-dd"
      24  } 25  } 26  } 27  } 28 }

      表截圖:app

    • match查詢
      1 GET chaxun/job/_search 2 { 3   "query": { 4     "match": { 5       "title": "python"
      6  } 7  } 8 }
      1 s = Search(index='chaxun').query('match', title='python') 2 response = s.execute()
    • term查詢 

      term查詢不會對查詢條件進行解析(分詞)python爬蟲

      1 GET chaxun/job/_search 2 { 3   "query": { 4     "term":{ 5       "title":"python爬蟲"
      6  } 7  } 8 }
      1 s = Search(index='chaxun').query('term', title='python爬蟲') 2 response = s.execute()
    • terms查詢
      1 GET chaxun/job/_search 2 { 3   "query": { 4     "terms":{ 5       "title":["工程師", "django", "系統"] 6  } 7  } 8 }
      1 s = Search(index='chaxun').query('terms', title=['django', u'工程師', u'系統']) 2 response = s.execute()
    • 控制查詢的返回數量
       1 GET chaxun/job/_search  2 {  3   "query": {  4     "term":{  5       "title":"python"
       6  }  7  },  8   "from":1,  9   "size":2
      10 }
      1 s = Search(index='chaxun').query('terms', title=['django', u'工程師', u'系統'])[0:2] 2 response = s.execute()
    • match_all 查詢全部
      1 GET chaxun/job/_search 2 { 3   "query": { 4     "match_all": {} 5  } 6 }
      1 s = Search(index='chaxun').query('match_all') 2 response = s.execute()
    • match_phrase短語查詢
       1 GET chaxun/job/_search  2 {  3   "query": {  4     "match_phrase": {  5       "title": {  6         "query": "python系統",  7         "slop": 3
       8  }  9  } 10  } 11 }
      1 s = Search(index='chaxun').query('match_phrase', title={"query": u"elasticsearch引擎", "slop": 3}) 2 response = s.execute()

      註釋:將查詢條件python系統」分詞成[「python」, 「系統」],結果需同時知足列表中分詞短語,「slop」指定分詞詞距,匹配結果需不超過slop,好比「python打造推薦引擎系統」,若是slop小於6則沒法匹配。elasticsearch

    • multi_match查詢
      1 GET chaxun/job/_search 2 { 3   "query": { 4     "multi_match": { 5       "query": "python", 6       "fields": ["title^3", "desc"] 7  } 8  } 9 }
      1 q = Q('multi_match', query="python", fields=["title", "desc"]) 2 s = Search(index='chaxun').query(q) 3 response = s.execute()

      註釋:指定查詢多個字段,」^3」指定」title」權重是」desc」3倍。測試

    • 指定返回字段
      1 GET chaxun/job/_search 2 { 3   "stored_fields": ["title", "company_name"], 4   "query": { 5     "match": { 6       "title": "python"
      7  } 8  } 9 }
      1 s = Search(index='chaxun').query('match', title='python').source(['title', 'company_name']) 2 response = s.execute()
    • 經過sort對結果排序
       1 GET chaxun/job/_search  2 {  3   "query": {  4     "match_all": {}  5  },  6   "sort": [  7  {  8       "comments": {  9         "order": "desc"
      10  } 11  } 12  ] 13 }
      1 s = Search(index='chaxun').query('match_all').sort({"comments": {"order": "desc"}}) 2 response = s.execute()
    • range查詢範圍
       1 GET chaxun/job/_search  2 {  3   "query": {  4     "range": {  5       "comments": {  6         "gte": 10,  7         "lte": 50,  8         "boost": 2.0   --權重  9  } 10  } 11  } 12 }
      1 s = Search(index='chaxun').query('range', comments={"gte": 10, "lte": 50, "boost": 2.0}) 2 response = s.execute()
    • wildcard查詢
       1 GET chaxun/job/_search  2 {  3   "query": {  4     "wildcard": {  5       "title": {  6         "value": "pyth*n",  7         "boost": 2
       8  }  9  } 10  } 11 }
      1 s = Search(index='chaxun').query('wildcard', title={"value": "pyth*n", "boost": 2}) 2 response = s.execute()

 7.二、組合查詢

    • 新建一張查詢表

    • bool查詢
  • 格式以下
    1 bool:{ 2     "filter":[], 3     "must":[], 4     "should":[], 5     "must_not":[] 6 }
    • 最簡單的filter查詢
      1 select * from testdb where salary=20
       1 GET bool/testdb/_search  2 {  3   "query": {  4     "bool": {  5       "must": {  6         "match_all":{}  7  },  8      "filter": {  9         "term":{ 10           "salary":20
      11  } 12  } 13  } 14  } 15 }
      1 s = Search(index='bool').query('bool', filter=[Q('term', salary=20)]) 2 response = s.execute()
    • 查看分析器解析(分詞)的結果
      1 GET _analyze 2 { 3   "analyzer": "ik_max_word", 4   "text": "成都電子科技大學"
      5 }

      註釋:」ik_max_word」,精細分詞;」ik_smart」,粗略分詞搜索引擎

    • bool組合過濾查詢
      1 select * from testdb where (salary=20 or title=python) and (salary !=30)
       1 GET bool/testdb/_search  2 {  3   "query": {  4     "bool": {  5       "should": [  6         {"term":{"salary":20}},  7         {"term":{"title":"python"}}  8  ],  9       "must_not": [ 10         {"term":{"salary":30}} 11  ] 12  } 13  } 14 }
      1 q = Q('bool', should=[Q('term', salary=20), Q('term', title='python')],must_not=[Q('term', salary=30)]) 2 response = s.execute()
    • 嵌套查詢
      1 select * from testdb where title=python or (title=django and salary=30)
       1 GET bool/testdb/_search  2 {  3   "query": {  4     "bool":{  5       "should":[  6         {"term":{"title":"python"}},  7         {"bool":{  8           "must":[{"term":{"title":"django"}},  9                   {"term":{"salary":30}}] 10  }} 11  ] 12  } 13  } 14 }
      1 q = Q('bool', should=[Q('term', title='python'), Q('bool', must=[Q('term', title='django'), Q('term', salary=30)])]) 2 s = Search(index='bool').query(q) 3 response = s.execute()
    • 過濾空和非空
  • 創建測試數據
     1 POST null/testdb2/_bulk  2 {"index":{"_id":1}}  3 {"tags":["search"]}  4 {"index":{"_id":2}}  5 {"tags":["search", "python"]}  6 {"index":{"_id":3}}  7 {"other_field":["some data"]}  8 {"index":{"_id":4}}  9 {"tags":null} 10 {"index":{"_id":5}} 11 {"tags":["search", null]}
  • 處理null空值的方法
    1 select tags from testdb2 where tags is not NULL
     1 GET null/testdb2/_search  2 {  3   "query": {  4     "bool":{  5       "filter": {  6         "exists": {  7           "field": "tags"
     8  }  9  } 10  } 11  } 12 }
    1 s = Search(index='null').query('bool', filter={"exists": {"field": "tags"}}) 2 response = s.execute()

7.三、聚合查詢

未完待續...url

相關文章
相關標籤/搜索