接續上篇,本篇使用python的elasticsearch-dsl庫操做elasticsearch進行查詢。html
7.查詢
Elasticsearch是功能很是強大的搜索引擎,使用它的目的就是爲了快速的查詢到須要的數據。python
查詢分類:django
- 基本查詢:使用es內置查詢條件進行查詢
- 組合查詢:把多個查詢組合在一塊兒進行復合查詢
- 過濾:查詢同時,經過filter條件在不影響打分的狀況下篩選數據
7.一、基本查詢
-
- 查詢前先建立一張表
1 PUT chaxun 2 { 3 "mappings": { 4 "job":{ 5 "properties": { 6 "title":{ 7 "store": true, 8 "type": "text", 9 "analyzer": "ik_max_word" 10 }, 11 "company_name":{ 12 "store": true, 13 "type": "keyword" 14 }, 15 "desc":{ 16 "type": "text" 17 }, 18 "comments":{ 19 "type":"integer" 20 }, 21 "add_time":{ 22 "type":"date", 23 "format": "yyyy-MM-dd" 24 } 25 } 26 } 27 } 28 }
表截圖:app
- 查詢前先建立一張表
-
- match查詢
1 GET chaxun/job/_search 2 { 3 "query": { 4 "match": { 5 "title": "python" 6 } 7 } 8 }
1 s = Search(index='chaxun').query('match', title='python') 2 response = s.execute()
- term查詢
term查詢不會對查詢條件進行解析(分詞)python爬蟲
1 GET chaxun/job/_search 2 { 3 "query": { 4 "term":{ 5 "title":"python爬蟲" 6 } 7 } 8 }
1 s = Search(index='chaxun').query('term', title='python爬蟲') 2 response = s.execute()
- terms查詢
1 GET chaxun/job/_search 2 { 3 "query": { 4 "terms":{ 5 "title":["工程師", "django", "系統"] 6 } 7 } 8 }
1 s = Search(index='chaxun').query('terms', title=['django', u'工程師', u'系統']) 2 response = s.execute()
- 控制查詢的返回數量
1 GET chaxun/job/_search 2 { 3 "query": { 4 "term":{ 5 "title":"python" 6 } 7 }, 8 "from":1, 9 "size":2 10 }
1 s = Search(index='chaxun').query('terms', title=['django', u'工程師', u'系統'])[0:2] 2 response = s.execute()
- match_all 查詢全部
1 GET chaxun/job/_search 2 { 3 "query": { 4 "match_all": {} 5 } 6 }
1 s = Search(index='chaxun').query('match_all') 2 response = s.execute()
- match_phrase短語查詢
1 GET chaxun/job/_search 2 { 3 "query": { 4 "match_phrase": { 5 "title": { 6 "query": "python系統", 7 "slop": 3 8 } 9 } 10 } 11 }
1 s = Search(index='chaxun').query('match_phrase', title={"query": u"elasticsearch引擎", "slop": 3}) 2 response = s.execute()
註釋:將查詢條件「python系統」分詞成[「python」, 「系統」],結果需同時知足列表中分詞短語,「slop」指定分詞詞距,匹配結果需不超過slop,好比「python打造推薦引擎系統」,若是slop小於6則沒法匹配。elasticsearch
- multi_match查詢
1 GET chaxun/job/_search 2 { 3 "query": { 4 "multi_match": { 5 "query": "python", 6 "fields": ["title^3", "desc"] 7 } 8 } 9 }
1 q = Q('multi_match', query="python", fields=["title", "desc"]) 2 s = Search(index='chaxun').query(q) 3 response = s.execute()
註釋:指定查詢多個字段,」^3」指定」title」權重是」desc」的3倍。測試
- 指定返回字段
1 GET chaxun/job/_search 2 { 3 "stored_fields": ["title", "company_name"], 4 "query": { 5 "match": { 6 "title": "python" 7 } 8 } 9 }
1 s = Search(index='chaxun').query('match', title='python').source(['title', 'company_name']) 2 response = s.execute()
- 經過sort對結果排序
1 GET chaxun/job/_search 2 { 3 "query": { 4 "match_all": {} 5 }, 6 "sort": [ 7 { 8 "comments": { 9 "order": "desc" 10 } 11 } 12 ] 13 }
1 s = Search(index='chaxun').query('match_all').sort({"comments": {"order": "desc"}}) 2 response = s.execute()
- range查詢範圍
1 GET chaxun/job/_search 2 { 3 "query": { 4 "range": { 5 "comments": { 6 "gte": 10, 7 "lte": 50, 8 "boost": 2.0 --權重 9 } 10 } 11 } 12 }
1 s = Search(index='chaxun').query('range', comments={"gte": 10, "lte": 50, "boost": 2.0}) 2 response = s.execute()
- wildcard查詢
1 GET chaxun/job/_search 2 { 3 "query": { 4 "wildcard": { 5 "title": { 6 "value": "pyth*n", 7 "boost": 2 8 } 9 } 10 } 11 }
1 s = Search(index='chaxun').query('wildcard', title={"value": "pyth*n", "boost": 2}) 2 response = s.execute()
- match查詢
7.二、組合查詢
-
- 新建一張查詢表
-
- bool查詢
- 格式以下
1 bool:{ 2 "filter":[], 3 "must":[], 4 "should":[], 5 "must_not":[] 6 }
-
- 最簡單的filter查詢
1 select * from testdb where salary=20
1 GET bool/testdb/_search 2 { 3 "query": { 4 "bool": { 5 "must": { 6 "match_all":{} 7 }, 8 "filter": { 9 "term":{ 10 "salary":20 11 } 12 } 13 } 14 } 15 }
1 s = Search(index='bool').query('bool', filter=[Q('term', salary=20)]) 2 response = s.execute()
- 查看分析器解析(分詞)的結果
1 GET _analyze 2 { 3 "analyzer": "ik_max_word", 4 "text": "成都電子科技大學" 5 }
註釋:」ik_max_word」,精細分詞;」ik_smart」,粗略分詞搜索引擎
- bool組合過濾查詢
1 select * from testdb where (salary=20 or title=python) and (salary !=30)
1 GET bool/testdb/_search 2 { 3 "query": { 4 "bool": { 5 "should": [ 6 {"term":{"salary":20}}, 7 {"term":{"title":"python"}} 8 ], 9 "must_not": [ 10 {"term":{"salary":30}} 11 ] 12 } 13 } 14 }
1 q = Q('bool', should=[Q('term', salary=20), Q('term', title='python')],must_not=[Q('term', salary=30)]) 2 response = s.execute()
- 嵌套查詢
1 select * from testdb where title=python or (title=django and salary=30)
1 GET bool/testdb/_search 2 { 3 "query": { 4 "bool":{ 5 "should":[ 6 {"term":{"title":"python"}}, 7 {"bool":{ 8 "must":[{"term":{"title":"django"}}, 9 {"term":{"salary":30}}] 10 }} 11 ] 12 } 13 } 14 }
1 q = Q('bool', should=[Q('term', title='python'), Q('bool', must=[Q('term', title='django'), Q('term', salary=30)])]) 2 s = Search(index='bool').query(q) 3 response = s.execute()
- 過濾空和非空
- 最簡單的filter查詢
- 創建測試數據
1 POST null/testdb2/_bulk 2 {"index":{"_id":1}} 3 {"tags":["search"]} 4 {"index":{"_id":2}} 5 {"tags":["search", "python"]} 6 {"index":{"_id":3}} 7 {"other_field":["some data"]} 8 {"index":{"_id":4}} 9 {"tags":null} 10 {"index":{"_id":5}} 11 {"tags":["search", null]}
- 處理null空值的方法
1 select tags from testdb2 where tags is not NULL
1 GET null/testdb2/_search 2 { 3 "query": { 4 "bool":{ 5 "filter": { 6 "exists": { 7 "field": "tags" 8 } 9 } 10 } 11 } 12 }
1 s = Search(index='null').query('bool', filter={"exists": {"field": "tags"}}) 2 response = s.execute()
7.三、聚合查詢
未完待續...url