(1)es是什麼?前端
es是基於Apache Lucene的開源分佈式(全文)搜索引擎,提供簡單的RESTful API來隱藏Lucene的複雜性。
es除了全文搜索引擎以外,還能夠這樣描述它:
分佈式的實時文件存儲,每一個字段都被索引並可被搜索
分佈式的實時分析搜索引擎
能夠擴展到成百上千臺服務器,處理PB級結構化或非結構化數據。
(2)算法
數據組織 - 物理:節點和分片 - 邏輯:索引、類型、文檔
(3)json
簡單操做服務器
GET
PUT
DELETE
DELETE s18 PUT s18/doc/1 { "name":"yangyazhou", "age": 81, "sex": "男", "tags": "悶騷", "b": "19900715" } PUT s18/doc/2 { "name":"yangtao", "age": 18, "sex": "男", "tags":"浪", "b": "19970521" } PUT s18/doc/3 { "name":"cancan", "age": 16, "sex":"女", "tags":"學習認真", "b":"19980101" } PUT s18/doc/4 { "name":"guchenxu", "age": 22, "sex": "男", "tags":"幽默", "b":"19930302" } PUT s18/doc/5 { "name":"yangwenyu", "age": 23, "sex": "男", "tags":"正人君子", "b":"19941201" }
運行上邊的5個操做app
GET s18/doc/1 GET s18/doc/_search # 查詢字符串 query string GET s18/doc/_search?q=age:22
PUT s18/doc/5 { "tags":"帥氣" } GET s18/doc/5
上圖是查看到的數據,下面恢復一下數據分佈式
#恢復數據
PUT s18/doc/5 { "name":"yangwenyu", "age": 23, "sex": "男", "tags":"正人君子", "b":"19941201" }
# 修改指定字段使用POST POST s18/doc/5/_update { "doc": { "tags":"帥氣" } }
#查看 GET s18/doc/5
是否能夠按照條件刪除?post
DELETE s18/doc/5 DELETE s18
不建議下圖的刪除方式學習
POST s18/doc/_delete_by_query?q=age:18
# 查詢字符串 query string GET s18/doc/_search?q=age:22
只須要記憶最簡單的就能夠了搜索引擎
PUT增長 GET查找 POST修改 DELETE刪除spa
# 查詢的兩種方式 # 方式一:查詢字符串 query string GET s18/doc/_search?q=age:22 # 方式二:DSL GET s18/doc/_search { "query": { "match": { "age": "18" } } }
GET s18/doc/_search
{ "query": { "match": { "age": 18 } } }
#內部已經作好了轉化
match
GET s18/doc/_search { "query": { "match": { "tags": "浪" } } }
# 報錯,不能加在列表裏邊 GET s18/doc/_search { "query": { "match": { "tags": ["浪", "悶騷"] } } } #經過空格分隔 GET s18/doc/_search { "query": { "match": { "tags": "浪 悶騷" } } }
#經過逗號分隔
GET s18/doc/_search { "query": { "match": { "tags": "浪,悶騷" } } }
#只要符合上邊的一個條件就能返回,只是寫法不一樣內部會作一些轉換
match_all的用法
#下面的兩種方式是等價的
GET s18/doc/_search GET s18/doc/_search { "query": { "match_all": {} } }
desc表示從大到小,降序
asc表示從小到大,升序
注意:不是全部的字段都能排序,選擇有意義的排序
# 排序 sort
GET s18/doc/_search { "query": { "match_all": {} }, "sort": [ { "age": { "order": "desc" } } ] } GET s18/doc/_search { "query": { "match_all": {} }, "sort": [ { "age": { "order": "asc" } } ] }
GET s18/doc/_search GET s18/doc/_search { "query": { "match_all": {} }, "from": 0, "size": 2 }
#上邊查找的是第1條和第2條數據 GET s18/doc/_search { "query": { "match_all": {} }, "from": 2, "size": 2 }
#上邊查找的是第3條和第4條數據 GET s18/doc/_search GET s18/doc/_search { "query": { "match_all": {} }, "from": 4, "size": 10 }
#上邊查找的是第5條到底15條數據,沒有就取到最大值,若是隻有1條就只返回1條
分頁就是自定製,從哪顯示到哪裏的意思.
#查詢yangwenyu或者18歲
GET s18/doc/_search { "query": { "bool": { "should": [ { "match": { "name": "yangwenyu" } }, { "match": { "age": "18" } } ] } } }
#這個查詢出的結果排序,也就是打分機制存在於內部算法中
#查詢性別是男的而且年齡81 GET s18/doc/_search { "query": { "bool": { "must": [ { "match": { "age": 81 } }, { "match": { "sex": "男" } } ] } } }
# 查詢性別既不是男的,又不是18歲: must_not GET s18/doc/_search { "query": { "bool": { "must_not": [ { "match": { "sex": "男" } }, { "match": { "age": 18 } } ] } } }
# 查詢年齡大於20歲的男的文檔: gt 大於 GET s18/doc/_search { "query": { "bool": { "must": [ { "match": { "sex": "男" } } ], "filter": { "range": { "age": { "gt": 20 } } } } } }
# gte 大於等於,查詢年齡大於等於23的男的 GET s18/doc/_search { "query": { "bool": { "must": [ { "match": { "sex": "男" } } ], "filter": { "range": { "age": { "gte": 23 } } } } } }
# 小於lt 查詢年齡小於20的女的 GET s18/doc/_search { "query": { "bool": { "must": [ { "match": { "sex": "女" } } ], "filter": { "range": { "age": { "lt": 20 } } } } } }
# 小於等於lte, 查詢年齡小於等於23的男的 GET s18/doc/_search { "query": { "bool": { "should": [ { "match": { "sex": "男" } } ], "filter": { "range": { "age": { "lte": 23 } } } } } }
# filter中儘可能用must,避免髒數據 GET s18/doc/_search { "query": { "bool": { "must": [ { "match": { "sex": "男" } } ], "filter": { "range": { "age": { "lte": 23 } } } } } }
# 查詢年齡小於等於23的非男性 GET s18/doc/_search { "query": { "bool": { "must_not": [ { "match": { "sex": "男" } } ], "filter": { "range": { "age": { "lte": 23 } } } } } }
關鍵字高亮顯示,查詢是哪一個檢索的.
# 高亮查詢 # 查詢name是cancan的文檔 GET s18/doc/_search { "query": { "match": { "name": "cancan" } }, "highlight": { "fields": { "name": {} } } } GET s18/doc/_search { "query": { "match": { "name": "cancan" } }, "highlight": { "pre_tags": "<b style='color:red;font-size:20px;' class='wangdi'>", "post_tags": "</b>", "fields": { "name": {} } } }
#如今只是json結果,只有放在前端才能顯示結果
PUT s18/doc/7 { "name":"wangdi", "desc": "騷的打漂" } GET s18/doc/_search { "query": { "match": { "desc": "打漂" } }, "highlight": { "pre_tags": "<b style='color:red;font-size:20px;' class='wangdi'>", "post_tags": "</b>", "fields": { "desc": {} } } }
#上邊表明只是高亮顯示"打漂"
#高亮顯示就是重要的點
# 結果過濾 GET s18/doc/_search { "query": { "match": { "name": "yangtao" } }, "_source": "name" }
GET s18/doc/_search { "query": { "match": { "name": "yangtao" } }, "_source": ["name", "age", "sex"] }
咱們只須要過濾出,咱們須要的字段,減小服務器壓力
# 聚合查詢 # sum,查詢全部男生的年齡總和 GET s18/doc/_search { "query": { "match": { "sex": "男" } }, "aggs": { "my_sum": { "sum": { "field": "age" } } } } # 查詢年齡最大的男生 max GET s18/doc/_search { "query": { "match": { "sex": "男" } }, "aggs": { "my_max": { "max": { "field": "age" } } } } # 查詢年齡最小的 min GET s18/doc/_search { "aggs": { "my_min": { "min": { "field": "age" } } } } # 求平均 avg GET s18/doc/_search { "aggs": { "my_avg": { "avg": { "field": "age" } } } } # 分組,根據年齡,10-20,,20-30, 30-100,每一個年齡段有多少人?
GET s18/doc/_search { "query": { "match": { "sex": "男" } }, "aggs": { "my_group":{ "range": { "field": "age", "ranges": [ { "from": 10, "to": 20 }, { "from": 20, "to": 30 }, { "from": 30, "to": 100 } ] } } } } # 分組,根據年齡,10-20,,20-30, 30-100, 對每組年齡求和 GET s18/doc/_search { "query": { "match": { "sex": "男" } }, "aggs": { "group":{ "range": { "field": "age", "ranges": [ { "from": 10, "to": 20 }, { "from": 20, "to": 30 }, { "from": 30, "to": 100 } ] }, "aggs": { "my_sum": { "sum": { "field": "age" } } } } } }
先分組,再聚合
homework:
(1)用py腳本製做一鍵啓動es和kibana
(2)倒排索引,把表畫出來