Elasticsearch由淺入深(一)簡介、、安裝、CRUD

什麼是Elasticsearch

什麼是搜索

百度:咱們好比說想找尋任何的信息的時候,就會上百度去搜索一下,好比說找一部本身喜歡的電影,或者說找一本喜歡的書,或者找一條感興趣的新聞(提到搜索的第一印象),百度 != 搜索java

垂直搜索(站內搜索)node

互聯網的搜索:電商網站,招聘網站,新聞網站,各類app算法

IT系統的搜索:OA軟件,辦公自動化軟件,會議管理,日程管理,項目管理,員工管理,搜索「張三」,「張三兒」,「張小三」;有個電商網站,賣家,後臺管理系統,搜索「牙膏」,訂單,「牙膏相關的訂單」數據庫

搜索,就是在任何場景下,找尋你想要的信息,這個時候,會輸入一段你要搜索的關鍵字,而後就指望找到這個關鍵字相關的有些信息json

若是用數據庫作搜索會怎麼樣

作軟件開發的話,或者對IT、計算機有必定的瞭解的話,都知道,數據都是存儲在數據庫裏面的,好比說電商網站的商品信息,招聘網站的職位信息,新聞網站的新聞信息,等等吧。因此說,很天然的一點,若是說從技術的角度去考慮,如何實現如說,電商網站內部的搜索功能的話,就能夠考慮,去使用數據庫去進行搜索。windows

  1. 比方說,每條記錄的指定字段的文本,可能會很長,好比說「商品描述」字段的長度,有長達數千個,甚至數萬個字符,這個時候,每次都要對每條記錄的全部文本進行掃描,懶判斷說,你包不包含我指定的這個關鍵詞(好比說「牙膏」)
  2. 還不能將搜索詞拆分開來,儘量去搜索更多的符合你的指望的結果,好比輸入「生化機」,就搜索不出來「生化危機」

用數據庫來實現搜索,是不太靠譜的。一般來講,性能會不好的。api

什麼是全文檢索和Lucene

全文檢索:倒排索引服務器

lucene:就是一個jar包,裏面包含了封裝好的各類創建倒排索引,以及進行搜索的代碼,包括各類算法。咱們就用java開發的時候,引入lucene jar,而後基於lucene的api進行去進行開發就能夠了。用lucene,咱們就能夠去將已有的數據創建索引,lucene會在本地磁盤上面,給咱們組織索引的數據結構。另外的話,咱們也能夠用lucene提供的一些功能和api來針對磁盤上額restful

什麼是Elasticsearch

Elasticsearch 是一個分佈式、RESTful 風格的搜索和數據分析引擎,可以解決不斷涌現出的各類用例。 做爲 Elastic Stack 的核心,它集中存儲您的數據,幫助您發現意料之中以及意料以外的狀況。網絡

Elasticsearch的功能、適用場景以及特色

Elasticsearch的功能

  • 分佈式的搜索引擎和數據分析引擎

    搜索:百度,網站的站內搜索,IT系統的檢索
    數據分析:電商網站,最近7天牙膏這種商品銷量排名前10的商家有哪些;新聞網站,最近1個月訪問量排名前3的新聞版塊是哪些
    分佈式,搜索,數據分析

  • 全文檢索,結構化檢索,數據分析

    全文檢索:我想搜索商品名稱包含牙膏的商品,select * from products where product_name like "%牙膏%"
    結構化檢索:我想搜索商品分類爲日化用品的商品都有哪些,select * from products where category_id='日化用品'
    部分匹配、自動完成、搜索糾錯、搜索推薦
    數據分析:咱們分析每個商品分類下有多少個商品,select category_id,count(*) from products group by category_id

  • 對海量數據進行近實時的處理

    分佈式:ES自動能夠將海量數據分散到多臺服務器上去存儲和檢索
    海聯數據的處理:分佈式之後,就能夠採用大量的服務器去存儲和檢索數據,天然而然就能夠實現海量數據的處理了
    近實時:檢索個數據要花費1小時(這就不要近實時,離線批處理,batch-processing);在秒級別對數據進行搜索和分析
    跟分佈式/海量數據相反的:lucene,單機應用,只能在單臺服務器上使用,最多隻能處理單臺服務器能夠處理的數據量

Elasticsearch的適用場景

  • 國外
    (1)維基百科,相似百度百科,牙膏,牙膏的維基百科,全文檢索,高亮,搜索推薦
    (2)The Guardian(國外新聞網站),相似搜狐新聞,用戶行爲日誌(點擊,瀏覽,收藏,評論)+社交網絡數據(對某某新聞的相關見解),數據分析,給到每篇新聞文章的做者,讓他知道他的文章的公衆反饋(好,壞,熱門,垃圾,鄙視,崇拜)
    (3)Stack Overflow(國外的程序異常討論論壇),IT問題,程序的報錯,提交上去,有人會跟你討論和回答,全文檢索,搜索相關問題和答案,程序報錯了,就會將報錯信息粘貼到裏面去,搜索有沒有對應的答案
    (4)GitHub(開源代碼管理),搜索上千億行代碼
    (5)電商網站,檢索商品
    (6)日誌數據分析,logstash採集日誌,ES進行復雜的數據分析(ELK技術,elasticsearch+logstash+kibana)
    (7)商品價格監控網站,用戶設定某商品的價格閾值,當低於該閾值的時候,發送通知消息給用戶,好比說訂閱牙膏的監控,若是高露潔牙膏的家庭套裝低於50塊錢,就通知我,我就去買
    (8)BI系統,商業智能,Business ntelligence。好比說有個大型商場集團,BI,分析一下某某區域最近3年的用戶消費金額的趨勢以及用戶羣體的組成構成,產出相關的數張報表,**區,最近3年,每一年消費金額呈現100%的增加,並且用戶羣體85%是高級白領,開一個新商場。ES執行數據分析和挖掘,Kibana進行數據可視化
  • 國內
    站內搜索(電商,招聘,門戶,等等),IT系統搜索(OA,CRM,ERP,等等),數據分析(ES熱門的一個使用場景)

Elasticsearch的特色

 

(1)能夠做爲一個大型分佈式集羣(數百臺服務器)技術,處理PB級數據,服務大公司;也能夠運行在單機上,服務小公司
(2)Elasticsearch不是什麼新技術,主要是將全文檢索、數據分析以及分佈式技術,合併在了一塊兒,才造成了獨一無二的ES;lucene(全文檢索),商用的數據分析軟件(也是有的),分佈式數據庫(mycat)
(3)對用戶而言,是開箱即用的,很是簡單,做爲中小型的應用,直接3分鐘部署一下ES,就能夠做爲生產環境的系統來使用了,數據量不大,操做不是太複雜
(4)數據庫的功能面對不少領域是不夠用的(事務,還有各類聯機事務型的操做);特殊的功能,好比全文檢索,同義詞處理,相關度排名,複雜數據分析,海量數據的近實時處理;Elasticsearch做爲傳統數據庫的一個補充,提供了數據庫所不不能提供的不少功能

lucene和elasticsearch的前世此生

  • lucene,最早進、功能最強大的搜索庫,直接基於lucene開發,很是複雜,api複雜(實現一些簡單的功能,寫大量的java代碼),須要深刻理解原理(各類索引結構)
  • elasticsearch,基於lucene,隱藏複雜性,提供簡單易用的restful api接口、java api接口(還有其餘語言的api接口)

    (1)分佈式的文檔存儲引擎
    (2)分佈式的搜索引擎和分析引擎
    (3)分佈式,支持PB級數據

elasticsearch的核心概念

  • Near Realtime(NRT):近實時,兩個意思,從寫入數據到數據能夠被搜索到有一個小延遲(大概1秒);基於es執行搜索和分析能夠達到秒級
  • Cluster:集羣,包含多個節點,每一個節點屬於哪一個集羣是經過一個配置(集羣名稱,默認是elasticsearch)來決定的,對於中小型應用來講,剛開始一個集羣就一個節點很正常
  • Node:節點,集羣中的一個節點,節點也有一個名稱(默認是隨機分配的),節點名稱很重要(在執行運維管理操做的時候),默認節點會去加入一個名稱爲「elasticsearch」的集羣,若是直接啓動一堆節點,那麼它們會自動組成一個elasticsearch集羣,固然一個節點也能夠組成一個elasticsearch集羣
  • Document&field:文檔,es中的最小數據單元,一個document能夠是一條客戶數據,一條商品分類數據,一條訂單數據,一般用JSON數據結構表示,每一個index下的type中,均可以去存儲多個document。一個document裏面有多個field,每一個field就是一個數據字段。
  • Index:索引,包含一堆有類似結構的文檔數據,好比能夠有一個客戶索引,商品分類索引,訂單索引,索引有一個名稱。一個index包含不少document,一個index就表明了一類相似的或者相同的document。好比說創建一個product index,商品索引,裏面可能就存放了全部的商品數據,全部的商品document。
  • Type:類型,每一個索引裏均可以有一個或多個type,type是index中的一個邏輯數據分類,一個type下的document,都有相同的field,好比博客系統,有一個索引,能夠定義用戶數據type,博客數據type,評論數據type。
  • shard:單臺機器沒法存儲大量數據,es能夠將一個索引中的數據切分爲多個shard,分佈在多臺服務器上存儲。有了shard就能夠橫向擴展,存儲更多數據,讓搜索和分析等操做分佈到多臺服務器上去執行,提高吞吐量和性能。每一個shard都是一個lucene index。
  • replica:任何一個服務器隨時可能故障或宕機,此時shard可能就會丟失,所以能夠爲每一個shard建立多個replica副本。replica能夠在shard故障時提供備用服務,保證數據不丟失,多個replica還能夠提高搜索操做的吞吐量和性能。primary shard(創建索引時一次設置,不能修改,默認5個),replica shard(隨時修改數量,默認1個),默認每一個索引10個shard,5個primary shard,5個replica shard,最小的高可用配置,是2臺服務器。

在windows上安裝和啓動Elasticseach

  1. 安裝JDK,至少1.8.0_73以上版本,java -version
  2. 下載和解壓縮Elasticsearch安裝包,目錄結構,下載地址:https://www.elastic.co/cn/downloads/elasticsearch,下載版本:5.2.0
  3. 啓動Elasticsearch:bin\elasticsearch.bat,es自己特色之一就是開箱即用,若是是中小型應用,數據量少,操做不是很複雜,直接啓動就能夠用了
  4. 檢查ES是否啓動成功:http://localhost:9200/?pretty
    // name: node名稱
    // cluster_name: 集羣名稱(默認的集羣名稱就是elasticsearch)
    // version.number: 5.2.0,es版本號
    {
        name: "1LdqLFq",
        cluster_name: "elasticsearch",
        cluster_uuid: "5pqT0Q_XQky6GKjSiFgilA",
        version: {
            number: "5.2.0",
            build_hash: "24e05b9",
            build_date: "2017-01-24T19:52:35.800Z",
            build_snapshot: false,
            lucene_version: "6.4.0"
        },
        tagline: "You Know, for Search"
    }
    View Code
  5. 修改集羣名稱:elasticsearch.yml
  6. 下載和解壓縮Kibana安裝包,使用裏面的開發界面,去操做elasticsearch,做爲咱們學習es知識點的一個主要的界面入口
  7. 啓動Kibana:bin\kibana.bat。地址:http://localhost:5601
  8. 進入Dev Tools界面
  9. GET _cluster/health

快速入門案例實戰之電商網站商品管理:集羣健康檢查,文檔CRUD

document數據格式

  1. 應用系統的數據結構都是面向對象的,複雜的

  2. 對象數據存儲到數據庫中,只能拆解開來,變爲扁平的多張表,每次查詢的時候還得還原回對象格式,至關麻煩
  3. ES是面向文檔的,文檔中存儲的數據結構,與面向對象的數據結構是同樣的,基於這種文檔數據結構,es能夠提供複雜的索引,全文檢索,分析聚合等功能
  4. es的document用json數據格式來表達

電商網站商品管理案例背景介紹

有一個電商網站,須要爲其基於ES構建一個後臺系統,提供如下功能:

  1. 對商品信息進行CRUD(增刪改查)操做
  2. 執行簡單的結構化查詢
  3. 能夠執行簡單的全文檢索,以及複雜的phrase(短語)檢索
  4. 對於全文檢索的結果,能夠進行高亮顯示
  5. 對數據進行簡單的聚合分析

簡單的集羣管理

快速檢查集羣的健康情況

  1. es提供了一套api,叫作cat api,能夠查看es中各類各樣的數據
    GET /_cat/health?v
    epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
    1566094709 10:18:29  elasticsearch yellow          1         1      1   1    0    0        1             0                  -                 50.0%

    如何快速瞭解集羣的健康情況?green、yellow、red?

    green:每一個索引的primary shard和replica shard都是active狀態的
    yellow:每一個索引的primary shard都是active狀態的,可是部分replica shard不是active狀態,處於不可用的狀態
    red:不是全部索引的primary shard都是active狀態的,部分索引有數據丟失了

    爲何如今會處於一個yellow狀態?

    咱們如今就一個筆記本電腦,就啓動了一個es進程,至關於就只有一個node。
    如今es中有一個index,就是kibana本身內置創建的index。
    因爲默認的配置是給每一個index分配5個primary shard和5個replica shard,並且primary shard和replica shard不能在同一臺機器上(爲了容錯)。
    如今kibana本身創建的index是1個primary shard和1個replica shard。
    當前就一個node,因此只有1個primary shard被分配了和啓動了,可是一個replica shard沒有第二臺機器去啓動。

    作一個小實驗:此時只要啓動第二個es進程,就會在es集羣中有2個node,而後那1個replica shard就會自動分配過去,而後cluster status就會變成green狀態。

快速查看集羣中有哪些索引

GET _cat/indices?v
health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana xpiNHK4UQb2569AzgiveSw   1   1          1            0      3.1kb          3.1kb

簡單的索引操做

  • 建立索引
    PUT /test_index?pretty
  • 刪除索引
    DELETE /test_index?pretty

商品的CRUD操做

新增商品:新增文檔,創建索引

語法:

PUT /index/type/id
{
  "json數據"
}

示例:

PUT /ecommerce/product/1
{
    "name" : "gaolujie yagao",
    "desc" :  "gaoxiao meibai",
    "price" :  30,
    "producer" :      "gaolujie producer",
    "tags": [ "meibai", "fangzhu" ]
}

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": true
}

PUT /ecommerce/product/2
{
    "name" : "jiajieshi yagao",
    "desc" :  "youxiao fangzhu",
    "price" :  25,
    "producer" :      "jiajieshi producer",
    "tags": [ "fangzhu" ]
}

PUT /ecommerce/product/3
{
    "name" : "zhonghua yagao",
    "desc" :  "caoben zhiwu",
    "price" :  40,
    "producer" :      "zhonghua producer",
    "tags": [ "qingxin" ]
}
View Code

es會自動創建index和type,不須要提早建立,並且es默認會對document每一個field都創建倒排索引,讓其能夠被搜索

查詢商品:檢索文檔

語法:

GET /index/type/id

示例:

GET /ecommerce/product/1

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "name": "gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}
View Code

修改商品:替換文檔

語法:

PUT /index/type/id
{
  "json數據"
}

示例:

PUT /ecommerce/product/1
{
    "name" : "jiaqiangban gaolujie yagao",
    "desc" :  "gaoxiao meibai",
    "price" :  30,
    "producer" :      "gaolujie producer",
    "tags": [ "meibai", "fangzhu" ]
}

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": false
}
View Code

替換方式有一個很差,即便必須帶上全部的field,才能去進行信息的修改(意思是會所有覆蓋)

修改商品:更新文檔

語法:

POST /index/type/id/_update
{
  "json數據"
}

示例:

POST /ecommerce/product/1/_update
{
  "doc": {
    "name": "jiaqiangban gaolujie yagao"
  }
}

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 8,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  }
}
View Code

刪除商品:刪除文檔

語法:

DELETE /index/type/id

示例:

DELETE /ecommerce/product/1

{
  "found": true,
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 9,
  "result": "deleted",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  }
}
View Code

快速入門案例實戰之電商網站商品管理:多種搜索方式

query string search

搜索所有商品:

GET /ecommerce/product/_search
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}
View Code
took:耗費了幾毫秒
timed_out:是否超時,這裏是沒有
_shards:數據拆成了5個分片,因此對於搜索請求,會打到全部的primary shard(或者是它的某個replica shard也能夠)
hits.total:查詢結果的數量,3個document
hits.max_score:score的含義,就是document對於一個search的相關度的匹配分數,越相關,就越匹配,分數也高
hits.hits:包含了匹配搜索的document的詳細數據

搜索商品名稱中包含yagao的商品,並且按照售價降序排序

GET /ecommerce/product/_search?q=name:yagao&sort=price:desc

query string search的由來,由於search參數都是以http請求的query string來附帶的

適用於臨時的在命令行使用一些工具,好比curl,快速的發出請求,來檢索想要的信息;可是若是查詢請求很複雜,是很難去構建的
在生產環境中,幾乎不多使用query string search

query DSL

DSL:Domain Specified Language,特定領域的語言

優勢:更加適合生產環境的使用,能夠構建複雜的查詢

http request body:請求體,能夠用json的格式來構建查詢語法,比較方便,能夠構建各類複雜的語法,比query string search確定強大多了

  • 查詢全部的商品:
    GET /ecommerce/product/_search
    {
      "query": {
        "match_all": {}
      }
    }
  • 查詢名稱包含yagao的商品,同時按照價格降序排序
    GET /ecommerce/product/_search
    {
      "query": {
        "match": {
          "name": "yagao"
        }
      },
      "sort": [
        {
          "price": {
            "order": "desc"
          }
        }
      ]
    }
  • 分頁查詢商品,總共3條商品,假設每頁就顯示1條商品,如今顯示第2頁,因此就查出來第2個商品

    GET /ecommerce/product/_search
    {
      "query": {
        "match_all": {}
      },
      "from": 1,
      "size": 1
    }
  • 指定要查詢出來商品的名稱和價格就能夠
    GET /ecommerce/product/_search
    {
      "query": {
        "match_all": {}
      },
      "_source": ["name","price"]
    }

     

query filter

搜索商品名稱包含yagao,並且售價大於25元的商品

GET /ecommerce/product/_search
{
    "query": {
        "bool": {
            "must": {
                "match": {
                    "name": "yagao"
                }
            },
            "filter": {
                "range": {
                    "price": {
                        "gt": 25
                    }
                }
            }
        }
    }
}

full-text search(全文檢索)

新增測試數據

PUT /ecommerce/product/4
{
    "name":"special yagao",
    "desc":"special meibai",
    "price":50,
    "producer":"special yagao producer",
    "tags":["meibai"]
}
View Code

全文模糊檢索

GET /ecommerce/product/_search
{
    "query" : {
        "match" : {
            "producer" : "yagao producer"
        }
    }
}
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0.70293105,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 0.70293105,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.25811607,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 0.1805489,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      }
    ]
  }
}
View Code

phrase search(短語搜索)

跟全文檢索相對應,相反,全文檢索會將輸入的搜索串拆解開來,去倒排索引裏面去一一匹配,只要能匹配上任意一個拆解後的單詞,就能夠做爲結果返回
phrase search,要求輸入的搜索串,必須在指定的字段文本中,徹底包含如出一轍的,才能夠算匹配,才能做爲結果返回

GET /ecommerce/product/_search
{
    "query" : {
        "match_phrase" : {
            "producer" : "yagao producer"
        }
    }
}
{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.70293105,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 0.70293105,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        }
      }
    ]
  }
}
View Code

highlight search(高亮搜索結果)

GET /ecommerce/product/_search
{
    "query" : {
        "match" : {
            "producer" : "producer"
        }
    },
    "highlight": {
        "fields" : {
            "producer" : {}
        }
    }
}
{
  "took": 15,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0.25811607,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.25811607,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        },
        "highlight": {
          "producer": [
            "gaolujie <em>producer</em>"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        },
        "highlight": {
          "producer": [
            "zhonghua <em>producer</em>"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 0.1805489,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        },
        "highlight": {
          "producer": [
            "jiajieshi <em>producer</em>"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 0.14638957,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        },
        "highlight": {
          "producer": [
            "special yagao <em>producer</em>"
          ]
        }
      }
    ]
  }
}
View Code

快速入門案例實戰之電商網站商品管理:嵌套聚合,下鑽分析,聚合分析

計算每一個tag下的商品數量

//將文本field的fielddata屬性設置爲true

PUT /ecommerce/_mapping/product
{
  "properties": {
    "tags": {
      "type": "text",
      "fielddata": true
    }
  }
}

// 聚合計算
GET /ecommerce/product/_search
{
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      }
    }
  }
}
{
  "took": 20,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 1,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  },
  "aggregations": {
    "group_by_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2
        },
        {
          "key": "meibai",
          "doc_count": 2
        },
        {
          "key": "qingxin",
          "doc_count": 1
        }
      ]
    }
  }
}
View Code

不返回hit信息

GET /ecommerce/product/_search
{
  "size": 0,
  "aggs": {
    "all_tags": {
      "terms": { "field": "tags" }
    }
  }
}
{
  "took": 20,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "group_by_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2
        },
        {
          "key": "meibai",
          "doc_count": 2
        },
        {
          "key": "qingxin",
          "doc_count": 1
        }
      ]
    }
  }
}
View Code

對名稱中包含yagao的商品,計算每一個tag下的商品數量

GET /ecommerce/product/_search
{
  "query": {
    "match": {
      "name": "yagao"
    }
  }, 
  "size": 0, 
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      }
    }
  }
}
{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "group_by_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2
        },
        {
          "key": "meibai",
          "doc_count": 2
        },
        {
          "key": "qingxin",
          "doc_count": 1
        }
      ]
    }
  }
}
View Code

先分組,再算每組的平均值,計算每一個tag下的商品的平均價格

GET /ecommerce/product/_search
{
  "size": 0, 
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "group_by_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2,
          "avg_price": {
            "value": 27.5
          }
        },
        {
          "key": "meibai",
          "doc_count": 2,
          "avg_price": {
            "value": 40
          }
        },
        {
          "key": "qingxin",
          "doc_count": 1,
          "avg_price": {
            "value": 40
          }
        }
      ]
    }
  }
}
View Code

計算每一個tag下的商品的平均價格,而且按照平均價格降序排序

GET /ecommerce/product/_search
{
  "size": 0, 
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags",
        "order": {
          "avg_price": "desc"
        }
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "group_by_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "meibai",
          "doc_count": 2,
          "avg_price": {
            "value": 40
          }
        },
        {
          "key": "qingxin",
          "doc_count": 1,
          "avg_price": {
            "value": 40
          }
        },
        {
          "key": "fangzhu",
          "doc_count": 2,
          "avg_price": {
            "value": 27.5
          }
        }
      ]
    }
  }
}
View Code

按照指定的價格範圍區間進行分組,而後在每組內再按照tag進行分組,最後再計算每組的平均價格

GET /ecommerce/product/_search
{
  "size": 0, 
  "aggs":{
    "group_by_price":{
      "range": {
        "field": "price",
        "ranges": [
          {
            "from": 0,
            "to": 20
          },{
            "from": 20,
            "to": 40
          },{
            "from": 40,
            "to": 50
          }
        ]
      },
      "aggs": {
        "group_by_tags": {
          "terms": {
            "field": "tags"
          },
          "aggs": {
            "average_price": {
              "avg": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "group_by_price": {
      "buckets": [
        {
          "key": "0.0-20.0",
          "from": 0,
          "to": 20,
          "doc_count": 0,
          "group_by_tags": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": []
          }
        },
        {
          "key": "20.0-40.0",
          "from": 20,
          "to": 40,
          "doc_count": 2,
          "group_by_tags": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "fangzhu",
                "doc_count": 2,
                "average_price": {
                  "value": 27.5
                }
              },
              {
                "key": "meibai",
                "doc_count": 1,
                "average_price": {
                  "value": 30
                }
              }
            ]
          }
        },
        {
          "key": "40.0-50.0",
          "from": 40,
          "to": 50,
          "doc_count": 1,
          "group_by_tags": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "qingxin",
                "doc_count": 1,
                "average_price": {
                  "value": 40
                }
              }
            ]
          }
        }
      ]
    }
  }
}
View Code
相關文章
相關標籤/搜索