search(14)- elastic4s-統計範圍:global, filter,post-filter bucket

  聚合通常做用在query範圍內。不帶query的aggregation請求其實是在match_all{}查詢範圍內進行統計的:json

GET /cartxns/_search { "aggs": { "all_colors": { "terms": {"field" : "color.keyword"} } } } } GET /cartxns/_search { "query": { "match_all": {} }, "aggs": { "all_colors": { "terms": {"field" : "color.keyword"} } } } }

上面這兩個請求結果相同:app

  "aggregations" : { "all_colors" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "red", "doc_count" : 4 }, { "key" : "blue", "doc_count" : 2 }, { "key" : "green", "doc_count" : 2 } ] } }

雖然不少時候咱們都但願在query做用域下進行統計,但也會碰到須要統計不含任何query條件的彙總數。好比在統計某個車款平價售價的同時又須要知道所有車款的平均售價。這裏所有車款平價售價就是一種global bucket統計:post

GET /cartxns/_search { "query" : { "match" : {"make.keyword": "ford"} } , "aggs": { "avg_ford": { "avg": { "field": "price" } }, "avg_all" : { "global": {}, "aggs": { "avg_price": { "avg": {"field": "price"} } } } } }

搜索結果和聚合結果以下:this

 "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.2809337, "hits" : [ { "_index" : "cartxns", "_type" : "_doc", "_id" : "NGVXAnIBSDa1Wo5UqLc3", "_score" : 1.2809337, "_source" : { "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" } }, { "_index" : "cartxns", "_type" : "_doc", "_id" : "OWVYAnIBSDa1Wo5UTrf8", "_score" : 1.2809337, "_source" : { "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" } } ] }, "aggregations" : { "avg_all" : { "doc_count" : 8, "avg_price" : { "value" : 26500.0 } }, "avg_ford" : { "value" : 27500.0 } }

用elastic4s來表達:spa

 val aggGlob = search("cartxns").query( matchQuery("make.keyword","ford") ).aggregations( avgAggregation("single_avg").field("price"), globalAggregation("all_avg").subaggs( avgAggregation("avg_price").field("price") ) ) println(aggGlob.show) val globResult = client.execute(aggGlob).await

  if (globResult.isSuccess) { val gavg = globResult.result.aggregations.global("all_avg").avg("avg_price") val savg = globResult.result.aggregations.avg("single_avg") println(s"${savg.value},${gavg.value}") globResult.result.hits.hits.foreach(h => println(s"${h.sourceAsMap}")) } else println(s"error: ${globResult.error.causedBy.getOrElse("unknown")}") ... POST:/cartxns/_search? StringEntity({"query":{"match":{"make.keyword":{"query":"ford"}}},"aggs":{"single_avg":{"avg":{"field":"price"}},"all_avg":{"global":{},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}},Some(application/json)) 27500.0,26500.0 Map(price -> 30000, color -> green, make -> ford, sold -> 2014-05-18) Map(price -> 25000, color -> blue, make -> ford, sold -> 2014-02-12)

filter-bucket的做用是:在query結果內再進行篩選後統計。好比:查詢全部honda車款交易,但只統計honda某個月銷售: code

GET /cartxns/_search { "query": { "match": { "make.keyword": "honda" } }, "aggs": { "sales_this_month": { "filter": { "range" : {"sold" : { "from" : "2014-10-01", "to" : "2014-11-01" }} }, "aggs": { "month_total": { "sum": {"field": "price"} } } } } }

首先,查詢結果應該不受影響。同時還獲得查詢結果車款某個月的銷售額:blog

 "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 0.9444616, "hits" : [ { "_index" : "cartxns", "_type" : "_doc", "_id" : "MmVXAnIBSDa1Wo5UqLc3", "_score" : 0.9444616, "_source" : { "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" } }, { "_index" : "cartxns", "_type" : "_doc", "_id" : "M2VXAnIBSDa1Wo5UqLc3", "_score" : 0.9444616, "_source" : { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } }, { "_index" : "cartxns", "_type" : "_doc", "_id" : "N2VXAnIBSDa1Wo5UqLc3", "_score" : 0.9444616, "_source" : { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } } ] }, "aggregations" : { "sales_this_month" : { "doc_count" : 1, "month_total" : { "value" : 10000.0 } } }

elastic4s示範以下: 作用域

  val aggfilter = search("cartxns").query( matchQuery("make.keyword","honda") ).aggregations( filterAgg("sales_the_month",rangeQuery("sold").gte("2014-10-01").lte("2014-11-01")) .subaggs(sumAggregation("monthly_sales").field("price")) ) println(aggfilter.show) val filterResult = client.execute(aggfilter).await

  if (filterResult.isSuccess) { val ms = filterResult.result.aggregations.filter("sales_the_month") .sum("monthly_sales").value println(s"${ms}") filterResult.result.hits.hits.foreach(h => println(s"${h.sourceAsMap}")) } else println(s"error: ${filterResult.error.causedBy.getOrElse("unknown")}") ... POST:/cartxns/_search? StringEntity({"query":{"match":{"make.keyword":{"query":"honda"}}},"aggs":{"sales_the_month":{"filter":{"range":{"sold":{"gte":"2014-10-01","lte":"2014-11-01"}}},"aggs":{"monthly_sales":{"sum":{"field":"price"}}}}}},Some(application/json)) 10000.0 Map(price -> 10000, color -> red, make -> honda, sold -> 2014-10-28) Map(price -> 20000, color -> red, make -> honda, sold -> 2014-11-05) Map(price -> 20000, color -> red, make -> honda, sold -> 2014-11-05)

最後一個是post-filter。post-filter一樣是對query結果的篩選,可是在完成了整個query後對結果的篩選。也就是說若是query還涉及到聚合,那麼聚合不受篩選影響:get

GET /cartxns/_search { "query": { "match": { "make.keyword": "ford" } }, "post_filter": { "match" : { "color.keyword" : "blue" } } ,"aggs": { "colors": { "terms": { "field": "color.keyword", "size": 10 } } } }

查詢和聚合結果以下:it

  "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 1.2809337, "hits" : [ { "_index" : "cartxns", "_type" : "_doc", "_id" : "OWVYAnIBSDa1Wo5UTrf8", "_score" : 1.2809337, "_source" : { "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" } } ] }, "aggregations" : { "colors" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "blue", "doc_count" : 1 }, { "key" : "green", "doc_count" : 1 } ] } } }

能夠看到:查詢結果顯示了通過post-filter篩選的結果,但聚合並無受到filter影響。

elastic4s示範代碼:

 val aggPost = search("cartxns").query( matchQuery("make.keyword","ford") ).postFilter(matchQuery("color.keyword","blue")) .aggregations( termsAgg("colors","color.keyword") ) println(aggPost.show) val postResult = client.execute(aggPost).await

  if (postResult.isSuccess) { postResult.result.hits.hits.foreach(h => println(s"${h.sourceAsMap}")) postResult.result.aggregations.terms("colors").buckets .foreach(b => println(s"${b.key},${b.docCount}")) } else println(s"error: ${postResult.error.causedBy.getOrElse("unknown")}") ... POST:/cartxns/_search? StringEntity({"query":{"match":{"make.keyword":{"query":"ford"}}},"post_filter":{"match":{"color.keyword":{"query":"blue"}}},"aggs":{"colors":{"terms":{"field":"color.keyword"}}}},Some(application/json)) Map(price -> 25000, color -> blue, make -> ford, sold -> 2014-02-12) blue,1 green,1
相關文章
相關標籤/搜索