在elasticsearch中es支持對存儲文檔進行復雜的統計.簡稱聚合。git
ES中的聚合被分爲兩大類。github
一、Metrics, Metrics 是簡單的對過濾出來的數據集進行avg,max等操做,是一個單一的數值。app
二、ucket, Bucket 你則能夠理解爲將過濾出來的數據集按條件分紅多個小數據集,而後Metrics會分別做用在這些小數據集上。ssh
聚合在ELK裏面是一個很是重要的概念,雖然咱們在ELK stack裏面用於過多的去了解es的實現過程,可是簡單的瞭解es的查詢過程,能夠有效的幫助咱們快速的入門Kibana,經過kibana鼠標點擊的方式生成聚合數據。curl
一、 git先下載數據導入:elasticsearch
git clone git@github.com:xiaoluoge11/longguo-devops.git學習
執行腳本:this
[root@controller longguo-devops]# ./car.shurl
#備註:咱們會創建一個也許對汽車交易商有所用處的聚合。數據是關於汽車交易的:汽車型號,製造商,銷售價格,銷售時間以及一些其餘的相關數據日誌
Bucket:
一、 按時間統計(能夠是一個時間區間的柱形圖date_histogram:kibana這樣展現):
[root@controller .ssh]# curl -XGET '192.168.63.235:9200/cars/transactions/_search?pretty' -d '
{
"aggs" : {
"articles_over_time" : {
"date_histogram" : {
"field" : "sold",
"interval": "month" ##區間能夠爲:data.hour,munite,year等
}
}
}
}'
返回結果:
"aggregations" : {
"articles_over_time" : {
"buckets" : [
{
"key_as_string" : "2014-01-01T00:00:00.000Z",
"key" : 1388534400000,
"doc_count" : 1
},
{
"key_as_string" : "2014-02-01T00:00:00.000Z",
"key" : 1391212800000,
"doc_count" : 1
},
#####也能夠這樣指定:
"field" : "sold",
"interval" : "mount",
"format" : "yyyy-MM-dd" ###指定相應的時間格式
"offset": "+6h" ###區間間隔
####或者按照時間區間來查詢:
"aggs": {
"range": {
"date_range": {
"field": "date",
"time_zone": "CET",
"ranges": [
{ "to": "2016-02-15/d" },
{ "from": "2016-02-15/d", "to" : "now/d" },
{ "from": "now/d" },
二、 返回價格區間柱形圖(Histogram Aggregation):
[root@controller .ssh]# curl -XGET '192.168.63.235:9200/cars/transactions/_search?pretty' -d '
{
"aggs" : {
"prices" : {
"histogram" : {
"field" : "price",
"interval" : 5000
}
}
}
}'
### Histogram作等間距劃分,統計區間的price值,看他落在那個區間,數據間隔是5000:
返回結果:
"aggregations" : {
"prices" : {
"buckets" : [
{
"key" : 10000.0,
"doc_count" : 2
},
{
"key" : 15000.0,
"doc_count" : 1
},
三、 查看每種顏色的銷量:
[root@controller .ssh]# curl -XGET '192.168.63.235:9200/cars/transactions/_search?pretty' -d '
{
"aggs" : {
"genres" : {
"terms" : { "field" : "color" }
}
}
}'
###注意可能會報以下錯:
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [color] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
提示咱們數據類型不對,咱們修改一下mapping映射:
[root@controller .ssh]# curl -XPUT '192.168.63.235:9200/cars/_mapping/transactions' -d '
> {
> "properties": {
> "color": {
> "type": "text",
> "fielddata": true
> }
> }
> }'
{"acknowledged":true}
再查下就會看到統計分佈的結果:
"buckets" : [
{
"key" : "red",
"doc_count" : 4
},
{
"key" : "blue",
"doc_count" : 2
},
{
"key" : "green",
"doc_count" : 2
}
四、 添加一個指標(Metric):
[root@controller .ssh]# curl -XGET '192.168.63.235:9200/cars/transactions/_search?pretty' -d '
{
"aggs" : {
"genres" : {
"terms" : { "field" : "color" }
,
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}'
####avg能夠換成max,min,sum等。用stats就表示全部。
五、 用stats找出Metric的全部值。
curl -XGET '192.168.63.235:9200/cars/transactions/_search?pretty' -d '
{
"aggs" : {
"genres" : {
"terms" : { "field" : "color" }
,
"aggs": {
"avg_price": {
"stats": {
"field": "price"
}
}
}
}
}
}'
####返回結果:
"buckets" : [
{
"key" : "red",
"doc_count" : 4,
"avg_price" : {
"count" : 4,
"min" : 10000.0,
"max" : 80000.0,
"avg" : 32500.0,
"sum" : 130000.0
}
}
本文內容出自:日誌分析之 ELK stack 實戰 課程學習筆記