MongoDB 聚合Group(二)

時間 2019-11-10

標籤 mongodb 聚合 group 欄目 MongoDB 简体版

原文原文鏈接

轉自：https://blog.csdn.net/congcong68/article/details/51620040

1、簡介：

上一篇咱們對 db.collection.aggregate(pipeline, options)介紹，咱們接下來介紹pipeline 參數和options參數的基礎認識sql

【pipeline 參數】mongodb

pipeline 類型是Array 語法：db.collection.aggregate( [ { <stage> }, ... ] )shell

$project：能夠對輸入文檔進行添加新字段或刪除現有的字段，能夠自定哪些字段顯示與不顯示。express

$match ：根據條件用於過濾數據，只輸出符合條件的文檔，若是放在pipeline前面，根據條件過濾數據，傳輸到下一個階段管道，能夠提升後續的數據處理效率。還能夠放在out以前，對結果進行再一次過濾。數組

$redact ：字段所處的document結構的級別.性能

$limit ：用來限制MongoDB聚合管道返回的文檔數spa

$skip :在聚合管道中跳過指定數量的文檔，並返回餘下的文檔。.net

$unwind :將文檔中的某一個數組類型字段拆分紅多條，每條包含數組中的一個值。指針

$sample :隨機選擇從其輸入指定數量的文檔。若是是大於或等於5%的collection的文檔，$sample進行收集掃描，進行排序，而後選擇頂部的文件。所以，$sample 在收集階段是受排序的內存限制。blog

語法： { $sample: { size: <positive integer> } }

$sort :將輸入文檔排序後輸出。

$geoNear:用於地理位置數據分析。

$out :必須爲pipeline最後一個階段管道，由於是將最後計算結果寫入到指定的collection中。

$indexStats :返回數據集合的每一個索引的使用狀況。

語法：{ $indexStats: { } }

$group : 將集合中的文檔分組，可用於統計結果，$group首先將數據根據key進行分組。

咱們能夠經過Aggregation pipeline一些操做跟sql用法同樣，咱們能很清晰的怎麼去使用

pipeline sql

$project select

$match where

$match(先$group階段管道而後在$match對統計的結果進行再一次的過濾) having

$sort order by

$limit limit

2、基礎的操做

【pipeline簡單例子】

數據：

[sql] view plain copy

db.items.insert( [
{
"quantity" : 2,
"price" : 5.0,
"pnumber" : "p003",
},{
"quantity" : 2,
"price" : 8.0,
"pnumber" : "p002"
},{
"quantity" : 1,
"price" : 4.0,
"pnumber" : "p002"
},{
"quantity" : 2,
"price" : 4.0,
"pnumber" : "p001"
},{
"quantity" : 4,
"price" : 10.0,
"pnumber" : "p003"
},{
"quantity" : 10,
"price" : 20.0,
"pnumber" : "p001"
},{
"quantity" : 10,
"price" : 20.0,
"pnumber" : "p003"
},{
"quantity" : 5,
"price" : 10.0,
"pnumber" : "p002"
}
])

【 $project】

一、咱們對上面的統計結果，只顯示count,不想_id ,能夠經過$project來操做，至關SQL的select 顯示咱們想要的字段：

[sql] view plain copy

> db.items.aggregate([{$group:{_id:null,count:{$sum:1}}},{$project:{"_id":0,"count":1}}])
{ "count" : 8 }

【 $match】

一、咱們想經過濾訂單中，想知道賣出的數量大於8的產品有哪些產品，至關於SQL:select sum(quantity) as total from items group by pnumber having total>8

[sql] view plain copy

> db.items.aggregate([{$group:{_id:"$pnumber",total:{$sum:"$quantity"}}},{$match:{total:{$gt:8}}}])
{ "_id" : "p001", "total" : 12 }
{ "_id" : "p003", "total" : 16 }

二、若是是放在$group以前就是當作where來使用，咱們只統計pnumber =p001 產品賣出了多少個 select sum(quantity) as total from items where pnumber='p001'

[sql] view plain copy

> db.items.aggregate([{$match:{"pnumber":"p001"}},{$group:{_id:null,total:{$sum:"$quantity"}}}])
{ "_id" : null, "total" : 12 }

【$skip、 $limit、$sort】

db.items.aggregate([{ $skip: 2 },{ $limit: 4 }]) 與db.items.aggregate([{ $limit: 4 },{ $skip: 2 }]) 這樣結果是不同的

[sql] view plain copy

> db.items.aggregate([{ $skip: 2 },{ $limit: 4 }])
{ "_id" : ObjectId("574d937cfafef57ee4427ac4"), "quantity" : 1, "price" : 4, "pnumber" : "p002" }
{ "_id" : ObjectId("574d937cfafef57ee4427ac5"), "quantity" : 2, "price" : 4, "pnumber" : "p001" }
{ "_id" : ObjectId("574d937cfafef57ee4427ac6"), "quantity" : 4, "price" : 10, "pnumber" : "p003" }
{ "_id" : ObjectId("574d937cfafef57ee4427ac7"), "quantity" : 10, "price" : 20, "pnumber" : "p001" }
> db.items.aggregate([{ $limit: 4 },{ $skip: 2 }])
{ "_id" : ObjectId("574d937cfafef57ee4427ac4"), "quantity" : 1, "price" : 4, "pnumber" : "p002" }
{ "_id" : ObjectId("574d937cfafef57ee4427ac5"), "quantity" : 2, "price" : 4, "pnumber" : "p001" }

$limit、$skip、$sort、$match可使用在階段管道，若是使用在$group以前能夠過濾掉一些數據，提升性能。

【$unwind】

將文檔中的某一個數組類型字段拆分紅多條，每條包含數組中的一個值。

[sql] view plain copy

> db.items.aggregate([{$group:{_id:"$pnumber",quantitys:{$push:"$quantity"}}}])
{ "_id" : "p001", "quantitys" : [ 2, 10 ] }
{ "_id" : "p002", "quantitys" : [ 2, 1, 5 ] }
{ "_id" : "p003", "quantitys" : [ 2, 4, 10 ] }
> db.items.aggregate([{$group:{_id:"$pnumber",quantitys:{$push:"$quantity"}}},{$unwind:"$quantitys"}])
{ "_id" : "p001", "quantitys" : 2 }
{ "_id" : "p001", "quantitys" : 10 }
{ "_id" : "p002", "quantitys" : 2 }
{ "_id" : "p002", "quantitys" : 1 }
{ "_id" : "p002", "quantitys" : 5 }
{ "_id" : "p003", "quantitys" : 2 }
{ "_id" : "p003", "quantitys" : 4 }
{ "_id" : "p003", "quantitys" : 10 }

【$out】

必須爲pipeline最後一個階段管道，由於是將最後計算結果寫入到指定的collection中。

[sql] view plain copy

{ "_id" : "p001", "quantitys" : 2 }
{ "_id" : "p001", "quantitys" : 10 }
{ "_id" : "p002", "quantitys" : 2 }
{ "_id" : "p002", "quantitys" : 1 }
{ "_id" : "p002", "quantitys" : 5 }
{ "_id" : "p003", "quantitys" : 2 }
{ "_id" : "p003", "quantitys" : 4 }
{ "_id" : "p003", "quantitys" : 10 }

把結果放在指定的集合中result

[sql] view plain copy

> db.items.aggregate([{$group:{_id:"$pnumber",quantitys:{$push:"$quantity"}}},{$unwind:"$quantitys"},{$project:{"_id":0,"quantitys":1}},{$out:"result"}])
> db.result.find()
{ "_id" : ObjectId("57529143746e15e8aa207a29"), "quantitys" : 2 }
{ "_id" : ObjectId("57529143746e15e8aa207a2a"), "quantitys" : 10 }
{ "_id" : ObjectId("57529143746e15e8aa207a2b"), "quantitys" : 2 }
{ "_id" : ObjectId("57529143746e15e8aa207a2c"), "quantitys" : 1 }
{ "_id" : ObjectId("57529143746e15e8aa207a2d"), "quantitys" : 5 }
{ "_id" : ObjectId("57529143746e15e8aa207a2e"), "quantitys" : 2 }
{ "_id" : ObjectId("57529143746e15e8aa207a2f"), "quantitys" : 4 }
{ "_id" : ObjectId("57529143746e15e8aa207a30"), "quantitys" : 10 }

【$redact】

語法：{ $redact: <expression> }

$redact 跟$cond結合使用，並在$cond裏面使用了if 、then、else表達式，if-else缺一不可，$redact還有三個重要的參數：

1）$$DESCEND：返回包含當前document級別的全部字段，而且會繼續判字段包含內嵌文檔，內嵌文檔的字段也會去判斷是否符合條件。

2）$$PRUNE：返回不包含當前文檔或者內嵌文檔級別的全部字段，不會繼續檢測此級別的其餘字段，即便這些字段的內嵌文檔持有相同的訪問級別。

3）$$KEEP：返回包含當前文檔或內嵌文檔級別的全部字段，再也不繼續檢測此級別的其餘字段，即便這些字段的內嵌文檔中持有不一樣的訪問級別。

一、 level=1則值爲爲$$DESCEND，不然爲$$PRUNE，從頂部開始掃描下去，執行$$DESCEND包含當前document級別的全部fields。當前級別字段的內嵌文檔將會被繼續檢測。

[sql] view plain copy

db.redact.insert(
{
_id: 1,
level: 1,
status: "A",
acct_id: "xyz123",
cc: [{
level: 1,
type: "yy",
num: 000000000000,
exp_date: ISODate("2015-11-01T00:00:00.000Z"),
billing_addr: {
level: 5,
addr1: "123 ABC Street",
city: "Some City"
}
},{
level: 3,
type: "yy",
num: 000000000000,
exp_date: ISODate("2015-11-01T00:00:00.000Z"),
billing_addr: {
level: 1,
addr1: "123 ABC Street",
city: "Some City"
}
}]
})
db.redact.aggregate(
[
{ $match: { status: "A" } },
{
$redact: {
$cond: {
if: { $eq: [ "$level", 1] },
then: "$$DESCEND",
else: "$$PRUNE"
}
}
}
]
);
{
"_id" : 1,
"level" : 1,
"status" : "A",
"acct_id" : "xyz123",
"cc" : [
{ "level" : 1,
"type" : "yy",
"num" : 0,
"exp_date" : ISODate("2015-11-01T00:00:00Z")
}
]
}

二、$$PRUNE：不包含當前文檔或者內嵌文檔級別的全部字段，不會繼續檢測此級別的其餘字段，即便這些字段的內嵌文檔持有相同的訪問級別。連等級的字段都不顯示，也不會去掃描等級字段包含下級。

[sql] view plain copy

db.redact.insert(
{
_id: 1,
level: 1,
status: "A",
acct_id: "xyz123",
cc: {
level: 3,
type: "yy",
num: 000000000000,
exp_date: ISODate("2015-11-01T00:00:00.000Z"),
billing_addr: {
level: 1,
addr1: "123 ABC Street",
city: "Some City"
}
}
}
)
db.redact.aggregate(
[
{ $match: { status: "A" } },
{
$redact: {
$cond: {
if: { $eq: [ "$level", 3] },
then: "$$PRUNE",
else: "$$DESCEND"
}
}
}
]
);
{ "_id" : 1, "level" : 1, "status" : "A", "acct_id" : "xyz123" }

三、$$KEEP：返回包含當前文檔或內嵌文檔級別的全部字段，再也不繼續檢測此級別的其餘字段，即便這些字段的內嵌文檔中持有不一樣的訪問級別。

[sql] view plain copy

db.redact.insert(
{
_id: 1,
level: 1,
status: "A",
acct_id: "xyz123",
cc: {
level: 2,
type: "yy",
num: 000000000000,
exp_date: ISODate("2015-11-01T00:00:00.000Z"),
billing_addr: {
level:3,
addr1: "123 ABC Street",
city: "Some City"
}
}
}
)
db.redact.aggregate(
[
{ $match: { status: "A" } },
{
$redact: {
$cond: {
if: { $eq: [ "$level", 1] },
then: "$$KEEP",
else: "$$PRUNE"
}
}
}
]
);
{ "_id" : 1, "level" : 1, "status" : "A", "acct_id" : "xyz123", "cc" : { "level" : 2, "type" : "yy", "num" : 0, "exp_date" : ISODate("2015-11-01T00:00:00Z"), "billing_addr" : { "level" : 3, "addr1" : "123 ABC Street", "city" : "Some City" } } }

【 options參數】

explain:返回指定aggregate各個階段管道的執行計劃信息。

他操做返回一個遊標，包含aggregate執行計劃詳細信息。

[sql] view plain copy

db.items.aggregate([{$group:{_id:"$pnumber",total:{$sum:"$quantity"}}},{$group:{_id:null,max:{$max:"$total"}}}],{explain:true})
"stages" : [
{
"$cursor" : {
"query" : {
},
"fields" : {
"pnumber" : 1,
"quantity" : 1,
"_id" : 0
},
"plan" : {
"cursor" : "BasicCursor",
"isMultiKey" : false,
"scanAndOrder" : false,
"allPlans" : [
{
"cursor" : "BasicCursor"
"isMultiKey" : false,
"scanAndOrder" : false
}
]
}
}
},
{
"$group" : {
"_id" : "$pnumber",
"total" : {
"$sum" : "$quantity"
}
}
},
{
"$group" : {
"_id" : {
"$const" : null
},
"max" : {
"$max" : "$total"
}
}
}
],
"ok" : 1