聚合

在講解聚合管道(Aggregation Pipeline)以前,咱們先介紹一下 MongoDB 的聚合功能,聚合操做主要用於對數據的批量處理,每每將記錄按條件分組之後,而後再進行一系列操做,例如,求最大值、最小值、平均值,求和等操做。聚合操做還可以對記錄進行復雜的操做,主要用於數理統計和數據挖掘。在 MongoDB 中,聚合操做的輸入是集合中的文檔,輸出能夠是一個文檔,也能夠是多條文檔。sql

MongoDB 提供了很是強大的聚合操做,有三種方式:mongodb

  • 聚合管道(Aggregation Pipeline)
  • 單目的聚合操做(Single Purpose Aggregation Operation)
  • MapReduce 編程模型

在本篇中,重點講解聚合管道和單目的聚合操做,MapReduce 編程模型會在後續的文章中講解。express

一、聚合管道

聚合管道是 MongoDB 2.2版本引入的新功能。它由階段(Stage)組成,文檔在一個階段處理完畢後,聚合管道會把處理結果傳到下一個階段。編程

聚合管道功能:數組

  • 對文檔進行過濾,查詢出符合條件的文檔
  • 對文檔進行變換,改變文檔的輸出形式

每一個階段用階段操做符(Stage Operators)定義,在每一個階段操做符中能夠用表達式操做符(Expression Operators)計算總和、平均值、拼接分割字符串等相關操做,直到每一個階段進行完成,最終返回結果,返回的結果能夠直接輸出,也能夠存儲到集合中。app

MongoDB 中使用 db.COLLECTION_NAME.aggregate([{<stage>},...]) 方法來構建和使用聚合管道。先看下官網給的實例,感覺一下聚合管道的用法。less

實例中,$match 用於獲取 status = "A" 的記錄,而後將符合條件的記錄送到下一階段 $group 中進行分組求和計算,最後返回 Results。其中,$match、$group 都是階段操做符,而階段 $group 中用到的 $sum 是表達式操做符。ide

在下面,咱們經過範例分別對階段操做符和表達式操做符進行詳解。大數據

1.一、階段操做符

使用階段操做符以前,咱們先看一下 article 集合中的文檔列表,也就是範例中用到的數據。優化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
>db.article.find().pretty()
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7570" )
    "title" "MongoDB Aggregate" ,
    "author" "liruihuan" ,
    "tags" : [ 'Mongodb' 'Database' 'Query' ],
    "pages" : 5,
    "time"  : ISODate( "2017-04-09T11:42:39.736Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7571" )
    "title" "MongoDB Index" ,
    "author" "liruihuan" ,
    "tags" : [ 'Mongodb' 'Index' 'Query' ],
    "pages" : 3,
    "time"  : ISODate( "2017-04-09T11:43:39.236Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7572" )
    "title" "MongoDB Query" ,
    "author" "eryueyang" ,
    "tags" : [ 'Mongodb' 'Query' ],
    "pages" : 8,
    "time"  : ISODate( "2017-04-09T11:44:56.276Z" )
}

1.1.一、$project 

做用

修改文檔的結構,能夠用來重命名、增長或刪除文檔中的字段。

範例1

只返回文檔中 title 和 author 字段

1
2
3
4
>db.article.aggregate([{$project:{_id:0, title:1, author:1 }}])
"title" "MongoDB Aggregate" ,   "author" "liruihuan"  },
"title" "MongoDB Index" ,   "author" "liruihuan"  },
"title" "MongoDB Query" ,   "author" "eryueyang"  }

由於字段 _id 是默認顯示的,這裏必須用 _id:0 把字段_id過濾掉。

範例2

把文檔中 pages 字段的值都增長10。並重命名成 newPages 字段。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>db.article.aggregate(
    [
       {
           $project:{
                _id:0,
                title:1,
                author:1,
                newPages: {$ add :[ "$Pages" ,10]}
          }
       }
    ]
)
"title" "MongoDB Aggregate" ,   "author" "liruihuan" "newPages" : 15 },
"title" "MongoDB Index" ,   "author" "liruihuan" "newPages" : 13  },
"title" "MongoDB Query" ,   "author" "eryueyang" "newPages" : 18  }

其中,$add 是 加 的意思,是算術類型表達式操做符,具體表達式操做符,下面會講到。

1.1.二、$match

做用

用於過濾文檔。用法相似於 find() 方法中的參數。

範例

查詢出文檔中 pages 字段的值大於等於5的數據。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
>db.article.aggregate(
     [
          {
               $match: { "pages" : {$gte: 5}}
          }
     ]
    ).pretty()
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7570" )
    "title" "MongoDB Aggregate" ,
    "author" "liruihuan" ,
    "tags" : [ 'Mongodb' 'Database' 'Query' ],
    "pages" : 5,
    "time"  : ISODate( "2017-04-09T11:42:39.736Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7572" )
    "title" "MongoDB Query" ,
    "author" "eryueyang" ,
    "tags" : [ 'Mongodb' 'Query' ],
    "pages" : 8,
    "time"  : ISODate( "2017-04-09T11:44:56.276Z" )
}

注:

  • 在 $match 中不能使用 $where 表達式操做符
  • 若是 $match 位於管道的第一個階段,能夠利用索引來提升查詢效率
  • $match 中使用 $text 操做符的話,只能位於管道的第一階段
  • $match 儘可能出如今管道的最前面,過濾出須要的數據,在後續的階段中能夠提升效率。

1.1.三、$group

做用

將集合中的文檔進行分組,可用於統計結果。

範例

從 article 中獲得每一個 author 的文章數,並輸入 author 和對應的文章數。

1
2
3
4
5
6
7
8
9
>db.article.aggregate(
     [
          {
               $ group : {_id:  "$author" , total: {$ sum : 1}}
          }
     ]
    )
{ "_id"  "eryueyang" "total"  : 1}
{ "_id"  "liruihuan" "total"  : 2}

1.1.四、$sort

做用

將集合中的文檔進行排序。

範例

讓集合 article 以 pages 升序排列

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
>db.article.aggregate([{$sort: { "pages" : 1}}]).pretty()
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7571" )
    "title" "MongoDB Index" ,
    "author" "liruihuan" ,
    "tags" : [ 'Mongodb' 'Index' 'Query' ],
    "pages" : 3,
    "time"  : ISODate( "2017-04-09T11:43:39.236Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7570" )
    "title" "MongoDB Aggregate" ,
    "author" "liruihuan" ,
    "tags" : [ 'Mongodb' 'Database' 'Query' ],
    "pages" : 5,
    "time"  : ISODate( "2017-04-09T11:42:39.736Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7572" )
    "title" "MongoDB Query" ,
    "author" "eryueyang" ,
    "tags" : [ 'Mongodb' 'Query' ],
    "pages" : 8,
    "time"  : ISODate( "2017-04-09T11:44:56.276Z" )
}

  若是以降序排列,則設置成 "pages": -1

1.1.五、$limit

做用

限制返回的文檔數量

範例

返回集合 article 中前兩條文檔

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
>db.article.aggregate([{$limit: 2}]).pretty()
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7570" )
    "title" "MongoDB Aggregate" ,
    "author" "liruihuan" ,
    "tags" : [ 'Mongodb' 'Database' 'Query' ],
    "pages" : 5,
    "time"  : ISODate( "2017-04-09T11:42:39.736Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7571" )
    "title" "MongoDB Index" ,
    "author" "liruihuan" ,
    "tags" : [ 'Mongodb' 'Index' 'Query' ],
    "pages" : 3,
    "time"  : ISODate( "2017-04-09T11:43:39.236Z" )
}

1.1.六、$skip

做用

跳過指定數量的文檔,並返回餘下的文檔。

範例

跳過集合 article 中一條文檔,輸出剩下的文檔

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
>db.article.aggregate([{$skip: 1}]).pretty()
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7571" )
    "title" "MongoDB Index" ,
    "author" "liruihuan" ,
    "tags" : [ 'Mongodb' 'Index' 'Query' ],
    "pages" : 3,
    "time"  : ISODate( "2017-04-09T11:43:39.236Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7572" )
    "title" "MongoDB Query" ,
    "author" "eryueyang" ,
    "tags" : [ 'Mongodb' 'Query' ],
    "pages" : 8,
    "time"  : ISODate( "2017-04-09T11:44:56.276Z" )
}

1.1.七、$unwind

做用

將文檔中數組類型的字段拆分紅多條,每條文檔包含數組中的一個值。

範例

把集合 article 中 title="MongoDB Aggregate" 的 tags 字段拆分

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
>db.article.aggregate(
     [
          {
               $match: { "title" "MongoDB Aggregate" }
          },
          {
               $unwind:  "$tags"
          }
     ]
    ).pretty()
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7570" )
    "title" "MongoDB Aggregate" ,
    "author" "liruihuan" ,
    "tags" "Mongodb" ,
    "pages" : 5,
    "time"  : ISODate( "2017-04-09T11:42:39.736Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7570" )
    "title" "MongoDB Aggregate" ,
    "author" "liruihuan" ,
    "tags" "Database" ,
    "pages" : 5,
    "time"  : ISODate( "2017-04-09T11:42:39.736Z" )
},
{
    "_id" : ObjectId( "58e1d2f0bb1bbc3245fa7570" )
    "title" "MongoDB Aggregate" ,
    "author" "liruihuan" ,
    "tags" "Query" ,
    "pages" : 5,
    "time"  : ISODate( "2017-04-09T11:42:39.736Z" )
}

注:

  • $unwind 參數數組字段爲空或不存在時,待處理的文檔將會被忽略,該文檔將不會有任何輸出
  • $unwind 參數不是一個數組類型時,將會拋出異常
  • $unwind 所做的修改,只用於輸出,不能改變原文檔

1.二、表達式操做符

 表達式操做符有不少操做類型,其中最經常使用的有布爾管道聚合操做、集合操做、比較聚合操做、算術聚合操做、字符串聚合操做、數組聚合操做、日期聚合操做、條件聚合操做、數據類型聚合操做等。每種類型都有不少用法,這裏就不一一舉例了。

1.2.一、布爾管道聚合操做(Boolean Aggregation Operators)

名稱 說明
$and Returns true only when all its expressions evaluate to true. Accepts any number of argument expressions.
$or Returns true when any of its expressions evaluates to true. Accepts any number of argument expressions.
$not Returns the boolean value that is the opposite of its argument expression. Accepts a single argument expression.

範例

假若有一個集合 mycol

1
2
3
4
5
"_id"  : 1,  "item"  "abc1" , description:  "product 1" , qty: 300 }
"_id"  : 2,  "item"  "abc2" , description:  "product 2" , qty: 200 }
"_id"  : 3,  "item"  "xyz1" , description:  "product 3" , qty: 250 }
"_id"  : 4,  "item"  "VWZ1" , description:  "product 4" , qty: 300 }
"_id"  : 5,  "item"  "VWZ2" , description:  "product 5" , qty: 180 }

肯定 qty 是否大於250或者小於200

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
db.mycol.aggregate(
    [
      {
        $project:
           {
             item: 1,
             result: { $ or : [ { $gt: [  "$qty" , 250 ] }, { $lt: [  "$qty" , 200 ] } ] }
           }
      }
    ]
)
"_id"  : 1,  "item"  "abc1" "result"  true  }
"_id"  : 2,  "item"  "abc2" "result"  false  }
"_id"  : 3,  "item"  "xyz1" "result"  false  }
"_id"  : 4,  "item"  "VWZ1" "result"  true  }
"_id"  : 5,  "item"  "VWZ2" "result"  true  }

1.2.二、集合操做(Set Operators)

用於集合操做,求集合的並集、交集、差集運算。

名稱 說明
$setEquals Returns true if the input sets have the same distinct elements. Accepts two or more argument expressions.
$setIntersection Returns a set with elements that appear in all of the input sets. Accepts any number of argument expressions.
$setUnion Returns a set with elements that appear in any of the input sets. Accepts any number of argument expressions.
$setDifference Returns a set with elements that appear in the first set but not in the second set; i.e. performs a relative complement of the second set relative to the first. Accepts exactly two argument expressions.
$setIsSubset Returns true if all elements of the first set appear in the second set, including when the first set equals the second set; i.e. not a strict subset. Accepts exactly two argument expressions.
$anyElementTrue Returns true if any elements of a set evaluate to true; otherwise, returns false. Accepts a single argument expression.
$allElementsTrue Returns true if no element of a set evaluates to false, otherwise, returns false. Accepts a single argument expression.

範例

假若有一個集合 mycol

1
2
3
4
5
6
7
8
9
"_id"  : 1,  "A"  : [  "red" "blue"  ],  "B"  : [  "red" "blue"  ] }
"_id"  : 2,  "A"  : [  "red" "blue"  ],  "B"  : [  "blue" "red" "blue"  ] }
"_id"  : 3,  "A"  : [  "red" "blue"  ],  "B"  : [  "red" "blue" "green"  ] }
"_id"  : 4,  "A"  : [  "red" "blue"  ],  "B"  : [  "green" "red"  ] }
"_id"  : 5,  "A"  : [  "red" "blue"  ],  "B"  : [ ] }
"_id"  : 6,  "A"  : [  "red" "blue"  ],  "B"  : [ [  "red"  ], [  "blue"  ] ] }
"_id"  : 7,  "A"  : [  "red" "blue"  ],  "B"  : [ [  "red" "blue"  ] ] }
"_id"  : 8,  "A"  : [ ],  "B"  : [ ] }
"_id"  : 9,  "A"  : [ ],  "B"  : [  "red"  ] }

求出集合 mycol 中 A 和 B 的交集

1
2
3
4
5
6
7
8
9
10
11
12
13
14
db.mycol.aggregate(
    [
      { $project: { A:1, B: 1, allValues: { $setUnion: [  "$A" "$B"  ] }, _id: 0 } }
    ]
)
"A" : [  "red" "blue"  ],  "B" : [  "red" "blue"  ],  "allValues" : [  "blue" "red"  ] }
"A" : [  "red" "blue"  ],  "B" : [  "blue" "red" "blue"  ],  "allValues" : [  "blue" "red"  ] }
"A" : [  "red" "blue"  ],  "B" : [  "red" "blue" "green"  ],  "allValues" : [  "blue" "red" "green"  ] }
"A" : [  "red" "blue"  ],  "B" : [  "green" "red"  ],  "allValues" : [  "blue" "red" "green"  ] }
"A" : [  "red" "blue"  ],  "B" : [ ],  "allValues" : [  "blue" "red"  ] }
"A" : [  "red" "blue"  ],  "B" : [ [  "red"  ], [  "blue"  ] ],  "allValues" : [  "blue" "red" , [  "red"  ], [  "blue"  ] ] }
"A" : [  "red" "blue"  ],  "B" : [ [  "red" "blue"  ] ],  "allValues" : [  "blue" "red" , [  "red" "blue"  ] ] }
"A" : [ ],  "B" : [ ],  "allValues" : [ ] }
"A" : [ ],  "B" : [  "red"  ],  "allValues" : [  "red"  ] }

1.2.三、比較聚合操做(Comparison Aggregation Operators)

名稱 說明
$cmp Returns: 0 if the two values are equivalent, 1 if the first value is greater than the second, and -1 if the first value is less than the second.
$eq Returns true if the values are equivalent.
$gt Returns true if the first value is greater than the second.
$gte Returns true if the first value is greater than or equal to the second.
$lt Returns true if the first value is less than the second.
$lte Returns true if the first value is less than or equal to the second.
$ne Returns true if the values are not equivalent.

這裏就不舉例了,以前的例子有用到過。

1.2.四、算術聚合操做(Arithmetic Aggregation Operators)

名稱 說明
$abs Returns the absolute value of a number.
$add Adds numbers to return the sum, or adds numbers and a date to return a new date. If adding numbers and a date, treats the numbers as milliseconds. Accepts any number of argument expressions, but at most, one expression can resolve to a date.
$ceil Returns the smallest integer greater than or equal to the specified number.
$divide Returns the result of dividing the first number by the second. Accepts two argument expressions.
$exp Raises e to the specified exponent.
$floor Returns the largest integer less than or equal to the specified number.
$ln Calculates the natural log of a number.
$log Calculates the log of a number in the specified base.
$log10 Calculates the log base 10 of a number.
$mod Returns the remainder of the first number divided by the second. Accepts two argument expressions.
$multiply Multiplies numbers to return the product. Accepts any number of argument expressions.
$pow Raises a number to the specified exponent.
$sqrt Calculates the square root.
$subtract Returns the result of subtracting the second value from the first. If the two values are numbers, return the difference. If the two values are dates, return the difference in milliseconds. If the two values are a date and a number in milliseconds, return the resulting date. Accepts two argument expressions. If the two values are a date and a number, specify the date argument first as it is not meaningful to subtract a date from a number.
$trunc Truncates a number to its integer.

範例

假若有一個集合 mycol

1
2
3
4
{ _id: 1, start: 5,  end : 8 }
{ _id: 2, start: 4,  end : 4 }
{ _id: 3, start: 9,  end : 7 }
{ _id: 4, start: 6,  end : 7 }

求集合 mycol 中 start 減去 end 的絕對值

1
2
3
4
5
6
7
8
9
db.mycol.aggregate([
    {
      $project: { delta: { $ abs : { $subtract: [  "$start" "$end"  ] } } }
    }
])
"_id"  : 1,  "delta"  : 3 }
"_id"  : 2,  "delta"  : 0 }
"_id"  : 3,  "delta"  : 2 }
"_id"  : 4,  "delta"  : 1 }

1.2.五、字符串聚合操做(String Aggregation Operators)

名稱 說明
$concat Concatenates any number of strings.
$indexOfBytes Searches a string for an occurence of a substring and returns the UTF-8 byte index of the first occurence. If the substring is not found, returns -1.
$indexOfCP Searches a string for an occurence of a substring and returns the UTF-8 code point index of the first occurence. If the substring is not found, returns -1.
$split Splits a string into substrings based on a delimiter. Returns an array of substrings. If the delimiter is not found within the string, returns an array containing the original string.
$strLenBytes Returns the number of UTF-8 encoded bytes in a string.
$strLenCP Returns the number of UTF-8 code points in a string.
$strcasecmp Performs case-insensitive string comparison and returns: 0 if two strings are equivalent, 1 if the first string is greater than the second, and -1 if the first string is less than the second.
$substr Deprecated. Use $substrBytes or $substrCP.
$substrBytes Returns the substring of a string. Starts with the character at the specified UTF-8 byte index (zero-based) in the string and continues for the specified number of bytes.
$substrCP Returns the substring of a string. Starts with the character at the specified UTF-8 code point (CP) index (zero-based) in the string and continues for the number of code points specified.
$toLower Converts a string to lowercase. Accepts a single argument expression.
$toUpper Converts a string to uppercase. Accepts a single argument expression.

範例

假若有一個集合 mycol

1
2
3
4
5
6
7
"_id"  : 1,  "city"  "Berkeley, CA" "qty"  : 648 }
"_id"  : 2,  "city"  "Bend, OR" "qty"  : 491 }
"_id"  : 3,  "city"  "Kensington, CA" "qty"  : 233 }
"_id"  : 4,  "city"  "Eugene, OR" "qty"  : 842 }
"_id"  : 5,  "city"  "Reno, NV" "qty"  : 655 }
"_id"  : 6,  "city"  "Portland, OR" "qty"  : 408 }
"_id"  : 7,  "city"  "Sacramento, CA" "qty"  : 574 }

以 ',' 分割集合 mycol 中字符串city的值,用 $unwind 拆分紅多個文檔,匹配出城市名稱只有兩個字母的城市,並求和各個城市中 qty 的值,最後以降序排序。

1
2
3
4
5
6
7
8
9
10
db.mycol.aggregate([
   { $project : { city_state : { $split: [ "$city" ", " ] }, qty : 1 } },
   { $unwind :  "$city_state"  },
   { $match : { city_state : /[A-Z]{2}/ } },
   { $ group  : { _id: {  "state"  "$city_state"  }, total_qty : {  "$sum"  "$qty"  } } },
   { $sort : { total_qty : -1 } }
])
"_id"  : {  "state"  "OR"  },  "total_qty"  : 1741 }
"_id"  : {  "state"  "CA"  },  "total_qty"  : 1455 }
"_id"  : {  "state"  "NV"  },  "total_qty"  : 655 }

1.2.六、數組聚合操做(Array Aggregation Operators)

 

名稱 說明
$arrayElemAt Returns the element at the specified array index.
$concatArrays Concatenates arrays to return the concatenated array.
$filter Selects a subset of the array to return an array with only the elements that match the filter condition.
$indexOfArray Searches an array for an occurence of a specified value and returns the array index of the first occurence. If the substring is not found, returns -1.
$isArray Determines if the operand is an array. Returns a boolean.
$range Outputs an array containing a sequence of integers according to user-defined inputs.
$reverseArray Returns an array with the elements in reverse order.
$reduce Applies an expression to each element in an array and combines them into a single value.
$size Returns the number of elements in the array. Accepts a single expression as argument.
$slice Returns a subset of an array.
$zip Merge two lists together.
$in Returns a boolean indicating whether a specified value is in an array.

範例

假若有一個集合 mycol

1
2
3
4
"_id"  : 1,  "name"  "dave123" , favorites: [  "chocolate" "cake" "butter" "apples"  ] }
"_id"  : 2,  "name"  "li" , favorites: [  "apples" "pudding" "pie"  ] }
"_id"  : 3,  "name"  "ahn" , favorites: [  "pears" "pecans" "chocolate" "cherries"  ] }
"_id"  : 4,  "name"  "ty" , favorites: [  "ice cream"  ] }

求出集合 mycol 中 favorites 的第一項和最後一項

1
2
3
4
5
6
7
8
9
10
11
12
13
14
db.mycol.aggregate([
    {
      $project:
       {
          name : 1,
          first : { $arrayElemAt: [  "$favorites" , 0 ] },
          last : { $arrayElemAt: [  "$favorites" , -1 ] }
       }
    }
])
"_id"  : 1,  "name"  "dave123" "first"  "chocolate" "last"  "apples"  }
"_id"  : 2,  "name"  "li" "first"  "apples" "last"  "pie"  }
"_id"  : 3,  "name"  "ahn" "first"  "pears" "last"  "cherries"  }
"_id"  : 4,  "name"  "ty" "first"  "ice cream" "last"  "ice cream"  }

1.2.七、日期聚合操做(Date Aggregation Operators)

名稱 說明
$dayOfYear Returns the day of the year for a date as a number between 1 and 366 (leap year).
$dayOfMonth Returns the day of the month for a date as a number between 1 and 31.
$dayOfWeek Returns the day of the week for a date as a number between 1 (Sunday) and 7 (Saturday).
$year Returns the year for a date as a number (e.g. 2014).
$month Returns the month for a date as a number between 1 (January) and 12 (December).
$week Returns the week number for a date as a number between 0 (the partial week that precedes the first Sunday of the year) and 53 (leap year).
$hour Returns the hour for a date as a number between 0 and 23.
$minute Returns the minute for a date as a number between 0 and 59.
$second Returns the seconds for a date as a number between 0 and 60 (leap seconds).
$millisecond Returns the milliseconds of a date as a number between 0 and 999.
$dateToString Returns the date as a formatted string.
$isoDayOfWeek Returns the weekday number in ISO 8601 format, ranging from 1 (for Monday) to 7 (for Sunday).
$isoWeek Returns the week number in ISO 8601 format, ranging from 1 to 53. Week numbers start at 1 with the week (Monday through Sunday) that contains the year’s first Thursday.
$isoWeekYear Returns the year number in ISO 8601 format. The year starts with the Monday of week 1 (ISO 8601) and ends with the Sunday of the last week (ISO 8601).

範例

假若有一個集合 mycol

1
"_id"  : 1,  "item"  "abc" "price"  : 10,  "quantity"  : 2,  "date"  : ISODate( "2017-01-01T08:15:39.736Z" ) }

獲得集合 mycol 中 date 字段的相關日期值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
db.mycol.aggregate(
    [
      {
        $project:
          {
            year : { $ year "$date"  },
            month : { $ month "$date"  },
            day : { $dayOfMonth:  "$date"  },
            hour : { $ hour "$date"  },
            minutes: { $ minute "$date"  },
            seconds: { $ second "$date"  },
            milliseconds: { $millisecond:  "$date"  },
            dayOfYear: { $dayOfYear:  "$date"  },
            dayOfWeek: { $dayOfWeek:  "$date"  },
            week: { $week:  "$date"  }
          }
      }
    ]
)
{
   "_id"  : 1,
   "year"  : 2017,
   "month"  : 1,
   "day"  : 1,
   "hour"  : 8,
   "minutes"  : 15,
   "seconds"  : 39,
   "milliseconds"  : 736,
   "dayOfYear"  : 1,
   "dayOfWeek"  : 1,
   "week"  : 0
}

1.2.八、條件聚合操做(Conditional Aggregation Operators)

名稱 說明
$cond A ternary operator that evaluates one expression, and depending on the result, returns the value of one of the other two expressions. Accepts either three expressions in an ordered list or three named parameters.
$ifNull Returns either the non-null result of the first expression or the result of the second expression if the first expression results in a null result. Null result encompasses instances of undefined values or missing fields. Accepts two expressions as arguments. The result of the second expression can be null.
$switch Evaluates a series of case expressions. When it finds an expression which evaluates to true$switch executes a specified expression and breaks out of the control flow.

範例

假若有一個集合 mycol

1
2
3
"_id"  : 1,  "item"  "abc1" , qty: 300 }
"_id"  : 2,  "item"  "abc2" , qty: 200 }
"_id"  : 3,  "item"  "xyz1" , qty: 250 }

若是集合 mycol 中 qty 字段值大於等於250,則返回30,不然返回20

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
db.mycol.aggregate(
    [
       {
          $project:
            {
              item: 1,
              discount:
                {
                  $cond: { if: { $gte: [  "$qty" , 250 ] },  then : 30,  else : 20 }
                }
            }
       }
    ]
)
"_id"  : 1,  "item"  "abc1" "discount"  : 30 }
"_id"  : 2,  "item"  "abc2" "discount"  : 20 }
"_id"  : 3,  "item"  "xyz1" "discount"  : 30 }

  

1.2.九、數據類型聚合操做(Data Type Aggregation Operators)

名稱 說明
$type Return the BSON data type of the field.

範例

假若有一個集合 mycol

1
2
3
4
5
6
{ _id: 0, a : 8 }
{ _id: 1, a : [ 41.63, 88.19 ] }
{ _id: 2, a : { a :  "apple" , b :  "banana" , c:  "carrot"  } }
{ _id: 3, a :   "caribou"  }
{ _id: 4, a : NumberLong(71) }
{ _id: 5 }

獲取文檔中 a 字段的數據類型

1
2
3
4
5
6
7
8
9
10
11
db.mycol.aggregate([{
     $project: {
        a : { $type:  "$a"  }
     }
}])
{ _id: 0,  "a"  "double"  }
{ _id: 1,  "a"  "array"  }
{ _id: 2,  "a"  "object"  }
{ _id: 3,  "a"  "string"  }
{ _id: 4,  "a"  "long"  }
{ _id: 5,  "a"  "missing"  }

1.三、聚合管道優化

默認狀況下,整個集合做爲聚合管道的輸入,爲了提升處理數據的效率,可使用一下策略:

  • 將 $match 和 $sort 放到管道的前面,能夠給集合創建索引,來提升處理數據的效率。
  • 能夠用 $match、$limit、$skip 對文檔進行提早過濾,以減小後續處理文檔的數量。

當聚合管道執行命令時,MongoDB 也會對各個階段自動進行優化,主要包括如下幾個狀況:

  1. $sort + $match 順序優化

若是 $match 出如今 $sort 以後,優化器會自動把 $match 放到 $sort 前面

2. $skip + $limit 順序優化

若是 $skip 在 $limit 以後,優化器會把 $limit 移動到 $skip 的前面,移動後 $limit的值等於原來的值加上 $skip 的值。

例如:移動前:{$skip: 10, $limit: 5},移動後:{$limit: 15, $skip: 10}

1.四、聚合管道使用限制

對聚合管道的限制主要是對 返回結果大小 和 內存 的限制。

返回結果大小

聚合結果返回的是一個文檔,不能超過 16M,從 MongoDB 2.6版本之後,返回的結果能夠是一個遊標或者存儲到集合中,返回的結果不受 16M 的限制。

內存

聚合管道的每一個階段最多隻能用 100M 的內存,若是超過100M,會報錯,若是須要處理大數據,可使用 allowDiskUse 選項,存儲到磁盤上。

二、單目的聚合操做

單目的聚合命令,經常使用的:count()、distinct(),與聚合管道相比,單目的聚合操做更簡單,使用很是頻繁。先經過 distinct() 看一下工做流程

distinct() 的做用是去重。而 count() 是求文檔的個數。

下面用 count() 方法舉例說明一下

範例

求出集合 article 中 time 值大於 2017-04-09 的文檔個數

1
2
>db.article. count ( {  time : { $gt: new  Date ( '04/09/2017' ) } } )
3

這個語句等價於

1
db.article.find( {  time : { $gt: new  Date ( '04/09/2017' ) } } ). count ()
相關文章
相關標籤/搜索