MongoDB:aggregate與aggregateCursor

環境
mongos 3.0.14

aggregate

使用 aggregate 能夠實現較爲複雜的數據聚合操做,例如 彙總(count)、去重彙總(distinct count)、分組統計(group having)等。mongodb

aggregate 返回結果爲數組,須要注意數據大小不能超過16M。數組

例如:code

$pipeline = [
            ['$match' => $tmpCondition],
            ['$group' => [
                '_id' => ['user_id'=>'$user_id']
            ]],
            ['$group' => [
                '_id' => '_id.user_id',
                'number' => ['$sum'=>1]
            ]]
        ];
$options = [
        'allowDiskUse'=>true,
        'cursor'=>['batchSize'=>1]
];
        $data = MongoSvc::get('user')->user_info->aggregate($pipeline,$options);

aggregateCursor

對於大量返回結果的聚合,可使用 aggregateCursor 返回遊標,能夠避免數據大小超限。排序

aggregateCursor 的返回結果爲遊標,可循環取數。索引

例如:ip

$pipeline = [
            ['$match' => $matchArr],
            ['$project' => ['id'=>1,'_id'=>0]],
            ['$group' => [
                '_id' => '$id',
                'count' => ['$sum' => 1]
            ]],
            ['$match' => [
                'count' => ['$gt' => 1]
            ]]
        ];
        //這裏改成aggregateCursor用遊標循環獲取
        $data = MongoSvc::get('user')->user_info->aggregateCursor($pipeline);

pipeline 參數

  • $match
    條件匹配。
  • $addFields
    增長新字段。
  • $count
    該stage的文檔總數。
  • $group
    分組。
  • $limit
    限制數量。
  • $skip
    跳步。
  • $sort
    排序。
  • $out
    輸出結果到集合。
  • $project
    過濾字段。

https://docs.mongodb.com/manu...文檔

options 參數

  • explain boolean
    處理信息。
  • allowDiskUse boolean
    true 可往磁盤寫臨時數據。
  • cursor
    cursor: { batchSize: <int> }
    給返回集合設置一個初始大小。
  • hint string or document
    強制指定索引。

https://docs.mongodb.com/manu...get

查詢示例

彙總統計文檔中某個字段(如'sum')的count值:string

$pipeline = [
            ['$match' => $tmpCondition],
            ['$group' => [
                '_id' => ['sum'],
                'sum_value' => ['$sum' => '$money']
            ]]
        ];

某列的去重後的數據:it

$pipeline = [
            ['$match' => $tmpCondition],
            ['$group' => [
                '_id' => ['user_id' => '$user_id']
            ]]
        ];

統計某列(如'user_id')去重後的count值:

$pipeline = [
            ['$match' => $tmpCondition],
            ['$group' => [
                '_id' => ['user_id'=>'$user_id']
            ]],
            ['$group' => [
                '_id' => '_id.user_id',
                'number' => ['$sum'=>1]
            ]]
        ];
        
$pipeline = [
            ['$match' => $tmpCondition],
            ['$group' => [
                '_id'        => ['qid' => '$qid'],
                'max_number' => ['$max' => '$days'] 
            ]],
            ['$group' => [
                '_id' => ['number' => '$max_number'],
                'total' => ['$sum' => 1]
            ]]
        ];

統計分組後,各組內的某列彙總值:

$pipeline = [
            ['$match' => $tmpCondition],
            ['$group' => [
                '_id' => ['type' => '$type'],
                'sum_value' => ['$sum' => '$number']
            ]]
        ];
相關文章
相關標籤/搜索