MongoDB aggregation

May 17, 2021 MongoDB

MongoDB aggregation

Aggregate in MongoDB is primarily used to process data (such as statistical averages, aggregations, etc.) and to return calculated data results. I t's a bit like count in a sql statement.

Aggregate() method

The method of aggregation in MongoDB uses aggregate().

Grammar

The basic syntax format of the aggregate() method is as follows:

>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)

Instance

The data in the collection is as follows:

{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by_user: 'w3cschool.cn',
   url: 'http://www.w3cschool.cn',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
},
{
   _id: ObjectId(7df78ad8902d)
   title: 'NoSQL Overview', 
   description: 'No sql database is very fast',
   by_user: 'w3cschool.cn',
   url: 'http://www.w3cschool.cn',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 10
},
{
   _id: ObjectId(7df78ad8902e)
   title: 'Neo4j Overview', 
   description: 'Neo4j is no sql database',
   by_user: 'Neo4j',
   url: 'http://www.neo4j.com',
   tags: ['neo4j', 'database', 'NoSQL'],
   likes: 750
},

Now let's calculate the number of articles written by each author from the above set, using aggregate() to calculate the following results:

> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])
{
   "result" : [
      {
         "_id" : "w3cschool.cn",
         "num_tutorial" : 2
      },
      {
         "_id" : "Neo4j",
         "num_tutorial" : 1
      }
   ],
   "ok" : 1
}
>

The above example is similar to the sql statement: select by_user, count (*) from mycol group by by_user

In the example above, we group the data by_user fields and calculate the sum of the by_user the same values for each field.

The following table shows some aggregated expressions:

The expression	Describe	Instance
$sum	Calculate the sum.	db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : "$likes"}}}])
$avg	Calculate the average	db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$avg : "$likes"}}}])
$min	Gets the minimum value for all documents in the collection.	db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$min : "$likes"}}}])
$max	Gets the maximum value for all documents in the collection.	db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$max : "$likes"}}}])
$push	Insert values into an array in the result document.	db.mycol.aggregate([{$group : {_id : "$by_user", url : {$push: "$url"}}}])
$addToSet	Insert values into an array in the resulting document, but do not create a copy.	db.mycol.aggregate([{$group : {_id : "$by_user", url : {$addToSet : "$url"}}}])
$first	Get the first document data based on the sort of resource documents.	db.mycol.aggregate([{$group : {_id : "$by_user", first_url : {$first : "$url"}}}])
$last	Get the last document data based on the sort of resource documents	db.mycol.aggregate([{$group : {_id : "$by_user", last_url : {$last : "$url"}}}])

The concept of pipes

Pipelines are typically used in Unix and Linux to use the output of the current command as an argument to the next command.

MongoDB's aggregation pipeline passes mongoDB documents to the next pipeline after one pipeline has been processed. Pipeline operations can be repeated.

Expression: Processes the input document and outputs it. Expressions are stateless and can only be used to evaluate documents for the current aggregate pipeline and cannot work with other documents.

Here's a look at a few of the common operations in the aggregation framework:

$project: Modify the structure of the input document. It can be used to rename, add, or delete fields, or to create calculations and nested documents.
$match: Used to filter data and output only eligible documents. $match to use MongoDB's standard query operations.
$limit: Used to limit the number of documents returned by the MongoDB aggregation pipeline.
$skip: Skip a specified number of documents in the aggregation pipeline and return the remaining documents.
$unwind: Split an array type field in a document into multiple bars, each containing a value in the array.
$group: Group documents in a collection that can be used to count results.
$sort: The input document is sorted and output.
$geoNear: Outputs an ordered document that is close to a geographic location.

An instance of a pipeline operator

1, $project instances

db.article.aggregate(
    { $project : {
        title : 1 ,
        author : 1 ,
    }}
 );

In this case, there are only _id, tilte and ausor fields,_id which are included by default, if you want to _id the following:

db.article.aggregate(
    { $project : {
        _id : 0 ,
        title : 1 ,
        author : 1
    }});

2.$match instance

db.articles.aggregate( [
                        { $match : { score : { $gt : 70, $lte : 90 } } },
                        { $group: { _id: null, count: { $sum: 1 } } }
                       ] );

$match is used to obtain records with scores greater than 70 than or equal to 90, and then send eligible records to the next stage of the $group pipeline operator for processing.

3.$skip example

db.article.aggregate(
    { $skip : 5 });

The first five documents are "filtered" out after processing by the $skip pipeline operator.

MongoDB aggregation

Table of contents