May 17, 2021 MongoDB
Aggregate in MongoDB is primarily used to process data (such as statistical averages, aggregations, etc.) and to return calculated data results. I t's a bit like count in a sql statement.
The method of aggregation in MongoDB uses aggregate().
The basic syntax format of the aggregate() method is as follows:
>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
The data in the collection is as follows:
{ _id: ObjectId(7df78ad8902c) title: 'MongoDB Overview', description: 'MongoDB is no sql database', by_user: 'w3cschool.cn', url: 'http://www.w3cschool.cn', tags: ['mongodb', 'database', 'NoSQL'], likes: 100 }, { _id: ObjectId(7df78ad8902d) title: 'NoSQL Overview', description: 'No sql database is very fast', by_user: 'w3cschool.cn', url: 'http://www.w3cschool.cn', tags: ['mongodb', 'database', 'NoSQL'], likes: 10 }, { _id: ObjectId(7df78ad8902e) title: 'Neo4j Overview', description: 'Neo4j is no sql database', by_user: 'Neo4j', url: 'http://www.neo4j.com', tags: ['neo4j', 'database', 'NoSQL'], likes: 750 },
Now let's calculate the number of articles written by each author from the above set, using aggregate() to calculate the following results:
> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}]) { "result" : [ { "_id" : "w3cschool.cn", "num_tutorial" : 2 }, { "_id" : "Neo4j", "num_tutorial" : 1 } ], "ok" : 1 } >
The above example is similar to the sql statement: select by_user, count (*) from mycol group by by_user
In the example above, we group the data by_user fields and calculate the sum of the by_user the same values for each field.
The following table shows some aggregated expressions:
The expression | Describe | Instance |
---|---|---|
$sum | Calculate the sum. | db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : "$likes"}}}]) |
$avg | Calculate the average | db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$avg : "$likes"}}}]) |
$min | Gets the minimum value for all documents in the collection. | db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$min : "$likes"}}}]) |
$max | Gets the maximum value for all documents in the collection. | db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$max : "$likes"}}}]) |
$push | Insert values into an array in the result document. | db.mycol.aggregate([{$group : {_id : "$by_user", url : {$push: "$url"}}}]) |
$addToSet | Insert values into an array in the resulting document, but do not create a copy. | db.mycol.aggregate([{$group : {_id : "$by_user", url : {$addToSet : "$url"}}}]) |
$first | Get the first document data based on the sort of resource documents. | db.mycol.aggregate([{$group : {_id : "$by_user", first_url : {$first : "$url"}}}]) |
$last | Get the last document data based on the sort of resource documents | db.mycol.aggregate([{$group : {_id : "$by_user", last_url : {$last : "$url"}}}]) |
Pipelines are typically used in Unix and Linux to use the output of the current command as an argument to the next command.
MongoDB's aggregation pipeline passes mongoDB documents to the next pipeline after one pipeline has been processed. Pipeline operations can be repeated.
Expression: Processes the input document and outputs it. Expressions are stateless and can only be used to evaluate documents for the current aggregate pipeline and cannot work with other documents.
Here's a look at a few of the common operations in the aggregation framework:
1, $project instances
db.article.aggregate( { $project : { title : 1 , author : 1 , }} );
In this case, there are only _id, tilte and ausor fields,_id which are included by default, if you want to _id the following:
db.article.aggregate( { $project : { _id : 0 , title : 1 , author : 1 }});
2.$match instance
db.articles.aggregate( [ { $match : { score : { $gt : 70, $lte : 90 } } }, { $group: { _id: null, count: { $sum: 1 } } } ] );
$match is used to obtain records with scores greater than 70 than or equal to 90, and then send eligible records to the next stage of the $group pipeline operator for processing.
3.$skip example
db.article.aggregate( { $skip : 5 });
The first five documents are "filtered" out after processing by the $skip pipeline operator.