May 17, 2021 MongoDB
Map-Reduce is a computational model that simply performs a large amount of work (data) decomposition (MAP) and then merges the results into a final result (REDUCE).
Map-Reduce from MongoDB is flexible and useful for large-scale data analysis.
Here's mapReduce's basic syntax:
>db.collection.mapReduce( function() {emit(key,value);}, //map 函数 function(key,values) {return reduceFunction}, //reduce 函数 { out: collection, query: document, sort: document, limit: number } )
Using MapReduce to implement two functions, the Map function calls emit (key, value), traverses all records in the collection, and passes key and value to the Reduce function for processing.
The Map function must call emit (key, value) to return the key value pair.
Description of the parameters:
Consider the following document structure to store the user's article, which stores the user'user_name and the status fields of the article:
{ "post_text": "w3cschool.cn W3Cschool教程,最全的技术文档。", "user_name": "mark", "status":"active" }
Now we'll use the mapReduce function in the posts collection to pick up published articles and user_name the number of articles per user by grouping them:
>db.posts.mapReduce( function() { emit(this.user_id,1); }, function(key, values) {return Array.sum(values)}, { query:{status:"active"}, out:"post_total" } )
The mapReduce output above is:
{ "result" : "post_total", "timeMillis" : 9, "counts" : { "input" : 4, "emit" : 4, "reduce" : 2, "output" : 2 }, "ok" : 1, }
The results show that there are four documents that meet the query criteria (status:active"), four key value pairs are generated in the map function, and the same key values are divided into two groups using the reduce function.
Description of the specific parameters:
Use the find operator to view the query results for mapReduce:
>db.posts.mapReduce( function() { emit(this.user_id,1); }, function(key, values) {return Array.sum(values)}, { query:{status:"active"}, out:"post_total" } ).find()
The above query shows the following results, with two published articles for both users, tom and mark:
{ "_id" : "tom", "value" : 2 } { "_id" : "mark", "value" : 2 }
In a similar way, MapReduce can be used to build large, complex aggregate queries.
Map functions and Reduce functions can be implemented using JavaScript, which is very flexible and powerful to use MapReduce.