MongoDB, MapReduce and sorting -


i might bit in on head on i'm still learning ins , outs of mongodb, here goes.

right i'm working on tool search/filter through dataset, sort arbitrary datapoint (eg. popularity) , group id. way see can through mongo's mapreduce functionality.

i can't use .group() because i'm working more 10,000 keys , need able sort dataset.

my mapreduce code working fine, except 1 thing: sorting. sorting doesn't want work @ all.

db.runcommand({   'mapreduce': 'products',   'map': function() {     emit({       product_id: this.product_id,       popularity: this.popularity     }, 1);   },   'reduce': function(key, values) {     var sum = 0;     values.foreach(function(v) {       sum += v;     });      return sum;     },   'query': {category_id: 20},   'out': {inline: 1},   'sort': {popularity: -1} });

i have descending index on popularity datapoint, it's not working because of lack of that:

{ "v" : 1, "key" : { "popularity" : -1 }, "ns" : "app.products", "name" : "popularity_-1" }

i cannot figure out why doesn't want sort.

instead of inlining result set, can't output collection , run .find().sort({popularity: -1}) on because of way feature going work.

first of all, mongo map/reduce not designed used in query tool (as in couchdb), design run background tasks. use @ work analyze traffic data.

what doing wrong you're applying sort() input, useless because when map() stage done intermediate documents sorted each keys. because key document, being sort product_id, popularity.

this how generated dataset

function generate_dummy_data() {     (i=2; < 1000000; i++) {          db.foobar.save({           _id: i,           category_id: parseint(math.random() * 30),           popularity:    parseint(math.random() * 50)         })      } } 

and map/reduce task:

var data = db.runcommand({   'mapreduce': 'foobar',   'map': function() {     emit({       sorting: this.popularity * -1,       product_id: this._id,       popularity: this.popularity,     }, 1);   },   'reduce': function(key, values) {     var sum = 0;     values.foreach(function(v) {       sum += v;     });      return sum;     },   'query': {category_id: 20},   'out': {inline: 1}, }); 

and end result (very long paste here):

http://cesarodas.com/results.txt

this works because we're sorting sorting, product_id, popularity. can play sorting how ever remember final sorting key regardless of how input sorted.

anyway said before should avoid doing queries map/reduce designed background processing. if design data in such way access simple queries, there trade-off in case complex insert/updates have simple queries (that's how see mongodb).


Comments

Popular posts from this blog

c++ - llvm function pass ReplaceInstWithInst malloc -

Cross-Compiling Linux Kernel for Raspberry Pi - ${CCPREFIX}gcc -v does not work -

java.lang.NoClassDefFoundError When Creating New Android Project -