Wednesday, 19 October 2011

MapReduce with Mongoose and CoffeeScript

After searching the InterWeb for a decent MapReduce example coded in CoffeeScript I came up blank and decided to write my own. This one uses Mongoose too - well why use anything else?

I haven't written a whole lot of explanation in the text, but commented the code quite heavily so you can copy and paste and still have it with you.


Load Mongoose and connect to your database.Describe your schema and model and then load up some dummy data.
# import mongoose
mongoose = require('mongoose')
# connect
db = mongoose.connect('mongodb://localhost/test')
# define the schema
basketSchema = new mongoose.Schema
fruit: String
count: Number
# define model from the schema
Basket = mongoose.model "basket", basketSchema
# define some dummy data
baskets = [{
fruit: 'apple'
count: 27
},{
fruit: 'apple'
count: 32
},{
fruit: 'pear'
count: 26
},{
fruit: 'orange'
count: 16
}]
# load dummy data into mongodb
for basket in baskets
b = new Basket basket
b.save (err) ->
if err
console.log err

Now for the MapReduce stuff. If you still don't get MapReduce then you will once you read this Star Trek based article.

Define a Map function.
# define the map function
# this function will be called for every basket and
# is used to emit whatever information that we want
# about the contents each basket, although the data
# must be in the format (key, value) although
# value can be an object if nescessary
mapFunc = ->
key = @fruit
values = {
count: @count,
baskets: 1
}
emit(key, values)

Define a Reduce function.
# define the reduce function
# this function is called for every emit emitted
# from the map function above
# in this example we'll count all the fruit by
# type and also keep a tally for the baskets
reduceFunc = (key, values) ->
count = 0
baskets = 0
values.forEach (value) ->
count += value.count
baskets += value.baskets
return {
fruit: key,
count: count,
baskets: baskets
}

Get some output
# now let's feed all the baskets into the MapReduce
# notice that you must convert the mapFunc and
# reduceFunc to strings or else they will not work
# the 'out' bit is described here http://www.mongodb.org/display/DOCS/MapReduce#MapReduce-Overview
Basket.collection.mapReduce mapFunc.toString(), reduceFunc.toString(), { out : "results" }, (err, val) ->
if err
err # do something with the error

The results will now be in a collection called "results" which you can look at with JMongoBrowser or whatever.

I hope this is of some use to you. If there are any corrections or improvements then please post something below.

No comments:

Post a Comment