Wednesday, 5 December 2012

Aggregating Everything - Map/Reduce and Camel?

If you are used to Map/Reduce you will be used to the idea of breaking tasks down into little chunks and then collecting the partial results together into some final format.

So, recently when I was parsing zillions of rows of data and aggregating related data into partial CSV files  and then aggregating the bits of partial of data to reports I thought - Aha! MapReduce.

For a whole bunch of good design decisions I was using Apache Camel - a neat pipelining tool which with a bit of help from ActiveMQ provides the sort of long running stability that I needed. Camel however does not do Map/Reduce, but it does have the Aggregator Integration pattern, which you can use to so a similar thing.

Image courtesy of Carlos Oliveira
Imagine you empty your jar of loose change on a table. You stack the nickles in stacks of ten coins, the dimes is stacks of ten coins and the quarters in stacks of ten coins. You add up all the 50 cents, $1s and $2.50s and you know how much you have. That's Map/Reduce.

Now, imagine you empty your jar of loose change into one of those coin counting machines in the Mall. Internally all the coins are sorted by falling through a hole which is either nickle, dime or quarter shaped and as they emerge from the other side they are counted*. That's aggregation Camel style.

I did hit a bit of a snag. I couldn't work out how to tell the Aggregator Integration pattern that there were no more files to come... Stop... Woaa-there... Desist!

It turns out that hidden away (in the middle of the docs) the File endpoint rather usefully sets a flag in the headers called CamelBatchComplete which is just what I was looking for:

<route id="report_month_to_date">
<from uri="file:partials" />
<unmarshal><csv/></unmarshal>
<to uri="bean:myCsvHandler?method=doHandleCsvData" />
<aggregate strategyRef="serviceStrategy">
<correlationExpression>
<simple>${header.month} == ${date:now:yyyyMM}</simple>
</correlationExpression>
<completionPredicate>
<simple>${header.CamelBatchComplete}</simple>
</completionPredicate>
<to uri="file:reports/?fileName=${header.month}.csv" />
</aggregate>
</route>
view raw context.xml hosted with ❤ by GitHub
Good luck fellow travelers.

* I have no idea how a coin counting machine works.

Friday, 23 November 2012

Simple Camel Configuration of a Twitter Endpoint

I was asked just now how my got the Twitter Stream working in my new Camel based project and how I managed the credentials.

The Twitter endpoint works like a dream and this is essentially what my code looks like. All you need is a secrets.properties file in alongside your java file.
/**
* A Camel Java DSL Router
*/
public class MyRouteBuilder extends RouteBuilder {
private static final ResourceBundle SECRETS = ResourceBundle
.getBundle("myproject.secrets");
/**
* Let's configure the Camel routing rules using Java code...
*
* @throws UnsupportedEncodingException
*/
public void configure() throws UnsupportedEncodingException {
configureAccess();
String twitter = "twitter://streaming/filter?type=event&keywords="
+ URLEncoder.encode("london", "utf8");
from(twitter).filter(body().isInstanceOf(Status.class)).addAllSortsOfStuffHere().
}
private void configureAccess() {
// setup Twitter component
TwitterComponent tc = getContext().getComponent("twitter",
TwitterComponent.class);
tc.setAccessToken(SECRETS.getString("ACCESS_TOKEN"));
tc.setAccessTokenSecret(SECRETS.getString("ACCESS_TOKEN_SECRET"));
tc.setConsumerKey(SECRETS.getString("CONSUMER_KEY"));
tc.setConsumerSecret(SECRETS.getString("CONSUMER_SECRET"));
}
}
view raw routes.java hosted with ❤ by GitHub

Saturday, 17 November 2012

Apache Camel - Connection Beans Without Spring

While writing some tests for an Apache Camel project, I just spent rather longer than I'd have liked trying to work out how to configure a connection bean for Mongo without using Spring.

Since I hide my embarrassments in public I thought I'd best share with anyone else with brain freeze.

public class SomeTest extends CamelTestSupport {
@Override
protected RouteBuilder createRouteBuilder() throws Exception {
return new RouteBuilder() {
@Override
public void configure() throws Exception {
from("direct:findAll")
.to("mongodb:myConnectionBean?database=flights&collection=tickets&operation=findAll")
.to("mock:resultFindAll");
}
};
}
@Override
protected JndiRegistry createRegistry() throws Exception {
JndiRegistry reg = new JndiRegistry(createJndiContext());
Mongo connectionBean = new Mongo("localhost", 27017);
reg.bind("myConnectionBean", connectionBean);
return reg;
}
@Test
public void testSomething() throws InterruptedException {
// test something here
}
}
view raw test.java hosted with ❤ by GitHub

Don't forget you need the camel-mongodb artifact in your pom.xml file. Good luck fellow travellers.

Monday, 5 November 2012

Python Web Microframeworks - Take Your Pick

You may read my post "Top Python Web Frameworks - Today" in which I took a fresh look at what which Python Web Frameworks were still around and still maintained.

In this post I give a quick overview of about half of those which I have loosely designated as "Microframeworks" - regardless of what the authors have called them. Wikipedia doesn't have a definition for microframework - I just looked - so what I really mean here is anything which let's you get started without having to learn a whole bunch of syntax and convention. Right on sister!

Let's get going:

mkvirtualenv micro
cdvirtualenv
pip install flask bottle web2py web.py kiss.py wheezy.web
blah...blah...
Successfully installed flask bottle web2py web.py kiss.py wheezy.web Werkzeug Jinja2 gevent compressinja beaker putils jsmin pyScss sqlalchemy elixir jsonpickle pev requests wheezy.core wheezy.caching wheezy.html wheezy.http wheezy.routing wheezy.security wheezy.validation greenlet
Cleaning up...
view raw micros.sh hosted with ❤ by GitHub
Now that's done a little code from each -

Bottle (v0.11.13)


#!/usr/bin/env python
import sys
from bottle import route, run, template
@route('/')
def index(name='World'):
return "Hello World!"
if __name__ == '__main__':
run(host='localhost', port=8080)
view raw main_bottle.py hosted with ❤ by GitHub

Bottle doesn't rely on any other packages at all, which means it's a great framework to use if you want to see all the working parts as they're all in the one file. That being said it can offer client-side sessions and compression and even WebSockets straight out of the box so it's not just a toy by any means.

Flask (v0.9)

#!/usr/bin/env python
import sys
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello():
return "Hello World!"
if __name__ == '__main__':
app.run(host='localhost', port=8080)
view raw main_flask.py hosted with ❤ by GitHub

Flask is dependent on Werkzeug for all the WSGI stuff and upon Jinja2 as a template library. It comes with client-side sessions, a built in debugger and is totally unicode friendly. I love Flask and use it often as my other posts will testify.

kiss.py (v0.4.9)

#!/usr/bin/env python
import sys
from kiss.core.application import Application
from kiss.views.templates import Response
class Controller(object):
def get(self, request):
return Response("Hello World!")
options = {
"application": {
"address": "127.0.0.1",
"port": 8080
},
"urls": {
"": Controller,
},
}
app = Application({"application": {"address": "127.0.0.1", "port": 8080}, "urls": {"": Controller} })
if __name__ == '__main__':
app.start()
view raw main_kiss.py hosted with ❤ by GitHub

The first of the new boys, kiss.py is certainly not package independent! It requires Werkzeug for WSGI, requests for http, Beaker for sessions, Elixir and SQLAlchemy for an ORM (PostgreSQL, MySQL and SQLite), Jinja2 for templates, gevent, pev and greenlet for events as well as compressinja, jsmin, jsonpickle, putils and pyScss which add various other niceties. Almost all well known and trusted libraries.

web.py (v0.37)

#!/usr/bin/env python
import sys
import web
urls = (
'/', 'index'
)
class index:
def GET(self):
return "Hello world!"
if __name__ == '__main__':
app = web.application(urls, globals())
app.run() # you can change the port number
# on the command line - default is 8080
view raw main_web.py hosted with ❤ by GitHub

Again, web.py doesn't rely on any other packages at all, but to me it's not as useful as flask or kiss.py and not as simple to study as bottle, so I can't see the point, although according to the site, it's well used by others.

wheezy.web (v0.1.307)

#!/usr/bin/env python
import sys
from wheezy.http import HTTPResponse
from wheezy.http import WSGIApplication
from wheezy.routing import url
from wheezy.web.handlers import BaseHandler
from wheezy.web.middleware import bootstrap_defaults
from wheezy.web.middleware import path_routing_middleware_factory
from wsgiref.simple_server import make_server
class WelcomeHandler(BaseHandler):
def get(self):
response = HTTPResponse()
response.write('Hello World!')
return response
def welcome(request):
response = HTTPResponse()
response.write('Hello World!')
return response
all_urls = [
url('', WelcomeHandler, name='default'),
url('welcome', welcome, name='welcome')
]
app = WSGIApplication(
middleware=[
bootstrap_defaults(url_mapping=all_urls),
path_routing_middleware_factory
],
options={}
)
if __name__ == '__main__':
make_server('', 8080, app).serve_forever()
view raw main_wheezy.py hosted with ❤ by GitHub

pip installs wheezy.web, wheezy.core, wheezy.caching, wheezy.html, wheezy.http, wheezy.routing, wheezy.security, wheezy.validation which to me looks like the developers have taken a sensible approach to the development cycle by splitting everything up into independent code units.

According to the site,  functionality includes routing, model update/validation, authentication/authorization,content caching with dependency, xsrf/resubmission protection, AJAX+JSON, i18n (gettext), middlewares, and more.

In Summary

No I haven't tested them to death and no I haven't even tried out kiss.py and wheezy.web in a real world app, although I will do. I certainly have not done either load or concurrency testing on them. You can do that and I'll read your blog.

Goo luck fellow traveller.

Friday, 2 November 2012

CoffeeScript Love: Backbone.js Tutorials in CoffeeScript

Ahhhh enjoy the aroma, inhale the caffeine. If you're looking for a nice list of Backbone tutorials that use CoffeeScript then look no more - CoffeeScript Love: Backbone.js Tutorials in CoffeeScript.

Thursday, 1 November 2012

Top Python Web Frameworks - Today

Every once in a while it's nice to rediscover old friends. Over the last few years I've had a play with quite a few Python Web Frameworks, but one does tend to get a favourite and stick with it. Same as other areas I suppose - cars, dogs, beers, partners... Still, it's good to have a look at what's new and fresh once in a while (my wife doesn't care, but don't tell my dog.)

example of Python language
example of Python language (Photo credit: Wikipedia)
If you look around the internet, there are probably thirty or so Python Web Frameworks around, but not all of them are actively supported. This means that they are likely to be missing functionality in some of the latest latest areas of interest - OAuth2, HTML5 or mobile support for example. Or maybe they just don't make it easy to stuff that you have become used to over the last few years - AJAX or web services maybe.

In my previous post "My Top Ten Python Web Frameworks" from about 18months ago, I gave you my opinion of what was hot, or not, at that time. Some of those seem to have stalled now - Tipfy, GAE framework, Weblayer; while others have appeared or matured - kiss.py, web.py, wheezy.web.

Below is an alphabetic list of the active (updated this year) which I know about. There may be others out there, so please let me know in the comments at the bottom.
In my next few posts I'll be giving you a run down of what state each of them is in and try to give you some idea of how they fit your requirements.

Good luck fellow traveller.


Tuesday, 17 July 2012

A First Look at Flask-Evolution

While poking around in Flask-Ahoy this morning I came across Flask-Evolution which I hadn't previous noticed. It offers the Django idea of migrations to the Flask community and so I thought I'd give it a go and see how it shapes up.


So let's set up a virtual environment and get the basics in place.
mkvirtualenv evo;
cdvirtualenv
# this installs flask, flask-script and everything you need
pip install Flask-Evolution
touch myapp.py
touch manage.py
view raw evo.sh hosted with ❤ by GitHub

Now the contents of the two files you touched. First the "hello world" Flask app goes in the myapp.py file:


Next we we take the tutorial Flask-Script manage.py file and adapt it very slightly:

Now we are ready to create a simple migration for a Post model for a blog:

# "python ./manage.py migrate init" does not create the "migrations" directory for me!
mkdir migrations
python ./manage.py migrate create
Name for the migration: Post
Created new migration file: /mypath/migrations/0001_post.py
view raw create.sh hosted with ❤ by GitHub


and you get the template file which looks like this:



In the file put your Post model and modfy "up" and "down":


Our First Migration

python ./manage.py migrate run
view raw run.sh hosted with ❤ by GitHub

Our Second Migration
Now if we want to modify that Post model we can again do:

python ./manage.py migrate create
Name for the migration: Post
Created new migration file: /mypath/migrations/0002_post.py
view raw create.sh hosted with ❤ by GitHub

modify your model and modfy "up" and "down", I've left some comments so you can see what I did:


and run it again.

python ./manage.py migrate run
view raw run.sh hosted with ❤ by GitHub

You should also play around with the "redo" and "undo" commands. That's about it really.

Conclusion
This is only version 0.6 of Flask-Evolution and it's got a way to go, but I can see some use in using it, but it's far from seamless and it's a long way from South.

Documentation is woeful and I'm not convinced the developer Adam Patterson has the time to do the job he'd like to.

You could always offer to give him a hand!

If you see anything wrong with what I've written here or if you spot spelling or other mistakes then please let me know.

Good luck fellow traveller.