Data Day Texas 2014
On Saturday (01/11) I attended Data Day Texas (#ddtx14). As per the title, it was a single day of concentrated awesomeness with 4 speaker tracks and focusing, and very much so, on Big Data for developers. We had twenty Rackers attending, one of us was presenting, and yes, Rackspace sponsored the conference this year.
While I am sure everybody's takeaway from ddtx14 was different, I was most impressed by two technologies that seem to be gaining steam: Summingbird and Mesos. If you work on big data architecture, you might find them useful.
If you are not familiar with Nathan Marz's (of Storm fame) Lambda architecture, here it is simplified to a few sentences (though I strongly recommend reading the Big Data book):
Batch-process immutable old data (example: with Hadoop)
Stream-process new real-time data (example: with Storm)
Combine 1) and 2) for best of both worlds solution!
Now the problem with this approach has been that the logic in 1) and 2) (which is mostly aggregation logic such as counts, filters, CMS, HLL, etc. ) is very similar. It's not DRY! It seems Summingbird allows you to write code that will execute in both Storm and Hadoop, thus streamlining the Lambda architecture process. One code to rule them all!
The second technology I wanted to highlight is Mesos. This Apache project is a bit outside my direct area of expertise, but it seems that by applying voodoo kernel magic it consolidates multiple computing resources into a giant data-center sized machine to run your tasks... and provides isolation without virtualization. Of course, it is more involved than that and some APIs and various tools and services have to be used. It also seems that most of the major frameworks are already supported, which is very neat: Hadoop, Storm, Spark, MPI, etc.
Other interesting topics that were discussed included Genetic programming and a lot of security and Hadoop talks. Of course I would be remiss if I didn't mention the nice talk by our very own Gary Dusbabek ( @gdusbabek ) on metrics and architecture.
I have included some more bad quality images from #ddtx14 below: