I have
covered D.E. Shaw in the past here at bbgm. I have always been fascinated by a research organization funded by someone who has money and is interested in science, without necessarily feeling the need to commercialize it, and by all accounts a person who is very intelligent. That D.E. Shaw research was represented at
Supercomputing 08 was not surprising, but the topic of the talk was not what I expected. That it was one of the more fun talks at the conference made it that much better.
Tiankai Tu (at least I think it was him) gave a talk about HiMach, a framework for the analysis of very long (millisecond scale) trajectories from molecular dynamics solutions. As most of you probably know, D.E. Shaw Research has been building a purpose-built computer, Anton, for millisecond scale molecular dynamics simulations. The problem is that the trajectories generated by those simulations create problems of their own, with traditional, sequential, trajectory analysis methods essentially untenable. That’s the problem HiMach seeks to address. It is not designed to be run on special purpose machines, but rather on a commodity cluster, and takes advantage of that increasingly ubiquitous distributed computing method of our times, MapReduce. HiMach allows users to write trajectory analysis programs sequentially, and then carries out the parallel execution of the programs automatically.

Under this model, the map phase corresponds to per frame data acquisition, and the reduce phase to cross frame
data analysis. In addition to providing a MapReduce style
API and parallel runtime which allows users to write trajectory analysis programs in
Python, HiMach also extends the MapReduce
paradigm, providing a MD trajectory definition and chained reduce capabilities (similar to
Cascading as far as I can tell). The framework takes care of
key-value data management, storing intermediate key-value pairs in a local file system on each compute node. They were also prudent, and support automatic loading of frames into
VMD. The following figure is an overview of the HiMach runtime on a single processor.

The cool architecture wouldn’t mean much if the performance was not that great. Using HiMach, they were able to analyze a 1 TB trajectory in 15 minutes on a 512 node cluster. The paper (accessible from the SC08 abstract page) goes into a lot more detail. This is fascinating stuff and shows how modern
distributed programming paradigms can be applied to some classic scientific problems at scale. I am looking forward to these key-value paradigms being extended to data storage and retrieval, e.g. using something like
CouchDB as a way to analysis structural and trajectory information (a database of trajectory data would be cool)
One Trackback
[...] in the world of pharma. It’s all well and good to use “Elastic Map Reduce” to speed up your molecular dynamics calculations on a bookstore’s spare machines, but come on, what about work that needs a validated system? [...]