Tu, Tiankai; Rendleman, Charles A.; Borhani, David W.; Dror, Ron O.; Gullingsrud, Justin; Jensen, Morten O.; Klepeis, John L.; Maragakis, Paul; Miller, Patrick; Stafford, Kate A.; Shaw, David E.
A Scalable Parallel Framework for Analyzing Terascale Molecular Dynamics Simulation Trajectories
INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 432-443, 2008

As parallel algorithms and architectures drive the longest molecular dynamics (MD) simulations towards the millisecond scale, traditional sequential post-simulation data analysis methods are becoming increasingly untenable. Inspired by the programming interface of Google's MapReduce, we have built a new parallel analysis framework called HiMach, which allows users to write trajectory analysis programs sequentially and carries out the parallel execution of the program; automatically. We introduce (1) a new MD trajectory data analysis model that is amenable to parallel processing, (2) a new interface for defining trajectories to be analyzed, (3) a novel method to make use of an existing sequential analysis tool called VIVID, and (4) an extension to the original MapReduce model to support multiple rounds of analysis. Performance evaluations on up to 512 cores demonstrate the efficiency and scalability of the HiMach framework on a Linux cluster.

Find full text with Google Scholar.