Luckow, Andre; Jha, Shantenu; Kim, Joohyun; Merzky, Andre; Schnor, Bettina
Adaptive distributed replica-exchange simulations
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 367:2595-2606, JUN 28 2009

Owing to the loose coupling between replicas, the replica-exchange (RE) class of algorithms should be able to benefit greatly from using as many resources as available. However, the ability to effectively use multiple distributed resources to reduce the time to completion remains a challenge at many levels. Additionally, an implementation of a pleasingly distributed algorithm such as replica-exchange, which is independent of infrastructural details, does not exist. This paper proposes an extensible and scalable framework based on Simple API for Grid Applications that provides a general-purpose, opportunistic mechanism to effectively use multiple resources in an infrastructure-independent way. By analysing the requirements of the RE algorithm and the challenges of implementing it on real production systems, we propose a new abstraction (BIGJOB), which forms the basis of the adaptive redistribution and effective scheduling of replicas.

DOI:10.1098/rsta.2009.0051

Find full text with Google Scholar.