Jung, Jaewoon; Kobayashi, Chigusa; Imamura, Toshiyuki; Sugita, Yuji
Parallel implementation of 3D FFT with volumetric decomposition schemes for efficient molecular dynamics simulations
COMPUTER PHYSICS COMMUNICATIONS, 200:57-65, MAR 2016

Three-dimensional Fast Fourier Transform (3D FFT) plays an important role in a wide variety of computer simulations and data analyses, including molecular dynamics (MD) simulations. In this study, we develop hybrid (MPI+OpenMP) parallelization schemes of 3D FFT based on two new volumetric decompositions, mainly for the particle mesh Ewald (PME) calculation in MD simulations. In one scheme, (1d_Alltoall), five all-to-all communications in one dimension are carried out, and in the other, (2d_Alltoall), one two-dimensional all-to-all communication is combined with two all-to-all communications in one dimension. 2d_Alltoall is similar to the conventional volumetric decomposition scheme. We performed benchmark tests of 3D FFT for the systems with different grid sizes using a large number of processors on the K computer in RIKEN AICS. The two schemes show comparable performances, and are better than existing 3D FFTs. The performances of 1d_Alltoall and 2d_Alltoall depend on the supercomputer network system and number of processors in each dimension. There is enough leeway for users to optimize performance for their conditions. In the PME method, short-range real-space interactions as well as long-range reciprocal-space interactions are calculated. Our volumetric decomposition schemes are particularly useful when used in conjunction with the recently developed midpoint cell method for short-range interactions, due to the same decompositions of real and reciprocal spaces. The 1d_Allitoall scheme of 3D FFT takes 4.7 ms to simulate one MD cycle for a virus system containing more than 1 million atoms using 32,768 cores on the K computer. (C) 2015 Elsevier B.V. All rights reserved.

DOI:10.1016/j.cpc.2015.10.024

Find full text with Google Scholar.