Pestryaev, E.M.
Parallel - vectorial algorithm of molecular dynamics
Mathematical Models and Computer Simulations, 19:62-70, 2007

A performance increasing method for programs of molecular dynamics is discussed. The method essence is full use of hardware resources of modern processors by the example of Pentium 4 Hyper Threading and Athlon 64 X2. The first one is seen by operational system as two virtual processors, and the second one has two real processors at one microchip. In both cases as pair of virtual processors as pair of real processors have common memory and hardware expansion in the form of multimedia SSE-registers or vector registers. Due to this circumstance any computation may be paralleled at first between two processors, and next it may be vectorized in SSE-registers of each processor in four streams more. As a result a number of simultaneously performed steps of original algorithm becomes equal to eight, which needed computer cluster with special software till recently. C++ text of parallel-vectorial algorithm is described and its relative performance is investigated as a function of stream number for both kinds of processors.

Find full text with Google Scholar.