Re: SHAKE Tolerance and Performance

From: Aron Broom (broomsday_at_gmail.com)
Date: Fri Mar 02 2012 - 01:33:21 CST

Hi Norman,

I agree completely with you on all points. I'm often forced to decide
between the extra speed of pmemd and the extra functionality of NAMD. It
seems like even with SHAKE and 2fs, 2fs, 6fs multistepping you are looking
at just over 50% of the speed. But yes, things like the very robust
collective variables module, and other numerous functions in NAMD are
enough to make one look the other way from the speed disadvantage most of
the time.

I realize now that a lot of my earlier inability to get a better
improvement in SHAKE with NAMD was simply because in AMBER you actually go
from 1fs to 2fs, but in NAMD if you are already using multistepping, it
more or less reduces to the jump in the electrostatic step from 4fs to 6fs,
since the 1fs to 2fs bonded jump is a minor computational change.

When I first heard of the two different methods being employed: all-gpu
versus nonbonded gpu and bonded cpu, I thought that the second method would
be superior, but I now see that was naive, as the at most 10% computational
cost of the bonded interactions are nothing compared with the time it takes
to shuttle data back and forth every step. Still, one might be able to get
improvements in NAMD with a GPU by scaling back the core clocks to reduce
heat and then scaling up the memory. I've never tested this yet, but I do
note that when running an AMBER simulation my GPU temps run about 10-15
degrees hotter than the same system with NAMD, which suggests to me that
all the cores are not being used fully with NAMD (which of course makes
sense if memory is the bottleneck). Sadly though, nVidia has gimped our
ability to control clocking in anything other than windows, and I'm not
thrilled by the idea of flashing my cards bios with different clock speeds.

Thanks for the reply,

~Aron

On Fri, Mar 2, 2012 at 2:14 AM, Norman Geist <norman.geist_at_uni-greifswald.de
> wrote:

> Hi Aron,****
>
> ** **
>
> I would the amber11 pmemd expect to be faster because more parts of the
> computation are done on the GPU (likely all), while in NAMD only the
> non-bonded interactions are computed there. So NAMD has to move data around
> more often and needs to return to the cpu to do the rest. One can improve
> that with doing the pme every 4fs only and set outputenergies to a higher
> number, cause they need to be computed on cpu to be printed to screen. But
> that harms energy conservation.****
>
> ** **
>
> I have not yet tested the amber11 pmemd but acemd which is also very fast--20cf307f37ee53128c04ba3d9783--

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:21:43 CST