Maxime Boissonneault
Date: Thu Aug 21 2014 - 07:32:10 CDT

> Which combination brings improvement depends on system size. The smaller the
> system, the better the benefit.
> Keep the order x y z until benefit is gone. This needs to be tested for
> different system sizes.
> Also check if you still get speedup while increasing the number of GPUs. You
> might already scale out much earlier, as this timing still doesn't represent
> 10 GPUs IMHO, guess 2 would do the same.
What GPUs is your experience based on ? With 2 K20m GPUs, I got
0.021s/step at the fastest when using 20 CPU cores.
With 8 K20m GPUs and 20 cores, I got 0.0075s/step, about 3 times faster.
> What value do you use for fullelectfrequency ?
I use the default ApoA1 benchmark, downloaded from the namd website.
Except that I added the twoawayx parameter you mentionned.


