Re: AW: Consistent temperature increase in CUDA runs

From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Tue Feb 28 2012 - 06:14:18 CST

On Feb 28, 2012, at 1:21 AM, "Norman Geist" <norman.geist_at_uni-greifswald.de>
wrote:

 Hi Aron,

thank you for investigation. What you see might be unfortunately, too.

The problem I see, arises also in CPU runs as I now think. It was harder to
observe for me as we haven’t done long simulations on CPU with namd. In
temperature controlled runs, everything is fine with the temperature. Then
we do a free simulation without any controlles afterwards, were the
temperature keeps consistently rising. Strange! I got to test a little.
Maybe increasing pairlistdist and switchdist helps but I already have

No surprise that GPU runs, particularly of very large systems, don't
conserve energy as well as CPU runs. On the GPU you compute forces in
single precision an thus have more "noise" in your system.

Axel

cutoff 10

switchdist 7

pairlistdist 14

What else might affect energy conservation?

Thanks you

Norman Geist.

*Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *I**m
Auftrag von *Aron Broom
*Gesendet:* Montag, 27. Februar 2012 15:32
*An:* Norman Geist
*Cc:* Namd Mailing List
*Betreff:* Re: namd-l: Consistent temperature increase in CUDA runs

Hi Norman,

Yeah I had temperature controlled for the whole run.

In doing a more thorough examination of the kinetic energies though, I've
come across something very odd. In my case I have ~2000 atoms from a
medium sized protein, and the rest (98,000) is solvent. I have rigid water
on, but have left the protein hydrogens non-rigid, and am using a 1fs
timestep, with langevin Hydrogen coupling off. What I see is that the
kinetic energy distribution of the non-rigid atoms matches my desired
temperature, but when I break it down into sub-systems, I find that the
temperature of all the protein atoms comes out ~20-25 K higher than the
solvent!!

That is a pretty extreme difference. I figured maybe it was because of I
decoupled the bath from the hydrogens, so I ran an extra 2 ns with it
coupled but saw no change in the crazy distribution of temperature. I'm
doing some more testing, and although it doesn't seem to relate entirely to
what you are seeing, it seems like a big problem. I'm also not certain if
it is GPU specific at this point.

~Aron

On Mon, Feb 27, 2012 at 1:28 AM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

Hi Aron,

thank you for your interest. My system sizes are mostly about a million
atoms, but the problem also occur in smaller systems. Have you measured out
this data you gave from a temperature controlled run, or was the
temperature free? I also heated up my system with the langevin and let in
run uncontrolled afterwards, while langevin is active, temperature is fine,
but when turning off, a slow increase of temperature occurs. The system is
also on 1 atm pressure.

Norman Geist.

*Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
Auftrag von *Aron Broom
*Gesendet:* Samstag, 25. Februar 2012 02:53
*An:* Norman Geist
*Cc:* Namd Mailing List
*Betreff:* Re: namd-l: Consistent temperature increase in CUDA runs

Having looked around a little more, I'm wondering if hydrogen coupling
might have anything to do with it. I've got Langevinhydrogen set to off,
which I thought was the correct thing to do when using rigid waters (like
TIP3P) even if the other hydrogen bonds in the system are not rigid. Maybe
I'm way off base here.

On Fri, Feb 24, 2012 at 3:49 PM, Aron Broom <broomsday_at_gmail.com> wrote:

Hi Norman,

I've been running simulations in NAMD using AMBERFF03 and GLYCAM06 on GPUs
(M2070 mostly). I dragged out some old files from a 60ns run, and checked
the temperature at 10ns with that at 60ns by directly computing it from the
velocity files that were written at the end of each 10ns segment. I was
using Langevin dynamics set to be 300K, I get 297K at 10ns, and 296K at
60ns. So I'm not seeing what you see.

For reference, I was using 1fs, 2fs, 4fs multi-time-stepping with only
waters being rigid (SETTLE). The system size was ~101,000 atoms. I was
also using pressure control at 1 atm.

Now, this being said, every 10ns the system was restarted, but the
velocities were not rescaled, they were taken from the restart velocity
file along with the coordinates and extended system information, and I
think the random seed was even the same at all restarts (which was probably
stupid, but not relevant to this discussion), so I imagine this should have
been the same as just running one long 60ns simulation.

Is your system considerably smaller than mine? Perhaps the error creeps up
more slowly with more particles.

~Aron

On Fri, Feb 24, 2012 at 1:51 AM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

Hi experts,

we got a little issue here. We use NAMD on CPU and GPU with the amber FF.
The systems run fine on CPU, but come with a consistent increase in
temperature when running on GPUs. Is that a known problem. What to do about
it. The rise is ca. 25K over 40 ns, so long simulations cannot be done
without many rescales.

Any ideas?

PS: The system does not contain fixed atoms.

  --
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo
-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:15 CST