Re: Re: Inconsistency between NAMD 2.11 and NAMD 2.12

From: Bryan Roessler (roessler_at_uab.edu)
Date: Fri Feb 03 2017 - 14:47:27 CST

Spoke too soon, the system temperature started to rise again and the atoms
became unstable. I'll stick with "useCUDA2 no" until I hear a resolution
this problem. It's a shame since the 3x performance boost sounds great in
theory.

My main worry is that researchers starting simulations on systems that they
are unfamiliar with might not recognize the problem and proceed with
inaccurate results. I might recommend that the new kernel be enabled as an
option and not turned on by default.

I'm using 3x GTX 980's if the hardware makes a difference in tracking down
the problem.

*Bryan Roessler | Graduate Research Assistant*
UAB | The University of Alabama at Birmingham
*uab.edu/cmdb <http://uab.edu/cmdb>*
Knowledge that will change your world

On Fri, Feb 3, 2017 at 2:17 PM, Bryan Roessler <roessler_at_uab.edu> wrote:

> FWIW, it appears that this issue has been resolved in the nightlies
> (2017-02-03).
>
> *Bryan Roessler | Graduate Research Assistant*
> UAB | The University of Alabama at Birmingham
> *uab.edu/cmdb <http://uab.edu/cmdb>*
> Knowledge that will change your world
>
> On Tue, Jan 31, 2017 at 10:12 AM, Bryan Roessler <roessler_at_uab.edu> wrote:
>
>> OK, it appears the issue is with the new nonbonded CUDA kernel in 2.12.
>>
>> Passing the "useCUDA2 no" option in the configuration file has fixed the
>> issue.
>>
>> Thanks,
>> Bryan
>>
>> *Bryan Roessler | Graduate Research Assistant*
>> UAB | The University of Alabama at Birmingham
>> *uab.edu/cmdb <http://uab.edu/cmdb>*
>> Knowledge that will change your world
>>
>> On Tue, Jan 31, 2017 at 12:52 AM, Bryan Roessler <roessler_at_uab.edu>
>> wrote:
>>
>>> Hello,
>>>
>>> I've been having problems with my free MD simulations in 2.12 and 2.12b1
>>> that do not happen in 2.11. According to my logs the last known good build
>>> that I was using was the 2015-10-29 nightly that did not have this problem.
>>>
>>> I am using the CUDA enabled linux builds.
>>>
>>> The problem is that immediately when I begin (or restart) a simulation
>>> using one of the affected builds I see a large depression form in the
>>> solvation box (it's not a PBC artifact, there is plenty of padding) and my
>>> system temperature rises from ~310K to ~335K (and higher) and eventually my
>>> simulation fails due to RATTLE constraints on some of the exterior
>>> hydrogens of my protein.
>>>
>>> I have tried restarting stable ~20ns simulations from the 2015-10-29
>>> build and they will usually fail within 100-1000 timesteps on 2.12 or
>>> 2.12b1. If I restart the simulations in 2.11 they proceed perfectly. I
>>> thought that there might be some compatibility issue between the binary
>>> files so I've also exported the restart files as a PDB in VMD and
>>> reinitialized my temperatures but this hasn't helped.
>>>
>>> I thought that perhaps CUDA versioning was giving me problems so I made
>>> sure to specify the correct LD_LIBRARY_PATH with the linked cuda.so
>>> included with NAMD.
>>>
>>> I have also tried building NAMD but I will need to fire up a VM since
>>> there are compatibility issues between GCC 5.3 and CUDA 8.0 and I don't
>>> want to muck with my environment too much. I was hoping that maybe someone
>>> could shed some light on this problem before I go that route.
>>>
>>> The KISS in me says to just keep using 2.11 for the time being but I'd
>>> very much like to utilize the GPU performance optimizations and I'm curious
>>> why this happening. 2015-10-29 has decent performance too without the
>>> associated problem so I will likely continue to use that version going
>>> forward. I'd also be curious to find a ftp or some other site where I can
>>> download older nightly builds so maybe I could narrow down when this
>>> problem was introduced.
>>>
>>> Thanks,
>>> Bryan
>>>
>>>
>>> *Bryan Roessler | Graduate Research Assistant*
>>> UAB | The University of Alabama at Birmingham
>>> *uab.edu/cmdb <http://uab.edu/cmdb>*
>>> Knowledge that will change your world
>>>
>>>
>>
>

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:20:04 CST