Re: Unpredictably Crashes upon Restarting

From: Matthew Guberman-Pfeffer (matthew.guberman-pfeffer_at_yale.edu)
Date: Fri Mar 27 2020 - 18:00:32 CDT

Thanks for the informaiton. I presume there are options for changing the
thermostat or settings the random seed. But, what about the that the
more-often-than-not the simulation crashes somewhere between 14 and 19 ns
because of atom velocities that are too fast or some problem with the
marin. Why might this happen? And, what can I do about it?

Best,
Matthew

On Fri, Mar 27, 2020 at 6:32 PM Victor Kwan <vkwan8_at_uwo.ca> wrote:

> Dear Matthew,
>
> If you use a stochastic thermostat, the trajectory will diverge very
> quickly unless you start each run with the same random seed.
>
>
>
> On Fri, Mar 27, 2020 at 5:57 PM Matthew Guberman-Pfeffer <
> matthew.guberman-pfeffer_at_yale.edu> wrote:
>
>> Dear NAMD community,
>>
>> I have restarted my simulation from the same point (at 13.8 ns) and end
>> up with a different outcome each time. Most times the simulation crashes
>> with an error message, but the messages are always slightly different. I
>> detail the messages and what I've tried below.
>>
>> 1) First restart: ran from 13.8 to 14.4 ns:
>>
>> ERROR: Atom 119 velocity is 63537.5 -53607.4 -29502 (limit is 12000, atom
>> 197 of 651 on patch 16 pe 9)ERROR: Atom 127 velocity is -63953.9 53411.6
>> 29213.9 (limit is 12000, atom 200 of 651 on patch 16 pe 9)ERROR: Atoms
>> moving too fast; simulation has become unstable (2 atoms on patch 16 pe 9).
>>
>> 2) Restarted from 13.8 ns saving every 1 fs to the DCD to visual the
>> issue. However, the simulation did not crash, and I was forced to
>> terminate the job at 18.9 ns because the dcd was consuming nearly a TB of
>> space.
>>
>> 3) Restarted from 13.8 ns saving less frequently, hoping to repeat the
>> previous good performance while using less memory. But, at 16.3 ns, I got
>> the below error:
>>
>> ERROR: Atom 125 velocity is -113436 -53195.6 -112022 (limit is 12000,
>> atom 448 of 632 on patch 14 pe 19)
>> ERROR: Atoms moving too fast; simulation has become unstable (1 atoms on
>> patch 14 pe 19).
>>
>> 4) Restarted fromm 13.8 ns again. Now it crashed at 15.1 ns with:
>>
>> ERROR: Atom 120 velocity is 20172.4 -16771.5 -221899 (limit is 12000,
>> atom 83 of 609 on patch 16 pe 9)
>> ERROR: Atom 127 velocity is -20158.6 16781.3 221594 (limit is 12000, atom
>> 93 of 609 on patch 16 pe 9)
>> ERROR: Atoms moving too fast; simulation has become unstable (2 atoms on
>> patch 16 pe 9).
>>
>> 5) Restarted from 13.8 ns. The simulation now crashed at 18.8 ns with:
>>
>> ERROR: Margin is too small for 1 atoms during timestep 18841762.
>> ERROR: Incorrect nonbonded forces and energies may be calculated!
>> ERROR: Atom 284 velocity is -17748.9 688.258 -64061.4 (limit is 12000,
>> atom 124 of 616 on patch 17 pe 22)
>> ERROR: Atoms moving too fast; simulation has become unstable (1 atoms on
>> patch 17 pe 22).
>>
>> I get the point that the simulation is unstable. But why does it become
>> unstable after 13+ ns? Why do the time at which the simulation crashes and
>> the error message vary from one restart to the next? More importantly, what
>> can I try to resolve whatever the problem is preventing this simulation
>> from continuing?
>>
>> Best,
>> Matthew
>>
>

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2020 - 23:17:13 CST