AW: AW: AW: Martini RBCG simulation frequently crashing with "atoms moving too fast"

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Tue Nov 05 2013 - 03:04:53 CST

I just think that one of your GPUs might be broken. The fact that the problem occurs so randomly, could be attributed to little errors that unload with huge impact due the large timestep. Therefore it works better for smaller timesteps and regular all-atom md. If nobody jumps in here to tell this is a known problem with the CG forcefield or whatever, the most likely reason IMHO is a broken GPU. Try to exclude specific GPUs to find out if one is doing the problems or try cudamemcheck.

Norman Geist.

> -----Ursprüngliche Nachricht-----
> Von: eprates_at_iqm.unicamp.br [mailto:eprates_at_iqm.unicamp.br]
> Gesendet: Montag, 4. November 2013 16:59
> An: Norman Geist
> Cc: Namd Mailing List
> Betreff: Re: AW: AW: namd-l: Martini RBCG simulation frequently
> crashing with "atoms moving too fast"
>
> Norman, yes, I am using GPUs. Does this interfere?
>
> Erica
>
> Quoting Norman Geist <norman.geist_at_uni-greifswald.de>:
>
> > Are you using GPUs?
> >
> > Norman Geist.
> >
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: eprates_at_iqm.unicamp.br [mailto:eprates_at_iqm.unicamp.br]
> >> Gesendet: Montag, 4. November 2013 15:43
> >> An: Norman Geist
> >> Cc: Namd Mailing List
> >> Betreff: Re: AW: namd-l: Martini RBCG simulation frequently crashing
> >> with "atoms moving too fast"
> >>
> >> Hi,
> >>
> >> Norman, I have already checked the beads and could not find any
> >> problem. Actually, the identity of fast atoms vary each time the
> >> simulation crashes.
> >> The temperature I am using is 310 K. The simulation using ts = 5 fs
> >> also crashed after 2319000 steps.
> >>
> >> Xavier, I will consider your suggestion, but I want to insist more
> in
> >> NAMD, to which I am more used. Thanks.
> >>
> >>
> >> Regards
> >> Erica
> >>
> >>
> >>
> >>
> >> Quoting Norman Geist <norman.geist_at_uni-greifswald.de>:
> >>
> >> > Did you try to find the mentioned beads visually, to check if they
> >> > could be the problem? Make sure you didn't mix CG and non CG in
> your
> >> > simulation. What temperature are you using?
> >> >
> >> > Norman Geist.
> >> >
> >> >
> >> >> -----Ursprüngliche Nachricht-----
> >> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu]
> Im
> >> >> Auftrag von eprates_at_iqm.unicamp.br
> >> >> Gesendet: Sonntag, 3. November 2013 18:57
> >> >> An: namd-l_at_ks.uiuc.edu
> >> >> Betreff: namd-l: Martini RBCG simulation frequently crashing with
> >> >> "atoms moving too fast"
> >> >>
> >> >>
> >> >> Dear all,
> >> >>
> >> >> I am running a Martini RBCG molecular dynamics simulation of a
> >> >> protein of about 550 residues in water. I have, first, minmized
> >> the
> >> >> system during 1000 steps. Then, the system have undergo a two
> >> phases
> >> >> relaxation period, in which, in the first part, the whole
> protein
> >> is
> >> >> frozen to don't move, while all the other atoms are free. In
> the
> >> >> second part of the relaxation, only the backbone beads are
> frozen.
> >> >> Subsequently, I have run the NPT simulation with 5 fs timestep,
> >> >> during 200,000 steps, with all the atoms free to move. After
> that,
> >> I
> >> >> have increased the timestep to 10 fs.
> >> >>
> >> >> The problem I am facing is that the simulation is frequently
> >> getting
> >> >> crashed
> >> >> with errors "atoms moving to fast", as the following:
> >> >>
> >> >>
> >> >>
> >> >> WRITING EXTENDED SYSTEM TO RESTART FILE AT STEP 382000
> >> >> WRITING COORDINATES TO DCD FILE AT STEP 382000
> >> >> WRITING COORDINATES TO RESTART FILE AT STEP 382000
> >> >> FINISHED WRITING RESTART COORDINATES
> >> >> WRITING VELOCITIES TO RESTART FILE AT STEP 382000
> >> >> FINISHED WRITING RESTART VELOCITIES
> >> >> LDB: ============= START OF LOAD BALANCING ==============
> 6755.22
> >> >> LDB: ============== END OF LOAD BALANCING ===============
> 6755.22
> >> >> Info: useSync: 1 useProxySync: 0
> >> >> LDB: =============== DONE WITH MIGRATION ================
> 6755.22
> >> >> LDB: ============= START OF LOAD BALANCING ==============
> 6756.1
> >> >> LDB: Largest compute 4049 load 0.000477 is 0.5% of average load
> >> >> 0.101946
> >> >> LDB: Average compute 0.000058 is 0.1% of average load 0.101946
> >> >> LDB: TIME 6756.1 LOAD: AVG 0.101946 MAX 0.131312 PROXIES:
> TOTAL
> >> 245
> >> >> MAXPE 35 MAXPATCH 1 None MEM: 99.7734 MB
> >> >> LDB: TIME 6756.1 LOAD: AVG 0.101946 MAX 0.131312 PROXIES:
> TOTAL
> >> 267
> >> >> MAXPE 46 MAXPATCH 2 RefineTorusLB MEM: 99.7734 MB
> >> >> LDB: TIME 6756.12 LOAD: AVG 0.101946 MAX 0.103445 PROXIES:
> TOTAL
> >> 272
> >> >> MAXPE 48 MAXPATCH 2 RefineTorusLB MEM: 99.7734 MB
> >> >> LDB: ============== END OF LOAD BALANCING ===============
> 6756.12
> >> >> Info: useSync: 1 useProxySync: 0
> >> >> LDB: =============== DONE WITH MIGRATION ================
> 6756.12
> >> >> ERROR: Atom 9646 velocity is -759.411 -118.145 -2502.89 (limit
> is
> >> >> 1200, atom 13 of 68 on patch 73 pe 0)
> >> >> ERROR: Atoms moving too fast; simulation has become unstable (1
> >> atoms
> >> >> on patch 73 pe 0).
> >> >> ERROR: Exiting prematurely; see error messages above.
> >> >> ====================================================
> >> >>
> >> >> WallClock: 6768.302246 CPUTime: 6768.302246 Memory: 99.773438
> MB
> >> >> Program finished.
> >> >>
> >> >>
> >> >>
> >> >> With ts=10 fs, the simulation stands not longer than 382000
> steps.
> >> I
> >> >> have also tried to simulate using ts=8 fs, but the simulation
> also
> >> >> crashed, after 1230000 steps. Now I set again ts=5 fs to see if
> I
> >> can
> >> >> run during some hundreds of nanoseconds.
> >> >> Anyway, I think this is a too short time step for a standard CG
> MD
> >> >> simulation, isn't it?
> >> >>
> >> >> Has anyone faced the same problem and could give me a hint
> about
> >> why
> >> >> it is happening? I send the configuration file described
> bellow.
> >> >>
> >> >> I appreciate any help.
> >> >> Best regards,
> >> >>
> >> >> Erica
> >> >>
> >> >>
> >> >>
> >> >> > #############################################################
> >> >> > ## JOB DESCRIPTION ##
> >> >> > #############################################################
> >> >> > #NPT lipoprotien system
> >> >> > #############################################################
> >> >> > ## ADJUSTABLE PARAMETERS ##
> >> >> > #############################################################
> >> >> > set inputname system-eq3
> >> >> > set outputname system-din
> >> >> > set restart 1
> >> >> >
> >> >> > set pvmode "p"
> >> >> > set temode "t"
> >> >> >
> >> >> > proc get_first_ts { xscfile } {
> >> >> > set fd [open $xscfile r]
> >> >> > gets $fd
> >> >> > gets $fd
> >> >> > gets $fd line
> >> >> > set ts [lindex $line 0]
> >> >> > close $fd
> >> >> > return $ts
> >> >> > }
> >> >> >
> >> >> > set temperature 310
> >> >> > cosAngles on
> >> >> >
> >> >> > structure cg-fixed.psf
> >> >> > bincoordinates $inputname.coor
> >> >> > coordinates cg-fixed.pdb
> >> >> > extendedSystem $inputname.xsc
> >> >> > binvelocities $inputname.vel
> >> >> >
> >> >> > # Nao entendi isso..
> >> >> > #if {$restart == 1} {
> >> >> > # bincoordinates $inputname.coor
> >> >> > # binvelocities $inputname.vel
> >> >> > # extendedSystem $inputname.xsc
> >> >> > # svim urrenttimestep [get_first_ts $inputname.xsc]
> >> >> > ## COMMotion yes
> >> >> > #} else {
> >> >> > # temperature $temperature
> >> >> > # set currenttimestep 0
> >> >> > #}
> >> >> >
> >> >> > # Do meu jeito
> >> >> >
> >> >> > #extendedsystem $inputname.xsc
> >> >> > restartname $outputname
> >> >> > firsttimestep 0
> >> >> > #temperature $temperature
> >> >> >
> >> >> >
> >> >> > #############################################################
> >> >> > ## SIMULATION PARAMETERS ##
> >> >> > #############################################################
> >> >> >
> >> >> > # Input
> >> >> > paraTypeCharmm on
> >> >> > parameters /home/erica/toppar/martini/martini-par/martini-
> protein-
> >> >> bonds.par
> >> >> > parameters
> >> >> > /home/erica/toppar/martini/martini-par/martini-protein-angles-
> >> cos.par
> >> >> > parameters
> >> >> > /home/erica/toppar/martini/martini-par/martini-protein-
> >> dihedrals.par
> >> >> > parameters /home/erica/toppar/martini/martini-par/martini-all-
> >> >> nonb.par
> >> >> > parameters
> >> >> > /home/erica/toppar/martini/martini-par/martini-lipids-bonds-
> >> angles-
> >> >> dihedrals.par
> >> >> >
> >> >> > # Force-Field Parameters
> >> >> > exclude 1-2
> >> >> > 1-4scaling 1.0
> >> >> > cutoff 12.0
> >> >> > martiniSwitching on
> >> >> > PME off
> >> >> > switching on
> >> >> > switchdist 9.0
> >> >> > pairlistdist 14
> >> >> > dielectric 15.0
> >> >> >
> >> >> >
> >> >> > # Integrator Parameters
> >> >> > timestep 10.0 #was 25
> >> >> > nonbondedFreq 1
> >> >> > stepspercycle 10
> >> >> >
> >> >> > #Constraints and restraints
> >> >> >
> >> >> > #if {0} {
> >> >> > #constraints on
> >> >> > #consref .pdb
> >> >> > #conskfile .ref
> >> >> > #conskcol B
> >> >> > #}
> >> >> >
> >> >> > #if {0} {
> >> >> > #fixedAtoms on
> >> >> > #fixedAtomsFile file
> >> >> > #fixedAtomsCol O
> >> >> > #}
> >> >> >
> >> >> > #fixedAtoms on
> >> >> > #fixedAtomsFile cg-fixed-fixedatoms-eq2.pdb
> >> >> > #fixedAtomsCol B
> >> >> >
> >> >> > # Constant Temperature Control
> >> >> > if {$temode == "t"} {
> >> >> > langevin on ;# do langevin dynamics
> >> >> > langevinDamping 1 ;# damping coefficient(gamma)5/ps
> >> >> > langevinTemp $temperature
> >> >> > langevinHydrogen off ;# don't couple langevin bath to
> >> hydrogens
> >> >> > }
> >> >> >
> >> >> >
> >> >> > # Periodic Boundary Conditions
> >> >> >
> >> >> > #if {1} {
> >> >> > #cellBasisVector1 106.0 0 0
> >> >> > #cellBasisVector2 0 106.0 0
> >> >> > #cellBasisVector3 0 0 106.0
> >> >> > #cellOrigin 52.5 52.5 52.5
> >> >> > #}
> >> >> > wrapAll on
> >> >> >
> >> >> > #
> >> >> > #PME yes
> >> >> > #PMEGridSizeX 256
> >> >> > #PMEGridSizeY 256
> >> >> > #PMEGridSizeZ 256
> >> >> >
> >> >> >
> >> >> > #margin 5.0
> >> >> >
> >> >> > # Constant Pressure Control (variable volume)
> >> >> > useGroupPressure no # no hydrogens in CG hence set this
> to
> >> no
> >> >> > inspite of 1 fs step
> >> >> > #useFlexibleCell yes
> >> >> > #useConstantArea no
> >> >> > #useConstantRatio yes
> >> >> >
> >> >> > if {$pvmode == "p"} {
> >> >> > langevinPiston yes
> >> >> > langevinPistonTarget 1.01325 ;# in bar -> 1 atm
> >> >> > #one may need to bump up the pressure constants at first
> >> >> > langevinPistonPeriod 2000. #usually 2000
> >> >> > langevinPistonDecay 1000. #usually 1000
> >> >> > langevinPistonTemp $temperature
> >> >> > }
> >> >> >
> >> >> >
> >> >> > # Output
> >> >> > outputName $outputname
> >> >> >
> >> >> > restartfreq 1000
> >> >> > dcdfreq 1000
> >> >> > xstFreq 1000
> >> >> > outputEnergies 1000
> >> >> > outputPressure 1000
> >> >> >
> >> >> >
> >> >> > #############################################################
> >> >> > ## EXTRA PARAMETERS ##
> >> >> > #############################################################
> >> >> >
> >> >> >
> >> >> > #############################################################
> >> >> > ## EXECUTION SCRIPT ##
> >> >> > #############################################################
> >> >> >
> >> >> >
> >> >> >
> >> >> > # Minimization
> >> >> > #if {$restart == 0} {
> >> >> > #minimize 20000
> >> >> > #reinitvels $temperature
> >> >> > #}
> >> >> >
> >> >> > run 5000000
> >> >> > #%set totsimtime 5000000
> >> >> > #%run [expr $totsimtime - $currenttimestep]
> >> >> > #
> >> >>
> >> >
> >> >
> >> >
> >> > Scanned and tagged with DSPAM 3.10.2 by Instituto de Quimica -
> >> Unicamp
> >> >
> >> >
> >>
> >
> >
> >
> > Scanned and tagged with DSPAM 3.10.2 by Instituto de Quimica -
> Unicamp
> >
> > !DSPAM:105,5277b6f171151605916098!
>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:52 CST