Re: ABF Simulation

From: Lionel Perrin (lionel.perrin_at_cea.fr)
Date: Tue Jun 27 2006 - 03:47:16 CDT

Hi Tyler,

I still have the same problem, but the interesting point is that we are
also using the CentOS distribution on bi-Xeon stations/cluster. I tried
to reinstall Tcl but it does not change anything.

So far I have no solution to solution or trick to give you,

All the best,

Lionel

Le mar 27/06/2006 à 02:45, Tyler Luchko a écrit :
> Hello,
>
> Has the issue with the unknown floating-point error been resolved? I
> have been running into the same problem using TclForces rather than
> of ABF. I have, however, been using Jerome's vectors.tcl. I also
> added the {}s to this file as suggested previously in this thread.
>
> The error is completely unpredictable (occurs anywhere from the first
> few time steps until 10000+) and happens where ever there is an
> [expr ...] (not just vectors.tcl). A sample error looks like:
>
> Info: Pairlistdist is too small for 74 computes during timestep 8589.
> Info: 74 pairlist warnings in past 1 steps.
> TCL: unknown floating-point error, errno = 4
> FATAL ERROR: unknown floating-point error, errno = 4
> while executing
> "expr {sqrt($retval)}"
> (procedure "veclength" line 6)
> invoked from within
> "veclength $pos2D"
> (procedure "calcforces" line 68)
> invoked from within
> "calcforces"
>
> I have also used a catch statement to catch the error in one part of
> the code and print the offending variables. These, however, look
> normal.
>
> The final interesting fact is that the problem is platform
> dependent. On our AMD64 cluster running Scientific Linux everything
> is fine. On another cluster running CentOS with Xeon 3.0 GHz CPUs I
> see the problems stated above.
>
> Any suggestions would be greatly appreciated.
>
> Thank you,
>
> Tyler
>
> ________________________________________________________________
> Tyler Luchko
> Ph.D. Candidate
> Department of Physics, University of Alberta
> Theory and Modeling, National Institute for Nanotechnology
> Edmonton, Alberta, Canada
> tluchko_at_ualberta.ca
> 780-492-5519 <- NEW
>
>
> On 2-Nov-05, at 7:28 AM, Lionel Perrin wrote:
>
> > Hi Jerome,
> >
> > thank's lot once more, the simulation is now running with your
> > modified
> > vectors.tcl.
> > I will tell you the outcome of this try.
> >
> > all the best,
> >
> > Lionel
> >
> > Le ven 28/10/2005 à 11:05, Jérôme Hénin a écrit :
> >> Lionel, Jim,
> >>
> >> I discussed the issue with a few core Tcl developpers. This seems
> >> to be
> >> related to corrupted data ending up in the variables used by ABF.
> >> As one of
> >> these people put it: "something out-of-thread is fouling your
> >> state". Jim, do
> >> you have any idea where this could come from?
> >>
> >> You'll notice that these errors appear in the vector routines
> >> defined in
> >> vector.tcl, which is borrowed to VMD. In that file, expr
> >> statements are not
> >> protected by curly braces. The Tcl guys recommend that the
> >> arguments to expr
> >> always be enclosed in braces. Doing this might prevent crashes,
> >> but if the
> >> data is really corrupted, I suppose some erroneous behavior is to
> >> be expected
> >> anyway.
> >>
> >> Lionel, I attach a version of vectors.tcl with curly braces
> >> inserted where
> >> applicable, so that you can try and see if it solves anything.
> >>
> >> Jerome
> >>
> >>
> >> On Wednesday 26 October 2005 17:52, Lionel Perrin wrote:
> >>> Jerome,
> >>>
> >>> I have changed the indexes and restarted the simulation,
> >>> after 1.6ns of propagation, I got a floating-point error :
> >>>
> >>> WRITING EXTENDED SYSTEM TO RESTART FILE AT STEP 1618000
> >>> WRITING COORDINATES TO DCD FILE AT STEP 1618000
> >>> WRITING COORDINATES TO RESTART FILE AT STEP 1618000
> >>> FINISHED WRITING RESTART COORDINATES
> >>> WRITING VELOCITIES TO RESTART FILE AT STEP 1618000
> >>> FINISHED WRITING RESTART VELOCITIES
> >>> LDB: LOAD: AVG 12.143 MAX 12.3366 MSGS: TOTAL 78 MAXC 12 MAXP
> >>> 4 None
> >>> LDB: LOAD: AVG 12.143 MAX 12.3366 MSGS: TOTAL 78 MAXC 12 MAXP 4
> >>> Refine
> >>> TCL: unknown floating-point error, errno = 4
> >>> FATAL ERROR: unknown floating-point error, errno = 4
> >>> while executing
> >>> "expr -$i"
> >>> (procedure "vecinvert" line 4)
> >>> invoked from within
> >>> "vecinvert $F2"
> >>> (in namespace eval "::ABF::ABFcoord" script line 12)
> >>> invoked from within
> >>> "namespace eval ABFcoord {
> >>>
> >>> set dr [vecsub $coords($abf2) $coords($abf1)]
> >>> set r [veclength $dr]
> >>> set nv [vecnorm $dr] ;# unity vector abf1 -> abf2
> >>>
> >>> ..."
> >>> (procedure "ABFapply" line 5)
> >>> invoked from within
> >>> "ABFapply $F"
> >>> (in namespace eval "::ABF" script line 84)
> >>> invoked from within
> >>> "namespace eval ::ABF {
> >>>
> >>> # First timestep : we don't have forces
> >>> if { $timestep == 0 } {
> >>>
> >>> # must not be equal to $timestep - 1
> >>> set timeStored -2
> >>> ..."
> >>> (procedure "calcforces" line 2)
> >>> invoked from within
> >>> "calcforces"
> >>>
> >>> here is my ABF input section (the rest of my input has remained
> >>> unchanged with respect to my previous trials) :
> >>>
> >>> source /usr/local/NAMD_2.6b1_Linux-i686/lib/abf/abf.tcl
> >>> abf coordinate distance
> >>> abf abf1 2472
> >>> abf abf2 3233
> >>> abf ximin 3.0
> >>> abf ximax 13.0
> >>> abf dxi 0.1
> >>> abf dsmooth 0.2
> >>> abf forceconst 10.0
> >>> abf fullsamples 500
> >>> abf outfile 1J8A_abf.abf
> >>> abf historyfile 1J8A_abf.his
> >>> abf outputfreq 5000
> >>> abf writexifreq 1000
> >>> abf distfile 1J8A_abf.dis
> >>>
> >>> sincerly,
> >>>
> >>> Lionel
> >>>
> >>> Le ven 21/10/2005 à 11:03, Lionel Perrin a écrit :
> >>>> Jerome,
> >>>>
> >>>> Thank you for the answer, I knew about the index issue between
> >>>> NAMD and
> >>>> VMD but I did not pay attention that time ! :-S
> >>>> I have restarted the simulation with the "good" indexes and will
> >>>> tell
> >>>> you the outcome of the ABF simulation.
> >>>>
> >>>> thank's a lot,
> >>>>
> >>>> Lionel
> >>>>
> >>>> Le mer 19/10/2005 à 11:11, Jérôme Hénin a écrit :
> >>>>> Lionel,
> >>>>>
> >>>>> I believe that part of the problem lies in the atom indexes you
> >>>>> pass to
> >>>>> ABF when defining the reaction coordinate. Your files indicate
> >>>>> that you
> >>>>> are using atom indexes given by VMD, which start at 0. However,
> >>>>> NAMD
> >>>>> uses the PDB convention for atom indexes, which starts at 1, so
> >>>>> the
> >>>>> atoms you pass to ABF are not the ones you intended. This is a
> >>>>> very
> >>>>> common pitfall that probably deserves to be more publicized in the
> >>>>> community of NAMD+VMD users.
> >>>>>
> >>>>> I don't have a clear idea of why you get that Tcl floating-
> >>>>> point error,
> >>>>> though. Since one of the atoms is a hydrogen, it might be more
> >>>>> prone to
> >>>>> numerical instability than a heavy atom, but this is not really a
> >>>>> satisfactory explanation to me. And since you are not constraining
> >>>>> protein hydrogens, issues involving constraints can be ruled out.
> >>>>>
> >>>>> If anyone else experiences similar crashes, please tell me
> >>>>> about it!
> >>>>>
> >>>>> Jerome
> >>>>>
> >>>>> On Thursday 06 October 2005 16:13, Lionel Perrin wrote:
> >>>>>> Chris,
> >>>>>>
> >>>>>> The two atoms defining the reaction coordination belong to two
> >>>>>> distinct molecules, X1 corresponds to the Cgamma of an
> >>>>>> aspartate and
> >>>>>> X2 corresponds to the C atom of the amidine function of a ligand.
> >>>>>> Hence, these two centers are not "chemically" bonded.
> >>>>>>
> >>>>>> In this case, should I apply a force constant at the border of
> >>>>>> the
> >>>>>> reaction coordinate and/or and external restraint ?
> >>>>>>
> >>>>>> Lionel
> >>>>>>
> >>>>>> Le mer 05/10/2005 à 14:58, Chris Chipot a écrit :
> >>>>>>> Lionel,
> >>>>>>>
> >>>>>>> could you tell us what atoms are involved in your reaction
> >>>>>>> coordinate ? Are these atoms chemically bonded to constrained
> >>>>>>> degrees of freedom ?
> >>>>>>>
> >>>>>>>
> >>>>>>> Chris Chipot
> > --
> > !------------------------------------------!
> > Lionel PERRIN
> > Chargé de Recherche au CNRS
> > DSV/DBJC/SBFM, URA 2096 du CNRS
> > CEA-Saclay, build. 528, office #215
> > tel : (+33) (-0)1 69 08 96 81
> > fax : (+33) (-0)1 69 08 40 07
> > 91191 Gif-sur-Yvette Cedex France
> > !------------------------------------------!
> >

-- 
!------------------------------------------!
 Lionel PERRIN
 Chargé de Recherche au CNRS
 DSV/DBJC/SBFM, URA 2096 du CNRS
 CEA-Saclay, build. 528, office #215
 tel : (+33) (-0)1 69 08 96 81
 fax : (+33) (-0)1 69 08 40 07
 91191 Gif-sur-Yvette Cedex France
!------------------------------------------!

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:42:16 CST