Problem with TclBC when run in more than one processor

From: fett_at_vtr.net
Date: Wed Jun 04 2008 - 12:08:57 CDT

Dear Namd developers:

I am trying to implement a time-variant electrical field with the NAMD TclBC
module (I am using TclBC because I am applying a force to all atoms of my
system), I made a test run with a small water box (401 molecules , 1203
atoms), the simulation performed very well when it was run in a single
processor giving the desired results (then I thought that the TclBC script was
corretc), but when I run the same system in a multiprocessor single-machine (8
cores) then I obtain the following error:

FATAL ERROR: unknown floating-point error, errno = 4
     while executing
"expr $E_t*$charge"
     (procedure "calcforces" line 17)
     invoked from within
"calcforces 71526 1 "
------------- Processor 2 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: unknown floating-point error, errno = 4
     while executing
"expr $E_t*$charge"
     (procedure "calcforces" line 17)
     invoked from within
"calcforces 71526 1 "

Stack Traceback:
   [0] _ZN12ComputeTclBC6doWorkEv+0xa0 [0x820fd84]
   [1] _ZN11WorkDistrib11enqueueWorkEP12LocalWorkMsg+0x19 [0x82de09d]
   [2]
_ZN19CkIndex_WorkDistrib30_call_enqueueWork_LocalWorkMsgEPvP11WorkDistrib+0x11
 [0x82de07d]
   [3] CkDeliverMessageFree+0x29 [0x8322f7d]
   [4] _Z15_processHandlerPvP11CkCoreState+0x417 [0x83225bf]
   [5] CmiHandleMessage+0x1d [0x837c885]
   [6] _ZN7BackEnd4initEiPPc+0x295 [0x80e2f89]
   [7] main+0x30 [0x80deff4]
   [8] __libc_start_main+0xd3 [0xf7ea2de3]
   [9] _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_c+0x59 [0x80dba71]
Fatal error on PE 2> FATAL ERROR: unknown floating-point error, errno = 4
     while executing
"expr $E_t*$charge"
     (procedure "calcforces" line 17)
     invoked from within
"calcforces 71526 1

Obviously, this error has to be related with the use of multiple processors,
maybe my test-system is to small but I would like to know if there is an error
in my TclBC script or if there is some kind of bug in NAMD when you use TclBC
in more than one processor,

The details of my system are :

Info: STRUCTURE SUMMARY:
Info: 1203 ATOMS
Info: 802 BONDS
Info: 401 ANGLES
Info: 0 DIHEDRALS
Info: 0 IMPROPERS
Info: 0 CROSSTERMS

An the TclBC script is:

proc calcforces {step unique } {

      global Eo w dt
      set t [expr ($step*$dt)*0.001]
      set wt [expr $w*$t]
      set E_t [expr $Eo*cos($wt)]

      if { $step % 50 == 0} {
         if { $unique } {
            print "step $step eField intensity = $E_t"
         }
      }

      while {[nextatom]} {
            set charge [getcharge]
            if { $charge == 0 } {
                dropatom
                continue
             }
             set force_ef [expr $E_t*$charge]
             addforce "0.0 0.0 $force_ef"

      }

}

where Eo w dt are defined in the .conf file for the simulation. I have also
attached the .conf file for this simulation

Hope somebody could give a hint because I want to simulate a much bigger
system (20000 atoms) with this time-variant electrical field, an runnig it
only in one processor would be terribly slow. By the way, I don't know if is
possible to use the command [getcharge] using TCLForces instead of TclBC,
because then I can try to use TclForces rather than TclBC.

Regards

Jose Antonio

P.S

This are the command that I'm using to run the simulations:

nohup /usr/local/NAMD_2.6_Linux-i686/charmrun
/usr/local/NAMD_2.6_Linux-i686/namd2 ++local +p8 eF_water_test.conf >
ef_nvt_1.out &


This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:49:32 CST