Namd job terminates after ~10000+ itrations

From: Sangamesh B (forum.san_at_gmail.com)
Date: Tue Dec 23 2008 - 00:29:25 CST

Hello Namd users,

     I'm benchmarking Namd-2.6 built on Rocks-4.3 Linux cluster( Intel
Xeon, 3.6 GHz)-Intel 10 compilers, with an input file given by one of
our customers.
I've tried to run Namd with both versions of charm++ - i.e. with MPI
and w/o MPI.

But in each case, the job fails, with the following error:

WRITING COORDINATES TO DCD FILE AT STEP 14000
LDB: LOAD: AVG 2.19086 MAX 2.72057 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
LDB: LOAD: AVG 2.19086 MAX 2.23459 MSGS: TOTAL 161 MAXC 10 MAXP 7 Refine
ENERGY: 15000 109.2174 351.9875 167.0691
17.0719 -96871.8411 9005.0313 0.0000
0.0000 16199.7877 -71021.6761 300.6049
-70992.4608 -70992.0088 300.8596 -20865.8855
102.9266 265795.4607 -2.6098 -3.2279

WRITING COORDINATES TO DCD FILE AT STEP 15000
Stack Traceback:
  [0] CmiAbort+0x51 [0x7a956c]
  [1] __cmi_assert+0x47 [0x7b51f3]
  [2] /opt/apps/namd26_intel/Linux-x86_64-netIB-gnu/namd2 [0x7ae963]
  [3] infi_CmiAlloc+0x16 [0x7ae823]
  [4] CmiAlloc+0x16 [0x7b4663]
  [5] CqsPrioqGetDeq+0xff [0x7b6f4b]
  [6] CqsEnqueueGeneral+0xae [0x7b7525]
  [7] CldHandler+0xa7 [0x4c06d5]
  [8] CmiHandleMessage+0x76 [0x7b2512]
  [9] CsdScheduleForever+0x5f [0x7b276b]
  [10] CsdScheduler+0x16 [0x7b26e4]
  [11] _ZN7BackEnd4initEiPPc+0x1a0 [0x4c6ea0]
  [12] main+0x19 [0x4c3ce9]
  [13] __libc_start_main+0xdb [0x3021d1c3fb]
  [14] __gxx_personality_v0+0x12a [0x4c058a]

 The detailed output ia pasted below:

$ cat out.60.Namd-Net-gnu24
Charmrun> IBVERBS version of charmrun
Charm++: scheduler running in netpoll mode.
Info: NAMD 2.6 for Linux-x86_64-netIB-gnu
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: and send feedback or bug reports to namd_at_ks.uiuc.edu
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 50914 for net-linux-x86_64-ibverbs
Info: Built Mon Dec 22 17:45:48 IST 2008 by root on rapideye.igib.res.in
Info: 1 NAMD 2.6 Linux-x86_64-netIB-gnu 20 compute-0-5.local locuz
Info: Running on 20 processors.
Info: 49816 kB of memory in use.
Info: Memory usage based on mallinfo
Info: Configuration file is npt03.inp
TCL: Suspending until startup complete.
Info: EXTENDED SYSTEM FILE npt02.xsc
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 1
Info: NUMBER OF STEPS 3000000
Info: STEPS PER CYCLE 8
Info: PERIODIC CELL BASIS 1 63.1779 0 0
Info: PERIODIC CELL BASIS 2 0 59.2372 0
Info: PERIODIC CELL BASIS 3 0 0 71.0325
Info: PERIODIC CELL CENTER 0 0 0
Info: LOAD BALANCE STRATEGY Other
Info: LDB PERIOD 1600 steps
Info: FIRST LDB TIMESTEP 40
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MAX SELF PARTITIONS 50
Info: MAX PAIR PARTITIONS 20
Info: SELF PARTITION ATOMS 125
Info: PAIR PARTITION ATOMS 200
Info: PAIR2 PARTITION ATOMS 400
Info: MIN ATOMS PER PATCH 100
Info: VELOCITY FILE npt02.vel
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 SCALE FACTOR 1
Info: DCD FILENAME npt03.dcd
Info: DCD FREQUENCY 1000
Info: DCD FIRST STEP 1000
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME npt03.xst
Info: XST FREQUENCY 10000
Info: NO VELOCITY DCD OUTPUT
Info: OUTPUT FILENAME npt03
Info: BINARY OUTPUT FILES WILL BE USED
Info: RESTART FILENAME npt03
Info: RESTART FREQUENCY 10000
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON 12
Info: SWITCHING OFF 13.5
Info: PAIRLIST DISTANCE 15
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLISTS ENABLED
Info: MARGIN 0.525
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 18.025
Info: ENERGY OUTPUT STEPS 1000
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPS 10000
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE 300
Info: LANGEVIN DAMPING COEFFICIENT IS 1 INVERSE PS
Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info: TARGET PRESSURE IS 1 BAR
Info: OSCILLATION PERIOD IS 200 FS
Info: DECAY TIME IS 500 FS
Info: PISTON TEMPERATURE IS 300 K
Info: PRESSURE CONTROL IS GROUP-BASED
Info: INITIAL STRAIN RATE IS -1.51206e-05 -1.51206e-05 -1.51206e-05
Info: CELL FLUCTUATION IS ISOTROPIC
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-06
Info: PME EWALD COEFFICIENT 0.227942
Info: PME INTERPOLATION ORDER 4
Info: PME GRID DIMENSIONS 80 80 80
Info: PME MAXIMUM GRID SPACING 1.5
Info: Attempting to read FFTW data from FFTW_NAMD_2.6_Linux-x86_64-netIB-gnu.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
Info: Writing FFTW data to FFTW_NAMD_2.6_Linux-x86_64-netIB-gnu.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 4
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RIGID BONDS TO HYDROGEN : ALL
Info: ERROR TOLERANCE : 1e-08
Info: MAX ITERATIONS : 100
Info: RIGID WATER USING SETTLE ALGORITHM
Info: NONBONDED FORCES EVALUATED EVERY 2 STEPS
Info: RANDOM NUMBER SEED 1230007678
Info: USE HYDROGEN BONDS? NO
Info: COORDINATE PDB abp_8776wat.pdb
Info: STRUCTURE FILE abp_8776wat.psf
Info: PARAMETER file: CHARMM format!
Info: PARAMETERS par_all22_prot.inp
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Info: BINARY COORDINATES npt02.coor
Info: SUMMARY OF PARAMETERS:
Info: 139 BONDS
Info: 345 ANGLES
Info: 452 DIHEDRAL
Info: 43 IMPROPER
Info: 0 CROSSTERM
Info: 95 VDW
Info: 0 VDW_PAIRS
Warning: Ignored 8776 bonds with zero force constants.
Warning: Will get H-H distance in rigid H2O from H-O-H angle.
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 26958 ATOMS
Info: 18185 BONDS
Info: 9914 ANGLES
Info: 1647 DIHEDRALS
Info: 116 IMPROPERS
Info: 0 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 26636 RIGID BONDS
Info: 54238 DEGREES OF FREEDOM
Info: 9098 HYDROGEN GROUPS
Info: TOTAL MASS = 162683 amu
Info: TOTAL CHARGE = 5.02914e-07 e
Info: *****************************
Info: Entering startup phase 0 with 52292 kB of memory in use.
Info: Entering startup phase 1 with 52292 kB of memory in use.
Info: Entering startup phase 2 with 54412 kB of memory in use.
Info: Entering startup phase 3 with 54624 kB of memory in use.
Info: PATCH GRID IS 3 (PERIODIC) BY 3 (PERIODIC) BY 3 (PERIODIC)
Info: REMOVING COM VELOCITY -0.02348 -0.0695335 -0.0196877
Info: LARGEST PATCH (6) HAS 1041 ATOMS
Info: CREATING 4376 COMPUTE OBJECTS
Info: Entering startup phase 4 with 56344 kB of memory in use.
Info: PME using 20 and 20 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 0 1 2 3 4 5 6 7 8 9 ...
Info: PME TRANS LOCATIONS: 0 1 2 3 4 5 6 7 8 9 ...
Info: Optimizing 4 FFT steps. 1... 2... 3... 4... Done.
Info: Entering startup phase 5 with 56344 kB of memory in use.
Info: Entering startup phase 6 with 56344 kB of memory in use.
Measuring processor speeds... Done.
Info: Entering startup phase 7 with 56344 kB of memory in use.
Info: CREATING 4376 COMPUTE OBJECTS
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 769 POINTS
Info: ABSOLUTE IMPRECISION IN FAST TABLE FORCE: 1.05879e-22 AT 13.4141
Info: RELATIVE IMPRECISION IN FAST TABLE FORCE: 2.02133e-16 AT 13.4141
Info: ABSOLUTE IMPRECISION IN SCOR TABLE FORCE: 1.05879e-22 AT 13.4141
Info: RELATIVE IMPRECISION IN SCOR TABLE FORCE: 2.27424e-16 AT 13.4141
Info: ABSOLUTE IMPRECISION IN VDWA TABLE ENERGY: 1.97215e-31 AT 13.6359
Info: RELATIVE IMPRECISION IN VDWA TABLE ENERGY: 3.64771e-15 AT 13.4141
Info: ABSOLUTE IMPRECISION IN VDWA TABLE FORCE: 1.2326e-32 AT 13.4141
Info: RELATIVE IMPRECISION IN VDWA TABLE FORCE: 3.49897e-16 AT 13.4141
Info: ABSOLUTE IMPRECISION IN VDWB TABLE ENERGY: 2.06795e-25 AT 13.3393
Info: RELATIVE IMPRECISION IN VDWB TABLE ENERGY: 3.90841e-16 AT 13.4141
Info: ABSOLUTE IMPRECISION IN VDWB TABLE FORCE: 5.16988e-26 AT 13.4141
Info: RELATIVE IMPRECISION IN VDWB TABLE FORCE: 2.44315e-16 AT 13.4141
Info: Entering startup phase 8 with 56344 kB of memory in use.
Info: Finished startup with 56344 kB of memory in use.
ETITLE: TS BOND ANGLE DIHED
IMPRP ELECT VDW BOUNDARY MISC
       KINETIC TOTAL TEMP TOTAL2
  TOTAL3 TEMPAVG PRESSURE GPRESSURE
VOLUME PRESSAVG GPRESSAVG

ENERGY: 0 121.1806 333.7596 170.1716
22.8872 -96614.8429 8841.2352 0.0000
0.0000 16138.1887 -70987.4200 299.4619
-70956.3288 -70956.3288 299.4619 -21073.5083
-161.4623 265837.8454 -21073.5083 -161.4623

OPENING EXTENDED SYSTEM TRAJECTORY FILE
Info: Initial time: 20 CPUs 0.107109 s/step 1.23969 days/ns 56344 kB memory
LDB: LOAD: AVG 2.34175 MAX 4.32288 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
Info: Adjusted background load on 11 nodes.
LDB: LOAD: AVG 2.4291 MAX 2.43265 MSGS: TOTAL 161 MAXC 10 MAXP 7 Alg7
LDB: LOAD: AVG 2.4291 MAX 2.43265 MSGS: TOTAL 161 MAXC 10 MAXP 7 Alg7
Info: Initial time: 20 CPUs 0.106527 s/step 1.23295 days/ns 57212 kB memory
LDB: LOAD: AVG 2.3116 MAX 3.00669 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
LDB: LOAD: AVG 2.3116 MAX 2.35658 MSGS: TOTAL 161 MAXC 10 MAXP 7 Refine
Info: Initial time: 20 CPUs 0.0767132 s/step 0.887884 days/ns 57212 kB memory
LDB: LOAD: AVG 2.25647 MAX 2.80525 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
LDB: LOAD: AVG 2.25647 MAX 2.30153 MSGS: TOTAL 161 MAXC 10 MAXP 7 Refine
Info: Benchmark time: 20 CPUs 0.0717251 s/step 0.830152 days/ns 57212 kB memory
Info: Benchmark time: 20 CPUs 0.0703394 s/step 0.814113 days/ns 57212 kB memory
Info: Benchmark time: 20 CPUs 0.0701201 s/step 0.811575 days/ns 57212 kB memory
OPENING COORDINATE DCD FILE
WRITING COORDINATES TO DCD FILE AT STEP 1000
ENERGY: 1000 127.8373 351.8293 177.3696
18.2626 -96272.9784 8691.6459 0.0000
0.0000 16181.0105 -70725.0232 300.2565
-70694.1834 -70697.6964 302.5369 -21064.6199
-222.6916 265616.0510 6.1947 6.3844

[locuz_at_rapideye net-gnu2]$ cat out.60.Namd-Net-gnu24
Charmrun> IBVERBS version of charmrun
Charm++: scheduler running in netpoll mode.
Info: NAMD 2.6 for Linux-x86_64-netIB-gnu
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: and send feedback or bug reports to namd_at_ks.uiuc.edu
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 50914 for net-linux-x86_64-ibverbs
Info: Built Mon Dec 22 17:45:48 IST 2008 by root on rapideye.igib.res.in
Info: 1 NAMD 2.6 Linux-x86_64-netIB-gnu 20 compute-0-5.local locuz
Info: Running on 20 processors.
Info: 49816 kB of memory in use.
Info: Memory usage based on mallinfo
Info: Configuration file is npt03.inp
TCL: Suspending until startup complete.
Info: EXTENDED SYSTEM FILE npt02.xsc
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 1
Info: NUMBER OF STEPS 3000000
Info: STEPS PER CYCLE 8
Info: PERIODIC CELL BASIS 1 63.1779 0 0
Info: PERIODIC CELL BASIS 2 0 59.2372 0
Info: PERIODIC CELL BASIS 3 0 0 71.0325
Info: PERIODIC CELL CENTER 0 0 0
Info: LOAD BALANCE STRATEGY Other
Info: LDB PERIOD 1600 steps
Info: FIRST LDB TIMESTEP 40
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MAX SELF PARTITIONS 50
Info: MAX PAIR PARTITIONS 20
Info: SELF PARTITION ATOMS 125
Info: PAIR PARTITION ATOMS 200
Info: PAIR2 PARTITION ATOMS 400
Info: MIN ATOMS PER PATCH 100
Info: VELOCITY FILE npt02.vel
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 SCALE FACTOR 1
Info: DCD FILENAME npt03.dcd
Info: DCD FREQUENCY 1000
Info: DCD FIRST STEP 1000
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME npt03.xst
Info: XST FREQUENCY 10000
Info: NO VELOCITY DCD OUTPUT
Info: OUTPUT FILENAME npt03
Info: BINARY OUTPUT FILES WILL BE USED
Info: RESTART FILENAME npt03
Info: RESTART FREQUENCY 10000
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON 12
Info: SWITCHING OFF 13.5
Info: PAIRLIST DISTANCE 15
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLISTS ENABLED
Info: MARGIN 0.525
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 18.025
Info: ENERGY OUTPUT STEPS 1000
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPS 10000
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE 300
Info: LANGEVIN DAMPING COEFFICIENT IS 1 INVERSE PS
Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info: TARGET PRESSURE IS 1 BAR
Info: OSCILLATION PERIOD IS 200 FS
Info: DECAY TIME IS 500 FS
Info: PISTON TEMPERATURE IS 300 K
Info: PRESSURE CONTROL IS GROUP-BASED
Info: INITIAL STRAIN RATE IS -1.51206e-05 -1.51206e-05 -1.51206e-05
Info: CELL FLUCTUATION IS ISOTROPIC
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-06
Info: PME EWALD COEFFICIENT 0.227942
Info: PME INTERPOLATION ORDER 4
Info: PME GRID DIMENSIONS 80 80 80
Info: PME MAXIMUM GRID SPACING 1.5
Info: Attempting to read FFTW data from FFTW_NAMD_2.6_Linux-x86_64-netIB-gnu.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
Info: Writing FFTW data to FFTW_NAMD_2.6_Linux-x86_64-netIB-gnu.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 4
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RIGID BONDS TO HYDROGEN : ALL
Info: ERROR TOLERANCE : 1e-08
Info: MAX ITERATIONS : 100
Info: RIGID WATER USING SETTLE ALGORITHM
Info: NONBONDED FORCES EVALUATED EVERY 2 STEPS
Info: RANDOM NUMBER SEED 1230007678
Info: USE HYDROGEN BONDS? NO
Info: COORDINATE PDB abp_8776wat.pdb
Info: STRUCTURE FILE abp_8776wat.psf
Info: PARAMETER file: CHARMM format!
Info: PARAMETERS par_all22_prot.inp
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Info: BINARY COORDINATES npt02.coor
Info: SUMMARY OF PARAMETERS:
Info: 139 BONDS
Info: 345 ANGLES
Info: 452 DIHEDRAL
Info: 43 IMPROPER
Info: 0 CROSSTERM
Info: 95 VDW
Info: 0 VDW_PAIRS
Warning: Ignored 8776 bonds with zero force constants.
Warning: Will get H-H distance in rigid H2O from H-O-H angle.
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 26958 ATOMS
Info: 18185 BONDS
Info: 9914 ANGLES
Info: 1647 DIHEDRALS
Info: 116 IMPROPERS
Info: 0 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 26636 RIGID BONDS
Info: 54238 DEGREES OF FREEDOM
Info: 9098 HYDROGEN GROUPS
Info: TOTAL MASS = 162683 amu
Info: TOTAL CHARGE = 5.02914e-07 e
Info: *****************************
Info: Entering startup phase 0 with 52292 kB of memory in use.
Info: Entering startup phase 1 with 52292 kB of memory in use.
Info: Entering startup phase 2 with 54412 kB of memory in use.
Info: Entering startup phase 3 with 54624 kB of memory in use.
Info: PATCH GRID IS 3 (PERIODIC) BY 3 (PERIODIC) BY 3 (PERIODIC)
Info: REMOVING COM VELOCITY -0.02348 -0.0695335 -0.0196877
Info: LARGEST PATCH (6) HAS 1041 ATOMS
Info: CREATING 4376 COMPUTE OBJECTS
Info: Entering startup phase 4 with 56344 kB of memory in use.
Info: PME using 20 and 20 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 0 1 2 3 4 5 6 7 8 9 ...
Info: PME TRANS LOCATIONS: 0 1 2 3 4 5 6 7 8 9 ...
Info: Optimizing 4 FFT steps. 1... 2... 3... 4... Done.
Info: Entering startup phase 5 with 56344 kB of memory in use.
Info: Entering startup phase 6 with 56344 kB of memory in use.
Measuring processor speeds... Done.
Info: Entering startup phase 7 with 56344 kB of memory in use.
Info: CREATING 4376 COMPUTE OBJECTS
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 769 POINTS
Info: ABSOLUTE IMPRECISION IN FAST TABLE FORCE: 1.05879e-22 AT 13.4141
Info: RELATIVE IMPRECISION IN FAST TABLE FORCE: 2.02133e-16 AT 13.4141
Info: ABSOLUTE IMPRECISION IN SCOR TABLE FORCE: 1.05879e-22 AT 13.4141
Info: RELATIVE IMPRECISION IN SCOR TABLE FORCE: 2.27424e-16 AT 13.4141
Info: ABSOLUTE IMPRECISION IN VDWA TABLE ENERGY: 1.97215e-31 AT 13.6359
Info: RELATIVE IMPRECISION IN VDWA TABLE ENERGY: 3.64771e-15 AT 13.4141
Info: ABSOLUTE IMPRECISION IN VDWA TABLE FORCE: 1.2326e-32 AT 13.4141
Info: RELATIVE IMPRECISION IN VDWA TABLE FORCE: 3.49897e-16 AT 13.4141
Info: ABSOLUTE IMPRECISION IN VDWB TABLE ENERGY: 2.06795e-25 AT 13.3393
Info: RELATIVE IMPRECISION IN VDWB TABLE ENERGY: 3.90841e-16 AT 13.4141
Info: ABSOLUTE IMPRECISION IN VDWB TABLE FORCE: 5.16988e-26 AT 13.4141
Info: RELATIVE IMPRECISION IN VDWB TABLE FORCE: 2.44315e-16 AT 13.4141
Info: Entering startup phase 8 with 56344 kB of memory in use.
Info: Finished startup with 56344 kB of memory in use.
ETITLE: TS BOND ANGLE DIHED
IMPRP ELECT VDW BOUNDARY MISC
       KINETIC TOTAL TEMP TOTAL2
  TOTAL3 TEMPAVG PRESSURE GPRESSURE
VOLUME PRESSAVG GPRESSAVG

ENERGY: 0 121.1806 333.7596 170.1716
22.8872 -96614.8429 8841.2352 0.0000
0.0000 16138.1887 -70987.4200 299.4619
-70956.3288 -70956.3288 299.4619 -21073.5083
-161.4623 265837.8454 -21073.5083 -161.4623

OPENING EXTENDED SYSTEM TRAJECTORY FILE
Info: Initial time: 20 CPUs 0.107109 s/step 1.23969 days/ns 56344 kB memory
LDB: LOAD: AVG 2.34175 MAX 4.32288 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
Info: Adjusted background load on 11 nodes.
LDB: LOAD: AVG 2.4291 MAX 2.43265 MSGS: TOTAL 161 MAXC 10 MAXP 7 Alg7
LDB: LOAD: AVG 2.4291 MAX 2.43265 MSGS: TOTAL 161 MAXC 10 MAXP 7 Alg7
Info: Initial time: 20 CPUs 0.106527 s/step 1.23295 days/ns 57212 kB memory
LDB: LOAD: AVG 2.3116 MAX 3.00669 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
LDB: LOAD: AVG 2.3116 MAX 2.35658 MSGS: TOTAL 161 MAXC 10 MAXP 7 Refine
Info: Initial time: 20 CPUs 0.0767132 s/step 0.887884 days/ns 57212 kB memory
LDB: LOAD: AVG 2.25647 MAX 2.80525 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
LDB: LOAD: AVG 2.25647 MAX 2.30153 MSGS: TOTAL 161 MAXC 10 MAXP 7 Refine
Info: Benchmark time: 20 CPUs 0.0717251 s/step 0.830152 days/ns 57212 kB memory
Info: Benchmark time: 20 CPUs 0.0703394 s/step 0.814113 days/ns 57212 kB memory
Info: Benchmark time: 20 CPUs 0.0701201 s/step 0.811575 days/ns 57212 kB memory
OPENING COORDINATE DCD FILE
WRITING COORDINATES TO DCD FILE AT STEP 1000
ENERGY: 1000 127.8373 351.8293 177.3696
18.2626 -96272.9784 8691.6459 0.0000
0.0000 16181.0105 -70725.0232 300.2565
-70694.1834 -70697.6964 302.5369 -21064.6199
-222.6916 265616.0510 6.1947 6.3844

LDB: LOAD: AVG 2.21514 MAX 2.69827 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
LDB: LOAD: AVG 2.21514 MAX 2.25925 MSGS: TOTAL 161 MAXC 10 MAXP 7 Refine
ENERGY: 2000 115.0802 333.6503 174.2795
23.1469 -96405.3595 8737.9491 0.0000
0.0000 16256.0032 -70765.2503 301.6480
-70735.7045 -70735.8084 302.4557 -20885.7694
-23.9166 265749.0243 -3.7921 -2.7705

WRITING COORDINATES TO DCD FILE AT STEP 2000
WRITING COORDINATES TO DCD FILE AT STEP 3000
ENERGY: 3000 123.9842 354.4409 164.2527
19.0929 -96814.6635 8930.0369 0.0000
0.0000 16079.0666 -71143.7892 298.3648
-71113.2322 -71115.0260 301.6379 -20846.0039
153.8439 265187.2389 -2.1004 -2.0211

....
........

WRITING COORDINATES TO DCD FILE AT STEP 14000
LDB: LOAD: AVG 2.19086 MAX 2.72057 MSGS: TOTAL 161 MAXC 10 MAXP 7 None
LDB: LOAD: AVG 2.19086 MAX 2.23459 MSGS: TOTAL 161 MAXC 10 MAXP 7 Refine
ENERGY: 15000 109.2174 351.9875 167.0691
17.0719 -96871.8411 9005.0313 0.0000
0.0000 16199.7877 -71021.6761 300.6049
-70992.4608 -70992.0088 300.8596 -20865.8855
102.9266 265795.4607 -2.6098 -3.2279

WRITING COORDINATES TO DCD FILE AT STEP 15000
Stack Traceback:
  [0] CmiAbort+0x51 [0x7a956c]
  [1] __cmi_assert+0x47 [0x7b51f3]
  [2] /opt/apps/namd26_intel/Linux-x86_64-netIB-gnu/namd2 [0x7ae963]
  [3] infi_CmiAlloc+0x16 [0x7ae823]
  [4] CmiAlloc+0x16 [0x7b4663]
  [5] CqsPrioqGetDeq+0xff [0x7b6f4b]
  [6] CqsEnqueueGeneral+0xae [0x7b7525]
  [7] CldHandler+0xa7 [0x4c06d5]
  [8] CmiHandleMessage+0x76 [0x7b2512]
  [9] CsdScheduleForever+0x5f [0x7b276b]
  [10] CsdScheduler+0x16 [0x7b26e4]
  [11] _ZN7BackEnd4initEiPPc+0x1a0 [0x4c6ea0]
  [12] main+0x19 [0x4c3ce9]
  [13] __libc_start_main+0xdb [0x3021d1c3fb]
  [14] __gxx_personality_v0+0x12a [0x4c058a]

Any one has faced such problem? Any optimization flags needs to be set?

The executable is linked with libraries as follows:

MPI version of NAMD with Voltaire MPI:

# ldd /opt/apps/namd26_intel/Linux-x86_64-VltMPI/namd2
        libdl.so.2 => /lib64/libdl.so.2 (0x00000034b1300000)
        libtcl8.4.so => /usr/lib64/libtcl8.4.so (0x00000034b1700000)
        libsrfftw.so.2 =>
/opt/libraries/fftw_intel/2.1.5/lib/libsrfftw.so.2
(0x0000002a9557a000)
        libsfftw.so.2 =>
/opt/libraries/fftw_intel/2.1.5/lib/libsfftw.so.2 (0x0000002a956ab000)
        libm.so.6 => /lib64/tls/libm.so.6 (0x00000034b1100000)
        libpmpich++.so.1.0 =>
/opt/vltmpi/OPENIB/mpi.icc.rsh/lib/shared/libpmpich++.so.1.0
(0x0000002a957e6000)
        libmpich.so.1.0 =>
/opt/vltmpi/OPENIB/mpi.icc.rsh/lib/shared/libmpich.so.1.0
(0x0000002a95909000)
        libmpichfstub.so.1.0 =>
/opt/vltmpi/OPENIB/mpi.icc.rsh/lib/shared/libmpichfstub.so.1.0
(0x0000002a95b07000)
        libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x0000002a95c08000)
        libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x00000034b1500000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00000034b4600000)
        libc.so.6 => /lib64/tls/libc.so.6 (0x00000034b0e00000)
        /lib64/ld-linux-x86-64.so.2 (0x00000034b0a00000)
        libimf.so => /opt/intel/cce/10.1.018/lib/libimf.so (0x0000002a95d14000)
        libsvml.so => /opt/intel/cce/10.1.018/lib/libsvml.so
(0x0000002a96079000)
        libintlc.so.5 => /opt/intel/cce/10.1.018/lib/libintlc.so.5
(0x0000002a96204000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000034b3900000)

Charm++ version of NAMD:

# ldd /opt/apps/namd26_intel/Linux-x86_64-netIB/namd2
        libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x0000002a95579000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00000034b1300000)
        libtcl8.4.so => /opt/libraries/tcl84_intel/lib/libtcl8.4.so
(0x0000002a95685000)
        libsrfftw.so.2 =>
/opt/libraries/fftw_intel/2.1.5/lib/libsrfftw.so.2
(0x0000002a958b4000)
        libsfftw.so.2 =>
/opt/libraries/fftw_intel/2.1.5/lib/libsfftw.so.2 (0x0000002a959e6000)
        libm.so.6 => /lib64/tls/libm.so.6 (0x00000034b1100000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00000034b4600000)
        libc.so.6 => /lib64/tls/libc.so.6 (0x00000034b0e00000)
        libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x00000034b1500000)
        /lib64/ld-linux-x86-64.so.2 (0x00000034b0a00000)
        libmpich.so.1.0 =>
/opt/vltmpi/OPENIB/mpi.icc.rsh/lib/shared/libmpich.so.1.0
(0x0000002a95b22000)
        libmpichfstub.so.1.0 =>
/opt/vltmpi/OPENIB/mpi.icc.rsh/lib/shared/libmpichfstub.so.1.0
(0x0000002a95d1f000)
        libimf.so => /opt/intel/cce/10.1.018/lib/libimf.so (0x0000002a95e20000)
        libsvml.so => /opt/intel/cce/10.1.018/lib/libsvml.so
(0x0000002a96184000)
        libintlc.so.5 => /opt/intel/cce/10.1.018/lib/libintlc.so.5
(0x0000002a96310000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000034b3900000)

Thanks,
Sangamesh

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:50:18 CST