NAMDv2.7 stops producing output in the log-file--any help?

From: harish vashisth (harish.vashisth_at_gmail.com)
Date: Tue Nov 09 2010 - 08:24:22 CST

Dear All,
I have been running MPI NAMDv2.7 installed from source. NAMD starts normally
producing lines pasted below at the beginning of the log-file, but at
certain point during dynamics it stops writing anything to the
log-file and no more DCD frames are being generated. There is no error
message or warning anywhere in the log-file. I looked into individual nodes
where job is running and I can see 8 namd2 processes per node,
which i think is normal for dual quad core node. Running "top" on individual
nodes shows ~100% user and hardly any system usage. This has happened to
three identical jobs where NAMD stopped producing output
at different time steps in the log file. I have also ran these jobs on some
supercomputing machines at NCSA/LONESTAR/RANGER_at_TACC in the past, and never
experiences such issues. Any help is greatly appreciated.
Please let me know if any other info is needed.

Some other specs are the following:
------------------------------------
The Infiniband is Qlogic
The MPI is OFED Intel openMPI 1.4.1
The OS is openSuSE 11.1
The kernel is 2.6.27.37-0.1-default
NAMD-version 2.7

The cluster nodes are dual quad core nodes with one of the following
processors:

Intel(R) Xeon(R) CPU E5440 @ 2.83GHz
Intel(R) Xeon(R) CPU E5520 @ 2.27GHz

----------------------------------------------------------

Charm++> Running on 8 unique compute nodes (8-way SMP).
Charm++> Cpu topology info:
PE to node map: 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3
3 3 3 3 3 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 7 7 7 7 7 7
7 7
Node to PE map:
Chip #0: 0 1 2 3 4 5 6 7
Chip #1: 8 9 10 11 12 13 14 15
Chip #2: 16 17 18 19 20 21 22 23
Chip #3: 24 25 26 27 28 29 30 31
Chip #4: 32 33 34 35 36 37 38 39
Chip #5: 40 41 42 43 44 45 46 47
Chip #6: 48 49 50 51 52 53 54 55
Chip #7: 56 57 58 59 60 61 62 63
Charm++> cpu topology info is gathered in 0.572 seconds.
Info: NAMD 2.7 for Linux-x86_64
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: and send feedback or bug reports to namd_at_ks.uiuc.edu
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 60202 for mpi-linux-x86_64
Info: Built Sat Oct 30 14:47:02 EDT 2010 by root on Satyr
Info: 1 NAMD 2.7 Linux-x86_64 64 gollum328 harishv
Info: Running on 64 processors.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 4.88826 s
Info: 227.605 MB of memory in use based on /proc/self/stat
Info: Configuration file is tamd.conf
TCL: Suspending until startup complete.
Info: EXTENDED SYSTEM FILE nvt.restart.xsc
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 2
Info: NUMBER OF STEPS 0
Info: STEPS PER CYCLE 10
Info: PERIODIC CELL BASIS 1 112.868 0 0
Info: PERIODIC CELL BASIS 2 0 105.423 0
Info: PERIODIC CELL BASIS 3 0 0 93.25
Info: PERIODIC CELL CENTER -28.9124 98.0206 24.5061

Regards,
-Harish

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:56:20 CST