From: Rene Salmon (rsalmon_at_tulane.edu)
Date: Wed Jun 08 2005 - 11:55:00 CDT
Hi List,
We are having a strange problem with NAND it seems to start running on 
our cluster and then all of the sudden it stops and goes to sleep or 
waits for something.
This is on an AMD64 Bproc cluster.  Here is what the running/sleeping 
NAMD job looks like:
0 S parag    10950 10937  0  75   0 -  1969 -      11:13 ? 
00:00:00 charmrun ++skipmaster ++verbose ++debug ++n
odelist nodelist.txt ++p 8 namd2 run_0025.conf
0 S parag    10951 10950  0  75   0 -  8943 ghost_ 11:13 ? 
00:00:04 [namd2]
0 S parag    10952 10950  0  75   0 -  7733 ghost_ 11:13 ? 
00:00:05 [namd2]
0 S parag    10953 10950  0  75   0 -  7299 ghost_ 11:13 ? 
00:00:04 [namd2]
0 S parag    10954 10950  0  75   0 -  7357 ghost_ 11:13 ? 
00:00:04 [namd2]
0 S parag    10955 10950  0  75   0 -  9449 ghost_ 11:13 ? 
00:00:06 [namd2]
0 S parag    10956 10950  0  75   0 -  8986 ghost_ 11:13 ? 
00:00:06 [namd2]
0 S parag    10957 10950  0  75   0 -  8001 ghost_ 11:13 ? 
00:00:05 [namd2]
0 S parag    10958 10950  0  75   0 -  7921 ghost_ 11:13 ? 
00:00:04 [namd2]
It is just hanging here forever.  Attached is the full stdout stderr log 
  but here are the last few lines of this:
ENERGY:       0       497.4372      2945.9299       697.7718 
26.8610
     -65990.2474       457.9247         0.0000         0.0000 
8213.8267
     -53150.4962       319.2297    -53065.4816    -53065.4816 
319.2297
      -4443.1718      -156.8539    365061.0449     -4443.1718      -156.8539
ENERGY:       0       497.4372      2945.9299       697.7718 
26.8610
     -65990.2474       457.9247         0.0000         0.0000 
8213.8267
     -53150.4962       319.2297    -53065.4816    -53065.4816 
319.2297
      -4443.1718      -156.8539    365061.0449     -4443.1718      -156.8539
OPENING EXTENDED SYSTEM TRAJECTORY FILE
OPENING EXTENDED SYSTEM TRAJECTORY FILE
Info: Initial time: 8 CPUs 0.156147 s/step 0.903626 days/ns 40608 kB memory
LDB:  LOAD: AVG 1.68147 MAX 2.42618  MSGS: TOTAL 152 MAXC 28 MAXP 7  None
LDB:  LOAD: AVG 1.68147 MAX 2.42618  MSGS: TOTAL 152 MAXC 28 MAXP 7  None
LDB:  LOAD: AVG 1.68147 MAX 1.80888  MSGS: TOTAL 152 MAXC 28 MAXP 7  Alg7
LDB:  LOAD: AVG 1.68147 MAX 1.80888  MSGS: TOTAL 152 MAXC 28 MAXP 7  Alg7
LDB:  LOAD: AVG 1.68147 MAX 1.71406  MSGS: TOTAL 152 MAXC 28 MAXP 7  Alg7
LDB:  LOAD: AVG 1.68147 MAX 1.71406  MSGS: TOTAL 152 MAXC 28 MAXP 7  Alg7
----------------
any ideas?  thank you in advance for any help on this.
Rene
Charmrun> charmrun started...
Charmrun> using nodelist.txt as nodesfile
Charmrun> node programs all started
Charmrun> node programs all connected
Charmrun> adding client 0: "1", IP:10.0.0.101
Charmrun> adding client 1: "2", IP:10.0.0.102
Charmrun> adding client 2: "3", IP:10.0.0.103
Charmrun> adding client 3: "4", IP:10.0.0.104
Charmrun> adding client 4: "1", IP:10.0.0.101
Charmrun> adding client 5: "2", IP:10.0.0.102
Charmrun> adding client 6: "3", IPInfo: Based on Charm++/Converse 0143163 for net-linux-amd64-clustermatic
Info: Built Wed Mar 30 14:30:04 CST 2005 by root on ares
Info: Sending usage information to NAMD developers via UDP.  Sent data is:
Info: 1 NAMD  2.5  Linux-amd64-Clustermatic  8    n1  25581
Info: Running on 8 processors.
Info: 8375 kB of memory in use.
Measuring processor speeds... Done.
Info: Based on Charm++/Converse 0143163 for net-linux-amd64-clustermatic
Info: Built Wed Mar 30 14:30:04 CST 2005 by root on ares
Info: Configuration file is run_0025.conf
Info: Sending usage information to NAMD developers via UDP.  Sent data is:
Info: 1 NAMD  2.5  Linux-amd64-Clustermatic  8    n1  25581
Info: Running on 8 processors.
omplete.
Info: Changed directory to /scratch-cluster/parag/ares/work5
Info: EXTENDED SYSTEM FILE   restrt_0017.xsc
Info: SIMULATION PARAMETERS:
Info: TIMESTEP               2
Info: NUMBER OF STEPS        1000000
Info: STEPS PER CYCLE        4
Info: PERIODIC CELL BASIS 1  45.379 0 0
Info: PERIODIC CELL BASIS 2  0 40.5114 0
Info: PERIODIC CELL BASIS 3  0 0 198.579
Info: PERIODIC CELL CENTER   0 0 0
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCE STRATEGY  Other
Info: LDB PERIOD             800 steps
Info: FIRST LDB TIMESTEP     20
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MAX SELF PARTITIONS    50
Info: MAX PAIR PARTITIONS    20
Info: SELF PARTITION ATOMS   125
Info: PAIR PARTITION ATOMS   200
Info: PAIR2 PARTITION ATOMS  400
Info: INITIAL TEMPERATURE    323
Info: CENTER OF MASS MOVING? NO
Info: DIELECTRIC             1
Info: EXCLUDE                SCALED ONE-FOUR
Info: 1-4 SCALE FACTOR       1
Info: DCD FILENAME           dcd_0025
Info: DCD FREQUENCY          10000
Warning: INITIAL COORDINATES WILL NOT BE WRITTEN TO DCD FILE
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME           cell_0025
Info: XST FREQUENCY          100
Info: NO VELOCITY DCD OUTPUT
Info: OUTPUT FILENAME        minim_0025
Info: RESTART FILENAME       restrt_0025
Info: RESTART FREQUENCY      100000
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON           9
Info: SWITCHING OFF          12
Info: PAIRLIST DISTANCE      16
Info: PAIRLIST SHRINK RATE   0.01
Info: PAIRLIST GROW RATE     0.01
Info: PAIRLIST TRIGGER       0.3
Info: PAIRLISTS PER CYCLE    2
Info: PAIRLISTS ENABLED
Info: MARGIN                 1.11
Info: HYDROGEN GROUP CUTOFF  2.5
Info: PATCH DIMENSION        19.61
Info: ENERGY OUTPUT STEPS    100
Info: TIMING OUTPUT STEPS    500000
Info: PRESSURE OUTPUT STEPS  100
Info: FIXED ATOMS ACTIVE
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE   323
Info: LANGEVIN DAMPING COEFFICIENT IS 5 INVERSE PS
Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
Info: EXCLUDE FROM PRESSURE ACTIVE
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info:        TARGET PRESSURE IS 1.01325 BAR
Info:     OSCILLATION PERIOD IS 500 FS
Info:             DECAY TIME IS 300 FS
Info:     PISTON TEMPERATURE IS 323 K
Info:       PRESSURE CONTROL IS GROUP-BASED
Info:    INITIAL STRAIN RATE IS 5.30331e-06 -3.44143e-05 -8.25525e-06
Info:       CELL FLUCTUATION IS ANISOTROPIC
Info: SURFACE TENSION CONTROL ACTIVE
Info:       TARGET SURFACE TENSION IS 55 DYN/CM
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE               1e-09
Info: PME EWALD COEFFICIENT       0.33586
Info: PME INTERPOLATION ORDER     6
Info: PME GRID DIMENSIONS         60 60 200
Info: Attempting to read FFTW data from FFTW_NAMD_2.5_Linux-amd64-Clustermatic.txt
Info: Optimizing 6 FFT steps.  1... 2... 3... 4... 5... 6...   Done.
Info: Writing FFTW data to FFTW_NAMD_2.5_Linux-amd64-Clustermatic.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY      2
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RIGID BONDS TO HYDROGEN : ALL
Info:         ERROR TOLERANCE : 1e-08
Info:          MAX ITERATIONS : 100
Info: RIGID WATER USING SETTLE ALGORITHMInfo: 8375 kB of memory in use.
Measuring processor speeds...
Info: NONBONDED FORCES EVALUATED EVERY 2 STEPS
Info: RANDOM NUMBER SEED     12345
Info: USE HYDROGEN BONDS?    NO
Info: COORDINATE PDB         run_0017.pdb
Info: STRUCTURE FILE         dppcwat_monosysf2.psf
Info: PARAMETER file: CHARMM format! 
Info: PARAMETERS             par_all22_prot_lipmod.inp
Info: SUMMARY OF PARAMETERS:
Info: 165 BONDS
Info: 412 ANGLES
Info: 491 DIHEDRAL
Info: 43 IMPROPER
Info: 73 VDW
Info: 5 VDW_PAIRS
Warning: Ignored 2984 bonds with zero force constants.
 Done.
Info: Configuration file is run_0025.conf
TCL: Suspending until startup complete.
Info: Changed directory to /scratch-cluster/parag/ares/work5
Info: EXTENDED SYSTEM FILE   restrt_0017.xsc
Info: SIMULATION PARAMETERS:
Info: TIMESTEP               2
Info: NUMBER OF STEPS        1000000
Info: STEPS PER CYCLE        4
Info: PERIODIC CELL BASIS 1  45.379 0 0
Info: PERIODIC CELL BASIS 2  0 40.5114 0
Info: PERIODIC CELL BASIS 3  0 0 198.579
Info: PERIODIC CELL CENTER   0 0 0
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCE STRATEGY  Other
Info: LDB PERIOD             800 steps
Info: FIRST LDB TIMESTEP     20
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MAX SELF PARTITIONS    50
Info: MAX PAIR PARTITIONS    20
Info: SELF PARTITION ATOMS   125
Info: PAIR PARTITION ATOMS   200
Info: PAIR2 PARTITION ATOMS  400
Info: INITIAL TEMPERATURE    323
Info: CENTER OF MASS MOVING? NO
Info: DIELECTRIC             1
Info: EXCLUDE                SCALED ONE-FOUR
Info: 1-4 SCALE FACTOR       1
Info: DCD FILENAME           dcd_0025
Info: DCD FREQUENCY          10000
Warning: INITIAL COORDINATES WILL NOT BE WRITTEN TO DCD FILE
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME           cell_0025
Info: XST FREQUENCY          100
Info: NO VELOCITY DCD OUTPUT
Info: OUTPUT FILENAME        minim_0025
Info: RESTART FILENAME       restrt_0025
Info: RESTART FREQUENCY      100000
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON           9
Info: SWITCHING OFF          12
Info: PAIRLIST DISTANCE      16
Info: PAIRLIST SHRINK RATE   0.01
Info: PAIRLIST GROW RATE     0.01
Info: PAIRLIST TRIGGER       0.3
Info: PAIRLISTS PER CYCLE    2
Info: PAIRLISTS ENABLED
Info: MARGIN                 1.11
Info: HYDROGEN GROUP CUTOFF  2.5
Info: PATCH DIMENSION        19.61
Info: ENERGY OUTPUT STEPS    100
Info: TIMING OUTPUT STEPS    500000
Info: PRESSURE OUTPUT STEPS  100
Info: FIXED ATOMS ACTIVE
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE   323
Info: LANGEVIN DAMPING COEFFICIENT IS 5 INVERSE PS
Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
Info: EXCLUDE FROM PRESSURE ACTIVE
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info:        TARGET PRESSURE IS 1.01325 BAR
Info:     OSCILLATION PERIOD IS 500 FS
Info:             DECAY TIME IS 300 FS
Info:     PISTON TEMPERATURE IS 323 K
Info:       PRESSURE CONTROL IS GROUP-BASED
Info:    INITIAL STRAIN RATE IS 5.30331e-06 -3.44143e-05 -8.25525e-06
Info:       CELL FLUCTUATION IS ANISOTROPIC
Info: SURFACE TENSION CONTROL ACTIVE
Info:       TARGET SURFACE TENSION IS 55 DYN/CM
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE               1e-09
Info: PME EWALD COEFFICIENT       0.33586
Info: PME INTERPOLATION ORDER     6
Info: PME GRID DIMENSIONS         60 60 200
Info: Attempting to read FFTW data from FFTW_NAMD_2.5_Linux-amd64-Clustermatic.txt
Info: Optimizing 6 FFT steps.  1... 2... 3... 4... 5... 6...   Done.
Info: Writing FFTW data to FFTW_NAMD_2.5_Linux-amd64-Clustermatic.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY      2
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RIGID BONDS TO HYDROGEN : ALL
Info:         ERROR TOLERANCE : 1e-08
Info:          MAX ITERATIONS : 100
Info: RIGID WATER USING SETTLE ALGORITHM
Info: NONBONDED FORCES EVALUATED EVERY 2 STEPS
Info: RANDOM NUMBER SEED     12345
Info: USE HYDROGEN BONDS?    NO
Info: COORDINATE PDB         run_0017.pdb
Info: STRUCTURE FILE         dppcwat_monosysf2.psf
Info: PARAMETER file: CHARMM format! 
Info: PARAMETERS             par_all22_prot_lipmod.inp
Info: Got 1608 excluded pressure atoms.BONDS
Info: 412 ANGLES
Info: 491 DIHEDRAL
Info: 43 IMPROPER
Info: 73 VDW
Info: 5 VDW_PAIRS
Warning: Ignored 2984 bonds with zero force constants.
Warning: Will get H-H distance in rigid H2O from H-O-H angle.
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 13648 ATOMS
Info: 10612 BONDS
Info: 11984 ANGLES
Info: 12564 DIHEDRALS
Info: 72 IMPROPERS
Info: 0 EXCLUSIONS
Info: 1608 FIXED ATOMS
Info: 11832 RIGID BONDS
Info: 1608 RIGID BONDS BETWEEN FIXED ATOMS
Info: 25896 DEGREES OF FREEDOM
Info: 4800 HYDROGEN GROUPS
Info: 536 HYDROGEN GROUPS WITH ALL ATOMS FIXED
Info: TOTAL MASS = 80651.5 amu
Info: TOTAL CHARGE = 4.02331e-07 e
Info: *****************************
Info: Got 1608 excluded pressure atoms.Info: Entering startup phase 0 with 12190 kB of memory in use.
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 13648 ATOMS
Info: 10612 BONDS
Info: 11984 ANGLES
Info: 12564 DIHEDRALS
Info: 72 IMPROPERS
Info: 0 EXCLUSIONS
Info: 1608 FIXED ATOMS
Info: 11832 RIGID BONDS
Info: 1608 RIGID BONDS BETWEEN FIXED ATOMS
Info: 25896 DEGREES OF FREEDOM
Info: 4800 HYDROGEN GROUPS
Info: 536 HYDROGEN GROUPS WITH ALL ATOMS FIXED
Info: TOTAL MASS = 80651.5 amu
Info: TOTAL CHARGE = 4.02331e-07 e
Info: *****************************
Info: Entering startup phase 0 with 12190 kB of memory in use.
Info: Entering startup phase 1 with 12181 kB of memory in use.
Info: Entering startup phase 1 with 12181 kB of memory in use.
Info: Entering startup phase 2 with 17417 kB of memory in use.
Info: Entering startup phase 2 with 17417 kB of memory in use.
Info: Entering startup phase 3 with 17524 kB of memory in use.
Info: Entering startup phase 3 with 17524 kB of memory in use.
Info: PATCH GRID IS 2 (PERIODIC) BY 2 (PERIODIC) BY 10 (PERIODIC)
Info: PATCH GRID IS 2 (PERIODIC) BY 2 (PERIODIC) BY 10 (PERIODIC)
Info: REMOVING COM VELOCITY 0.0588433 -0.0529249 -0.0782718
Info: REMOVING COM VELOCITY 0.0588433 -0.0529249 -0.0782718
Info: LARGEST PATCH (10) HAS 999 ATOMS
Info: LARGEST PATCH (10) HAS 999 ATOMS
Info: Entering startup phase 4 with 21492 kB of memory in use.
Info: Entering startup phase 4 with 21492 kB of memory in use.
Info: PME using 8 and 8 processors for FFT and reciprocal sum.
Info: PME using 8 and 8 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 0 1 2 3 4 5 6 7
Info: PME TRANS LOCATIONS: 0 1 2 3 4 5 6 7
Info: PME GRID LOCATIONS: 0 1 2 3 4 5 6 7
Info: PME TRANS LOCATIONS: 0 1 2 3 4 5 6 7
Info: Optimizing 4 FFT steps.  1...Info: Optimizing 4 FFT steps.  1... 2... 3... 4...   Done.
 2... 3... 4...   Done.
Info: Entering startup phase 5 with 22425 kB of memory in use.
Info: Entering startup phase 5 with 22425 kB of memory in use.
Info: Entering startup phase 6 with 20978 kB of memory in use.
Info: Entering startup phase 6 with 20978 kB of memory in use.
Info: Entering startup phase 7 with 20999 kB of memory in use.
Info: Entering startup phase 7 with 20999 kB of memory in use.
Info: COULOMB TABLE R-SQUARED SPACING: 0.0625
Info: COULOMB TABLE R-SQUARED SPACING: 0.0625
Info: COULOMB TABLE SIZE: 769 POINTS
Info: COULOMB TABLE SIZE: 769 POINTS
Info: NONZERO IMPRECISION IN COULOMB TABLE: 3.9443e-31 (709) 1.57772e-29 (682)
Info: NONZERO IMPRECISION IN COULOMB TABLE: 3.9443e-31 (709) 1.57772e-29 (682)
Info: NONZERO IMPRECISION IN COULOMB TABLE: 2.11758e-22 (759) 2.91168e-22 (759)
Info: NONZERO IMPRECISION IN COULOMB TABLE: 2.11758e-22 (759) 2.91168e-22 (759)
Info: Entering startup phase 8 with 22099 kB of memory in use.
Info: Entering startup phase 8 with 22099 kB of memory in use.
Info: Finished startup with 22484 kB of memory in use.
Info: Finished startup with 22484 kB of memory in use.
PRESSURE: 0 -4483 25.3851 -1039.88 66.2416 -4581.91 738.936 -199.198 482.616 -4264.61
GPRESSURE: 0 -35.367 106.524 -1115.35 173.727 -280.875 652.35 -187.755 289.251 -154.32
ETITLE:      TS           BOND          ANGLE          DIHED          IMPRP               ELECT            VDW       BOUNDARY           MISC        KINETIC               TOTAL           TEMP         TOTAL2         TOTAL3        TEMPAVG            PRESSURE      GPRESSURE         VOLUME       PRESSAVG      GPRESSAVG
PRESSURE: 0 -4483 25.3851 -1039.88 66.2416 -4581.91 738.936 -199.198 482.616 -4264.61
GPRESSURE: 0 -35.367 106.524 -1115.35 173.727 -280.875 652.35 -187.755 289.251 -154.32
ETITLE:      TS           BOND          ANGLE          DIHED          IMPRP               ELECT            VDW       BOUNDARY           MISC        KINETIC               TOTAL           TEMP         TOTAL2         TOTAL3        TEMPAVG            PRESSURE      GPRESSURE         VOLUME       PRESSAVG      GPRESSAVG
ENERGY:       0       497.4372      2945.9299       697.7718        26.8610         -65990.2474       457.9247         0.0000         0.0000      8213.8267         -53150.4962       319.2297    -53065.4816    -53065.4816       319.2297          -4443.1718      -156.8539    365061.0449     -4443.1718      -156.8539
ENERGY:       0       497.4372      2945.9299       697.7718        26.8610         -65990.2474       457.9247         0.0000         0.0000      8213.8267         -53150.4962       319.2297    -53065.4816    -53065.4816       319.2297          -4443.1718      -156.8539    365061.0449     -4443.1718      -156.8539
OPENING EXTENDED SYSTEM TRAJECTORY FILE
OPENING EXTENDED SYSTEM TRAJECTORY FILE
Info: Initial time: 8 CPUs 0.156147 s/step 0.903626 days/ns 40608 kB memory
LDB:  LOAD: AVG 1.68147 MAX 2.42618  MSGS: TOTAL 152 MAXC 28 MAXP 7  None
LDB:  LOAD: AVG 1.68147 MAX 2.42618  MSGS: TOTAL 152 MAXC 28 MAXP 7  None
LDB:  LOAD: AVG 1.68147 MAX 1.80888  MSGS: TOTAL 152 MAXC 28 MAXP 7  Alg7
LDB:  LOAD: AVG 1.68147 MAX 1.80888  MSGS: TOTAL 152 MAXC 28 MAXP 7  Alg7
LDB:  LOAD: AVG 1.68147 MAX 1.71406  MSGS: TOTAL 152 MAXC 28 MAXP 7  Alg7
LDB:  LOAD: AVG 1.68147 MAX 1.71406  MSGS: TOTAL 152 MAXC 28 MAXP 7  Alg7
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:39:31 CST