Re: bug report and stray pme

From: Brian Bennion (brian_at_youkai.llnl.gov)
Date: Tue Oct 19 2004 - 15:31:43 CDT

Hi Cheri

There have certainly be some answers posted on the mailing list regarding
the first error. Yes this has been figured out. You need to increase your
margins at least during equilibration. Your system size is shrinking past
the boundaries that NAMD can safely use in it parallelization.

The following error is most likely a result from the first error.
Do as the warning suggests and increase your margin size at least in the
begining of the simulation until you see it equilibrate.

Regards
Brian

On Mon, 18 Oct 2004, Cheri M Turman wrote:

> Hi,
> I am having some issues running SA on a gigabit beowulf cluster. I got
> this error and bug report. My minimization for 8000 steps ran fine but
> during the heating process, the problems started. I have found some
> people have written questions here about the ERROR: Margin is too small
> for 1 atoms during timestep 52258. ERROR: Incorrect nonbonded forces
> and energies may be calculated! However, I could not find any answers
> to these questions about this error. Did any ever figure this out?
>
> My next error was:
> WRITING EXTENDED SYSTEM TO RESTART FILE AT STEP 54000
> WRITING COORDINATES TO DCD FILE AT STEP 54000
> WRITING COORDINATES TO RESTART FILE AT STEP 54000
> FINISHED WRITING RESTART COORDINATES
> WRITING VELOCITIES TO RESTART FILE AT STEP 54000
> FINISHED WRITING RESTART VELOCITIES
> LDB: LOAD: AVG 8.70782 MAX 10.2825 MSGS: TOTAL 387 MAXC 41 MAXP 5 None
> LDB: LOAD: AVG 8.70782 MAX 8.88166 MSGS: TOTAL 387 MAXC 41 MAXP 5 Refine
> ERROR: Stray PME grid charges detected: 7 sending to 8 for planes 56
> ERROR: Stray PME grid charges detected: 7 sending to 8 for planes 56
> BUG ALERT: Stray PME grid charges detected!
>
> BUG ALERT: NAMD has detected a bug. Please notify namd_at_ks.uiuc.edu.
> Stack Traceback:
> [0] _ZN10Controller16compareChecksumsEii+0x685 [0x81a649d]
> [1] _ZN10Controller21printDynamicsEnergiesEi+0x89 [0x81af4a5]
> [2] _ZN10Controller9integrateEv+0x10c [0x81a916c]
> [3] _ZN10Controller9algorithmEv+0x718 [0x81a4680]
> [4] _ZN10Controller9threadRunEPS_+0xc [0x81b0d64]
> [5] /apps/namd/./namd2 [0x82758a4]
> [6] Charm++ Runtime: Converse thread (qt_args+0x66 [0x82c40aa])
>
>
> Any ideas about this one? Do they both point to problems with running
> on multiple computers/passing off jobs? I olny think this because I
> have been having problems with these issues and have been playing with
> my PBS batch scripts. Please let me know if you have any clue about
> this. Following is the beginning of my logfile to let you know the
> parameters used.
>
> /tmp/pbs.6281.Trinity/6281.Trinity.nodelist
> 12 nodes
> Info: NAMD 2.5 for Linux-i686
> Info:
> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
> Info: and send feedback or bug reports to namd_at_ks.uiuc.edu
> Info:
> Info: Please cite Kale et al., J. Comp. Phys. 151:283-312 (1999)
> Info: in all publications reporting results obtained with NAMD.
> Info:
> Info: Based on Charm++/Converse 050612 for net-linux-icc
> Info: Built Fri Sep 26 17:33:59 CDT 2003 by jim on lisboa.ks.uiuc.edu
> Info: Sending usage information to NAMD developers via UDP. Sent data is:
> Info: 1 NAMD 2.5 Linux-i686 12 Morpheus10 cmturman
> Info: Running on 12 processors.
> Info: 1477 kB of memory in use.
> Measuring processor speeds... Done.
> Info: Changed directory to /home/cmturman/NAMD
> Info: Configuration file is Simulated_annealing_2DV1_92704
> TCL: Suspending until startup complete.
> Info: SIMULATION PARAMETERS:
> Info: TIMESTEP 1
> Info: NUMBER OF STEPS 0
> Info: STEPS PER CYCLE 10
> Info: PERIODIC CELL BASIS 1 78.94 0 0
> Info: PERIODIC CELL BASIS 2 0 78.393 0
> Info: PERIODIC CELL BASIS 3 0 0 75.347
> Info: PERIODIC CELL CENTER 5.54633 32.9508 19.3833
> Info: LOAD BALANCE STRATEGY Other
> Info: LDB PERIOD 2000 steps
> Info: FIRST LDB TIMESTEP 50
> Info: LDB BACKGROUND SCALING 1
> Info: HOM BACKGROUND SCALING 1
> Info: PME BACKGROUND SCALING 1
> Info: MAX SELF PARTITIONS 50
> Info: MAX PAIR PARTITIONS 20
> Info: SELF PARTITION ATOMS 125
> Info: PAIR PARTITION ATOMS 200
> Info: PAIR2 PARTITION ATOMS 400
> Info: INITIAL TEMPERATURE 310
> Info: CENTER OF MASS MOVING? NO
> Info: DIELECTRIC 1
> Info: EXCLUDE SCALED ONE-FOUR
> Info: 1-4 SCALE FACTOR 1
> Info: DCD FILENAME
> /home/cmturman/NAMD/2DV1_supcom/2DV1_annealing_92704.dcd
> Info: DCD FREQUENCY 1000
> Warning: INITIAL COORDINATES WILL NOT BE WRITTEN TO DCD FILE
> Info: XST FILENAME
> /home/cmturman/NAMD/2DV1_supcom/2DV1_annealing_92704.xst
> Info: XST FREQUENCY 1000
> Info: NO VELOCITY DCD OUTPUT
> Info: OUTPUT FILENAME
> /home/cmturman/NAMD/2DV1_supcom/2DV1_annealing_92704
> Info: RESTART FILENAME
> /home/cmturman/NAMD/2DV1_supcom/2DV1_92704_restart
> Info: RESTART FREQUENCY 2000
> Info: RESTART FILES WILL NOT BE OVERWRITTEN
> Info: SWITCHING ACTIVE
> Info: SWITCHING ON 8.5
> Info: SWITCHING OFF 10
> Info: PAIRLIST DISTANCE 10
> Info: PAIRLIST SHRINK RATE 0.01
> Info: PAIRLIST GROW RATE 0.01
> Info: PAIRLIST TRIGGER 0.3
> Info: PAIRLISTS PER CYCLE 2
> Info: PAIRLISTS ENABLED
> Info: MARGIN 0
> Info: HYDROGEN GROUP CUTOFF 2.5
> Info: PATCH DIMENSION 12.5
> Info: ENERGY OUTPUT STEPS 1000
> Info: VELOCITY REASSIGNMENT FREQ 500
> Info: VELOCITY REASSIGNMENT TEMP 300
> Info: PARTICLE MESH EWALD (PME) ACTIVE
> Info: PME TOLERANCE 1e-06
> Info: PME EWALD COEFFICIENT 0.312341
> Info: PME INTERPOLATION ORDER 4
> Info: PME GRID DIMENSIONS 82 80 80
> Info: Attempting to read FFTW data from FFTW_NAMD_2.5_Linux-i686.txt
> Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
> Info: Writing FFTW data to FFTW_NAMD_2.5_Linux-i686.txt
> Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 1
> Info: USING VERLET I (r-RESPA) MTS SCHEME.
> Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
> Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
> Info: RIGID BONDS TO HYDROGEN : WATER
> Info: ERROR TOLERANCE : 1e-08
> Info: MAX ITERATIONS : 100
> Info: RIGID WATER USING SETTLE ALGORITHM
> Info: RANDOM NUMBER SEED 1098114157
> Info: USE HYDROGEN BONDS? NO
> Info: COORDINATE PDB /home/cmturman/NAMD/output/2DV1_wb.pdb
> Info: STRUCTURE FILE /home/cmturman/NAMD/output/2DV1_wb.psf
> Info: PARAMETER file: CHARMM format!
> Info: PARAMETERS /home/cmturman/NAMD/par_all22_prot_2.inp
> Info: SUMMARY OF PARAMETERS:
> Info: 139 BONDS
> Info: 343 ANGLES
> Info: 443 DIHEDRAL
> Info: 43 IMPROPER
> Info: 60 VDW
> Info: 0 VDW_PAIRS
> Info: ****************************
> Info: STRUCTURE SUMMARY:
> Info: 42793 ATOMS
> Info: 31257 BONDS
> Info: 26076 ANGLES
> Info: 21240 DIHEDRALS
> Info: 1347 IMPROPERS
> Info: 0 EXCLUSIONS
> Info: 34935 RIGID BONDS
> Info: 93441 DEGREES OF FREEDOM
> Info: 15570 HYDROGEN GROUPS
> Info: TOTAL MASS = 265405 amu
> Info: TOTAL CHARGE = 4.00001 e
> Info: *****************************
> Info: Entering startup phase 0 with 9282 kB of memory in use.
> Info: Entering startup phase 1 with 9282 kB of memory in use.
> Info: Entering startup phase 2 with 13578 kB of memory in use.
> Info: Entering startup phase 3 with 13914 kB of memory in use.
> Info: PATCH GRID IS 6 (PERIODIC) BY 6 (PERIODIC) BY 6 (PERIODIC)
> Info: REMOVING COM VELOCITY 0.0360994 -0.0131584 0.0112262
> Info: LARGEST PATCH (58) HAS 269 ATOMS
> Info: Entering startup phase 4 with 20028 kB of memory in use.
> Info: PME using 12 and 12 processors for FFT and reciprocal sum.
> Creating Strategy 4
> Creating Strategy 4
> Info: PME GRID LOCATIONS: 0 1 2 3 4 5 6 7 8 9 ...
> Info: PME TRANS LOCATIONS: 0 1 2 3 4 5 6 7 8 9 ...
> Info: Optimizing 4 FFT steps. 1... 2... 3... 4... Done.
> Info: Entering startup phase 5 with 20408 kB of memory in use.
> Info: Entering startup phase 6 with 15335 kB of memory in use.
> Info: Entering startup phase 7 with 15335 kB of memory in use.
> Info: COULOMB TABLE R-SQUARED SPACING: 0.0625
> Info: COULOMB TABLE SIZE: 705 POINTS
> Info: NONZERO IMPRECISION IN COULOMB TABLE: 4.23516e-22 (675)
> 8.47033e-22 (675)
> Info: NONZERO IMPRECISION IN COULOMB TABLE: 5.04871e-29 (699)
> 1.26218e-28 (699)
> Info: NONZERO IMPRECISION IN COULOMB TABLE: 1.05879e-22 (698)
> 2.64698e-22 (698)
> Info: Entering startup phase 8 with 16015 kB of memory in use.
> Info: Finished startup with 18523 kB of memory in use.
> TCL: Minimizing for 8000 steps
> Thanks,
> Cheri
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Cheri M. Turman
> Graduate Student
> University of Texas-Houston Medical School
> 6431 Fannin
> Houston, TX 77030 USA
>
> e-mail: cheri.m.turman_at_uth.tmc.edu
> Ph.: 713-500-6126
> Fax: 713-500-0652
>

*****************************************************************
**Brian Bennion, Ph.D. **
**Computational and Systems Biology Division **
**Biology and Biotechnology Research Program **
**Lawrence Livermore National Laboratory **
**P.O. Box 808, L-448 bennion1_at_llnl.gov **
**7000 East Avenue phone: (925) 422-5722 **
**Livermore, CA 94550 fax: (925) 424-6605 **
*****************************************************************

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:37:55 CST