Re: Segmentation fault

From: Sonibare, Kolawole (kasonibare42_at_students.tntech.edu)
Date: Wed May 23 2018 - 16:49:04 CDT

Thank you for your response. I thought as much. This was due to the tcl code that I used in moving the molecules when I generated them. I have previously ran a simulation where this didn't present an issue as the box eventually compressed to a reasonable size, although the molecules are different. The problem didn't go away when run on one core.

________________________________
From: Vermaas, Joshua <Joshua.Vermaas_at_nrel.gov>
Sent: Wednesday, May 23, 2018 10:34:45 PM
To: Sonibare, Kolawole
Subject: RE: Segmentation fault

Your system seems strange. You've got a relatively small number of atoms and loads of empty space. This somehow lead to a patch grid decomposition that looks kinda odd, with one huge slab for the z-axis. Does this go away if you run on one core?

-Josh

On 2018-05-23 12:46:35-06:00 owner-namd-l_at_ks.uiuc.edu wrote:

Dear NAMD users,

I am running a minimization of a system of organic molecules on an HPC.

But when I check my slurm file, I get this error:

cm/shared/apps/namd/namd_functions: line 49: 103445 Segmentation fault namd2 ${NAMD_ARGS} ${INPUT} &>${OUTPUT}

This is what my output file looks like:

Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 10 threads
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID: v6.7.1-0-gbdf6a1b-namd-charm-6.7.1-build-2016-Nov-07-136676
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (28-way SMP).
Charm++> cpu topology info is gathered in 0.004 seconds.
Info: NAMD 2.12 for Linux-x86_64-multicore
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: for updates, documentation, and support information.
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 60701 for multicore-linux64-iccstatic
Info: Built Wed Dec 21 11:36:52 CST 2016 by jim on harare.ks.uiuc.edu
Info: 1 NAMD 2.12 Linux-x86_64-multicore 10 node002 kasonibare42
Info: Running on 10 processors, 1 nodes, 1 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.021831 s
CkLoopLib is used in SMP with a simple dynamic scheduling (converse-level notification) but not using node-level queue
Info: 649.32 MB of memory in use based on /proc/self/stat
Info: Configuration file is pre.inp
Info: Working in the current directory /home/tntech.edu/kasonibare42/lignin/collabo/NAMD/2units-lignin
TCL: Suspending until startup complete.
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 2
Info: NUMBER OF STEPS 0
Info: STEPS PER CYCLE 10
Info: PERIODIC CELL BASIS 1 1720.71 0 0
Info: PERIODIC CELL BASIS 2 0 3429.66 0
Info: PERIODIC CELL BASIS 3 0 0 1719.39
Info: PERIODIC CELL CENTER 854.97 1679.34 791.83
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCER Centralized
Info: LOAD BALANCING STRATEGY New Load Balancers -- DEFAULT
Info: LDB PERIOD 2000 steps
Info: FIRST LDB TIMESTEP 50
Info: LAST LDB TIMESTEP -1
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MIN ATOMS PER PATCH 40
Info: INITIAL TEMPERATURE 50
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 ELECTROSTATICS SCALED BY 1
Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
Info: DCD FILENAME pre.dcd
Info: DCD FREQUENCY 100
Info: DCD FIRST STEP 100
Info: DCD FREQUENCY 100
Info: DCD FIRST STEP 100
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME pre.xst
Info: XST FREQUENCY 100
Info: NO VELOCITY DCD OUTPUT
Info: NO FORCE DCD OUTPUT
Info: OUTPUT FILENAME pre
Info: BINARY OUTPUT FILES WILL BE USED
Info: RESTART FILENAME pre.restart
Info: RESTART FREQUENCY 100
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON 10
Info: SWITCHING OFF 12
Info: PAIRLIST DISTANCE 14
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLISTS ENABLED
Info: MARGIN 0.495
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 16.995
Info: ENERGY OUTPUT STEPS 100
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPS 1000
Info: PRESSURE OUTPUT STEPS 100
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info: TARGET PRESSURE IS 1.01325 BAR
Info: OSCILLATION PERIOD IS 100 FS
Info: DECAY TIME IS 50 FS
Info: PISTON TEMPERATURE IS 50 K
Info: PRESSURE CONTROL IS GROUP-BASED
Info: INITIAL STRAIN RATE IS 0 0 0
Info: CELL FLUCTUATION IS ISOTROPIC
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-05
Info: PME EWALD COEFFICIENT 0.226635
Info: PME INTERPOLATION ORDER 6
Info: PME GRID DIMENSIONS 1728 3456 1728
Info: PME MAXIMUM GRID SPACING 1
Info: Attempting to read FFTW data from FFTW_NAMD_2.12_Linux-x86_64-multicore.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
Info: Writing FFTW data to FFTW_NAMD_2.12_Linux-x86_64-multicore.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 1
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RIGID BONDS TO HYDROGEN : ALL
Info: ERROR TOLERANCE : 1e-08
Info: MAX ITERATIONS : 100
Info: RIGID WATER USING SETTLE ALGORITHM
Info: MAX ITERATIONS : 100
Info: RIGID WATER USING SETTLE ALGORITHM
Info: RANDOM NUMBER SEED 1095112514
Info: USE HYDROGEN BONDS? NO
Info: COORDINATE PDB 343-2units-lignin.pdb
Info: STRUCTURE FILE 343-2units-lignin.psf
Info: PARAMETER file: CHARMM format!
Info: PARAMETERS lignin.prm
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Info: SUMMARY OF PARAMETERS:
Info: 33 BONDS
Info: 66 ANGLES
Info: 117 DIHEDRAL
Info: 2 IMPROPER
Info: 0 CROSSTERM
Info: 23 VDW
Info: 0 VDW_PAIRS
Info: 0 NBTHOLE_PAIRS
Info: TIME FOR READING PSF FILE: 0.182832
Info: Reading pdb file 343-2units-lignin.pdb
Info: TIME FOR READING PDB FILE: 0.0271032
Info:
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 20580 ATOMS
Info: 20923 BONDS
Info: 34643 ANGLES
Info: 46305 DIHEDRALS
Info: 0 IMPROPERS
Info: 0 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 9604 RIGID BONDS
Info: 52133 DEGREES OF FREEDOM
Info: 10976 HYDROGEN GROUPS
Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
Info: 10976 MIGRATION GROUPS
Info: 4 ATOMS IN LARGEST MIGRATION GROUP
Info: TOTAL MASS = 156561 amu
Info: TOTAL CHARGE = -2.17222e-05 e
Info: MASS DENSITY = 2.56218e-05 g/cm^3
Info: ATOM DENSITY = 2.02821e-06 atoms/A^3
Info: *****************************
Info:
Info: Entering startup at 63.4152 s, 759.59 MB of memory in use
Info: Startup phase 0 took 0.000112057 s, 759.59 MB of memory in use
Info: ADDED 99813 IMPLICIT EXCLUSIONS
Info: Startup phase 1 took 0.0162029 s, 767 MB of memory in use
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: ABSOLUTE IMPRECISION IN VDWA TABLE ENERGY: 4.59334e-32 AT 11.9974
Info: RELATIVE IMPRECISION IN VDWA TABLE ENERGY: 7.4108e-17 AT 11.9974
Info: INCONSISTENCY IN VDWA TABLE ENERGY VS FORCE: 0.0040507 AT 0.251946
Info: ABSOLUTE IMPRECISION IN VDWB TABLE ENERGY: 1.53481e-26 AT 11.9974
Info: RELATIVE IMPRECISION IN VDWB TABLE ENERGY: 7.96691e-18 AT 11.9974
Info: INCONSISTENCY IN VDWB TABLE ENERGY VS FORCE: 0.00150189 AT 0.251946
Info: Startup phase 2 took 0.000453949 s, 767 MB of memory in use
Info: Startup phase 3 took 6.50883e-05 s, 767 MB of memory in use
Info: Startup phase 4 took 0.000301838 s, 767 MB of memory in use
Info: Startup phase 5 took 7.10487e-05 s, 767 MB of memory in use
Info: PATCH GRID IS 57 (PERIODIC) BY 9 (PERIODIC) BY 1 (PERIODIC)
Info: PATCH GRID IS 1-AWAY BY 1-AWAY BY 1-AWAY
Info: REMOVING COM VELOCITY 0.00632699 -0.0107726 0.00522905
Info: LARGEST PATCH (510) HAS 370 ATOMS
Info: TORUS A SIZE 10 USING 0
Info: TORUS B SIZE 1 USING 0
Info: TORUS C SIZE 1 USING 0
Info: TORUS MINIMAL MESH SIZE IS 1 BY 1 BY 1
Info: Placed 100% of base nodes on same physical node as patch
Info: Startup phase 6 took 0.010524 s, 773.664 MB of memory in use
Info: PME using 10 and 10 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 0 1 2 3 4 5 6 7 8 9
Info: PME TRANS LOCATIONS: 0 1 2 3 4 5 6 7 8 9
Info: PME USING 1 GRID NODES AND 1 TRANS NODES
Info: Optimizing 4 FFT steps. 1... 2... 3... 4... Done.
Info: Startup phase 7 took 4.72412 s, 79688 MB of memory in use
Info: Startup phase 8 took 0.00241709 s, 79688 MB of memory in use
LDB: Central LB being created...
Info: Startup phase 9 took 0.000478029 s, 79688 MB of memory in use
Info: CREATING 10310 COMPUTE OBJECTS

Ran /cm/shared/apps/namd/multicore/2.12/namd2 +p10 pre.inp

I have checked the mailing list but can't find any solution that applies to this. I have checked the pdb file and it looks OK. Please kindly help out.

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:21:07 CST