memory problem

From: Adam Moser (adam.t.moser_at_gmail.com)
Date: Fri Jun 20 2014 - 15:01:30 CDT

NAMD Gurus,

I'm running NAMD_2.9_Linux-x86_64-multicore/ (no CUDA) on a single IBM
server with 4 16core processors (AMD Opteron(tm) Processor 6282 SE) and
251G of memory. The OS is 64bit FedoraCore18 (kernel 3.11.10).

I minimized, heated, and equilibrated my system and the energy and
temperature have been reasonable. Viewing the trajectory doesn't show any
obvious problems. I'm running a ~82k atom system (>namd2 +p32 config.in >
output.out) and running the system in 100000 step chunks. All of the
sudden I'm getting segmentation faults. For example

>*** glibc detected *** namd2: free(): corrupted unsorted chunks:
0x00002ac7807c0e90 ***
>*** glibc detected *** namd2: free(): corrupted unsorted chunks:
0x00002ac78065b800 ***
>Segmentation fault (core dumped)

It doesn't always happen at the same spot, but always happens before the
100000 steps. One processor seems to work, but any multicore jobs die at
seemingly random places. I deleted everything and repeated the
heat/minimization/equilbration process over an even longer time, but still
no change.

I redownloaded the binary (and also the nightliy build) with no success.

I've read through the mail list archives, but haven't found something that
has worked. I'm not sure what else to try. Any thoughtful comments are
appreciated.

~Adam

Input file is as follows

-------------------------------------------
# user parameters
set outputname dyn_6
set finaltemp 300.0
# how often to replace rst file, save coords and vels
set restfreq 10000
set savefreqc 100
set savefreqv 100
set firststep 500000
set stepsize 1
set steps 600000
# top and struct
structure ../minimize/mppe_12mer_xplor.psf
coordinates ./rst/dyn_5.coor
velocities ./rst/dyn_5.vel
# parameters
paratypecharmm on
paratypexplor off
parameters ../../../toppar/par_mpe_chlor2.par
# Non-bonds
exclude scaled1-4
1-4scaling 1.0
switching on
switchdist 10.0
cutoff 12.0
pairlistdist 14.0
stepspercycle 20
# Constraints (shake)
rigidBonds all
# Ewald
PME on
PMEGridSizeX 120
PMEGridSizeY 120
PMEGridSizeZ 120
# Constant Pressure Control (variable Volume)
useGroupPressure yes #must be yes if rigidBonds are set
useFlexibleCell no
useConstantArea no
langevinPiston on
langevinPistonTarget 1.01325
langevinPistonPeriod 200
langevinPistonDecay 100
langevinPistonTemp $finaltemp
extendedSystem ./rst/dyn_5.xsc
cellOrigin 0.0 0.0 0.0
wrapAll on
wrapNearest on
# Output options
outputenergies 100
outputtiming 1000
binaryoutput no
restartfreq $restfreq
binaryrestart no
dcdfreq $savefreqc
# name for files
outputname ./rst/$outputname
DCDfile ./dcd/$outputname.dcd
# integrator
timestep $stepsize # step and evaluate bonded
nonbondedFreq $stepsize # evaluate short-range nonbonded
fullElectFrequency $stepsize # evaluate long-range nonbonded
seed 4242
numsteps $steps
firsttimestep $firststep

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:22:32 CST