NAMD3 segfault while writing restart files

From: David Sept (dsept_at_umich.edu)
Date: Sat Dec 05 2020 - 11:09:55 CST

I'm playing with NAMD3 and having some issues. The start-up phase and
initial run seems to be perfectly fine and consistent with other benchmarks
(my system is ~72k atoms and I get 128 ns/day with 4 cpu cores and 1 V100
GPU w/ NpT), but the simulation always dies at the first checkpoint:

...

WRITING EXTENDED SYSTEM TO RESTART FILE AT STEP 50000

OPENING COORDINATE DCD FILE

WRITING COORDINATES TO DCD FILE cp1-1.dcd AT STEP 50000

WRITING COORDINATES TO RESTART FILE AT STEP 50000

/var/spool/slurmd.spool/job15885396/slurm_script: line 32: 21081
Segmentation fault (core dumped) $NAMD +p4 prod-run.conf

As indicated above, the .dcd and .xsc files are written out, but it dies on
the coordinate file for some reason. I have a core file that is spit out,
but I haven't tried to get any information out of it.

Has anyone else seen something like this? Are there command line flags I
can add to give me more information?

Thanks!
Dave

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:10 CST