From: Karteek Bejagam (karteek4_at_vt.edu)
Date: Wed May 31 2017 - 11:33:48 CDT
Hello NAMD users,
I have a system with 100000 atoms.
It runs fine on a single node with 24 cores.
However, on multiple nodes, it fails with the following error.
##################
LDB: ============= START OF LOAD BALANCING ============== 6.80246
LDB: Largest compute 1637 load 0.031749 is 2.9% of average load 1.083487
LDB: Average compute 0.001368 is 0.1% of average load 1.083487
LDB: TIME 6.81261 LOAD: AVG 1.08349 MAX 1.4512 PROXIES: TOTAL 2496 MAXPE
40 MAXPATCH 5 None MEM: 406.734 MB
LDB: TIME 6.83671 LOAD: AVG 1.08349 MAX 1.24366 PROXIES: TOTAL 2496 MAXPE
40 MAXPATCH 5 TorusLB MEM: 406.934 MB
--------------------------------------------------------------------------
mpirun noticed that process rank 8 with PID 142379 on node nr060 exited on
signal 11 (Segmentation fault).
###################
Here is a part of script file.
module load gcc/4.7.2 openmpi/1.8.5 namd/2.10
charmrun namd2 +p$PBS_NP equi.namd > output.log
Thanks in advance,
Karteek
This archive was generated by hypermail 2.1.6 : Sun Dec 31 2017 - 23:21:20 CST