From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Fri Oct 11 2013 - 06:23:14 CDT
Hi,
I'd like to report the following behavior of the mentioned NAMD version.
NAMD REMD simulations seem to segfault if the least required two processes
per replica are split up over different nodes. This happens for example if a
queuing system uses machine files which contain a node name only once, as
the MPI will usually start the processes round robin until reached its
maximum count.
Example machinefile:
C1
C2
C3
Example distribution:
#P node replica
1 C01 0
2 C02 0
3 C03 1
4 C01 1
5 C02 2
6 C03 2
The above case always segfaults for me, whereas the following work
perfectly:
Example machinefile:
C1
C1
C2
C2
C3
C3
Example distribution:
#P node replica
1 C01 0
2 C01 0
3 C02 1
4 C02 1
5 C03 2
6 C03 2
I guess segfaulting isn't the expected behavior, so if the current
implementation requires the explained behavior, it might be worth to
precheck the distribution, or to mention it in the manual.
Also, if one starts a REMD having num procs equal to num replicas, the
simulation will just do nothing, but won't break, which is also
uncomfortable.
Regards
Norman Geist
This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:49 CST