AW: Replica-exchange MD on a single node / SMP workstation

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Mon Nov 12 2012 - 00:33:23 CST

Hi,

 

If you really get the same error while running multiple processes, this
could be simply a mistake.

Usually, a parallel environment like MPI doesn't care about nodes, only
about processes. Charm++ is different here. It knows what CPUs belong to a
single node. Maybe the programmer just wanted to make sure you're not
running in serial for the replica simulations and just implemented this
quick check, maybe while not noticing the difference between nodes and
processes when using charm++. You could change this behavior by modifying
this code part in a way, that not the node count must be greater 1, but the
count of processes, should be easy to figure out. Maybe provide a diff patch
then.

 

Norman Geist.

 

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Zachary Ulissi
Gesendet: Samstag, 10. November 2012 00:57
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: Replica-exchange MD on a single node / SMP workstation

 

I've been trying to get the mpi-based replica exchange example to work on a
workstation for testing (4 processor, 64 core SMP). I have mpich2 set up,
have compiled the MPI version of NAMD2.9, and namd/charmm sees the machine
as a single 64-way SMP node.

 

When I run the replica exchange example for alanin, it does a hard-check in
the patched charm++ code to make sure the number of nodes is greater than
the number of replicas and then stops (since 1node<8replicas), even if I
specify multiple processes for mpirun.

 

Is there a reason why you need a complete node per replica simulation? If
not, is it possible to run multiple replicas on a single node with multiple
processes? Am I missing something obvious?

 

Thanks!

Zack

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:22:15 CST