From: Alessandro Cembran (cembran_at_chem.umn.edu)
Date: Mon Sep 18 2006 - 13:18:52 CDT
Hi,
I've been experiencing a problem running NAMD (with any of the versions 
2.6b1, 2.6b2 and 2.6) on a 256 processors node altix 3700 BX2 machine 
(http://www.msi.umn.edu/altix/intro/).
What happens is that with systems of different size (either ~55,000 or 
~190,000 atoms) and with different number of processors (8 or 40), the 
performances of my calculations are not reproducible at all. In 
particular, a job might run extremely fast (i.e., almost linear scaling) 
for hours or days and all of a sudden its performances slow down to 10% 
or even ~2% of the peak performance and never recover.
I talked with the systems manager here and he said that this is related 
to the architecture of the machine, because many jobs are competing for 
the network resources. In fact, I could track down that in some 
occasions the slow down arose when another "massively parallel" NAMD job 
started on the same node, and both of them then were running very slowly.
So, I was wondering whether there is anything that could be done to make 
a better use of the altix architecture. In particular I was thinking if 
there is a way to reduce the message passing among the processors or 
tune it.
Note: I always set the variables MPI_DSM_DISTRIBUTE
I also set MPI_MEMMAP_OFF=1 because my jobs crashed after a while they 
were running because they ran put of memory. The following  is a quote 
from the system manager:
> Another NAMD user ran into a problem with respect to the amount of 
> virtual memory that was being allocated to NAMD by the operating 
> system on the 256-processor Altix node.  It turns out that the Altix 
> MPI is designed to put huge memory maps into memory that speed up 
> performance when running MPI jobs that share memory between seperate 
> Altix partitions (a feature we do not use).  When this other NAMD user 
> would attempt to run large NAMD jobs, they would segfault.  If he set 
> the MPI_MEMMAP_OFF environment variable, his jobs no longer segfaulted.
Thanks in advance,
Alessandro
-- Alessandro Cembran,PhD Post Doctoral Associate Mailing Address: Univ. of Minnesota, Dept. of Chemistry G2, 139 Smith Hall 207 Pleasant St SE Minneapolis, MN 55455-0431 Office: Univ. of Minnesota, Walter Library 117 Pleasant St SE, Room 473 Phone: +1 612-624-4617 E-mail: cembran_at_chem.umn.edu
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:42:35 CST