Re: Background load problem

From: Axel Kohlmeyer (akohlmey_at_cmm.chem.upenn.edu)
Date: Thu Feb 26 2009 - 10:13:06 CST

On Thu, 26 Feb 2009, Anirban wrote:

AG> Hi Peter,
AG>
AG> Actually that is a shape-based system, thats why the size is too small.
AG> I also tried to run a RBCG model of 5622 CG beads and that is also not
AG> scaling beyond 16 processors. And I am getting the same comment "High
AG> background load" for this system also. What should I do?

expecting significant parallel scaling for such miniscule systems
is just plain ridiculous. i assume you are running on the
large 2x-quad-core nodes machine with infiniband in pune?

your performance is vastly dominated by the print statements
from the rank0 process, and the overhead from communication.
having 8 mpi tasks being serialized through a single infiniband
host adapter will add a lot of latency. so if you want to get
any scaling, you will have to reduce the number of mpi tasks
per node (try 4 or 2 or even 1).

..and i find it hard to imagine any scientifically
meaningful application for such simulation anyways.

cheers,
   axel.

AG>
AG> Regards,
AG>
AG>
AG> On Thu, 2009-02-26 at 07:50 -0600, Peter Freddolino wrote:
AG> > Hi Anirban,
AG> > namd is generally able to scale efficiently in parallel up to 100-1000
AG> > atoms per processor, depending on your exact system. Trying to run a 45
AG> > particle system on more than one processor is unlikely to give
AG> > significant returns, and certainly running on more than one node (where
AG> > network latency comes into play) is right out. Have you tried
AG> > benchmarking with different numbers of processors?
AG> >
AG> > How is your system so small? Do you not have cg water?
AG> >
AG> > Best,
AG> > Peter
AG> >
AG> > Anirban Ghosh wrote:
AG> > > Hi ALL,
AG> > >
AG> > > I am running a CGMD simulation of a system comprising of 45 beads using
AG> > > NAMD. I am using 16 processors (4 nodes) to run the job. Although no other
AG> > > jobs are running on this node and CPU usage % is only 10-11%, but still I
AG> > > am getting the following error messages related to background loads.
AG> > > Because of this the log files are becoming too large and the run-times are
AG> > > increasing exponentially.
AG> > > -----------------------------------------------------------------------------
AG> > > WRITING EXTENDED SYSTEM TO RESTART FILE AT STEP 1000100000
AG> > > OPENING COORDINATE DCD FILE
AG> > > WRITING COORDINATES TO DCD FILE AT STEP 1000100000
AG> > > WRITING COORDINATES TO RESTART FILE AT STEP 1000100000
AG> > > FINISHED WRITING RESTART COORDINATES
AG> > > WRITING VELOCITIES TO RESTART FILE AT STEP 1000100000
AG> > > FINISHED WRITING RESTART VELOCITIES
AG> > > LDB: LOAD: AVG 0.00873701 MAX 0.0428593 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > None
AG> > > Warning: 1 processors are overloaded due to high background load.
AG> > > LDB: LOAD: AVG 0.00873701 MAX 0.0428593 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > Refine
AG> > > LDB: LOAD: AVG 0.00786597 MAX 0.0348988 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > None
AG> > > Warning: 2 processors are overloaded due to high background load.
AG> > > LDB: LOAD: AVG 0.00786597 MAX 0.031147 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > Refine
AG> > > LDB: LOAD: AVG 0.00806749 MAX 0.034482 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > None
AG> > > Warning: 2 processors are overloaded due to high background load.
AG> > > LDB: LOAD: AVG 0.00806749 MAX 0.0327935 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > Refine
AG> > > LDB: LOAD: AVG 0.00822078 MAX 0.0371771 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > None
AG> > > Warning: 1 processors are overloaded due to high background load.
AG> > > LDB: LOAD: AVG 0.00822078 MAX 0.0339828 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > Refine
AG> > > LDB: LOAD: AVG 0.00878559 MAX 0.0362875 MSGS: TOTAL 15 MAXC 1 MAXP 15
AG> > > None
AG> > > -----------------------------------------------------------------------------
AG> > >
AG> > > How can I solve this problem? Any suggestion is appreciated.
AG> > >
AG> > >
AG> > > Regards,
AG> > >
AG> > >
AG>

-- 
=======================================================================
Axel Kohlmeyer   akohlmey_at_cmm.chem.upenn.edu   http://www.cmm.upenn.edu
   Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:52:24 CST