AW: namd scale-up

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Thu Sep 05 2013 - 07:58:23 CDT

Give us the output of "cat /proc/cpuinfo" from one of the compute nodes. As
you are using openmpi, you already have the idlepoll behavior enabled by
default.

Norman Geist.

> -----Ursprüngliche Nachricht-----
> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> Auftrag von Revthi Sanker
> Gesendet: Donnerstag, 5. September 2013 10:47
> An: Norman Geist
> Cc: Axel Kohlmeyer; Namd Mailing List
> Betreff: Re: namd-l: namd scale-up
>
> Dear Sir,
> This is the script I am using to run NAMD in my cluster:
>
> #!/bin/bash
> #@ output = test.out
> #@ error = test.err
> #@ job_type = MPICH
> #@ class = Medium128
> #@ node = 8
> #@tasks_per_node = 16
> #@ environment = COPY_ALL
> #@ queue
> Jobid=`echo $LOADL_STEP_ID | cut -f 6 -d .`
> tmpdir=$HOME/scratch/job$Jobid
> mkdir -p $tmpdir; cd $tmpdir
> cat $LOADL_HOSTFILE >xx
> cp -R $LOADL_STEP_INITDIR/* $tmpdir
> cat $LOADL_HOSTFILE > ./host.list
> export LD_LIBRARY_PATH=/sware/openmpi1.6/lib:$LD_LIBRARY_PATH
> /sware/openmpi1.6/bin/mpirun --mca btl openib,self -np 128 -hostfile
> $LOADL_HOSTFILE /sware/NAMD_2.9_Source/Linux-x86_64-g++/namd2 md9.namd
> >cetp_ana_md9.log
> mv ../job$Jobid $LOADL_STEP_INITDIR
>
> Am I failing to include something? Kinldy provide your valuable
> suggestions
> in this regard.
> Thanks in advance.
>
> Revathi.S
> M.S. Research Scholar
> Indian Institute Of Technology, Madras
> India
> _________________________________
>
>
> On Thu, Sep 5, 2013 at 12:02 PM, Norman Geist <
> norman.geist_at_uni-greifswald.de> wrote:
>
> > Hi Revthi,****
> >
> > ** **
> >
> > you should also have mentioned if you use an NAMD compiled against
> charm++
> > or MPI. If charm++, try “+idlepoll” to the namd2 command, it should
> > additionally improve scaling, sometimes two fold. Furthermore, if you
> have
> > hyperthreading or magnycores, try to use half of the cores claimed
> per node
> > and bind the processes to real physical cores only. You can use
> > “/proc/cpuinfo” to determine that. “processors” with same “physical
> id” and
> > “core id” usually appear to be the same physical core, these should
> not be
> > used as they are bottlenecked due memory or fpu. Using “taskset” on
> the
> > namd2 command, you can easily control which cores are allowed. ****
> >
> > ** **
> >
> > Example:****
> >
> > ** **
> >
> > charmrun +p 64 ++nodelist nodelist taskset –c 0,2,4,6 namd2 +idlepoll
> > my.in****
> >
> > ** **
> >
> > If you do not have virtual cores, forget about the above for now, but
> keep
> > in mind for the future as it has a large impact.****
> >
> > ** **
> >
> > Additionally, it is easy to say how well a scaling is, if you just
> compare
> > the speedup to the ideal linear case. Therefore simply devide the
> time/step
> > of 1node by time /step of n nodes. This number will usually be <= n
> nodes.
> > The nearer it is to n nodes, the better. Do some benchmarks while
> > increasing number of nodes and keep in mind that there can be a point
> of
> > outscaling, where the time/step will start raising again. But you do
> not
> > seem to hit that case already.****
> >
> > ** **
> >
> > So far I think there’s a little more to squeeze out for 300K system
> doing
> > about 2.5ns/day.****
> >
> > ** **
> >
> > Good luck****
> >
> > ** **
> >
> > Norman Geist.****
> >
> > ** **
> >
> > *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
> > Auftrag von *Axel Kohlmeyer
> > *Gesendet:* Mittwoch, 4. September 2013 10:01
> > *An:* Revthi Sanker
> > *Cc:* namd-l_at_ks.uiuc.edu
> > *Betreff:* Re: namd-l: namd scale-up****
> >
> > ** **
> >
> > ** **
> >
> > ** **
> >
> > On Wed, Sep 4, 2013 at 9:43 AM, Revthi Sanker
> <revthi.sanker1990_at_gmail.com>
> > wrote:****
> >
> > Dear all, ****
> >
> > I am running NAMD on the super cluster at my institute. My system
> consists
> > of 3 L atoms roughly.****
> >
> > ** **
> >
> > please keep in mind that most people on this mailing list (and in the
> > world in general) do not know what a lakh is and better talk about
> 300,000
> > atoms instead. what would you think if somebody would talk to you
> about a
> > system with 2000 gross atoms?****
> >
> > ****
> >
> > I am aware that the scale up depends on the configuration of the
> cluster
> > I am currently using. But the people at the computer center would
> like to
> > get a rough estimate of the the Benchmark (ns/day) for a system size
> of
> > mine. Anybody who is aware of the yield for this system size, please
> let
> > me know as I am not sure if what I am getting currently (*2.5 ns/day
> *for
> > 8 nodes* 16 processors=128) is optimum or can it be tweaked
> further.****
> >
> > ** **
> >
> > the only way to find out the optimum, is by doing a (strong) scaling
> > benchmark, i.e. use a different number of nodes and plot the
> resulting
> > speedup. the performance depends not only on the hardware (CPU
> > (type,generation,clock rate), memory bandwith, interconnect, BIOS
> > configuration (e.g. hyper-threading, turbo boost)), but also on
> software
> > (kernel, NAMD version, compiler, configuration (SMP, MPI, ibverbs))
> and
> > your system and input. so there is no way to tell from the number of
> atoms
> > in the system and the number of nodes/cores whether you have a good
> > performance or a bad performance.****
> >
> > ** **
> >
> > you can compare your numbers (absolute per cpu core performance and
> > speedup) to other published data from other machines (even if much
> older)=

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:37 CST