Re: namd scale-up

From: Revthi Sanker (revthi.sanker1990_at_gmail.com)
Date: Thu Sep 05 2013 - 03:47:19 CDT

Dear Sir,
This is the script I am using to run NAMD in my cluster:

#!/bin/bash
#@ output = test.out
#@ error = test.err
#@ job_type = MPICH
#@ class = Medium128
#@ node = 8
#@tasks_per_node = 16
#@ environment = COPY_ALL
#@ queue
Jobid=`echo $LOADL_STEP_ID | cut -f 6 -d .`
tmpdir=$HOME/scratch/job$Jobid
mkdir -p $tmpdir; cd $tmpdir
cat $LOADL_HOSTFILE >xx
cp -R $LOADL_STEP_INITDIR/* $tmpdir
cat $LOADL_HOSTFILE > ./host.list
export LD_LIBRARY_PATH=/sware/openmpi1.6/lib:$LD_LIBRARY_PATH
/sware/openmpi1.6/bin/mpirun --mca btl openib,self -np 128 -hostfile
$LOADL_HOSTFILE /sware/NAMD_2.9_Source/Linux-x86_64-g++/namd2 md9.namd
>cetp_ana_md9.log
mv ../job$Jobid $LOADL_STEP_INITDIR

Am I failing to include something? Kinldy provide your valuable suggestions
in this regard.
Thanks in advance.

Revathi.S
M.S. Research Scholar
Indian Institute Of Technology, Madras
India
_________________________________

On Thu, Sep 5, 2013 at 12:02 PM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

> Hi Revthi,****
>
> ** **
>
> you should also have mentioned if you use an NAMD compiled against charm++
> or MPI. If charm++, try “+idlepoll” to the namd2 command, it should
> additionally improve scaling, sometimes two fold. Furthermore, if you have
> hyperthreading or magnycores, try to use half of the cores claimed per node
> and bind the processes to real physical cores only. You can use
> “/proc/cpuinfo” to determine that. “processors” with same “physical id” and
> “core id” usually appear to be the same physical core, these should not be
> used as they are bottlenecked due memory or fpu. Using “taskset” on the
> namd2 command, you can easily control which cores are allowed. ****
>
> ** **
>
> Example:****
>
> ** **
>
> charmrun +p 64 ++nodelist nodelist taskset –c 0,2,4,6 namd2 +idlepoll
> my.in****
>
> ** **
>
> If you do not have virtual cores, forget about the above for now, but keep
> in mind for the future as it has a large impact.****
>
> ** **
>
> Additionally, it is easy to say how well a scaling is, if you just compare
> the speedup to the ideal linear case. Therefore simply devide the time/step
> of 1node by time /step of n nodes. This number will usually be <= n nodes.
> The nearer it is to n nodes, the better. Do some benchmarks while
> increasing number of nodes and keep in mind that there can be a point of
> outscaling, where the time/step will start raising again. But you do not
> seem to hit that case already.****
>
> ** **
>
> So far I think there’s a little more to squeeze out for 300K system doing
> about 2.5ns/day.****
>
> ** **
>
> Good luck****
>
> ** **
>
> Norman Geist.****
>
> ** **
>
> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
> Auftrag von *Axel Kohlmeyer
> *Gesendet:* Mittwoch, 4. September 2013 10:01
> *An:* Revthi Sanker
> *Cc:* namd-l_at_ks.uiuc.edu
> *Betreff:* Re: namd-l: namd scale-up****
>
> ** **
>
> ** **
>
> ** **
>
> On Wed, Sep 4, 2013 at 9:43 AM, Revthi Sanker <revthi.sanker1990_at_gmail.com>
> wrote:****
>
> Dear all, ****
>
> I am running NAMD on the super cluster at my institute. My system consists
> of 3 L atoms roughly.****
>
> ** **
>
> please keep in mind that most people on this mailing list (and in the
> world in general) do not know what a lakh is and better talk about 300,000
> atoms instead. what would you think if somebody would talk to you about a
> system with 2000 gross atoms?****
>
> ****
>
> I am aware that the scale up depends on the configuration of the cluster
> I am currently using. But the people at the computer center would like to
> get a rough estimate of the the Benchmark (ns/day) for a system size of
> mine. Anybody who is aware of the yield for this system size, please let
> me know as I am not sure if what I am getting currently (*2.5 ns/day *for
> 8 nodes* 16 processors=128) is optimum or can it be tweaked further.****
>
> ** **
>
> the only way to find out the optimum, is by doing a (strong) scaling
> benchmark, i.e. use a different number of nodes and plot the resulting
> speedup. the performance depends not only on the hardware (CPU
> (type,generation,clock rate), memory bandwith, interconnect, BIOS
> configuration (e.g. hyper-threading, turbo boost)), but also on software
> (kernel, NAMD version, compiler, configuration (SMP, MPI, ibverbs)) and
> your system and input. so there is no way to tell from the number of atoms
> in the system and the number of nodes/cores whether you have a good
> performance or a bad performance.****
>
> ** **
>
> you can compare your numbers (absolute per cpu core performance and
> speedup) to other published data from other machines (even if much older)--e89a8f3bae5b47e93104e59ef84d--

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:37 CST