Re: NAMD SGE Job Script Scale Up issues!

From: Aditya Ranganathan (aditya.sia_at_gmail.com)
Date: Fri Oct 19 2012 - 02:29:07 CDT

Hi,

The system I`m simulating using is of 68,000 atoms. Also, I was looking at
the timings in NAMD output when I said the simulation is running slower if
I use anything beyond 32 processes (running on 16 physical processor
cores). Please tell me if the script I attached is flexible to work with as
many processors available.

Regards

On Fri, Oct 19, 2012 at 11:11 AM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

> Hi,****
>
> ** **
>
> as every other parallel program, namd doesn’t care about physical and
> virtual cores. If you tell your sge that your nodes got 16 slots, and you
> choose to run a job with 64 slots, the sge will generate a nodelist
> representing this number of slots, so 4 nodes. Means namd got told to use
> this nodes and to spawn 64 processes, if it does, that’s the expected
> behavior.****
>
> ** **
>
> Another thing might be, that you barely can’t expect a speedup with HT. HT
> basically is a hardware strategy to improve multitasking by introducing a
> second command pipeline per physical core. This provides the possibility to
> fill up waiting times of some procs with other running processes. But as
> namd is a well performing code, not leaving too much spaces on the cores,
> most of cases you won’t see a gain through this.****
>
> ** **
>
> Also keep in mind, that when running across multiple nodes, every process
> will need to communicate with all other processes. So if you start 2 times
> the processes per node, with already more or less negative impact on
> performance due oversubscribing memory and the additional parallel
> overhead, the need for communication is also more than doubled:****
>
> Example:****
>
> ** **
>
> **1. **2 nodes 32 processes -> means every process needs o talk to
> 31 other processes. Means there are 16 processes per node, each open 31
> active connections, 16 of them are at the other host. Means your network
> gets load with 240 active connections.****
>
> ** **
>
> **2. **4 nodes 64 processes -> means every process needs o talk to
> 63 other processes. Means there are 16 processes per node, each open 63
> active connections, 47 of them are at the other hosts. Means your network
> gets load with 704 active connections.****
>
> ** **
>
> So 1st of all, try not to use the HT cores (turn off in bios or half the
> slots of the machines in the sge). Then watch parallel scaling again.****
>
> Usually namd scales quite well even with Gigabit-Ethernet. So if this
> doesn’t make a difference, we should check your network setup.****
>
> ** **
>
> To give reliable advice, we should also know what molecular system you
> benchmarked (num atoms f.i.) and what kind of network you got. Also, for
> benchmarking, don’t look at the CPU utilization, watch the timing in the
> namd out, that’s what counts.****
>
> ** **
>
> Good luck****
>
> ** **
>
> Norman Geist.****
>
> ** **
>
> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
> Auftrag von *Aditya Ranganathan
> *Gesendet:* Donnerstag, 18. Oktober 2012 15:05
> *An:* NAMD list; namd_at_ks.uiuc.edu
> *Betreff:* namd-l: NAMD SGE Job Script Scale Up issues!****
>
> ** **
>
> Hello,
>
> We installed a Linux Xeon Processor Based cluster. The specifications of
> the same are as follows, Master Node + 8 Compute Nodes, Each Node consists
> of 8 physical processor cores (16 hyperthreaded cores). So in total 64
> physical processor cores and 128 virtual cores available for computing.
>
> I tried submitting a NAMD job via sge using the following script: Ours is
> a Rocks 5.4 based cluster.
>
> #$ -cwd
> #$ -j y
> #$ -S /bin/bash
>
> nodefile=$TMPDIR/namd2.nodelist
> echo group main > $nodefile
> awk '{ for (i=0;i<$2;++i) {print "host",$1} }' $PE_HOSTFILE >> $nodefile
>
> dir=/home/moldyn/sim/namd
> $dir/charmrun ++remote-shell ssh ++nodelist $nodefile +p$NSLOTS $dir/namd2
> $dir/p53_equi.conf > $dir/p53onGATI.log
>
>
> Now, when i submit the job with the command, qsub -pe mpich 32 namd.sh
>
> The job generates 32 processes on 2 nodes with cpu utilization around 70%--bcaec554072289fcbd04cc647536--

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:22:10 CST