AW: Scaling on 64 core nodes?

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Fri Feb 15 2013 - 01:43:58 CST

Next message: Nicholas M Glykos: "Re: confines volume effects"
Previous message: Robert Elder: "Re: Replica-exchange metadynamics scripts"
In reply to: Rebecca Swett: "Scaling on 64 core nodes?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi Rebecca,

what prevents you from benchmarking your cluster on your own? I think nobody
can "approximate" a scaling for a cluster without any information about the
underlying hardware. Namd usually scales very well and almost linear.
Depending on your interconnect and the molecular system size, you will
observe a later or earlier outscaling. The best way to find out how good
which simulation size scales, is to just try it. As the outscaling usually
comes due the memory bandwidth (at one node), network bandwidth and the most
important network latency, there are possibilities to adapt the
parallelization of namd at runtime to better fit the used hardware. Testing
between different ratios of threads(smp) and processes(charm++/mpi) to
reduce the number of communicators can bring some additional tactics how to
deal with a large scale job. Additionally, one can use some tuning
parameters of namd like twoawayx if the system is to small to scale well.

The bottom line so far is, start some benchmarks for several system sizes
and double the number of processors till you don't get additional speedup.
Then try further to improve scaling with the namd tuning parameters and
various smp/mpi configurations. Usually 1 process per physical processor
using threads to utilize all the cores is a good way to coalesce
communication. But this is different in almost every environment.

As an example, our old CPU Cluster 32x Fujitsu Primergy RX220 2x Operon 270
DualCore 1.95 GHz = 128 cores, scales very satisfying with namd and only
1GBit/s Ethernet. But these CPUs are old and slow. Remember, the more
computing power per node, the more need for low latency/high bandwidth
network (or at one node the memory bandwith), as the nodes can solve their
part problems that quickly, that most of time will be spend in communicate
results and getting new tasks to do.

Hope that helps

Norman Geist.

> -----Ursprüngliche Nachricht-----
> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> Auftrag von Rebecca Swett
> Gesendet: Donnerstag, 14. Februar 2013 15:38
> An: namd-l_at_ks.uiuc.edu
> Betreff: namd-l: Scaling on 64 core nodes?
>
> Hi all, i'm working on setting up some guidelines for a local cluster
> and would like any advice you have for running on 64 CPU nodes? I have
> a
> cluster of 80 available but most of the old scaling documentation isn't
> really designed for anything higher than an 8 core node. Any
> suggestions
> on how to best utilize these resources?
>
> --
> Rebecca Swett
> Wayne State University
> 357 Chemistry
> Detroit, MI 48201
>
> Lab Phone 313-577-0552
> Cell Phone 906-235-0768

Next message: Nicholas M Glykos: "Re: confines volume effects"
Previous message: Robert Elder: "Re: Replica-exchange metadynamics scripts"
In reply to: Rebecca Swett: "Scaling on 64 core nodes?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:22:57 CST