Re: namd2 erratic performance over 104 procs

From: Axel Kohlmeyer (
Date: Sat Dec 06 2008 - 03:18:29 CST

On 12/5/08, Zailo Leite <> wrote:
> We're running it on a 64-node cluster, 8 cores per node, infiniband
> inteconnect.
> Using the ApoA1 benchmark, thing go pretty well until we go over 104
> cores, then not only it degrades signifcantly but the scaling becomes
> very chaotic. Any suggestions? something relating to the matrix
> decomposition perhaps?

more likely you have reached the the scaling limit with using 8 cores/node.

please try with 7 core/node or even 6. you have to deal two problems:
- communication congestion on the infiniband interface (all communication
  is serialized)
- OS jitter (or OS noise), the kernel has to serve other processes, this adds
  latency (and at the limit of scaling for a classical MD code, latency is
  all that matters) if you keep one processor core "empty" it does interfere
  less with the ongoing communication.

with more bandwidth intense applications, you'll have to go down to 4cores/node
to get the maximum performance out of a machine. particular with current
intel quad core, you are frequently better off in considering them dual core
cpus with twice the cache...


> Z
> --
> Zailo Leite
> Sysadmin, IMSS -
> Academic Unix Solutions /
> High Performance Computing Group
> California Institute of Technology
> Phone (626)395-3507 - Cell (626)394-6989

Axel Kohlmeyer
  Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
If you make something idiot-proof, the universe creates a better idiot.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:50:12 CST