Re: NAMD benchmark results for dual-Nehalem, was Re: questions regarding new single node box for NAMD: re components, CUDA, Fermi

From: Biff Forbush (
Date: Thu Dec 10 2009 - 19:57:02 CST

Hi Dow,
    Thanks for your data, I think it is very helpful for the community
to have benchmark updates with newer processors.
    As I discussed earlier, my objective was to try to see what could be
done with a single box, no infiniband etc., but this is interesting.

Dow Hurst wrote:
> Biff,
> We recently put together a cluster designed to run NAMD primarily on
> CPUs but with an eye to upgrading to GPUs if the performance panned
> out. The cluster has dual quad core 2.5GHz Xeon 5420 cpus connected
> with QLogic Infinipath QLE7280 DDR cards and a 96 port QLogic 9080
> Silverstorm switch. The nodes have a PCI-Express gen2 16x slot for
> the Infinipath card to maximize bandwidth and lower latency. Leaving
> one core free to manage the interconnect really helps out and pushing
> the PME management on to an additional core made an additional
> improvement. The sweet spot for this simulation is 350 compute cores
> using 7 cores per node, not eight, and one extra core to manage the
> PME. I'm using the "twoAwayX yes" option to bump up the number of
> patches, and the "ldbUnloadZero yes" option to offload the PME in our
> NAMD config file. We've tested the CPU and IB interconnects with NAMD
> 2.6 and have achieved 15.5 ns/day on 351 cores on a 95,874 atom
> simulation. (I apologize for not having apoa1 numbers!) NAMD reported
> a benchmark of 0.00548014 (seconds)/step or 1.96 cpu s/step for this
> run. What we've found is slower cpus with lot's of onboard cache
> combined with a fast interconnect perform very well when scaling up
> the number nodes in the calculation.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:53:35 CST