Re: no. of CPUs for optimal GTX-690 performance

From: Aron Broom (
Date: Tue Nov 06 2012 - 13:44:06 CST

Normally I'd point out the whole business about GPU bandwidth being a major
culprit in poor performance with NAMD since not everything is done on the
GPU, but that is one hell of a motherboard you're looking at! I'd really
love to see what the benchmarks look like on that thing with your GTX-690.

With dual M2070s (equivalent of two GTX-570s, so roughly a GTX-590), here
is how my performance scaled with CPU usage, having two 6-core xeons (this
is with a 100k atom system, explicit solvent, 1,2,4 fs multistepping (I
didn't really know what I was doing initially, so that part is a bit odd),
langevin thermo and barostat, PME 1.0 grid):

2 CPU: 0.79 days/ns
4 CPU: 0.47 days/ns
6 CPU: 0.39 days/ns
8 CPU: 0.39 days/ns
10 CPU: 0.40 days/ns
12 CPU: 0.39 days/ns

So as you can see from that, I was likely limited by GPU memory bandwidth
beyond 6-cores (or maybe some kind of communication between the two
CPUs?). So, I guess the only conclusion you can draw from this is, for
sure go with more than 6 cores, but you were already doing that.

One other thing I can add, which maybe doesn't matter because your
motherboad may lock you into certain CPUs; the hyperthreading on the intel
cores is valueless for NAMD calculations. From my limited testing between
a few different systems, AMD thuban 6-core CPUs gave equivalent performance
to Xeon 6-core, but at a fraction of the cost.


I think you should investigate heavily the memory bandwidth issues
associated with this. There are other posts on the mailing list, but
essentially you are probably being limited not by your CPU performance, but
by the time it takes the GPU to communicate back to the CPU.

A quick test would be to compare the workload on the CPUs when running a
simulation with the GPU vs. one where only the CPUs are running (I'm not
sure what the best way to examine the workload would actually be, possibly

Anyway, my point is, while putting in some dual 6-core CPUs will certainly
give some kind of boost, you want to make sure you've got a good

On Tue, Nov 6, 2012 at 2:01 PM, Michael Purdy <> wrote:

> Hello, I am running NAMD simulations (multicore-CUDA) on a ThinkPad with
> dual Core i7-2760QM CPUs and a Quadro 2000M running Debian. For a 150k atom
> system I get performance like this:
> Benchmark time: 4 CPUs 0.287062 s/step 1.66124 days/ns 387.641 MB memory
> Benchmark time: 7 CPUs 0.289229 s/step 1.67378 days/ns 428.574 MB memory
> Things are going well so we purchased a GTX-690 which we installed in a
> workstation with two dual core Opterons, which is evidently far short of
> the CPU cores we need to get the most of the 2 GPUs and 3072 cuda cores.
> Performance was just slightly better than the ThinkPad:
> Benchmark time: 4 CPUs ~0.2 s/step ~1.4 days/ns
> We would like to build a new workstation to get the most out of the
> GTX-690 and I'd like to know how many CPU cores we need. I'm considering
> two Core i7-3930k (6-core/12-thread) or two Xeon E5-2650
> (8-core/16-thread). Will either of these be a good match for the GTX-690 or
> will I still be short running short on CPUs? The current plans is to build
> this on an Asus Z9PE-D8 WS board.
> Michael

Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:22:43 CST