Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!

From: Axel Kohlmeyer (
Date: Mon May 27 2013 - 03:57:15 CDT

On Mon, May 27, 2013 at 10:14 AM, Aditya Ranganathan
<> wrote:
> Hello All,
> We are pondering over investing on a GPU based machine for running NAMD
> simulations (all-atom). Currently, we are stuck with a dilemma over the
> choice of card for CUDA computing. We already have a GTX 680 which gives us
> about 3ns/day for a 100000 atom system using a single GPU card and 8 cpu
> cores.
> Now, we are planning to build a GPU machine with 4 GPU cards (either Tesla
> C-2075C, 6GB GDDR5 or the NVIDIA GTX 680). The base system would consists of
> a 6-core Intel Xeon E5 2620 processor, 64GB DDR3 RAM and a 2TB Hard Drive.
> Has anyone in the community used the Tesla series of cards with NAMD and
> compared its benchmarks (scalability etc) with a entry level card like GTX
> 680. The cost of the Tesla is almost 3 times that of the GTX680. Does its
> performance justify its price?

GTX 680 is not exactly "entry" level (more upper mid level) and you
can't compare GPUs like that. you basically have different "chip
families" and different "chip generations" GTX 680 is based on the
"Kepler" generation, as are the Tesla K10 and Tesla K20, the C2075
however is based on the previous generation called "Fermi". Now
GeForce cards are usually spec'd rather aggressively and for use in
video games and not for reliability in computing (which doesn't mean,
they are unreliable, only that the vendors take a higher risk for
lowering production costs and raising game performance).

Also on GeForce cards certain functionality is not available (for
example ECC memory configuration) or only in very limited way (for
example double precision floating point math). Also, support through
the nvidia-smi utility is limited. On the other hand, Tesla GPUs do
have all of these benefits and also use "certified" and tested
hardware components, often more RAM and have better warranty deals.
All of this and the fact that they are produced and sold in smaller
quantities result in higher costs.

So whether the Tesla GPUs are worth the price or not depends on what
you are looking for in a GPU. Classical MD can function very well with
just limited double precision performance, since most of the force
calculation can be done in single precision with only a small loss of
accuracy (and would otherwise similarly offloaded to SSE and AVX
vector instructions). Also the performance of classical MD is often as
much dominated by memory bandwidth (looking up pairs of particles
through the neighbor lists) as it is through compute performance. the
fastest GeForce type GPUs often outperform the fastest Tesla cards of
the same generation in classical MD due to their higher clocks and
higher memory bandwidth. However, if you would also run applications
that are dependent on double precision floating point, or prefer a low
risk and better management and are willing for that the extra price,
then the Tesla would be it.

mind you, the Tesla K10 is a special beast in this zoo, since it is
effectively a pimped up GeForce GTX690.

> Any suggestions from the community would be greatly appreciated.

multi-gpu machines are tricky business. you have to pay great
attention to the chipset and how many full withd PCI-e slots are
supported. for a 4-GPU machine, you usually need two CPUs and two
southbridges (two GPUs per socket). some boards have only one
southbridge and then support more full width PCI-e slots via PCIe
bridge chips. those add a little latency and - when you use all GPUs
at the same time - two GPUs have to share the bandwidth. since the
host to GPU bandwidth affects NAMD performance, you have to test
whether in that case a single 4 GPU machine or two machines with 2
GPUs each are the better option (probably the latter). also you should
make sure that the CPU memory bandwidth is not crippled (they come in
different speeds).

in short, there is no clear cut answer. many things depend on what
*else* you want to do with the machine and there are many personal
opinions that people are not 100% agreed upon. if you ask simply, is
the performance of a tesla worth 3x the price (or more in the case of
a K20), my personal opinion is "not at all", but i might still buy
one, in case i come across an application and workflow that benefits
from it.

> Regards
> Srivastav Ranganathan
> Research Scholar
> IIT Bombay,
> Mumbai, India

Dr. Axel Kohlmeyer
International Centre for Theoretical Physics, Trieste. Italy.

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:13 CST