Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!

From: Aron Broom (broomsday_at_gmail.com)
Date: Mon May 27 2013 - 08:09:44 CDT

as Axel suggested, in terms of just performance, the C-2075 will be about
the same as a GTX480 in most cases. So from just a performance standpoint
the GTX680 will generally be better.

I'm not sure how much a C-2075 costs currently, but if as you say you are
getting a PCIe 3.0 board, why not buy a Titan? You'll have even better
performance than the 680 and huge memory (6GB). Of course the memory
quality issues compared to a K20x that Axel brought up still exist, but if
performance is your only concern...

~Aron

On Mon, May 27, 2013 at 5:58 AM, Axel Kohlmeyer <akohlmey_at_gmail.com> wrote:

> On Mon, May 27, 2013 at 11:35 AM, Aditya Ranganathan
> <aditya.sia_at_gmail.com> wrote:
> > @Francesco, we are planning to buy a PCI-express 3.0 supported board.
> @Alex:
> > Thanks Alex for the comprehensive walthrough on this issue. We aim at
> > building this machine solely for performing classical MD-simulations
> using
> > NAMD. Reliability and scaling up issues of the a GeForce card like
> GTX680 is
> > what was cited as a possible disadvantage by our computer vendor while he
> > was suggesting the Tesla to be an option.
>
> > I have`nt been able to get any clear benchmarks for the TESLA C-2075 as
> of
> > yet. Most of the benchmarks seem to revolve around the Kepler series of
> > cards. If anyone is aware of those, please lead me to the NAMD
> benchmarks on
> > TESLA C-2075.
>
> look for benchmarks of C-2050. the C-2075 is tad faster. they are two
> different revisions of the fermi chip. the difference is similar to
> what a GTX 480 is to a GTX 580 (which are the corresponding consumer
> models). mind you. with the fermi generations, the consumer cards were
> more similar to the tesla cards than they are now. only the GeForce
> TITAN has a similar (or better?) relationship to the tesla K20. even
> the recently released GTX 780 has been deliberately "crippled" to
> massively reduce double precision floating point performance.
>
> the problem with the C-2075 is that it is using an already outdated
> architecture (with GPUs architectures change fast) which is as
> different from a kepler chips as perhaps a intel pentium 4 is from a
> current (ivy bridge) based intel i7 cpu.
>
> axel.
>
> >
> >
> >
> > On Mon, May 27, 2013 at 2:27 PM, Axel Kohlmeyer <akohlmey_at_gmail.com>
> wrote:
> >>
> >> On Mon, May 27, 2013 at 10:14 AM, Aditya Ranganathan
> >> <aditya.sia_at_gmail.com> wrote:
> >> > Hello All,
> >> >
> >> > We are pondering over investing on a GPU based machine for running
> NAMD
> >> > simulations (all-atom). Currently, we are stuck with a dilemma over
> the
> >> > choice of card for CUDA computing. We already have a GTX 680 which
> gives
> >> > us
> >> > about 3ns/day for a 100000 atom system using a single GPU card and 8
> cpu
> >> > cores.
> >> >
> >> > Now, we are planning to build a GPU machine with 4 GPU cards (either
> >> > Tesla
> >> > C-2075C, 6GB GDDR5 or the NVIDIA GTX 680). The base system would
> >> > consists of
> >> > a 6-core Intel Xeon E5 2620 processor, 64GB DDR3 RAM and a 2TB Hard
> >> > Drive.
> >> >
> >> > Has anyone in the community used the Tesla series of cards with NAMD
> and
> >> > compared its benchmarks (scalability etc) with a entry level card like
> >> > GTX
> >> > 680. The cost of the Tesla is almost 3 times that of the GTX680. Does
> >> > its
> >> > performance justify its price?
> >>
> >> GTX 680 is not exactly "entry" level (more upper mid level) and you
> >> can't compare GPUs like that. you basically have different "chip
> >> families" and different "chip generations" GTX 680 is based on the
> >> "Kepler" generation, as are the Tesla K10 and Tesla K20, the C2075
> >> however is based on the previous generation called "Fermi". Now
> >> GeForce cards are usually spec'd rather aggressively and for use in
> >> video games and not for reliability in computing (which doesn't mean,
> >> they are unreliable, only that the vendors take a higher risk for
> >> lowering production costs and raising game performance).
> >>
> >> Also on GeForce cards certain functionality is not available (for
> >> example ECC memory configuration) or only in very limited way (for
> >> example double precision floating point math). Also, support through
> >> the nvidia-smi utility is limited. On the other hand, Tesla GPUs do
> >> have all of these benefits and also use "certified" and tested
> >> hardware components, often more RAM and have better warranty deals.
> >> All of this and the fact that they are produced and sold in smaller
> >> quantities result in higher costs.
> >>
> >> So whether the Tesla GPUs are worth the price or not depends on what
> >> you are looking for in a GPU. Classical MD can function very well with
> >> just limited double precision performance, since most of the force
> >> calculation can be done in single precision with only a small loss of
> >> accuracy (and would otherwise similarly offloaded to SSE and AVX
> >> vector instructions). Also the performance of classical MD is often as
> >> much dominated by memory bandwidth (looking up pairs of particles
> >> through the neighbor lists) as it is through compute performance. the
> >> fastest GeForce type GPUs often outperform the fastest Tesla cards of
> >> the same generation in classical MD due to their higher clocks and
> >> higher memory bandwidth. However, if you would also run applications
> >> that are dependent on double precision floating point, or prefer a low
> >> risk and better management and are willing for that the extra price,
> >> then the Tesla would be it.
> >>
> >> mind you, the Tesla K10 is a special beast in this zoo, since it is
> >> effectively a pimped up GeForce GTX690.
> >>
> >> > Any suggestions from the community would be greatly appreciated.
> >>
> >> multi-gpu machines are tricky business. you have to pay great
> >> attention to the chipset and how many full withd PCI-e slots are
> >> supported. for a 4-GPU machine, you usually need two CPUs and two
> >> southbridges (two GPUs per socket). some boards have only one
> >> southbridge and then support more full width PCI-e slots via PCIe
> >> bridge chips. those add a little latency and - when you use all GPUs
> >> at the same time - two GPUs have to share the bandwidth. since the
> >> host to GPU bandwidth affects NAMD performance, you have to test
> >> whether in that case a single 4 GPU machine or two machines with 2
> >> GPUs each are the better option (probably the latter). also you should
> >> make sure that the CPU memory bandwidth is not crippled (they come in
> >> different speeds).
> >>
> >> in short, there is no clear cut answer. many things depend on what
> >> *else* you want to do with the machine and there are many personal
> >> opinions that people are not 100% agreed upon. if you ask simply, is
> >> the performance of a tesla worth 3x the price (or more in the case of
> >> a K20), my personal opinion is "not at all", but i might still buy
> >> one, in case i come across an application and workflow that benefits
> >> from it.
> >>
> >> axel.
> >> >
> >> >
> >> > Regards
> >> >
> >> > Srivastav Ranganathan
> >> > Research Scholar
> >> > IIT Bombay,
> >> > Mumbai, India
> >>
> >>
> >>
> >> --
> >> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
> >> International Centre for Theoretical Physics, Trieste. Italy.
> >
> >
>
>
>
> --
> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
> International Centre for Theoretical Physics, Trieste. Italy.
>
>

-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:14 CST