Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!

From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Mon May 27 2013 - 04:58:05 CDT

Next message: Andrew Emerson: "Re: NAMD/VMD installation on Linux Clusters with InfiniBand"
Previous message: Aditya Ranganathan: "Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!"
In reply to: Aditya Ranganathan: "Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!"
Next in thread: Aron Broom: "Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!"
Reply: Aron Broom: "Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

On Mon, May 27, 2013 at 11:35 AM, Aditya Ranganathan
<aditya.sia_at_gmail.com> wrote:
> @Francesco, we are planning to buy a PCI-express 3.0 supported board. @Alex:
> Thanks Alex for the comprehensive walthrough on this issue. We aim at
> building this machine solely for performing classical MD-simulations using
> NAMD. Reliability and scaling up issues of the a GeForce card like GTX680 is
> what was cited as a possible disadvantage by our computer vendor while he
> was suggesting the Tesla to be an option.

> I have`nt been able to get any clear benchmarks for the TESLA C-2075 as of
> yet. Most of the benchmarks seem to revolve around the Kepler series of
> cards. If anyone is aware of those, please lead me to the NAMD benchmarks on
> TESLA C-2075.

look for benchmarks of C-2050. the C-2075 is tad faster. they are two
different revisions of the fermi chip. the difference is similar to
what a GTX 480 is to a GTX 580 (which are the corresponding consumer
models). mind you. with the fermi generations, the consumer cards were
more similar to the tesla cards than they are now. only the GeForce
TITAN has a similar (or better?) relationship to the tesla K20. even
the recently released GTX 780 has been deliberately "crippled" to
massively reduce double precision floating point performance.

the problem with the C-2075 is that it is using an already outdated
architecture (with GPUs architectures change fast) which is as
different from a kepler chips as perhaps a intel pentium 4 is from a
current (ivy bridge) based intel i7 cpu.

axel.

>
>
>
> On Mon, May 27, 2013 at 2:27 PM, Axel Kohlmeyer <akohlmey_at_gmail.com> wrote:
>>
>> On Mon, May 27, 2013 at 10:14 AM, Aditya Ranganathan
>> <aditya.sia_at_gmail.com> wrote:
>> > Hello All,
>> >
>> > We are pondering over investing on a GPU based machine for running NAMD
>> > simulations (all-atom). Currently, we are stuck with a dilemma over the
>> > choice of card for CUDA computing. We already have a GTX 680 which gives
>> > us
>> > about 3ns/day for a 100000 atom system using a single GPU card and 8 cpu
>> > cores.
>> >
>> > Now, we are planning to build a GPU machine with 4 GPU cards (either
>> > Tesla
>> > C-2075C, 6GB GDDR5 or the NVIDIA GTX 680). The base system would
>> > consists of
>> > a 6-core Intel Xeon E5 2620 processor, 64GB DDR3 RAM and a 2TB Hard
>> > Drive.
>> >
>> > Has anyone in the community used the Tesla series of cards with NAMD and
>> > compared its benchmarks (scalability etc) with a entry level card like
>> > GTX
>> > 680. The cost of the Tesla is almost 3 times that of the GTX680. Does
>> > its
>> > performance justify its price?
>>
>> GTX 680 is not exactly "entry" level (more upper mid level) and you
>> can't compare GPUs like that. you basically have different "chip
>> families" and different "chip generations" GTX 680 is based on the
>> "Kepler" generation, as are the Tesla K10 and Tesla K20, the C2075
>> however is based on the previous generation called "Fermi". Now
>> GeForce cards are usually spec'd rather aggressively and for use in
>> video games and not for reliability in computing (which doesn't mean,
>> they are unreliable, only that the vendors take a higher risk for
>> lowering production costs and raising game performance).
>>
>> Also on GeForce cards certain functionality is not available (for
>> example ECC memory configuration) or only in very limited way (for
>> example double precision floating point math). Also, support through
>> the nvidia-smi utility is limited. On the other hand, Tesla GPUs do
>> have all of these benefits and also use "certified" and tested
>> hardware components, often more RAM and have better warranty deals.
>> All of this and the fact that they are produced and sold in smaller
>> quantities result in higher costs.
>>
>> So whether the Tesla GPUs are worth the price or not depends on what
>> you are looking for in a GPU. Classical MD can function very well with
>> just limited double precision performance, since most of the force
>> calculation can be done in single precision with only a small loss of
>> accuracy (and would otherwise similarly offloaded to SSE and AVX
>> vector instructions). Also the performance of classical MD is often as
>> much dominated by memory bandwidth (looking up pairs of particles
>> through the neighbor lists) as it is through compute performance. the
>> fastest GeForce type GPUs often outperform the fastest Tesla cards of
>> the same generation in classical MD due to their higher clocks and
>> higher memory bandwidth. However, if you would also run applications
>> that are dependent on double precision floating point, or prefer a low
>> risk and better management and are willing for that the extra price,
>> then the Tesla would be it.
>>
>> mind you, the Tesla K10 is a special beast in this zoo, since it is
>> effectively a pimped up GeForce GTX690.
>>
>> > Any suggestions from the community would be greatly appreciated.
>>
>> multi-gpu machines are tricky business. you have to pay great
>> attention to the chipset and how many full withd PCI-e slots are
>> supported. for a 4-GPU machine, you usually need two CPUs and two
>> southbridges (two GPUs per socket). some boards have only one
>> southbridge and then support more full width PCI-e slots via PCIe
>> bridge chips. those add a little latency and - when you use all GPUs
>> at the same time - two GPUs have to share the bandwidth. since the
>> host to GPU bandwidth affects NAMD performance, you have to test
>> whether in that case a single 4 GPU machine or two machines with 2
>> GPUs each are the better option (probably the latter). also you should
>> make sure that the CPU memory bandwidth is not crippled (they come in
>> different speeds).
>>
>> in short, there is no clear cut answer. many things depend on what
>> *else* you want to do with the machine and there are many personal
>> opinions that people are not 100% agreed upon. if you ask simply, is
>> the performance of a tesla worth 3x the price (or more in the case of
>> a K20), my personal opinion is "not at all", but i might still buy
>> one, in case i come across an application and workflow that benefits
>> from it.
>>
>> axel.
>> >
>> >
>> > Regards
>> >
>> > Srivastav Ranganathan
>> > Research Scholar
>> > IIT Bombay,
>> > Mumbai, India
>>
>>
>>
>> --
>> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
>> International Centre for Theoretical Physics, Trieste. Italy.
>
>

--
Dr. Axel Kohlmeyer  akohlmey_at_gmail.com  http://goo.gl/1wk0
International Centre for Theoretical Physics, Trieste. Italy.

Next message: Andrew Emerson: "Re: NAMD/VMD installation on Linux Clusters with InfiniBand"
Previous message: Aditya Ranganathan: "Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!"
In reply to: Aditya Ranganathan: "Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!"
Next in thread: Aron Broom: "Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!"
Reply: Aron Broom: "Re: Suggestions while building a GPU-machine (CUDA) for NAMD use!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:15 CST