AW: GPU cluster

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Thu Jun 21 2012 - 05:43:56 CDT

Next message: J�r�me H�nin: "Re: ABF/Steered MD for DNA Hybridization on Carbon Nanotubes"
Previous message: Francesco Pietra: "Re: Fwd: Fwd: namd on gtx-680"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi,

> -----Ursprüngliche Nachricht-----
> Von: chiendarret_at_gmail.com [mailto:chiendarret_at_gmail.com] Im Auftrag
> von Francesco Pietra
> Gesendet: Donnerstag, 21. Juni 2012 08:22
> An: Norman Geist
> Betreff: Re: namd-l: GPU cluster
>
> Would it be economically sound to run NAMD across CPU-GPU shared-mem
> game-boxes, i.e., multiple of the GA-X79-UD3 / i7-3930k / 2 x GTX-680
> I have now assembled? How much expensive would it be a suitable
> Infiniband to this purpose (keeping in mind that the system might be
> upgraded from present PCIe 2.0 to PCIe 3.0)? In other words, I would
> try to avoid replacing Infiniband if performance of the boxes is
> improved.

The power of the gpus are the problem for node interconnect. The more computing power per node on a given problem size, the more dependence on bandwidth and latency as the problem parts get solved too quickly so that more communication is needed. Also one would need a machine, that can provide enough pcie bandwidth for two gpus and a Infiniband HCA.

I guess a QDR switch with only 8 ports is about 1500€ and the HCAs each about 200-500€ and cable each 100€. Yes, Infiniband is expensive but I don't know any alternative for a highspeed interconnect which is simply needed for gpu nodes. At least DDR I guess you would need.

;) Norman

>
> thanks
> francesco pietra
>
> On Thu, Jun 21, 2012 at 7:45 AM, Norman Geist
> <norman.geist_at_uni-greifswald.de> wrote:
> > Hi,
> >
> > to run across multiple nodes you will also need a highspeed network.
> My
> > cluster 3 nodes 36 cores 6 tesla c2050 does barely scale ok with SDR
> > Infiniband (10Gbit/s).
> >
> > Norman Geist.
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> >> Auftrag von Axel Kohlmeyer
> >> Gesendet: Dienstag, 19. Juni 2012 21:41
> >> An: Matthew B. Roark
> >> Cc: namd-l_at_ks.uiuc.edu
> >> Betreff: Re: namd-l: GPU cluster
> >>
> >> On Tue, Jun 19, 2012 at 3:07 PM, Matthew B. Roark
> <Roarkma_at_wabash.edu>
> >> wrote:
> >> >
> >> > I currently use CHARMM on several Rocks clusters and I am looking
> to
> >> try out GPGPU. I am looking into buying either a small GPU cluster
> or
> >> a few stand-alone workstations to use CUDA enabled NAMD. I wanted
> to
> >> see what people are using and what people suggest.
> >> >
> >> > (1) My main concern is that I need to have something that will
> >> positively work with NAMD. Is there any hardware or vendors I
> should
> >> stay away from? Are there vendors with "out-of-the-box'
> compatibility?
> >>
> >> there is no simple answer to this. for as long as you get recent and
> >> capable enough nvidia hardware. there is compatibility, but it can
> >> wildly
> >> differ how well it will work for your specific application.
> >>
> >> most vendors go with what nvidia recommends, and particularly the
> sales
> >> people that you get to deal with don't know much, if anything at
> all.
> >> what
> >> nvidia and/or vendors recommend is not always the best choice,
> >> particularly
> >> when you are on a budget. but at the same time, what is the best
> choice
> >> depends on how much effort you want to invest by yourself in
> figuring
> >> out
> >> what is best for your purpose and how well you are able to push
> vendors
> >> to offer you something that that doesn't make them as much money or
> >> is going against (inofficial?) agreements they have with nvidia.
> >>
> >> the biggest question is whether you want to go with consumer grade
> >> hardware or "professional" tesla GPUs. classical MD won't benefit
> >> as much from the features of the tesla hardware as other
> applications.
> >> and GeForce GTX 580 cards provide an incredible price performance
> >> ratio compared to Tesla C2075 GPUs. OTOH, you don't have ECC,
> >> more thoroughly tested hardware, and the better warranty.
> >>
> >> from what has percolated through this mailing list over the last
> years,
> >> it seems that consumer grade hardware is best used in a workstation
> >> type environment, and tesla type hardware makes most sense in a
> >> cluster environment (particularly with the passively cooled M
> series)
> >>
> >> > (2) How does scaling and efficiency work across multiple GPUs in
> the
> >> same server? That is, how many GPUs can a server really make good
> use
> >> of? I plan on testing with an 80k atom simulation.
> >>
> >> that depends on the mainboard chipset. most can handle two GPUs
> well.
> >> a dual intel tylersburg chipset mainboard (supermicro has one) can
> >> handle
> >> up to 4 GPUs very well. those mainboards with two 6-core CPUs is
> >> probably
> >> the best choice for a single/multiple workstation setup, and that is
> >> probably
> >> also the limit of how far you can scale NAMD well for your input.
> >>
> >> because memory and PCI-bus bandwidth is very important, you should
> >> stay away from mainboards with integrated "PCI-e bridges" (with
> those
> >> one can have 8 16-lane PCIe slots, but two cards have to share the
> >> bandwidth and latency is increased). the same goes for dual GPU
> cards.
> >>
> >> > (3) How much CPU power do I need to make use of multiple GPUs?
> Will
> >> 8 or 16 cores suffice?
> >>
> >> NAMD can overlap computational work that is not GPU accelerated
> >> with GPU kernels and run them concurrently. NAMD also can attach
> >> multiple threads to the same GPU and thus increase GPU utilization.
> >> however, how well this works and how many threads per CPU depends
> >> on memory bandwidth, computational complexity of the model and
> >> size of the data set. in may cases, the optimum seems to be around
> >> two to three CPU cores per GPU. it sometimes may be best to leave
> >> CPU cores idle to run the GPUs more efficiently. using more threads
> >> per GPU can increase utilization, but also increases overhead. so
> >> there is an optimum. clock rate of the CPU is usually less important
> >> than a good overall i/o bandwidth.
> >>
> >> a final comment. try to resist the urge of purchasing the very
> latest
> >> (kepler) hardware. vendors will push it, but applications have not
> yet
> >> caught up (it can take a few years sometimes), so you won't benefit.
> >> if you want something that definitely works, it is always a good
> idea
> >> to stick with tried and tested hardware that is closer to its end-
> of-
> >> life
> >> than to its introduction.
> >>
> >> HTH,
> >> axel.
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Dr. Axel Kohlmeyer
> >> akohlmey_at_gmail.com http://goo.gl/1wk0
> >>
> >> College of Science and Technology
> >> Temple University, Philadelphia PA, USA.
> >
> >

Next message: J�r�me H�nin: "Re: ABF/Steered MD for DNA Hybridization on Carbon Nanotubes"
Previous message: Francesco Pietra: "Re: Fwd: Fwd: namd on gtx-680"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:22:10 CST