Re: Advice on buying GPUs

From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Thu Jun 02 2011 - 09:06:08 CDT

Sorry for the TOFU style reply. I am working on our largest cluster right
now hoping to salvage a busted raid.

I would go for all 32-core opteron and get time on GPU clusters externally.
GPU machines are currently much less in demand than conventional ones, so it
is much easier to get time on them (and more). At best I would just buy one
GPU node for learning and testing. Unless you want to start a career as a
sysadmin or HPC specialist, you are better off buying something that is less
work to run well so you can focus on your research.

In re: benchmarks. They are rarely using a representative data set and only
use som "kernel" of NAND and other codes. This is what vendors like since
this just gives you a single number to compare to. In real life you need a
balanced system and that is more than the fastest CPU.

Axel

 --
Axel Kohlmeyer
akohlmey_at_gmail.com
http://goo.gl/1wk0

On Jun 2, 2011, at 9:27, "Ajasja Ljubetič" <ajasja.ljubetic_at_gmail.com>
wrote:

this is not that easy to answer and 20k EUR is a pretty small sum.
> question is, how much are you interested in compute capacity (more nodes)
> vs. capability (ability to run jobs faster)? do you already have an
> infiniband
> (or similar) switch that those nodes could be added to? also, what kind
> of facilities do you have (cooling, power, racks, storage)?
>

Oh, what we have is what I build two years ago. No sales reps were involved
in the process:) it will probably make you laugh, but still, a laugh a day
is healthy, so here it goes:
We have the 8 nodes allready mentioned:
8x tyan mother board <http://www.tyan.com/product_board_detail.aspx?pid=453>
16x AMD Opteron Quad
2376<http://products.amd.com/en-us/OpteronCPUDetail.aspx?id=489>
8x 750 W PSU
connected with gigabit ethernet and cooled by air. External air is blown at
the bottom and sucked out at the top. Without the cover the "cluster" looks
like this <http://www.ijs.si/ijs/dept/epr/ajasja/mails/cluster.png>.

Currently I'm doing some ABF stuff on small/medium systems. Since I can run
multiple independent simulations on different sections of the colvar
space, parallelization is not a problem.
I'd also like to run some ABF MD in larger membrane systems in the near
future, but if the system will be too large I'll rather apply for CPU time
somewhere.

> and how much time are you or your sysadmin(s) willing to spend
> on maintaining the machines and how much expertise is available
> to pick and choose the right hardware and not get talked into buying
> useless crap (as it happens far too often these days).

Again it's mostly just me:)

> the answer to each of these topics can have an impact as to which is
> better for you. buying machines with a "gamer" mainboard and 3 GeForce
> GPUs per machine is likely to give you the most bang for the buck, but
> if you don't have much technical experience then you may not be able to
> find the little details that are important to make them work well.

Yes, that might be the case as I don't (yet) have much experience with GPU
computing.

I'll probably go for two 32 core nodes and fill the rest with gaming boards
+ GPUs...

> SPEC benchmarks are useless, they are a tool to sell people
> what sales reps make a good commission on.

Why are they useless? I was so happy when I found out that the SPEC CPU2006
has a namd module.

Thank you for the answers,
Ajasja

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:57:13 CST