AW: AW: AW: 2CPU+1GPU vs 1CPU+2GPU

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Wed Feb 15 2012 - 03:46:09 CST

> Hi Norman,

Hi Nicholas

>
>
> <snip>
> > And for a small number of nodes, Gigabit is very sufficient, whereas
> GPU
> > nodes would already need expensive Infiniband or 10Gbit/s-Ethernet.
> </snip>
>
>
> Even with old CPUs (quad-core Q6600), loaded with consumer GPUs (GTX
> 460,
> one card per node) and with simple gigabit, you can have useful
> speedups
> for small (tiny) number of nodes and large-ish systems (say, 100K
> atoms).
> To put this in numbers using the ApoA1 benchmark (measurements in
> ns/day):
>
>
> Nodes/Cores/GPUs With CUDA Without CUDA Times faster with CUDA
>
> 1 / 4 / 1 1.11 0.22 5.04
>
> 2 / 8 / 2 1.79 0.42 4.25
>
> 4 / 16 / 4 2.27 0.77 2.94
>
>
> Although the parallel scaling with CUDA (at ~68% with four nodes) is

That's not true.

The speedup from 1 to 4 nodes is only 2,04 instead of linearly 4 so it's 51%
efficiency with cuda whereas without it's speedup 3.5 and 88% efficiency. So
one can really say cuda with gigabit is very insufficient. My aim was to
share this fact with others so they are not disappointed about poor parallel
scaling on gigabit with GPU nodes.

Cheers

> far
> from ideal, for small (tiny) research groups which can relax the big
> cluster rules, this may still be a useful toy-cluster. Having said
> than,
> you can substitute the whole lot discussed above (4 nodes + LAN) with a
> single (cheap) node based on an AMD 8-core FX-8150 plus one GTX 570
> card
> which for the same benchmark delivers 2.33 ns/day.
>
>
> My twocents,
> Nicholas
>
>
> --
>
>
> Nicholas M. Glykos, Department of Molecular Biology
> and Genetics, Democritus University of Thrace, University Campus,
> Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office)
> +302551030620,
> Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:12 CST