Re: AW: AW: AW: "Beefier" benchmark

From: Maxime Boissonneault (maxime.boissonneault_at_calculquebec.ca)
Date: Thu Aug 21 2014 - 09:04:42 CDT

Well, the scaling from 2GPU+5CPU to 8GPU+20CPU is almost perfect :
2GPU+5CPU = 0.026s/step
4GPU+10CPU = 0.013s/step
8GPU+20CPU = 0.0075s/step

I did try using 2 nodes, i.e.
16GPU + 40 CPU, and I got 0.006s/step.

So I would say the benchmark stops scaling once I get outside of one node.

However, the scaling within a node was done with the multicore binaries,
while the number for two nodes is with a compiled version with MPI+SMP
support, which do not offer quite the same performance as multicore on a
single node (I get 0.0085s/step for 8GPU+20CPU with the MPI+SMP version).

Maxime

Le 2014-08-21 09:43, Norman Geist a écrit :
> Ok forget for I said, on two Tesla C2050 along with 12 Xeon cores I get ~
> 0,034 s/step so your timings really look reasonable then, although
> unfortunately the scaling from 2 to 10 GPUs is only ~50% which should be due
> small system size.
>
> ;)
>
>> -----Ursprüngliche Nachricht-----
>> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
>> Auftrag von Maxime Boissonneault
>> Gesendet: Donnerstag, 21. August 2014 14:32
>> An: Norman Geist
>> Cc: Namd Mailing List
>> Betreff: Re: AW: AW: namd-l: "Beefier" benchmark
>>
>>
>>> Which combination brings improvement depends on system size. The
>> smaller the
>>> system, the better the benefit.
>>> Keep the order x y z until benefit is gone. This needs to be tested
>> for
>>> different system sizes.
>>>
>>> Also check if you still get speedup while increasing the number of
>> GPUs. You
>>> might already scale out much earlier, as this timing still doesn't
>> represent
>>> 10 GPUs IMHO, guess 2 would do the same.
>> What GPUs is your experience based on ? With 2 K20m GPUs, I got
>> 0.021s/step at the fastest when using 20 CPU cores.
>> With 8 K20m GPUs and 20 cores, I got 0.0075s/step, about 3 times
>> faster.
>>> What value do you use for fullelectfrequency ?
>> I use the default ApoA1 benchmark, downloaded from the namd website.
>> Except that I added the twoawayx parameter you mentionned.
>>
>> Maxime
>
> ---
> Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv.
> http://www.avast.com
>
>

-- 
---------------------------------
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:21:09 CST