Re: NAMD PERFORMANCE ON NVIDIA K20 GPU

From: Neeraj Agrawal (neer1980_at_gmail.com)
Date: Tue Sep 17 2013 - 18:13:34 CDT

Hello Norman,

I do not have a cluster; I am running NAMD on a dual 8-core workstation. I
used the +devices flag but not the +ignoresharing flag.

On Mon, Sep 16, 2013 at 3:05 AM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

> Did you notice the bad scaling across nodes? I guess you only use a
> gigabit ethernet ,right? Also, what you call the biggest advantage ratio
> 8:1, has in fact the lower speedup. The improvement in time comes due the
> additional processor power, not the gpu, so best test case for measuring
> the benefit of using gpus against cpu only, is the 1:1 ratio and 5.7 is
> quite nice and also the rest looks reasonable. Did you use the +devices or
> +ignoresharing flag? What settings did you use for fullelectfrequency?****
>
> ** **
>
> Norman Geist.****
>
> ** **
>
> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
> Auftrag von *Neeraj Agrawal
> *Gesendet:* Sonntag, 15. September 2013 01:59
> *An:* namd-l_at_ks.uiuc.edu
> *Betreff:* namd-l: NAMD PERFORMANCE ON NVIDIA K20 GPU****
>
> ** **
>
> Hello, ****
>
> ** **
>
> I recently performed few benchmark NAMD runs on a workstation (Dual 8-core
> Xeon E5-2687W, 3.1 GHz with one Nvidia Tesla K20C GPU). Below are the
> results:****
>
> ** **
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> ****
>
> ** **
>
> System Size: 85,000 atoms****
>
> ** **
>
> number of CPU only CPU + K20c Speed-up****
>
> processors (days/ns) (days/ns) from GPU****
>
> ** **
>
> 4 1.19 0.21 5.7****
>
> 8 0.62 0.18 3.4****
>
> 16 0.33 0.21 1.6****
>
> 32 0.29 0.23 1.3****
>
> ** **
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> ****
>
> System Size: 6300 atoms****
>
> ** **
>
> number of CPU only CPU + K20c Speed-up****
>
> processors (days/ns) (days/ns) from GPU****
>
> ** **
>
> 4 0.086 0.087 1.0****
>
> 8 0.05 0.02 2.5****
>
> 16 0.029 0.02 1.5****
>
> 32 0.032 0.017 1.9****
>
> ** **
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> ****
>
> ** **
>
> In all these simulations, outputEnergy is written every 100th frame and
> cutoff is set to 12 A. The results of CPU only NAMd were obtained by using
> Linux-x86_64 (version 2.9) and results of CPU+GPU were obtained by using
> Linux-x86_64-multicore-CUDA (version 2.9)****
>
> ** **
>
> Since, in the future, I will be simulating solvated proteins with around
> 50K-70K atoms (in total), would it be reasonable to conclude the following
> based on the above benchmark results:****
>
> ** **
>
> 1. The biggest advantage of GPU is seen when one GPU is used per 8 cores.
> ****
>
> ** **
>
> 2. It might be advantageous to add one more GPU to this workstation so
> that I can run two NAMD simulations (each on 8 procs + 1 GPU)
> simultaneously ?****
>
> ** **
>
> 3. For a system with <80k atoms, hyper-threading can deteriorate the
> performance. ****
>
> ** **
>
> Thank you,****
>
> ** **
>
> Neeraj****
>
> ** **
>

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:44 CST