Antw: Re: NAMD performance

From: Gerald Keller (gerald.keller_at_uni-wuerzburg.de)
Date: Thu Mar 19 2020 - 11:21:26 CDT

Next message: Stefano Guglielmo: "Re: Re: NAMD performance"
Previous message: Josh Vermaas: "Re: NAMD performance"
In reply to: Stefano Guglielmo: "Re: NAMD performance"
Next in thread: Stefano Guglielmo: "Re: Re: NAMD performance"
Reply: Stefano Guglielmo: "Re: Re: NAMD performance"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi Stefano,

I played around with the same issue.
You will only be able to run 2 namd instances on the same machine with no performence loss if your used cores are on different sockets.

In your case cores 0-15 should be on socket 1 and cores 32-47 on socket 2, respectivley. If you only have 1 socket I wouldn't reccomend to run more than 1 namd instance per node.

Best,
Gerald

>>> Stefano Guglielmo <stefano.guglielmo_at_unito.it> 19.03.20 17.00 Uhr >>>
Dear all,
thanks for your advice and sorry for my late reply. I finally managed to optimize performance for a single simulation.

Now I am trying to run two simulations in parallel using NAMD 2.13 multicore-CUDA version. I used the following option to run the two simulations:

+p16 +idlepoll +setcpuaffinity +devices 0 +pemap 0-15

and

+p16 +idlepoll +setcpuaffinity +devices 1 +pemap 32-47.

For two systems of comparable dimension I observed a sizeable performance loss when starting the second simulation (from 0.017 s/step to 0.028 s/step). In your opinion is this reasonable or shall I tune some options differently/use a different version of NAMD?

Thanks in advance for sharing advice,
all the best
Stefano

Il giorno gio 5 mar 2020 alle ore 22:03 Josh Vermaas <joshua.vermaas_at_gmail.com> ha scritto:

Don't forget to compare against multicore builds. On one node with shared memory, those builds often win for maximum 1 gpu throughput. Since you have 2 on the same node, an smp build without communication threads may win.Josh

On Thu, Mar 5, 2020, 10:23 AM Victor Kwan <vkwan8_at_uwo.ca> wrote:

   Hi Stefano,

Since you already have a system in mind, you can compare the time it takes to perform a 10ps simulation with different setups.

> one or both gpu, number of cores
* With NAMD 2.13 comes a large improvement in dual gpu/single node performance and we observe almost linear scaling when going from 1 to 2 GPUs.
* 16core/GPU is sufficient, from our experience 6-8core/GPU is the lower limit.

* For GPU runs, hyperthreading should not increase affect performance.
> pemap/commap options
* check the output of nvidia-smi topo matrix - leaving cpu/gpu affinity as default should be fine.

  On Thu, Mar 5, 2020 at 10:12 AM Stefano Guglielmo <stefano.guglielmo_at_unito.it> wrote:

   Dear NAMD users,
I am using a workstation with an AMD Ryzen Threadripper 2990WX 32-Core Processor, 128 GB RAM and two RTX 2080 Ti cards with NVlink; I am here to ask for suggestions on what could be the "best" options to run a single simulation on a 200K atom system with NAMD 2.13 (one or both gpu, number of cores, hyperthreading or not, pemap/commap options...)

Thanks in advance for your time
Stefano

--
         Stefano GUGLIELMO PhD

Assistant Professor of Medicinal Chemistry

Department of Drug Science and Technology

Via P. Giuria 9

10125 Turin, ITALY

ph. +39 (0)11 6707178


      Mail priva di virus. www.avast.com

-- 
Stefano GUGLIELMO PhD
Assistant Professor of Medicinal Chemistry
Department of Drug Science and Technology
Via P. Giuria 9
10125 Turin, ITALY
ph. +39 (0)11 6707178

Next message: Stefano Guglielmo: "Re: Re: NAMD performance"
Previous message: Josh Vermaas: "Re: NAMD performance"
In reply to: Stefano Guglielmo: "Re: NAMD performance"
Next in thread: Stefano Guglielmo: "Re: Re: NAMD performance"
Reply: Stefano Guglielmo: "Re: Re: NAMD performance"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2020 - 23:17:13 CST