From: Stefano Guglielmo (stefano.guglielmo_at_unito.it)
Date: Thu Mar 19 2020 - 11:47:04 CDT
Thanks to all of you for sharing opinion.
Josh: I also tried the pemap option you suggested but it did not bring any
improvement; as for system dimension, it is a 200k atom box.
Gerald: the cpu I am using is just one on a single socket; thank you for
sharing your experience.
Il giorno gio 19 mar 2020 alle ore 17:21 Gerald Keller <
gerald.keller_at_uni-wuerzburg.de> ha scritto:
> Hi Stefano,
> I played around with the same issue.
> You will only be able to run 2 namd instances on the same machine with no
> performence loss if your used cores are on different sockets.
> In your case cores 0-15 should be on socket 1 and cores 32-47 on socket 2,
> respectivley. If you only have 1 socket I wouldn't reccomend to run more
> than 1 namd instance per node.
> >>> Stefano Guglielmo <stefano.guglielmo_at_unito.it> 19.03.20 17.00 Uhr >>>
> Dear all,
> thanks for your advice and sorry for my late reply. I finally managed to
> optimize performance for a single simulation.
> Now I am trying to run two simulations in parallel using NAMD 2.13
> multicore-CUDA version. I used the following option to run the two
> +p16 +idlepoll +setcpuaffinity +devices 0 +pemap 0-15
> +p16 +idlepoll +setcpuaffinity +devices 1 +pemap 32-47.
> For two systems of comparable dimension I observed a sizeable performance
> loss when starting the second simulation (from 0.017 s/step to 0.028
> s/step). In your opinion is this reasonable or shall I tune some options
> differently/use a different version of NAMD?
> Thanks in advance for sharing advice,
> all the best
> Il giorno gio 5 mar 2020 alle ore 22:03 Josh Vermaas <
> joshua.vermaas_at_gmail.com> ha scritto:
>> Don't forget to compare against multicore builds. On one node with shared
>> memory, those builds often win for maximum 1 gpu throughput. Since you have
>> 2 on the same node, an smp build without communication threads may win.
>> On Thu, Mar 5, 2020, 10:23 AM Victor Kwan <vkwan8_at_uwo.ca> wrote:
>>> Hi Stefano,
>>> Since you already have a system in mind, you can compare the time it
>>> takes to perform a 10ps simulation with different setups.
>>> > one or both gpu, number of cores
>>> * With NAMD 2.13 comes a large improvement in dual gpu/single node
>>> performance and we observe almost linear scaling when going from 1 to 2
>>> * 16core/GPU is sufficient, from our experience 6-8core/GPU is the lower
>>> * For GPU runs, hyperthreading should not increase affect performance.
>>> > pemap/commap options
>>> * check the output of nvidia-smi topo matrix - leaving cpu/gpu affinity
>>> as default should be fine.
>>> On Thu, Mar 5, 2020 at 10:12 AM Stefano Guglielmo <
>>> stefano.guglielmo_at_unito.it> wrote:
>>>> Dear NAMD users,
>>>> I am using a workstation with an AMD Ryzen Threadripper 2990WX 32-Core
>>>> Processor, 128 GB RAM and two RTX 2080 Ti cards with NVlink; I am here to
>>>> ask for suggestions on what could be the "best" options to run a single
>>>> simulation on a 200K atom system with NAMD 2.13 (one or both gpu, number of
>>>> cores, hyperthreading or not, pemap/commap options...)
>>>> Thanks in advance for your time
>>>> Stefano GUGLIELMO PhD
>>>> Assistant Professor of Medicinal Chemistry
>>>> Department of Drug Science and Technology
>>>> Via P. Giuria 9
>>>> 10125 Turin, ITALY
>>>> ph. +39 (0)11 6707178
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Mail
>>>> priva di virus. www.avast.com
> Stefano GUGLIELMO PhD
> Assistant Professor of Medicinal Chemistry
> Department of Drug Science and Technology
> Via P. Giuria 9
> 10125 Turin, ITALY
> ph. +39 (0)11 6707178
-- Stefano GUGLIELMO PhD Assistant Professor of Medicinal Chemistry Department of Drug Science and Technology Via P. Giuria 9 10125 Turin, ITALY ph. +39 (0)11 6707178
This archive was generated by hypermail 2.1.6 : Thu Dec 31 2020 - 23:17:13 CST