Re: Slow performance over multi-core processor and CUDA build

From: Roshan Shrestha (roshanpra_at_gmail.com)
Date: Tue Jul 28 2020 - 11:36:36 CDT

Prof. Giacomo,
                        So, if I use the newest nightly build of namd with
Nvidia cuda acceleration, do I need to specify something in my command
arguments like the number of processors with *+p8 *and +idlepoll or the
normal *namd2 file.conf | tee output.log *shall work? Which is the best
command I can use to have access to all cuda cores and the cpu cores? The
thing with gromacs, was I had to build the source myself so that I can use
cuda, whereas since namd seems like automate things, I am unable to grasp
to understand how can I maximize its performance. For now, my system is
pretty simple with 50K + atoms and the simulation parameters are pretty
standard for normal equilibration and a production run. Thanks.

With best regards

On Tue, Jul 28, 2020 at 6:52 PM Giacomo Fiorin <giacomo.fiorin_at_gmail.com>
wrote:

> Not sure why hyperthreading is mentioned, which is not supported by the
> processor in question:
>
> https://ark.intel.com/content/www/us/en/ark/products/186604/intel-core-i7-9700k-processor-12m-cache-up-to-4-90-ghz.html
>
> Roshan, what are the system size and simulation parameters? It is
> possible that the system is not suitable for a CPU-GPU hybrid scheme
> (possibly made worse by using too many CPU cores). The Gromacs benchmark
> (which was probably run in single precision and on the CPU) seems to
> suggest a rather small system. Have you tried running a non-GPU build? Or
> the GPU-optimized 3.0 alpha build?
>
> For typical biological systems (of the order of 100,000 atoms) and running
> over CPUs, Gromacs would be faster over a few nodes but scale over multiple
> nodes less well than NAMD. The tipping point depends on the system and to
> a lesser extent on the hardware makeup. I suggest you benchmark your
> system thoroughly with both codes, and then decide.
>
> Giacomo
>
> On Tue, Jul 28, 2020 at 8:37 AM Norman Geist <
> norman.geist_at_uni-greifswald.de> wrote:
>
>> I’d say don’t use hyperthreading in HPC in general, nothing special
>> about GPUs. You can assign your tasks/threads to physical core only, e.g--000000000000aa4cc105ab830b91--

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:09 CST