Re: Running on GPU multiple nodes

From: Renfro, Michael (Renfro_at_tntech.edu)
Date: Sun Mar 29 2020 - 08:15:16 CDT

Your log also shows:

  Charm++> Running on 1 unique compute nodes (24-way SMP).

and I suspect you’ll find one of your two nodes sits completely idle if viewed through ‘top’ or a similar utility.

At a minimum, I think you’ll need a ++nodelist parameter added to charmrun. https://www.ks.uiuc.edu/Research/namd/2.11/ug/node83.html has one example for this.

--
Mike Renfro, PhD / HPC Systems Administrator, Information Technology Services
931 372-3601<tel:931%20372-3601> / Tennessee Technological University

On Mar 29, 2020, at 6:37 AM, Adupa Vasista <adupavasista_at_gmail.com> wrote:

Dear NAMD users

I hope everyone is safe amid the pandemic.

When I try to run the simulation of GPU multiple nodes, I am getting a drop in the performance when compared to running on a single node. I am using charmrun to run on multiple nodes. But on the first line on the log file says Charm++: standalone mode (not using charmrun).

Here are the Benchmark results
Info: Benchmark time: 24 CPUs 0.0420026 s/step 0.486141 days/ns 1518.28 MB memory
Info: Benchmark time: 48 CPUs 0.0556737 s/step 0.644372 days/ns 1799.73 MB memory

I am using the following command
Running on 2 nodes 24 cores each and 2 GPU's
charmrun +p48 namd2 +idlepoll +devices 0,1 abeta_eq.conf > Solution_eq.log

Here, I attached the log file.
PFA

Please let me know if I need to change anything in the command.

Thank you.
<Solution_eq.log>

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:08 CST