Re: Running on GPU multiple nodes

From: Adupa Vasista (adupavasista_at_gmail.com)
Date: Sun Mar 29 2020 - 14:22:05 CDT

Thanks for the reply.
I tried as you said but fell into a new error as follows:* FATAL ERROR:
Unknown command-line option ++nodelist*

Any idea on what's wrong with that.

Thank you.

On Sun, Mar 29, 2020 at 6:45 PM Renfro, Michael <Renfro_at_tntech.edu> wrote:

> Your log also shows:
>
> Charm++> Running on 1 unique compute nodes (24-way SMP).
>
> and I suspect you’ll find one of your two nodes sits completely idle if
> viewed through ‘top’ or a similar utility.
>
> At a minimum, I think you’ll need a ++nodelist parameter added to
> charmrun. https://www.ks.uiuc.edu/Research/namd/2.11/ug/node83.html has
> one example for this.
>
> --
> Mike Renfro, PhD / HPC Systems Administrator, Information Technology
> Services
> 931 372-3601 / Tennessee Technological University
>
> On Mar 29, 2020, at 6:37 AM, Adupa Vasista <adupavasista_at_gmail.com> wrote:
>
> Dear NAMD users
>
> I hope everyone is safe amid the pandemic.
>
> When I try to run the simulation of GPU multiple nodes, I am getting a
> drop in the performance when compared to running on a single node. I am
> using charmrun to run on multiple nodes. But on the first line on the log
> file says *Charm++: standalone mode (not using charmrun).*
>
> Here are the Benchmark results
> Info: Benchmark time: 24 CPUs 0.0420026 s/step 0.486141 days/ns 1518.28 MB
> memory
> Info: Benchmark time: 48 CPUs 0.0556737 s/step 0.644372 days/ns 1799.73 MB
> memory
>
> I am using the following command
> Running on 2 nodes 24 cores each and 2 GPU's
> charmrun +p48 namd2 +idlepoll +devices 0,1 abeta_eq.conf > Solution_eq.log
>
> Here, I attached the log file.
> PFA
>
> Please let me know if I need to change anything in the command.
>
> Thank you.
> <Solution_eq.log>
>
>

--

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:08 CST