Re: Unable to run NAMD3.0 on 2 GPUs simultaneously

From: Hrishikesh Dhondge (hbdhondge_at_gmail.com)
Date: Mon May 02 2022 - 04:54:59 CDT

Hello Sruthi,

What is the command you are using to run NAMD3.0?

You need to add *+devicesperreplica 1 *in your command. For more
information look here https://www.ks.uiuc.edu/Research/namd/alpha/3.0alpha/

Also, make sure you have the correct executables for multi-GPU usage.

On Mon, May 2, 2022 at 11:35 AM Sruthi Sundaresan <bo20resch11002_at_iith.ac.in>
wrote:

> Dear Users,
> I would like to run my job on 2 GPUs. Although the queue shows that I am
> submitting a job to run on 2 GPU nodes:
>
>
>
>
> *#!/bin/bash#SBATCH --partition=gpu#SBATCH --gres=gpu:2#SBATCH -N 2 #
> Number of nodes#SBATCH --ntasks-per-node=2 #Number of cores per node*
>
> *I happened to notice that only one node is being utilized:*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *[user_at_gpu002 ~]$ nvidia-smiMon May 2 14:53:11
> 2022+-----------------------------------------------------------------------------+|
> NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4
> ||-------------------------------+----------------------+----------------------+|
> GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC
> || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
> M. || | |
> MIG M.
> ||===============================+======================+======================||
> 0 Tesla V100-SXM2... Off | 00000000:61:00.0 Off | 0
> || N/A 58C P0 269W / 300W | 2261MiB / 16160MiB | 99%
> Default || | |
> N/A
> |+-------------------------------+----------------------+----------------------+|
> 1 Tesla V100-SXM2... Off | 00000000:89:00.0 Off | 0
> || N/A 42C P0 26W / 300W | 2MiB / 16160MiB | 0%
> Default || | |
> N/A
> |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+|
> Processes:
> || GPU GI CI PID Type Process name GPU
> Memory || ID ID
> Usage
> ||=============================================================================||
> 0 N/A N/A 25714 C ...6_64-multicore-CUDA/namd3 2259MiB
> |+-----------------------------------------------------------------------------+*
>
>
> *And there are no jobs submitted to the second node:*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *[user_at_gpu003 ~]$ nvidia-smiMon May 2 14:54:38
> 2022+-----------------------------------------------------------------------------+|
> NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4
> ||-------------------------------+----------------------+----------------------+|
> GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC
> || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
> M. || | |
> MIG M.
> ||===============================+======================+======================||
> 0 Tesla V100-SXM2... Off | 00000000:61:00.0 Off | 0
> || N/A 42C P0 41W / 300W | 0MiB / 16160MiB | 0%
> Default || | |
> N/A
> |+-------------------------------+----------------------+----------------------+|
> 1 Tesla V100-SXM2... Off | 00000000:89:00.0 Off | 0
> || N/A 41C P0 40W / 300W | 0MiB / 16160MiB | 0%
> Default || | |
> N/A
> |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+|
> Processes:
> || GPU GI CI PID Type Process name GPU
> Memory || ID ID
> Usage
> ||=============================================================================||
> No running processes found
> |+-----------------------------------------------------------------------------+*
>
> Is there any way I can make my NAMD_3.0 job run by utilizing both GPUs?
> The queue shows that the job is submitted to 2 GPUs but is running entirely
> on only one.
>
> I've tried using *mpirun -np* in my script, but it's still running on
> only one GPU.
>
>
> Thanks and Regards,
>
>
> <https://urldefense.com/v3/__https://iith.ac.in/__;!!DZ3fjg!4ahDWXFpriRXPz3KjSz0LJ4p3kyDSIAMa-6dqff2Fw4r8c1cG-PbLktVh5vcrMzTsUbK3gZH-IhsOygvXtxWKuwIkZ2_ww$>
>
> Sruthi Sundaresan
>
> Ph.D. Research Scholar
>
> C/o Dr. Thenmalarchelvi Rathinavelan
>
> Molecular Biophysics Lab, Department of Biotechnology
>
>
> <https://urldefense.com/v3/__https://www.iith.ac.in/*tr/Home.html__;fg!!DZ3fjg!4ahDWXFpriRXPz3KjSz0LJ4p3kyDSIAMa-6dqff2Fw4r8c1cG-PbLktVh5vcrMzTsUbK3gZH-IhsOygvXtxWKuy6509-Gw$>
>
> <https://urldefense.com/v3/__https://www.linkedin.com/in/sruthisundaresan/__;!!DZ3fjg!4ahDWXFpriRXPz3KjSz0LJ4p3kyDSIAMa-6dqff2Fw4r8c1cG-PbLktVh5vcrMzTsUbK3gZH-IhsOygvXtxWKuxV0pAn7w$>
>
> <https://urldefense.com/v3/__https://twitter.com/MBL_IITH__;!!DZ3fjg!4ahDWXFpriRXPz3KjSz0LJ4p3kyDSIAMa-6dqff2Fw4r8c1cG-PbLktVh5vcrMzTsUbK3gZH-IhsOygvXtxWKuwIyY2fPQ$>
>
> Disclaimer:- This footer text is to convey that this email is sent by one
> of the users of IITH. So, do not mark it as SPAM.
>

-- 
With regards
Hrishikesh Dhondge
PhD student,
LORIA - INRIA Nancy

This archive was generated by hypermail 2.1.6 : Tue Dec 13 2022 - 14:32:44 CST