From: Vermaas, Joshua (Joshua.Vermaas_at_nrel.gov)
Date: Thu Feb 21 2019 - 12:24:15 CST
And the logfile. The top usually has stuff like this (important parts in bold) if the GPU is actually being detected and is being used:
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 36 threads
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID: v6.8.2-0-g26d4bd8-namd-charm-6.8.2-build-2018-Jan-11-30463
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (36-way SMP).
Charm++> cpu topology info is gathered in 0.002 seconds.
Info: Built with CUDA version 9010
Did not find +devices i,j,k,... argument, using all
Pe 0 physical rank 0 will use CUDA device of pe 16
Pe 8 physical rank 8 will use CUDA device of pe 16
Pe 24 physical rank 24 will use CUDA device of pe 32
Pe 30 physical rank 30 will use CUDA device of pe 32
Pe 23 physical rank 23 will use CUDA device of pe 32
Pe 25 physical rank 25 will use CUDA device of pe 32
Pe 7 physical rank 7 will use CUDA device of pe 16
Pe 14 physical rank 14 will use CUDA device of pe 16
Pe 9 physical rank 9 will use CUDA device of pe 16
Pe 27 physical rank 27 will use CUDA device of pe 32
Pe 1 physical rank 1 will use CUDA device of pe 16
Pe 13 physical rank 13 will use CUDA device of pe 16
Pe 20 physical rank 20 will use CUDA device of pe 32
Pe 19 physical rank 19 will use CUDA device of pe 32
Pe 15 physical rank 15 will use CUDA device of pe 16
Pe 17 physical rank 17 will use CUDA device of pe 16
Pe 4 physical rank 4 will use CUDA device of pe 16
Pe 26 physical rank 26 will use CUDA device of pe 32
Pe 5 physical rank 5 will use CUDA device of pe 16
Pe 31 physical rank 31 will use CUDA device of pe 32
Pe 21 physical rank 21 will use CUDA device of pe 32
Pe 28 physical rank 28 will use CUDA device of pe 32
Pe 22 physical rank 22 will use CUDA device of pe 32
Pe 3 physical rank 3 will use CUDA device of pe 16
Pe 29 physical rank 29 will use CUDA device of pe 32
Pe 34 physical rank 34 will use CUDA device of pe 32
Pe 6 physical rank 6 will use CUDA device of pe 16
Pe 18 physical rank 18 will use CUDA device of pe 32
Pe 2 physical rank 2 will use CUDA device of pe 16
Pe 10 physical rank 10 will use CUDA device of pe 16
Pe 33 physical rank 33 will use CUDA device of pe 32
Pe 35 physical rank 35 will use CUDA device of pe 32
Pe 11 physical rank 11 will use CUDA device of pe 16
Pe 12 physical rank 12 will use CUDA device of pe 16
Pe 16 physical rank 16 binding to CUDA device 0 on r103u01: 'Tesla V100-PCIE-16GB' Mem: 16130MB Rev: 7.0 PCI: 0:37:0
Pe 32 physical rank 32 binding to CUDA device 1 on r103u01: 'Tesla V100-PCIE-16GB' Mem: 16130MB Rev: 7.0 PCI: 0:86:0
Info: NAMD 2.13 for Linux-x86_64-multicore-CUDA
If these sorts of messages aren't coming up, you'll need to do some debugging to actually figure out what's what.
-Josh
On 2019-02-21 08:38:14-07:00 owner-namd-l_at_ks.uiuc.edu wrote:
Do you have CUDA installed with a requisite NVIDIA driver and are running the NAMD CUDA version? If so, what are the outputs of `nvidia-smi`?
________________________________
From: owner-namd-l_at_ks.uiuc.edu <owner-namd-l_at_ks.uiuc.edu> on behalf of Denish Poudyal <qrystal45_at_gmail.com>
Sent: Thursday, February 21, 2019 8:59:29 AM
To: NAMD list
Subject: namd-l: Utilising the GPU in NAMD((NVIDIA CUDA acceleration) in windows
I have a system with NVIDIA's Quadra K420 GPU & 12 CPU cores, and while using the cmd like
namd2 +idlepoll +p10 <.conf file>
I am still seeing GPU usage around 1 % and CPU usage around 90%. How can I employ gpu in this simulation? Obviously, we dont have GTXs so trying to use what we've got. What in the code is missing to force Quadro to get involved with this NAMD simulation?
Denish Poudyal
CDPTU, Nepal
This archive was generated by hypermail 2.1.6 : Tue Dec 31 2019 - 23:20:31 CST