Re: Utilising the GPU in NAMD((NVIDIA CUDA acceleration) in windows

From: Darin Lory (darin.lory_at_gmail.com)
Date: Thu Feb 21 2019 - 16:52:52 CST

Team,

Check out nvtop for GPU stats, I built on my AWS EC2 P3 RHEL instances for
GROMACS/NAMD/VMD and RELION 3.

My notes are below. You will need cmake version 3.x. I used RHEL 7, but
this is easy to do for Ubuntu.

Best regards,

-Darin

"The most exciting phrase to hear in science, the one that heralds new
discoveries, is not 'Eureka!' (I found it!) but 'That's funny ...'"
  -Issac Asimov

[image: --]

Darin S. Lory
[image: https://]about.me/darin.lory
<https://about.me/darin.lory?promo=email_sig&utm_source=email_sig&utm_medium=email_sig&utm_campaign=external_links>
Compiling nvtop - https://github.com/Syllo/nvtop

git clone https://github.com/Syllo/nvtop.git
mkdir -p nvtop/build && cd nvtop/build
cmake3 ..
make
make install

++++++++++++
++++++++++++
output from git clone,cmake3, make, and make install
++++++++++++
++++++++++++

[root_at_ip-10-246-148-209 packages]# git clone
https://github.com/Syllo/nvtop.git
Cloning into 'nvtop'...
remote: Enumerating objects: 93, done.
remote: Counting objects: 100% (93/93), done.
remote: Compressing objects: 100% (64/64), done.
remote: Total 496 (delta 51), reused 64 (delta 26), pack-reused 403
Receiving objects: 100% (496/496), 242.52 KiB | 0 bytes/s, done.
Resolving deltas: 100% (295/295), done.
[root_at_ip-10-246-148-209 packages]# mkdir -p nvtop/build && cd nvtop/build
[root_at_ip-10-246-148-209 build]# cmake3 ..
-- The C compiler identification is GNU 4.8.5
-- Check for working C compiler: /usr/lib64/ccache/cc
-- Check for working C compiler: /usr/lib64/ccache/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Setting build type to 'Release' as none was specified.
-- Found NVML: /usr/local/cuda-10.0/include (found version "10")
-- Looking for cbreak in /usr/lib64/libncursesw.so
-- Looking for cbreak in /usr/lib64/libncursesw.so - found
-- Found Curses: /usr/lib64/libncursesw.so
-- Performing Test compiler_has-Wall
-- Performing Test compiler_has-Wall - Success
-- Performing Test compiler_has-Wpedantic
-- Performing Test compiler_has-Wpedantic - Success
-- Performing Test compiler_has-Wextra
-- Performing Test compiler_has-Wextra - Success
-- Performing Test compiler_has-Waddress
-- Performing Test compiler_has-Waddress - Success
-- Performing Test compiler_has-Waggressive-loop-optimizations
-- Performing Test compiler_has-Waggressive-loop-optimizations - Success
-- Performing Test compiler_has-Wcast-qual
-- Performing Test compiler_has-Wcast-qual - Success
-- Performing Test compiler_has-Wcast-align
-- Performing Test compiler_has-Wcast-align - Success
-- Performing Test compiler_has-Wbad-function-cast
-- Performing Test compiler_has-Wbad-function-cast - Success
-- Performing Test compiler_has-Wmissing-declarations
-- Performing Test compiler_has-Wmissing-declarations - Success
-- Performing Test compiler_has-Wmissing-parameter-type
-- Performing Test compiler_has-Wmissing-parameter-type - Success
-- Performing Test compiler_has-Wmissing-prototypes
-- Performing Test compiler_has-Wmissing-prototypes - Success
-- Performing Test compiler_has-Wnested-externs
-- Performing Test compiler_has-Wnested-externs - Success
-- Performing Test compiler_has-Wold-style-declaration
-- Performing Test compiler_has-Wold-style-declaration - Success
-- Performing Test compiler_has-Wold-style-definition
-- Performing Test compiler_has-Wold-style-definition - Success
-- Performing Test compiler_has-Wstrict-prototypes
-- Performing Test compiler_has-Wstrict-prototypes - Success
-- Performing Test compiler_has-Wpointer-sign
-- Performing Test compiler_has-Wpointer-sign - Success
-- Performing Test compiler_has-Wdouble-promotion
-- Performing Test compiler_has-Wdouble-promotion - Success
-- Performing Test compiler_has-Wuninitialized
-- Performing Test compiler_has-Wuninitialized - Success
-- Performing Test compiler_has-Winit-self
-- Performing Test compiler_has-Winit-self - Success
-- Performing Test compiler_has-Wstrict-aliasing
-- Performing Test compiler_has-Wstrict-aliasing - Success
-- Performing Test compiler_has-Wsuggest-attribute-const
-- Performing Test compiler_has-Wsuggest-attribute-const - Success
-- Performing Test compiler_has-Wtrampolines
-- Performing Test compiler_has-Wtrampolines - Success
-- Performing Test compiler_has-Wfloat-equal
-- Performing Test compiler_has-Wfloat-equal - Success
-- Performing Test compiler_has-Wshadow
-- Performing Test compiler_has-Wshadow - Success
-- Performing Test compiler_has-Wunsafe-loop-optimizations
-- Performing Test compiler_has-Wunsafe-loop-optimizations - Success
-- Performing Test compiler_has-Wfloat-conversion
-- Performing Test compiler_has-Wfloat-conversion - Failed
-- Performing Test compiler_has-Wlogical-op
-- Performing Test compiler_has-Wlogical-op - Success
-- Performing Test compiler_has-Wnormalized
-- Performing Test compiler_has-Wnormalized - Failed
-- Performing Test compiler_has-Wdisabled-optimization
-- Performing Test compiler_has-Wdisabled-optimization - Success
-- Performing Test compiler_has-Whsa
-- Performing Test compiler_has-Whsa - Failed
-- Performing Test compiler_has-Wconversion
-- Performing Test compiler_has-Wconversion - Success
-- Performing Test compiler_has-Wunused-result
-- Performing Test compiler_has-Wunused-result - Success
-- Performing Test compiler_has-Werror-implicit-function-declaration
-- Performing Test compiler_has-Werror-implicit-function-declaration -
Success
-- Performing Test linker_has-Wl_-z_relro
-- Performing Test linker_has-Wl_-z_relro - Success
-- Performing Test sanitizer-address-available
-- Performing Test sanitizer-address-available - Failed
-- Performing Test sanitizer-undefined-available
-- Performing Test sanitizer-undefined-available - Failed
-- Configuring done
-- Generating done
-- Build files have been written to: /apps/packages/nvtop/build
[root_at_ip-10-246-148-209 build]# pwd
/apps/packages/nvtop/build
[root_at_ip-10-246-148-209 build]# make
Scanning dependencies of target nvtop
[ 12%] Building C object src/CMakeFiles/nvtop.dir/nvtop.c.o
[ 25%] Building C object src/CMakeFiles/nvtop.dir/interface.c.o
[ 37%] Building C object
src/CMakeFiles/nvtop.dir/interface_layout_selection.c.o
[ 50%] Building C object src/CMakeFiles/nvtop.dir/get_process_info_linux.c.o
[ 62%] Building C object src/CMakeFiles/nvtop.dir/extract_gpuinfo.c.o
[ 75%] Building C object src/CMakeFiles/nvtop.dir/time.c.o
[ 87%] Building C object src/CMakeFiles/nvtop.dir/plot.c.o
[100%] Linking C executable nvtop
[100%] Built target nvtop
[root_at_ip-10-246-148-209 build]# make install
[100%] Built target nvtop
Install the project...
-- Install configuration: "Release"
-- Installing: /usr/local/share/man/man1/nvtop.1
-- Installing: /usr/local/bin/nvtop
-- Set runtime path of "/usr/local/bin/nvtop" to "/usr/local/lib"
[root_at_ip-10-246-148-209 build]# which nvtop
/usr/local/bin/nvtop

++++++++++++
++++++++++++
Notes:
++++++++++++
++++++++++++

NVIDIA commands:

nvidia-smi -q -g 0 -d UTILIZATION -l
nvidia-smi -a -q
nvidia-smi -i 3 -l -q -d

https://github.com/Syllo/nvtop

yum install ncurses-devel
cd /apps
mkdir nvtop
cd nvtop
git clone https://github.com/Syllo/nvtop.git
mkdir -p nvtop/build && cd nvtop/build
/data/apps/build/cmake/cmake-3.4.3-Linux-x86_64/bin/cmake ..
make
make install

cmake3 .. -DNVML_RETRIEVE_HEADER_ONLINE=True

cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX+/usr/local -D
INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D
OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules -D BUILD_EXAMPLES=ON

On Thu, Feb 21, 2019 at 1:25 PM Vermaas, Joshua <Joshua.Vermaas_at_nrel.gov>
wrote:

> And the logfile. The top usually has stuff like this (important parts in
> bold) if the GPU is actually being detected and is being used:
>
> Charm++: standalone mode (not using charmrun)
> Charm++> Running in Multicore mode: 36 threads
> Charm++> Using recursive bisection (scheme 3) for topology aware partitions
> Converse/Charm++ Commit ID:
> v6.8.2-0-g26d4bd8-namd-charm-6.8.2-build-2018-Jan-11-30463
> CharmLB> Load balancer assumes all CPUs are same.
> Charm++> Running on 1 unique compute nodes (36-way SMP).
> Charm++> cpu topology info is gathered in 0.002 seconds.
> *Info: Built with CUDA version 9010*
> Did not find +devices i,j,k,... argument, using all
> Pe 0 physical rank 0 will use CUDA device of pe 16
> Pe 8 physical rank 8 will use CUDA device of pe 16
> Pe 24 physical rank 24 will use CUDA device of pe 32
> Pe 30 physical rank 30 will use CUDA device of pe 32
> Pe 23 physical rank 23 will use CUDA device of pe 32
> Pe 25 physical rank 25 will use CUDA device of pe 32
> Pe 7 physical rank 7 will use CUDA device of pe 16
> Pe 14 physical rank 14 will use CUDA device of pe 16
> Pe 9 physical rank 9 will use CUDA device of pe 16
> Pe 27 physical rank 27 will use CUDA device of pe 32
> Pe 1 physical rank 1 will use CUDA device of pe 16
> Pe 13 physical rank 13 will use CUDA device of pe 16
> Pe 20 physical rank 20 will use CUDA device of pe 32
> Pe 19 physical rank 19 will use CUDA device of pe 32
> Pe 15 physical rank 15 will use CUDA device of pe 16
> Pe 17 physical rank 17 will use CUDA device of pe 16
> Pe 4 physical rank 4 will use CUDA device of pe 16
> Pe 26 physical rank 26 will use CUDA device of pe 32
> Pe 5 physical rank 5 will use CUDA device of pe 16
> Pe 31 physical rank 31 will use CUDA device of pe 32
> Pe 21 physical rank 21 will use CUDA device of pe 32
> Pe 28 physical rank 28 will use CUDA device of pe 32
> Pe 22 physical rank 22 will use CUDA device of pe 32
> Pe 3 physical rank 3 will use CUDA device of pe 16
> Pe 29 physical rank 29 will use CUDA device of pe 32
> Pe 34 physical rank 34 will use CUDA device of pe 32
> Pe 6 physical rank 6 will use CUDA device of pe 16
> Pe 18 physical rank 18 will use CUDA device of pe 32
> Pe 2 physical rank 2 will use CUDA device of pe 16
> Pe 10 physical rank 10 will use CUDA device of pe 16
> Pe 33 physical rank 33 will use CUDA device of pe 32
> Pe 35 physical rank 35 will use CUDA device of pe 32
> Pe 11 physical rank 11 will use CUDA device of pe 16
> Pe 12 physical rank 12 will use CUDA device of pe 16
> *Pe 16 physical rank 16 binding to CUDA device 0 on r103u01: 'Tesla
> V100-PCIE-16GB' Mem: 16130MB Rev: 7.0 PCI: 0:37:0*
> *Pe 32 physical rank 32 binding to CUDA device 1 on r103u01: 'Tesla
> V100-PCIE-16GB' Mem: 16130MB Rev: 7.0 PCI: 0:86:0*
> *Info: NAMD 2.13 for Linux-x86_64-multicore-CUDA*
>
> If these sorts of messages aren't coming up, you'll need to do some
> debugging to actually figure out what's what.
>
> -Josh
>
>
> On 2019-02-21 08:38:14-07:00 owner-namd-l_at_ks.uiuc.edu wrote:
>
> Do you have CUDA installed with a requisite NVIDIA driver and are running
> the NAMD CUDA version? If so, what are the outputs of `nvidia-smi`?
> ------------------------------
> *From:* owner-namd-l_at_ks.uiuc.edu <owner-namd-l_at_ks.uiuc.edu> on behalf of
> Denish Poudyal <qrystal45_at_gmail.com>
> *Sent:* Thursday, February 21, 2019 8:59:29 AM
> *To:* NAMD list
> *Subject:* namd-l: Utilising the GPU in NAMD((NVIDIA CUDA acceleration)
> in windows
> I have a system with NVIDIA's Quadra K420 GPU & 12 CPU cores, and while
> using the cmd like
> namd2 +idlepoll +p10 <.conf file>
> I am still seeing GPU usage around 1 % and CPU usage around 90%. How can I
> employ gpu in this simulation? Obviously, we dont have GTXs so trying to
> use what we've got. What in the code is missing to force Quadro to get
> involved with this NAMD simulation?
>
> *Denish PoudyalCDPTU, Nepal*
>
>
>

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2019 - 23:20:31 CST