Re: FATAL ERROR: CUDA error in cudaGetDeviceCount on Pe 0 (thomasASUS): CUDA driver version is insufficient for CUDA runtime version

From: Thomas Evangelidis (tevang3_at_gmail.com)
Date: Sat Oct 20 2012 - 11:46:53 CDT

Hi again,

I managed to install the latest NVIDIA drivers (NVIDIA-Linux-x86_64-304.51)
and the latest production CUDA-5.0 release on my AsusN56V with i7-3610QM
and GeForce GT 650M. The trick for NAMD to find my GPU was to explicitly
give in the command line "+devices 0". The whole command line looked like
this:

${NAMD_HOME}/charmrun ++local +p8 ${NAMD_HOME}/namd2 +idlepoll +devices 0
production_default.amberff.octahedron.namd

I used the precompiled binaries NAMD_CVS-2012-09-22_Linux-x86_64-multicore
and NAMD_CVS-2012-09-22_Linux-x86_64-multicore-CUDA to monitor the
performance on my system, which is a truncated octahedron with a protein
(the ff I use is Amber99SB-NMR1-ILDN), 131788 TIP4P-Ew water atoms (32947
waters; each TIP4P-Ew counts 4 atoms in the Amber .prmtop), 93 Na and 113
Cl ions, namely 131788+93+113+2796=134790 atoms in total. Surprisingly the
performance without the GPU is better as you can see below.

With the GPU:
Info: Benchmark time: 8 CPUs 0.238132 s/step 1.37808 days/ns 359.961 MB
memory

Without the GPU:
Info: Benchmark time: 8 CPUs 0.206626 s/step 1.19575 days/ns 720.852 MB
memory

The only case I get better performance with the GPU is when I run NAMD in
serial mode:

With the GPU:
Info: Benchmark time: 1 CPUs 0.26001 s/step 1.50469 days/ns 256.984 MB
memory

Without the GPU:
Info: Benchmark time: 1 CPUs 0.808154 s/step 4.67682 days/ns 504.398 MB
memory

For the apo1a benchmark, NAMD complained about "++local" so I used the
following command line:

${NAMD_HOME}//charmrun +p8 ${NAMD_HOME}//namd2 +idlepoll +devices 0
apoa1.namd

This time the performance was almost the same with and without the GPU:

With the GPU:
Info: Benchmark time: 8 CPUs 0.22935 s/step 2.65451 days/ns 280.375 MB
memory

Without the GPU:
Info: Benchmark time: 8 CPUs 0.223781 s/step 2.59006 days/ns 696.184 MB
memory

Is there any parameter I can tweak to get better GPU performance for my
system??? Below is the GPU assignment when I run on all available cores.

Pe 7 physical rank 7 will use CUDA device of pe 4
Pe 2 physical rank 2 will use CUDA device of pe 4
Pe 3 physical rank 3 will use CUDA device of pe 4
Pe 6 physical rank 6 will use CUDA device of pe 4
Pe 1 physical rank 1 will use CUDA device of pe 4
Pe 5 physical rank 5 will use CUDA device of pe 4
Pe 4 physical rank 4 binding to CUDA device 0 on thomasASUS: 'GeForce GT
650M' Mem: 2047MB Rev: 3.0
Pe 0 physical rank 0 will use CUDA device of pe 4

Thanks,
Thomas

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:22:41 CST