From: Ron Stubbs (rons_at_duke.edu)
Date: Mon Sep 21 2009 - 14:56:42 CDT
Hi All,
I have a Tesla C-1060 installed on my workstation along with a Quadro FX
570 video card.
My problem is that NAMD-CUDA is using the FX 570 instead of the Tesla
card. Is there a way to pass an argument to NAMD-CUDA to select the
desired device? If not guess, I'll need to download the source and
modify it to run device 0.
I've swapped device slots, but the Tesla still enumerated as device 0
and the video card as device 1. NAMD-CUDA appears to look for a device
at ID:1
Here's the relevant excerpt for the my output file:
Info: 1 NAMD CVS Linux-x86_64-CUDA 1 ocracoke.pratt.duke.edu rons
Info: Running on 1 processors.
Info: Charm++/Converse parallel runtime startup completed at 0.00278807 s
Did not find +devices i,j,k,... argument, using all
Pe 0 physical rank 0 binding to CUDA device 1 on
ocracoke.pratt.duke.edu: 'Quadro FX 570' Mem: 255MB Rev: 1.1
Info: 1.62163 MB of memory in use based on CmiMemoryUsage
Here's the results of running deviceQuery:
rons_at_ocracoke:~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release> ./deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There are 2 devices supporting CUDA
Device 0: "Tesla C1060"
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294705152 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host
threads can use this device simultaneously)
Device 1: "Quadro FX 570"
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 268107776 bytes
Number of multiprocessors: 2
Number of cores: 16
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 0.92 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host
threads can use this device simultaneously)
Test PASSED
Any comments/suggestions would be greatly appreciated.
Thanks,
Ron
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Ron Stubbs Senior Systems Programmer Research Computing Pratt School of Engineering 1454A Fitzpatrick Center Box 90271 Duke University, Durham, N.C. 27708-0271 office: (919)660-5339 cell:(919)641-5689 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:53:17 CST