AW: All CUDA devices are in prohibited mode, of compute capability 1.0, or otherwise unusable.

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Thu Aug 15 2013 - 02:12:58 CDT

The Ti has about 30% better performance, similar factor for the pricing.
Same for the 700 series models. The pricing almost scales perfectly with the
performance, except of the faster memory in the newer series. So at least it
doesn’t really matter.

 

Norman Geist.

 

Von: Lucas [mailto:lucasbleicher_at_gmail.com]
Gesendet: Donnerstag, 15. August 2013 08:53
An: Norman Geist
Cc: Namd Mailing List
Betreff: Re: namd-l: All CUDA devices are in prohibited mode, of compute
capability 1.0, or otherwise unusable.

 

As for the LD_LIBRARY_PATH, I simply put the namd2 command line in a shell
script, with a "export LD_LIBRARY_PATH=libcudartpath" line before it.

While fiddling with the "+devices" directive I have also tried other
combinations including 0,0,0,0, but all of them turned out to be slower than
simply using the CPUs. I'll try the other options and see if they help, but,
it seems the hardware isn't that great...

The good news is that it isn't my machine - I'm visiting another group which
happened to have a desktop with that configuration with virtually no use.
I'm actually deciding which model to buy for my own server, and I'm quite
confused about all those different models. Since you mentioned the GTX660,
how does it compare to the GTX660Ti? Amazon is listing the first for $194
and the second for $317.

Cheers,
Lucas

 

2013/8/14 Norman Geist <norman.geist_at_uni-greifswald.de>

Alright, just wanted to check which libcudart it is using, how do you pass
the LD_LIBRARY_PATH to namd2, do you use ++runscript? Possibly namd is
getting the wrong cudart version dynamically linked.

As it seems, namd is unable to list up your devices for some reason, maybe
because <see above>. But fine if it is running if you pass the device id.

What version of namd are you using? If higher 2.8, try +ignoresharing or
+devices 0,0,0,0 (number of cpus) for the namd2 command, to try
oversubscription. Additionally, for all versions, try "twoawayx yes" for
small boxes and "fullelectfrequency 4" to save pcie communication in general
in your namd config.

But I think you won't get much more performance out of the quadro, as it has
only 96 cuda cores which is quite few compared to 1344 on a GTX660 which is
only about 200 bucks.

I understand why people buy themselves a nvidia tesla series card, but I
don't understand why people pay that much money for this quadro rubbish. The
key advantages they have is showing GPU utilization and doing QuadBuffered
Stereo. Much cheaper a GTX660 f.i. and a passive 3d monitor both together
for about 500 bucks and much better performance and no need for 140Hz
monitor that are quite expensive or a quadro card that is about some
thousands sometimes.

If your machine has one more pcie x16 slot, try to get yourself a geforce ;)

Good luck

Norman Geist.

> -----Ursprüngliche Nachricht-----

> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> Auftrag von Lucas

> Gesendet: Donnerstag, 15. August 2013 00:10
> An: Norman Geist; namd-l_at_ks.uiuc.edu
> Betreff: Re: namd-l: All CUDA devices are in prohibited mode, of

> compute capability 1.0, or otherwise unusable.
>

> ldd returns:
> linux-vdso.so.1 => (0x00007fffe39ff000)
> libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003049200000)
> libdl.so.2 => /lib64/libdl.so.2 (0x0000003048e00000)
> libcudart.so.4 => not found
> libm.so.6 => /lib64/libm.so.6 (0x0000003049600000)
> libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003055a00000)
> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003054a00000)
> libc.so.6 => /lib64/libc.so.6 (0x0000003048a00000)
> /lib64/ld-linux-x86-64.so.2 (0x0000003048600000)
>
> nvidia-smi –q –g 0 actually returns an error (Invalid combination of
> input arguments), but I've included the full output from a
> "nvidia-smi -q" in the end of the e-mail.
>
> I was fiddling with the command line options this morning and found
> out that the program actually runs if I include the +devices directive
> (I thought that namd would simply assume it would be "+devices all" if
> that option was not used). What is weird about it is that the
> benchmark times for four processors and the GPU are actually much
> slower than simply using the four processors (even if I use a very
> high value for outputEnergies, such as 500). I know the Quadro 600
> isn't one of the most powerful cards around, but it still weird to see
> such performance (about 3days/ns with CPUs only, 3.6 days/ns when
> using +devices 0).
>
> ==========================================
> Output from nvidia-smi -q:
>
> Timestamp : Wed Aug 14 11:20:21 2013
> Driver Version : 319.37
>
> Attached GPUs : 1
> GPU 0000:01:00.0
> Product Name : Quadro 600
> Display Mode : Enabled
> Display Active : Enabled
> Persistence Mode : Disabled
> Accounting Mode : N/A
> Accounting Mode Buffer Size : N/A
> Driver Model
> Current : N/A
> Pending : N/A
> Serial Number : N/A
> GPU UUID : GPU-0fb91d62-ad2d-2b61-a988-
> 91870abafeb1
> VBIOS Version : 70.08.88.00.01
> Inforom Version
> Image Version : N/A
> OEM Object : N/A
> ECC Object : N/A
> Power Management Object : N/A
> GPU Operation Mode
> Current : N/A
> Pending : N/A
> PCI
> Bus : 0x01
> Device : 0x00
> Domain : 0x0000
> Device Id : 0x0DF810DE
> Bus Id : 0000:01:00.0
> Sub System Id : 0x083510DE
> GPU Link Info
> PCIe Generation
> Max : 2
> Current : 1
> Link Width
> Max : 16x
> Current : 16x
> Fan Speed : 30 %
> Performance State : P12
> Clocks Throttle Reasons : N/A
> Memory Usage
> Total : 1023 MB
> Used : 68 MB
> Free : 955 MB
> Compute Mode : Default
> Utilization
> Gpu : 0 %
> Memory : 12 %
> Ecc Mode
> Current : N/A
> Pending : N/A
> ECC Errors
> Volatile
> Single Bit
> Device Memory : N/A
> Register File : N/A
> L1 Cache : N/A
> L2 Cache : N/A
> Texture Memory : N/A
> Total : N/A
> Double Bit
> Device Memory : N/A
> Register File : N/A
> L1 Cache : N/A
> L2 Cache : N/A
> Texture Memory : N/A
> Total : N/A
> Aggregate
> Single Bit
> Device Memory : N/A
> Register File : N/A
> L1 Cache : N/A
> L2 Cache : N/A
> Texture Memory : N/A
> Total : N/A
> Double Bit
> Device Memory : N/A
> Register File : N/A
> L1 Cache : N/A
> L2 Cache : N/A
> Texture Memory : N/A
> Total : N/A
> Retired Pages
> Single Bit ECC : N/A
> Double Bit ECC : N/A
> Pending : N/A
> Temperature
> Gpu : 49 C
> Power Readings
> Power Management : N/A
> Power Draw : N/A
> Power Limit : N/A
> Default Power Limit : N/A
> Enforced Power Limit : N/A
> Min Power Limit : N/A
> Max Power Limit : N/A
> Clocks
> Graphics : 50 MHz
> SM : 101 MHz
> Memory : 135 MHz
> Applications Clocks
> Graphics : N/A
> Memory : N/A
> Default Applications Clocks
> Graphics : N/A
> Memory : N/A
> Max Clocks
> Graphics : 641 MHz
> SM : 1282 MHz
> Memory : 800 MHz
> Compute Processes : None

 

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:34 CST