Re: cuda error cudastreamcreate

From: Ajasja Ljubetič (ajasja.ljubetic_at_gmail.com)
Date: Tue Jun 14 2011 - 08:23:23 CDT

Are you sure the CUDA drivers are correctly installed?
Try building and running some example programs from the CUDA SDK. Personnaly
I found this guide invaluable
http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Getting_Started_Linux.pdf

Best regards,
Ajasja

On Tue, Jun 14, 2011 at 07:45, Francesco Pietra <chiendarret_at_gmail.com>wrote:

> Hello:
> With a gaming machine
> Gigabyte GA 890FXAUD5
> Six-core AMD PhenomII 1075T
> 2x GTX 470
> NAMD_CVS-2011-06-04_Linux-x86_64-CUDA.tar.gz
> Debian GNU-Linux amd64 wheezy
>
> I could run plainly MD:
>
> nfo: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic
> Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu
> Info: 1 NAMD CVS-2011-06-04 Linux-x86_64-CUDA 6 gig64 francesco
> Info: Running on 6 processors, 6 nodes, 1 physical nodes.
> Info: CPU topology information available.
> Info: Charm++/Converse parallel runtime startup completed at 0.00650811 s
> Pe 5 sharing CUDA device 1 first 1 next 1
> Pe 2 sharing CUDA device 0 first 0 next 4
> Did not find +devices i,j,k,... argument, using all
> Pe 5 physical rank 5 binding to CUDA device 1 on gig64: 'GeForce GTX
> 470' Mem: 1279MB Rev: 2.0
> Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'GeForce GTX
> 470' Mem: 1279MB Rev: 2.0
> Pe 0 sharing CUDA device 0 first 0 next 2
> Pe 3 sharing CUDA device 1 first 1 next 5
> Pe 1 sharing CUDA device 1 first 1 next 3
> Pe 1 physical rank 1 binding to CUDA device 1 on gig64: 'GeForce GTX
> 470' Mem: 1279MB Rev: 2.0
> Pe 0 physical rank 0 binding to CUDA device 0 on gig64: 'GeForce GTX
> 470' Mem: 1279MB Rev: 2.0
> Pe 3 physical rank 3 binding to CUDA device 1 on gig64: 'GeForce GTX
> 470' Mem: 1279MB Rev: 2.0
> Pe 4 sharing CUDA device 0 first 0 next 0
> Pe 4 physical rank 4 binding to CUDA device 0 on gig64: 'GeForce GTX
> 470' Mem: 1279MB Rev: 2.0
> Info: 1.64104 MB of memory in use based on CmiMemoryUsage
> Info: Configuration file is min-02.conf
>
> Yesterday failure: "cuda error cudastreamcreate", which was resolved
> by stepwise visiting
>
> ----/var/lib/dkms/nvidia/270.41.19/2.6.38-2-amd64/x86_64/module/nvidia.ko
>
> and
>
> ----/lib/module/2.6.38-2-amd64/update/dkms/nvidia.ko
>
> and (perhaps, unsure whether this next action was really carried out):
>
> ---reboot
>
> whereby the machine worked nicely for several different tasks all day
> and night long.
>
> Today same error "cuda error cudastreamcreate" and the procedure
> above, including reboot, is unable to get NAMD running. The log file
> says:
>
> Info: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic
> Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu
> Info: 1 NAMD CVS-2011-06-04 Linux-x86_64-CUDA 6 gig64 francesco
> Info: Running on 6 processors, 6 nodes, 1 physical nodes.
> Info: CPU topology information available.
> Info: Charm++/Converse parallel runtime startup completed at 0.0124412 s
> Pe 5 sharing CUDA device 0 first 0 next 0
> Pe 5 physical rank 5 binding to CUDA device 0 on gig64: 'Device
> Emulation (CPU)' Mem: 0MB Rev: 9999.9999
> FATAL ERROR: CUDA error cudaStreamCreate on Pe 5 (gig64 device 0): no
> CUDA-capable device is available
> ------------- Processor 5 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 5 (gig64 device
> 0): no CUDA-capable device is available
>
> Did not find +devices i,j,k,... argument, using all
> Pe 0 sharing CUDA device 0 first 0 next 1
> Pe 0 physical rank 0 binding to CUDA device 0 on gig64: 'Device
> Emulation (CPU)' Mem: 0MB Rev: 9999.9999
> Pe 3 sharing CUDA device 0 first 0 next 4
> Pe 3 physical rank 3 binding to CUDA device 0 on gig64: 'Device
> Emulation (CPU)' Mem: 0MB Rev: 9999.9999
> Pe 1 sharing CUDA device 0 first 0 next 2
> Pe 1 physical rank 1 binding to CUDA device 0 on gig64: 'Device
> Emulation (CPU)' Mem: 0MB Rev: 9999.9999
> FATAL ERROR: CUDA error cudaStreamCreate on Pe 0 (gig64 device 0): no
> CUDA-capable device is available
> ------------- Processor 0 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 0 (gig64 device
> 0): no CUDA-capable device is available
>
> FATAL ERROR: CUDA error cudaStreamCreate on Pe 3 (gig64 device 0): no
> CUDA-capable device is available
> ------------- Processor 3 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 3 (gig64 device
> 0): no CUDA-capable device is available
>
> FATAL ERROR: CUDA error cudaStreamCreate on Pe 1 (gig64 device 0): no
> CUDA-capable device is available
> ------------- Processor 1 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 1 (gig64 device
> 0): no CUDA-capable device is available
>
> Pe 2 sharing CUDA device 0 first 0 next 3
> Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'Device
> Emulation (CPU)' Mem: 0MB Rev: 9999.9999
> FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device 0): no
> CUDA-capable device is available
> ------------- Processor 2 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device
> 0): no CUDA-capable device is available
>
> Pe 4 sharing CUDA device 0 first 0 next 5
> Pe 4 physical rank 4 binding to CUDA device 0 on gig64: 'Device
> Emulation (CPU)' Mem: 0MB Rev: 9999.9999
> FATAL ERROR: CUDA error cudaStreamCreate on Pe 4 (gig64 device 0): no
> CUDA-capable device is available
> ------------- Processor 4 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 4 (gig64 device
> 0): no CUDA-capable device is available
>
> [0] Stack Traceback:
>
> --------------------------------
> nvidia-smi -r (or nvidia-smi -a)
> NVIDIA: could not open the device file /dev/nvidia1 (no such file)
> Failed to initialize NVML: unknown error.
>
> If "nvidia-smi" is for Tesla only, how to check GTX 470?
>
> Thanks for advice
>
> francesco pietra
>
>

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:20:26 CST