Re: Trouble with NAMD Win64-CUDA 2.12 on my new PC

From: Florian Blanc (blanc.flori_at_gmail.com)
Date: Mon Jan 22 2018 - 11:29:45 CST

Hello,

The GTX 570 is too old for NAMD 2.12. It is based on the FERMI
architecture, for which the support was dropped. See
http://www.ks.uiuc.edu/Research/namd/2.13/features.html

This is what the "FATAL ERROR: CUDA error all devices are in prohibited
mode, of compute capability < 3.0, unable to map host memory, too small,
or otherwise unusable on Pe 11 (yuzhou-AMD8 device 0)" message is
telling you.

All the best,

Florian

On 01/22/2018 06:24 PM, Yu Zhou wrote:
> Dear NAMD users,
>
> I just built a desktop with AMD Ryzen 7 eight-core CPU and GTX 570
> running on Windows 7. The non-CUDA version of NAMD 2.12 works fine on
> the PC. But the Win64-CUDA version of NAMD doesn’t work. With command
> line like "namd2 +p16 input.conf > output.conf", the output of the
> CUDA-NAMD is:
> ________________
> Charm++: standalone mode (not using charmrun)
> Charm++> Running in Multicore mode:  16 threads
> Charm++> Using recursive bisection (scheme 3) for topology aware
> partitions
> Charm++ warning> fences and atomic operations not available in native
> assembly
> Converse/Charm++ Commit ID:
> v6.7.1-0-gbdf6a1b-namd-charm-6.7.1-build-2016-Nov-07-136676
> [0] isomalloc.c> Disabling isomalloc because mmap() does not work
> CharmLB> Load balancer assumes all CPUs are same.
> Charm++> Running on 1 unique compute nodes (16-way SMP).
> Charm++> cpu topology info is gathered in 0.000 seconds.
> Info: Built with CUDA version 6050
> Did not find +devices i,j,k,... argument, using all
> FATAL ERROR: CUDA error all devices are in prohibited mode, of compute
> capability < 3.0, unable to map host memory, too small, or otherwise
> unusable on Pe 0 (yuzhou-AMD8 device 0)
> FATAL ERROR: CUDA error all devices are in prohibited mode, of compute
> capability < 3.0, unable to map host memory, too small, or otherwise
> unusable on Pe 11 (yuzhou-AMD8 device 0)
> ________________
>
> If I run CUDA-NAMD by "namd2 +p16 +devices 0 input.conf >output.conf"
> then the output is a little longer:
> ________________
> Charm++: standalone mode (not using charmrun)
> Charm++> Running in Multicore mode:  16 threads
> Charm++> Using recursive bisection (scheme 3) for topology aware
> partitions
> Charm++ warning> fences and atomic operations not available in native
> assembly
> Converse/Charm++ Commit ID:
> v6.7.1-0-gbdf6a1b-namd-charm-6.7.1-build-2016-Nov-07-136676
> [0] isomalloc.c> Disabling isomalloc because mmap() does not work
> CharmLB> Load balancer assumes all CPUs are same.
> Charm++> Running on 1 unique compute nodes (16-way SMP).
> Charm++> cpu topology info is gathered in 0.000 seconds.
> Info: Built with CUDA version 6050
> Pe 7 physical rank 7 will use CUDA device of pe 4
> Pe 14 physical rank 14 will use CUDA device of pe 12
> Pe 9 physical rank 9 will use CUDA device of pe 8
> Pe 12 sharing CUDA device 0
> Pe 15 physical rank 15 will use CUDA device of pe 12
> Pe 11 physical rank 11 will use CUDA device of pe 8
> Pe 8 sharing CUDA device 0
> Pe 3 physical rank 3 will use CUDA device of pe 2
> Pe 4 sharing CUDA device 0
> Pe 5 physical rank 5 will use CUDA device of pe 4
> Pe 10 physical rank 10 will use CUDA device of pe 8
> Pe 13 physical rank 13 will use CUDA device of pe 12
> Pe 6 physical rank 6 will use CUDA device of pe 4
> Pe 2 sharing CUDA device 0
> Pe 0 physical rank 0 will use CUDA device of pe 2
> Info: NAMD 2.12 for Win64-multicore-CUDA
> Info:
> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
> Info: for updates, documentation, and support information.
> Info:
> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
> Info: in all publications reporting results obtained with NAMD.
> Info:
> Info: Based on Charm++/Converse 60701 for multicore-win64
> Info: Built Wed, Dec 21, 2016 11:23:47 AM by jim on europa
> Info: Running on 16 processors, 1 nodes, 1 physical nodes.
> Info: CPU topology information available.
> Info: Charm++/Converse parallel runtime startup completed at 0.0930002 s
> Pe 1 physical rank 1 will use CUDA device of pe 2
> Pe 8 physical rank 8 binding to CUDA device 0 on yuzhou-AMD8: 'GeForce
> GTX 570'  Mem: 1280MB  Rev: 2.0
> Pe 12 physical rank 12 binding to CUDA device 0 on yuzhou-AMD8:
> 'GeForce GTX 570'  Mem: 1280MB  Rev: 2.0
> Pe 2 physical rank 2 binding to CUDA device 0 on yuzhou-AMD8: 'GeForce
> GTX 570'  Mem: 1280MB  Rev: 2.0
> Pe 4 physical rank 4 binding to CUDA device 0 on yuzhou-AMD8: 'GeForce
> GTX 570'  Mem: 1280MB  Rev: 2.0
> FATAL ERROR: CUDA error device not of compute capability 3.0 or higher
> on Pe 8 (yuzhou-AMD8 device 0)
> FATAL ERROR: CUDA error device not of compute capability 3.0 or higher
> on Pe 2 (yuzhou-AMD8 device 0)
> FATAL ERROR: CUDA error device not of compute capability 3.0 or higher
> on Pe 12 (yuzhou-AMD8 device 0)
> FATAL ERROR: CUDA error device not of compute capability 3.0 or higher
> on Pe 4 (yuzhou-AMD8 device 0)
> ________________
>
>
> Thanks for your help
>
> Yu Zhou
>
> Department of anesthesiology
> Washington University School of Medicine
>
>
>

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:20:48 CST