From: Michelle Kuttel (mkuttel_at_cs.uct.ac.za)
Date: Thu Aug 03 2017 - 03:30:21 CDT
Hello
I’m having a problem running the name 2.12 Linux-x86_64-multicore-CUDA (NVIDIA CUDA acceleration) build on a single node with Nvidia M2090 cards (a node with K40’s runs fine).
A run as follows:
export LD_LIBRARY_PATH=/opt/exp_soft/NAMD_2.12_Linux-x86_64-multicore-CUDA/:$LD_LIBRARY_PATH
/opt/exp_soft/NAMD_2.12_Linux-x86_64-multicore-CUDA/namd2 +p12 +idlepoll +noAnytimeMigration +LBSameCpus runEQ.conf > runEQ.log
Generates this error:
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 12 threads
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID: v6.7.1-0-gbdf6a1b-namd-charm-6.7.1-build-2016-Nov-07-136676
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (12-way SMP).
Charm++> cpu topology info is gathered in 0.010 seconds.
Info: Built with CUDA version 6050
Did not find +devices i,j,k,... argument, using all
FATAL ERROR: CUDA error all devices are in prohibited mode, of compute capability < 3.0, unable to map host memory, too small, or otherwise unusable on Pe 9 (srvslsgpu002 device 0)
AFAIK, named 2.12 supports GPU’s with compute capability of 2.0 (i.e. M2090’s)?
This runs fine on the node with K90’s.
Suggestions for fixes will be (greatly) appreciated (note that specifying “+devices all” does not help).
regards
Michelle
-----------------------------
Michelle Kuttel
mkuttel_at_cs.uct.ac.za
-----------------------------
This archive was generated by hypermail 2.1.6 : Sun Dec 31 2017 - 23:21:31 CST