Newbie GPU user, please help with my submission line

From: Jose Borreguero (borreguero_at_gmail.com)
Date: Thu Oct 02 2014 - 14:03:24 CDT

Dear NAMD users,

I am trying to run NAMD 2.10b GPU version on a 16core node containing one
GPU (Tesla K20X) and one 16core CPU ( 2.2GHz AMD Opteron 6274 Interlagos)

This is my submission line. I am using one node and setting aside one
thread for communication.

aprun -n 1 -N 1 -d 16 namd2 +ppn 15 +pemap 1-15 +commap 0 test_gpu.conf >&
test_gpu.log

The logfile shows the following erorr:
Pe 8 has 666 local and 665 remote patches and 17982 local and 17955 remote
computes.
------------- Processor 8 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error before memcpy to atoms on Pe 8 (physical
node 0 device 0): invalid configuration argument

Relevant lines from the logfile. Could you please let me know if anything
pops up as a redflag? This is my first time running the GPU version, I am
not familiar with it yet.
Charm++> Running in SMP mode: numNodes 1, 15 worker threads per process
Charm++> cpuaffinity PE-core map : 1-15
Charm++> set comm 0 on node 0 to core #0
Charm++> Running on 1 unique compute nodes (16-way SMP).
Did not find +devices i,j,k,... argument, using all
Pe 0 physical rank 0 will use CUDA device of pe 8
Pe 13 physical rank 13 will use CUDA device of pe 8
Pe 12 physical rank 12 will use CUDA device of pe 8
Pe 7 physical rank 7 will use CUDA device of pe 8
Pe 2 physical rank 2 will use CUDA device of pe 8
Pe 1 physical rank 1 will use CUDA device of pe 8
Pe 6 physical rank 6 will use CUDA device of pe 8
Pe 3 physical rank 3 will use CUDA device of pe 8
Pe 4 physical rank 4 will use CUDA device of pe 8
Pe 5 physical rank 5 will use CUDA device of pe 8
Pe 10 physical rank 10 will use CUDA device of pe 8
Pe 9 physical rank 9 will use CUDA device of pe 8
Pe 11 physical rank 11 will use CUDA device of pe 8
Pe 14 physical rank 14 will use CUDA device of pe 8
Pe 8 physical rank 8 binding to CUDA device 0 on physical node 0: 'Tesla
K20X' Mem: 5759MB Rev: 3.5
Info: NAMD 2.10b1 for CRAY-XE-ugni-smp-Titan-CUDA
....Pe 4 hosts 7 local and 8 remote patches for pe 8
Pe 6 hosts 10 local and 10 remote patches for pe 8
Pe 7 hosts 7 local and 7 remote patches for pe 8
Pe 10 hosts 10 local and 10 remote patches for pe 8
Pe 3 hosts 10 local and 9 remote patches for pe 8
Pe 11 hosts 12 local and 13 remote patches for pe 8
Pe 8 hosts 12 local and 11 remote patches for pe 8
Pe 9 hosts 38 local and 38 remote patches for pe 8
Pe 12 hosts 44 local and 43 remote patches for pe 8
Pe 1 hosts 17 local and 16 remote patches for pe 8
Pe 13 hosts 16 local and 17 remote patches for pe 8
Pe 2 hosts 48 local and 49 remote patches for pe 8
Pe 5 hosts 43 local and 43 remote patches for pe 8
Pe 14 hosts 196 local and 195 remote patches for pe 8
Pe 0 hosts 196 local and 196 remote patches for pe 8
..
Pe 8 has 666 local and 665 remote patches and 17982 local and 17955 remote
computes.
------------- Processor 8 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error before memcpy to atoms on Pe 8 (physical
node 0 device 0): invalid configuration argument
FATAL ERROR: CUDA error at cuda stream completed on Pe 8 (physical node 0
device 0): invalid configuration argument
aborting job:
FATAL ERROR: CUDA error at cuda stream completed on Pe 8 (physical node 0
device 0): invalid configuration argument
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 10 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 0 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 11 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 2 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 5 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 7 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 12 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 6 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 3 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 13 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 14 (physical node 0
device 0): unload of CUDA runtime failed
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 9 (physical node 0
device 0): unload of CUDA runtime failed
aborting job:
FATAL ERROR: CUDA error in cuda_check_pme_forces on Pe 10 (physical node 0
device 0): unload of CUDA runtime failed

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:22:54 CST