NAMD-2.12 handful of issues with CUDA

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Fri Mar 10 2017 - 03:16:13 CST

Dear experts,

 

somehow I have a lot of problems with the NAMD-2.12 version. All CUDA jobs
will:

 

1. Immediately fail for SMP single process runs when having more than
1 thread via ++ppn:

FATAL ERROR: CUDA error cudaStreamSynchronize(stream) in file
src/CudaTileListKernel.cu, function sortTileLists

on Pe 4 (gpu5 device 1): an illegal memory access was encountered

------------- Processor 4 Exiting: Called CmiAbort ------------

Reason: FATAL ERROR: CUDA error cudaStreamSynchronize(stream) in file
src/CudaTileListKernel.cu, function sortTileLists

on Pe 4 (gpu5 device 1): an illegal memory access was encountered

 

This happens for my own compiled versions (CUDA-7.5) as well as for the
precompiled multicore version (CUDA-6.5).

 

2. Fail after a random amount of steps (few ps up to tens of ns) with
either segfault or even illegal instruction O_o (MPI + CUDA-7.5 + SMP build)

 

I already upgraded the GPU driver but nothing changed. I remember that I
also had Problems with namd-2.11 and GBIS when using CUDA (illegal
instruction) just btw.

 

Any hints?

 

Regards

 

Norman Geist

This archive was generated by hypermail 2.1.6 : Sun Dec 31 2017 - 23:21:08 CST