AW: NAMD-2.12 handful of issues with CUDA

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Fri Mar 10 2017 - 04:55:47 CST

3. Randomly also constraint errors occur, some memory uninitialized
somewhere?

 

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Norman Geist
Gesendet: Freitag, 10. Mrz 2017 10:16
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: NAMD-2.12 handful of issues with CUDA

 

Dear experts,

 

somehow I have a lot of problems with the NAMD-2.12 version. All CUDA jobs
will:

 

1. Immediately fail for SMP single process runs when having more than
1 thread via ++ppn:

FATAL ERROR: CUDA error cudaStreamSynchronize(stream) in file
src/CudaTileListKernel.cu, function sortTileLists

on Pe 4 (gpu5 device 1): an illegal memory access was encountered

------------- Processor 4 Exiting: Called CmiAbort ------------

Reason: FATAL ERROR: CUDA error cudaStreamSynchronize(stream) in file
src/CudaTileListKernel.cu, function sortTileLists

on Pe 4 (gpu5 device 1): an illegal memory access was encountered

 

This happens for my own compiled versions (CUDA-7.5) as well as for the
precompiled multicore version (CUDA-6.5).

 

2. Fail after a random amount of steps (few ps up to tens of ns) with
either segfault or even illegal instruction O_o (MPI + CUDA-7.5 + SMP build)

 

I already upgraded the GPU driver but nothing changed. I remember that I
also had Problems with namd-2.11 and GBIS when using CUDA (illegal
instruction) just btw.

 

Any hints?

 

Regards

 

Norman Geist

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:20:10 CST