AW: CUDA error in cuda_check_local_progress

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Thu Apr 17 2014 - 03:21:44 CDT

Next message: Norman Geist: "AW: CUDA error in cuda_check_local_progress"
Previous message: Abhishek TYAGI: "CUDA error in cuda_check_local_progress"
In reply to: Abhishek TYAGI: "CUDA error in cuda_check_local_progress"
Next in thread: Abhishek TYAGI: "RE: CUDA error in cuda_check_local_progress"
Reply: Abhishek TYAGI: "RE: CUDA error in cuda_check_local_progress"
Reply: Abhishek TYAGI: "RE: CUDA error in cuda_check_local_progress"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

What GPUs are that? This error occurs for example if your cutoff or
pairlistdist, etc. are too large to fit the GPUs memory and stuff. Whats the
output of "nvidia-smi -q". Maybe there are multiple GPUs where one is only
for display and therefore hasn't enough memory. Try setting +devices to
select the GPU ids manually and see if it works with one GPU separately.

Norman Geist.

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Abhishek TYAGI
Gesendet: Donnerstag, 17. April 2014 09:41
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: CUDA error in cuda_check_local_progress

Hi,

I am running a simulation for graphene and dna system. While running in my
CPU their is no error, but while running on GPU Cluster (Nvidia, Cuda) I am
using NAMD tool available on website
(NAMD_2.9_Linux-x86_64-multicore-CUDA.tar.gz). The following error appears
all the time. I tried to change timesteps, frequencies and other things too
but i really dont understand what to do in this case.

I run the command for minimization but it is failed everytime:

% charmrun namd2 +idlepoll +p4 eq1.namd > eq1.log &

------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error in cuda_check_local_progress on Pe 0 (gpu10
device 0): unspecified launch failure

Charm++ fatal error:
FATAL ERROR: CUDA error in cuda_check_local_progress on Pe 0 (gpu10 device
0): unspecified launch failure

The eq1.namd conf file is as follows:

#############################################################
## JOB DESCRIPTION ##
#############################################################

# Minimization and Equilibration of
# COMMENT ON YOUR SYSTEM HERE

#############################################################
## ADJUSTABLE PARAMETERS ##
#############################################################

structure ionized.psf
coordinates ionized.pdb

set temperature 298
set outputname eq1

firsttimestep 0

#############################################################
## SIMULATION PARAMETERS ##
#############################################################

# Input
paraTypeCharmm on
parameters par_all27_na.prm
parameters par_graphene.prm
temperature $temperature

# Force-Field Parameters
exclude scaled1-4
1-4scaling 1.0
cutoff 12.
switching on
switchdist 10.
pairlistdist 13.5

# Integrator Parameters
timestep 0.5
rigidBonds all
nonbondedFreq 2
fullElectFrequency 4
stepspercycle 10

# Constant Temperature Control
langevin off
langevinDamping 5
langevinTemp $temperature
langevinHydrogen off

# Output
outputName $outputname

restartfreq 500 ;# 500steps = every 1ps
dcdfreq 300
outputEnergies 100
outputPressure 100

#############################################################
## PBC PARAMETERS ##
#############################################################

# Periodic Boundary Conditions
cellBasisVector1 40.0 0.0 0.0
cellBasisVector2 0.0 40.0 0.0
cellBasisVector3 0.0 0.0 30.0
cellOrigin 0.0 0.0 0.0

#############################################################
## EXECUTION SCRIPT ##
#############################################################

# Minimization
minimize 100000
reinitvels $temperature

run 50000

Please suggest me how to resolve this issue.

Thanks in advance

Abhishek

---
Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv.
http://www.avast.com

Next message: Norman Geist: "AW: CUDA error in cuda_check_local_progress"
Previous message: Abhishek TYAGI: "CUDA error in cuda_check_local_progress"
In reply to: Abhishek TYAGI: "CUDA error in cuda_check_local_progress"
Next in thread: Abhishek TYAGI: "RE: CUDA error in cuda_check_local_progress"
Reply: Abhishek TYAGI: "RE: CUDA error in cuda_check_local_progress"
Reply: Abhishek TYAGI: "RE: CUDA error in cuda_check_local_progress"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:22:21 CST