AW: AW: CUDA problem?

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Thu Apr 05 2012 - 01:22:38 CDT

I guess the developers will fix this soon as 2.9b2 is a beta, bugs are expected. And reports a wished.

 

Norman Geist.

 

Von: Albert [mailto:mailmd2011_at_gmail.com]
Gesendet: Donnerstag, 5. April 2012 08:16
An: Norman Geist; namd-l_at_ks.uiuc.edu
Betreff: Re: AW: namd-l: CUDA problem?

 

Hello:
  thank you very much for kind messages.
 Is there an solution for this problem?

best
A

On 04/05/2012 08:12 AM, Norman Geist wrote:

Hi,

 

there seems to be something wrong within the new gpu accelerated minimization as Francesco posted the same issue and I answered him a few second ago. I first thought this could also be an hardware issue of a single gpu, but two people with a broken gpu is really unlikely. So it’s the developers turn.

 

Best wishes

 

Norman Geist.

 

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag von Albert
Gesendet: Mittwoch, 4. April 2012 21:03
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: CUDA problem?

 

Dear:
  I've built a membrane system from CHARMM GUI and use the equilibration protocol to relax my system. Everything goes well if I use the default setting and it was finished under CUDA mode. However, there is a ligand in my system and I would like to restrain it during step 6.1(see below of the file). Here is what I did to add constrain for my ligand

set sel [atomselect top all]
$sel set beta 0
set fix [atomselect top "protein and backbone or (resname LIG and not hydrogen)"]
$fix set beta 1
$sel writepdb bb_rmsd.ref

after that I am trying to run this 6.1.inp by command:

charmrun ++local +p4 namd2 +idlepoll step6.1_equilibration.inp > log

a few minutes later, it stopped with following logs:

---------log----------------
LINE MINIMIZER BRACKET: DX 7.96611e-05 0.000159322 DU -84.715 50.7203 DUDX -1.52698e+06 -592619 1.21989e+06
ENERGY: 1776 5819.8403 10258.1721 9471.5998 94.8591 -182114.5405 16169.6595 0.0000 3.2133 0.0000 -140297.1965 0.0000 -140297.1965 -140297.1965 0.0000 3492.2283 3770.7578 593110.5555 3492.2283 3770.7578

LINE MINIMIZER BRACKET: DX 5.18225e-05 0.0001075 DU -15.3777 66.098 DUDX -592619 3098.88 1.21989e+06
ENERGY: 1777 5817.4042 10259.2760 9467.1949 94.8526 -182109.7937 16170.7783 0.0000 3.2124 0.0000 -140297.0753 0.0000 -140297.0753 -140297.0753 0.0000 3495.3068 3772.9724 593110.5555 3495.3068 3772.9724

LINE MINIMIZER BRACKET: DX 5.18225e-06 0.0001075 DU -0.121147 66.098 DUDX -56148.6 3098.88 1.21989e+06
------------- Processor 2 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: cuda_check_remote_progress polled 1000000 times over 101.085352 s on step 1778

FATAL ERROR: cuda_check_remote_progress polled 1000000 times over 101.085352 s on step 1778
Charm++ fatal error:
FATAL ERROR: cuda_check_remote_progress polled 1000000 times over 101.085352 s on step 1778

However, if I don't use CUDA mode, everthing goes well.... and the simulation can be finished without any error.... Would you please give me some advices for this?

----------step 6.1.inp-------------
structure ../step5_assembly.xplor_ext.psf
coordinates ../step5_assembly.pdb

set temp 310;
set outputname step6.1_equilibration;

# read system values written by CHARMM (need to convert uppercases to lowercases)
exec tr "\[:upper:\]" "\[:lower:\]" < ../step5_assembly.str | sed -e "s/ = //g" > step5_assembly.namd.str
source step5_assembly.namd.str

temperature $temp;

outputName step6.1_equilibration_a; # base name for output from this run
# NAMD writes two files at the end, final coord and vel
# in the format of first-dyn.coor and first-dyn.vel
firsttimestep 0; # last step of previous run
restartfreq 500; # 500 steps = every 1ps
dcdfreq 1000;
dcdUnitCell yes; # the file will contain unit cell info in the style of
# charmm dcd files. if yes, the dcd files will contain
# unit cell information in the style of charmm DCD files.
xstFreq 1000; # XSTFreq: control how often the extended systen configuration
# will be appended to the XST file
outputEnergies 125; # 125 steps = every 0.25ps
# The number of timesteps between each energy output of NAMD
outputTiming 1000; # The number of timesteps between each timing output shows
# time per step and time to completion

# Force-Field Parameters
paraTypeCharmm on; # We're using charmm type parameter file(s)
# multiple definitions may be used but only one file per definition

exec mkdir -p toppar
exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" ./toppar/par_all22_prot.prm > toppar/par_all22_prot.prm
exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" ../toppar/par_all27_na.prm > toppar/par_all27_na.prm
exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" ./toppar/par_all36_carb.prm > toppar/par_all36_carb.prm
exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" ./toppar/par_all36_lipid.prm > toppar/par_all36_lipid.prm
exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" ./toppar/par_all36_cgenff.prm > toppar/par_all36_cgenff.prm
exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" -e "1,/read para/d" \
-e "278,296d" -e "s/^BOM/!&/g" -e "s/^WRN/!&/g" ./toppar/toppar_water_ions.str > toppar/toppar_water_ions.str
exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" -e "1,/read para/d" \
-e "278,296d" -e "s/^BOM/!&/g" -e "s/^WRN/!&/g" ./toppar/toppar_all36_lipid_cholesterol.str > toppar/toppar_all36_lipid_cholesterol.str

parameters toppar/par_all27_prot_na.prm;
parameters toppar/par_all36_lipid.prm;
parameters toppar/par_all22_prot.prm;
parameters toppar/par_all27_na.prm;
parameters toppar/par_all36_carb.prm;
parameters toppar/par_all36_cgenff.prm;
parameters toppar/par_all35_ethers.prm;
parameters toppar/lig.prm;

parameters toppar/toppar_water_ions.str;
parameters toppar/toppar_all36_lipid_cholesterol.str;

# These are specified by CHARMM
exclude scaled1-4 # non-bonded exclusion policy to use "none,1-2,1-3,1-4,or scaled1-4"
# 1-2: all atoms pairs that are bonded are going to be ignored
# 1-3: 3 consecutively bonded are excluded
# scaled1-4: include all the 1-3, and modified 1-4 interactions
# electrostatic scaled by 1-4scaling factor 1.0
# vdW special 1-4 parameters in charmm parameter file.
1-4scaling 1.0
switching on
vdwForceSwitching yes; # New option for force-based switching of vdW
# if both switching and vdwForceSwitching are on CHARMM force
# switching is used for vdW forces.
seed 1333525265 # Specifies a specific seed

# You have some freedom choosing the cutoff
cutoff 12.0; # may use smaller, maybe 10., with PME
switchdist 10.0; # cutoff - 2.
# switchdist - where you start to switch
# cutoff - where you stop accounting for nonbond interactions.
# correspondence in charmm:
# (cutnb,ctofnb,ctonnb = pairlistdist,cutoff,switchdist)
pairlistdist 16.0; # stores the all the pairs with in the distance it should be larger
# than cutoff( + 2.)
stepspercycle 20; # 20 redo pairlists every ten steps
pairlistsPerCycle 2; # 2 is the default
# cycle represents the number of steps between atom reassignments
# this means every 20/2=10 steps the pairlist will be updated

# Integrator Parameters
timestep 1.0; # fs/step
rigidBonds all; # Bound constraint all bonds involving H are fixed in length
nonbondedFreq 1; # nonbonded forces every step
fullElectFrequency 1; # PME every step

# Constant Temperature Control ONLY DURING EQUILB
reassignFreq 500; # reassignFreq: use this to reassign velocity every 500 steps
reassignTemp $temp;

# Periodic Boundary conditions. Need this since for a start...
cellBasisVector1 $a 0.0 0.0; # vector to the next image
cellBasisVector2 0.0 $b 0.0;
cellBasisVector3 0.0 0.0 $c;
cellOrigin 0.0 0.0 $zcen; # the *center* of the cell

wrapWater on; # wrap water to central cell
wrapAll on; # wrap other molecules too
wrapNearest off; # use for non-rectangular cells (wrap to the nearest image)

# PME (for full-system periodic electrostatics)
exec python ../checkfft.py $a $b $c > checkfft.str
source checkfft.str

PME yes;
PMEInterpOrder 6; # interpolation order (spline order 6 in charmm)
PMEGridSizeX $fftx; # should be close to the cell size
PMEGridSizeY $ffty; # corresponds to the charmm input fftx/y/z
PMEGridSizeZ $fftz;

# Pressure and volume control
useGroupPressure yes; # use a hydrogen-group based pseudo-molecular viral to calcualte pressure and
# has less fluctuation, is needed for rigid bonds (rigidBonds/SHAKE)
useFlexibleCell yes; # yes for anisotropic system like membrane
useConstantRatio yes; # keeps the ratio of the unit cell in the x-y plane constant A=B

langevin on
langevinDamping 10
langevinTemp $temp
langevinHydrogen no

# planar restraint
colvars on
exec sed -e "s/Constant \$fc/Constant 5/g" -e "s/\$bb/10.0/g" -e "s/\$sc/5.0/g" membrane_lipid_restraint.namd.col > restraints/$outputname.col
colvarsConfig restraints/$outputname.col

# dihedral restraint
extraBonds yes
exec sed -e "s/\$FC/500/g" restraints/dihe.txt > restraints/$outputname.dihe
extraBondsFile restraints/$outputname.dihe

minimize 10000

numsteps 90000000
run 3000000 ; 3ns

 

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:24 CST