Re: AW: CUDA problem?

From: Albert (mailmd2011_at_gmail.com)
Date: Thu Apr 05 2012 - 01:15:52 CDT

Hello:
   thank you very much for kind messages.
  Is there an solution for this problem?

best
A

On 04/05/2012 08:12 AM, Norman Geist wrote:
>
> Hi,
>
> there seems to be something wrong within the new gpu accelerated
> minimization as Francesco posted the same issue and I answered him a
> few second ago. I first thought this could also be an hardware issue
> of a single gpu, but two people with a broken gpu is really unlikely.
> So it’s the developers turn.
>
> Best wishes
>
> Norman Geist.
>
> *Von:*owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
> Auftrag von *Albert
> *Gesendet:* Mittwoch, 4. April 2012 21:03
> *An:* namd-l_at_ks.uiuc.edu
> *Betreff:* namd-l: CUDA problem?
>
> Dear:
> I've built a membrane system from CHARMM GUI and use the
> equilibration protocol to relax my system. Everything goes well if I
> use the default setting and it was finished under CUDA mode. However,
> there is a ligand in my system and I would like to restrain it during
> step 6.1(see below of the file). Here is what I did to add constrain
> for my ligand
>
> set sel [atomselect top all]
> $sel set beta 0
> set fix [atomselect top "protein and backbone or (resname LIG and not
> hydrogen)"]
> $fix set beta 1
> $sel writepdb bb_rmsd.ref
>
>
>
> after that I am trying to run this 6.1.inp by command:
>
> charmrun ++local +p4 namd2 +idlepoll step6.1_equilibration.inp > log
>
>
>
>
> a few minutes later, it stopped with following logs:
>
>
> ---------log----------------
> LINE MINIMIZER BRACKET: DX 7.96611e-05 0.000159322 DU -84.715 50.7203
> DUDX -1.52698e+06 -592619 1.21989e+06
> ENERGY: 1776 5819.8403 10258.1721 9471.5998 94.8591 -182114.5405
> 16169.6595 0.0000 3.2133 0.0000 -140297.1965 0.0000 -140297.1965
> -140297.1965 0.0000 3492.2283 3770.7578 593110.5555 3492.2283 3770.7578
>
> LINE MINIMIZER BRACKET: DX 5.18225e-05 0.0001075 DU -15.3777 66.098
> DUDX -592619 3098.88 1.21989e+06
> ENERGY: 1777 5817.4042 10259.2760 9467.1949 94.8526 -182109.7937
> 16170.7783 0.0000 3.2124 0.0000 -140297.0753 0.0000 -140297.0753
> -140297.0753 0.0000 3495.3068 3772.9724 593110.5555 3495.3068 3772.9724
>
> LINE MINIMIZER BRACKET: DX 5.18225e-06 0.0001075 DU -0.121147 66.098
> DUDX -56148.6 3098.88 1.21989e+06
> ------------- Processor 2 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: cuda_check_remote_progress polled 1000000 times
> over 101.085352 s on step 1778
>
> FATAL ERROR: cuda_check_remote_progress polled 1000000 times over
> 101.085352 s on step 1778
> Charm++ fatal error:
> FATAL ERROR: cuda_check_remote_progress polled 1000000 times over
> 101.085352 s on step 1778
>
>
> However, if I don't use CUDA mode, everthing goes well.... and the
> simulation can be finished without any error.... Would you please give
> me some advices for this?
>
>
> ----------step 6.1.inp-------------
> structure ../step5_assembly.xplor_ext.psf
> coordinates ../step5_assembly.pdb
>
> set temp 310;
> set outputname step6.1_equilibration;
>
> # read system values written by CHARMM (need to convert uppercases to
> lowercases)
> exec tr "\[:upper:\]" "\[:lower:\]" < ../step5_assembly.str | sed -e
> "s/ = //g" > step5_assembly.namd.str
> source step5_assembly.namd.str
>
> temperature $temp;
>
> outputName step6.1_equilibration_a; # base name for output from this run
> # NAMD writes two files at the end, final coord and vel
> # in the format of first-dyn.coor and first-dyn.vel
> firsttimestep 0; # last step of previous run
> restartfreq 500; # 500 steps = every 1ps
> dcdfreq 1000;
> dcdUnitCell yes; # the file will contain unit cell info in the style of
> # charmm dcd files. if yes, the dcd files will contain
> # unit cell information in the style of charmm DCD files.
> xstFreq 1000; # XSTFreq: control how often the extended systen
> configuration
> # will be appended to the XST file
> outputEnergies 125; # 125 steps = every 0.25ps
> # The number of timesteps between each energy output of NAMD
> outputTiming 1000; # The number of timesteps between each timing
> output shows
> # time per step and time to completion
>
> # Force-Field Parameters
> paraTypeCharmm on; # We're using charmm type parameter file(s)
> # multiple definitions may be used but only one file per definition
>
> exec mkdir -p toppar
> exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g"
> ../toppar/par_all22_prot.prm > toppar/par_all22_prot.prm
> exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g"
> ../toppar/par_all27_na.prm > toppar/par_all27_na.prm
> exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g"
> ../toppar/par_all36_carb.prm > toppar/par_all36_carb.prm
> exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g"
> ../toppar/par_all36_lipid.prm > toppar/par_all36_lipid.prm
> exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g"
> ../toppar/par_all36_cgenff.prm > toppar/par_all36_cgenff.prm
> exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" -e "1,/read para/d" \
> -e "278,296d" -e "s/^BOM/!&/g" -e "s/^WRN/!&/g"
> ../toppar/toppar_water_ions.str > toppar/toppar_water_ions.str
> exec sed -e "s/^ATOM/!&/g" -e "s/^MASS/!&/g" -e "1,/read para/d" \
> -e "278,296d" -e "s/^BOM/!&/g" -e "s/^WRN/!&/g"
> ../toppar/toppar_all36_lipid_cholesterol.str >
> toppar/toppar_all36_lipid_cholesterol.str
>
> parameters toppar/par_all27_prot_na.prm;
> parameters toppar/par_all36_lipid.prm;
> parameters toppar/par_all22_prot.prm;
> parameters toppar/par_all27_na.prm;
> parameters toppar/par_all36_carb.prm;
> parameters toppar/par_all36_cgenff.prm;
> parameters toppar/par_all35_ethers.prm;
> parameters toppar/lig.prm;
>
>
> parameters toppar/toppar_water_ions.str;
> parameters toppar/toppar_all36_lipid_cholesterol.str;
>
> # These are specified by CHARMM
> exclude scaled1-4 # non-bonded exclusion policy to use
> "none,1-2,1-3,1-4,or scaled1-4"
> # 1-2: all atoms pairs that are bonded are going to be ignored
> # 1-3: 3 consecutively bonded are excluded
> # scaled1-4: include all the 1-3, and modified 1-4 interactions
> # electrostatic scaled by 1-4scaling factor 1.0
> # vdW special 1-4 parameters in charmm parameter file.
> 1-4scaling 1.0
> switching on
> vdwForceSwitching yes; # New option for force-based switching of vdW
> # if both switching and vdwForceSwitching are on CHARMM force
> # switching is used for vdW forces.
> seed 1333525265 # Specifies a specific seed
>
> # You have some freedom choosing the cutoff
> cutoff 12.0; # may use smaller, maybe 10., with PME
> switchdist 10.0; # cutoff - 2.
> # switchdist - where you start to switch
> # cutoff - where you stop accounting for nonbond interactions.
> # correspondence in charmm:
> # (cutnb,ctofnb,ctonnb = pairlistdist,cutoff,switchdist)
> pairlistdist 16.0; # stores the all the pairs with in the distance it
> should be larger
> # than cutoff( + 2.)
> stepspercycle 20; # 20 redo pairlists every ten steps
> pairlistsPerCycle 2; # 2 is the default
> # cycle represents the number of steps between atom reassignments
> # this means every 20/2=10 steps the pairlist will be updated
>
> # Integrator Parameters
> timestep 1.0; # fs/step
> rigidBonds all; # Bound constraint all bonds involving H are fixed in
> length
> nonbondedFreq 1; # nonbonded forces every step
> fullElectFrequency 1; # PME every step
>
>
> # Constant Temperature Control ONLY DURING EQUILB
> reassignFreq 500; # reassignFreq: use this to reassign velocity every
> 500 steps
> reassignTemp $temp;
>
> # Periodic Boundary conditions. Need this since for a start...
> cellBasisVector1 $a 0.0 0.0; # vector to the next image
> cellBasisVector2 0.0 $b 0.0;
> cellBasisVector3 0.0 0.0 $c;
> cellOrigin 0.0 0.0 $zcen; # the *center* of the cell
>
> wrapWater on; # wrap water to central cell
> wrapAll on; # wrap other molecules too
> wrapNearest off; # use for non-rectangular cells (wrap to the nearest
> image)
>
> # PME (for full-system periodic electrostatics)
> exec python ../checkfft.py $a $b $c > checkfft.str
> source checkfft.str
>
> PME yes;
> PMEInterpOrder 6; # interpolation order (spline order 6 in charmm)
> PMEGridSizeX $fftx; # should be close to the cell size
> PMEGridSizeY $ffty; # corresponds to the charmm input fftx/y/z
> PMEGridSizeZ $fftz;
>
> # Pressure and volume control
> useGroupPressure yes; # use a hydrogen-group based pseudo-molecular
> viral to calcualte pressure and
> # has less fluctuation, is needed for rigid bonds (rigidBonds/SHAKE)
> useFlexibleCell yes; # yes for anisotropic system like membrane
> useConstantRatio yes; # keeps the ratio of the unit cell in the x-y
> plane constant A=B
>
> langevin on
> langevinDamping 10
> langevinTemp $temp
> langevinHydrogen no
>
> # planar restraint
> colvars on
> exec sed -e "s/Constant \$fc/Constant 5/g" -e "s/\$bb/10.0/g" -e
> "s/\$sc/5.0/g" membrane_lipid_restraint.namd.col >
> restraints/$outputname.col
> colvarsConfig restraints/$outputname.col
>
> # dihedral restraint
> extraBonds yes
> exec sed -e "s/\$FC/500/g" restraints/dihe.txt >
> restraints/$outputname.dihe
> extraBondsFile restraints/$outputname.dihe
>
> minimize 10000
>
> numsteps 90000000
> run 3000000 ; 3ns
>

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:21:50 CST