Re: Slow vacuum/GBIS simulations (time to completion keeps increasing)

From: Aron Broom (broomsday_at_gmail.com)
Date: Tue Jul 02 2013 - 13:10:49 CDT

I guess the best quick check is to see how large the files from your failed
ones ended up being before you killed them. I take your point that if the
slow down happens quickly it seems like it would be something else.

On Tue, Jul 2, 2013 at 1:57 PM, Ole Juul Andersen <oja_at_chem.au.dk> wrote:

> Woops, my bad. I usually use 1 fs time steps, forgot that the tutorial
> files use 2. You have a valid point, however, I don't think this is the
> actual cause of the problem. The simulations we performed on our own system
> had more reasonable output frequencies, and the increase in time to
> completion is observed very quickly after the simulations start. I have
> just submitted a job using 5000 as the output frequency to see if it helps
> =).
>
> /Ole
>
>
> Ph.D. student Ole J. Andersen
> iNANO, University of Aarhus
> Department of Chemistry, Aarhus University
> Langelandsgade 140,
> Building 1510-421
> 8000 Aarhus C
> Denmark
> Tel.: +45 87 15 53 16 / +45 26 39 61 50
> Mail: oja_at_chem.au.dk
> ------------------------------
> *From:* Aron Broom [broomsday_at_gmail.com]
> *Sent:* Tuesday, July 02, 2013 7:45 PM
> *To:* Ole Juul Andersen
> *Cc:* namd-l_at_ks.uiuc.edu
> *Subject:* Re: namd-l: Slow vacuum/GBIS simulations (time to completion
> keeps increasing)
>
> Just to clarify, you realize the config file you posted is asking for
> 1000 ns of simulation time? That is really quite long. More importantly,
> your output frequencies are pretty low, so by the end you would have output
> 50 million energy lines! and 10 million DCD frames!! That is borderline
> insane. But it might not just be a question of sanity, at that magnitude,
> your slowdown may be because the files NAMD is writing to are becoming SO
> large that the I/O process is becoming the limiting factor.
>
> I would set all the output frequencies to ~5000 (10 ps) and try it.
>
> ~Aron
>
>
> On Tue, Jul 2, 2013 at 1:33 PM, Ole Juul Andersen <oja_at_chem.au.dk> wrote:
>
>> Dear all,
>>
>> We would like to run implicit solvent simulations of a protein complex
>> (9135 atoms) using the GBIS model implemented in NAMD. However, we
>> experience that the simulations are VERY slow. We have therefore been
>> searching for errors in the configuration file and in the submit script,
>> using the mailing list archive, the NAMD tutorial, and the NAMD user guide
>> as resources. As none of these sources gave a helpful answer (we might have
>> missed it?), we turned to the example files provided with the implicit
>> solvent tutorial. The pdb file of ubiquitin contains 1231 atoms, and if we
>> start a production run of 500 ns on a node with 12 cores (using the
>> tutorial conf file), the simulation is estimated to take 1,000 hours. If we
>> simply switch GBIS from 'on' to 'off' in the conf file, the time drops to
>> ~200 hours. For both of these simulations, we see the time to expected
>> completion rise (the GBIS 'on' simulation currently has 3500 hours
>> remaining). This is the exact same problem that we experienced while
>> running GBIS simulations on our own system. At one point, the log file
>> stated that the job had over 10,000 hours remaining for less than 500 ns of
>> simulation time and running on 4 nodes!
>>
>> The jobs are run using NAMD2.8, however, the problem also occurs when
>> they are run using NAMD2.9 on GPUs. The structures seem rather stable
>> throughout the simulations, and I do therefore not believe that the
>> increase in time to completion arises from an increase in neighbouring
>> atoms (supported by the fact the the GBIS 'off' simulations also show an
>> increase in simulation time). We don't experience problems when using
>> explicit solvation and PME, as the time to completion decreases at a steady
>> rate for these simulations. Have any of you experienced similar problems,
>> and if so, did you manage to find the reason, or even better, the solution?
>> The submit script and configuration file is presented below.
>>
>> Thank you.
>>
>> All the best,
>> Ole
>>
>> ------------------------------------- SUBMIT SCRIPT
>> -------------------------------------
>> #!/bin/sh
>> # Request resources - ppn is the number of processors requested per node
>> #PBS -q q12
>> #PBS -l nodes=1:ppn=12
>> #PBS -l walltime=24:00:00
>> #PBS -m abe
>> #PBS -N ubq_test
>> # Find additional options on
>> #
>> http://www.clusterresources.com/wiki/doku.php?id=torque:2.1_job_submission
>> # and try "man qsub"
>>
>> #Directory where input files are to be found
>> export JOB=/home/oja/GPU/namd-tutorial-files/1-4-gbis/grendel
>>
>> #SCR is where the job is run (could be redundant)
>> export SCR=/scratch/$PBS_JOBID
>>
>> #The programdirectory
>> export PROG=/com/namd/NAMD_2.8_Linux-x86_64-OpenMPI
>>
>> #Copy all(!) the input files to the scratch directory
>> export name=ubq
>> export confname=ubq_gbis_eq
>>
>> cp -p $JOB/par_all27_prot_lipid.inp $SCR/
>> cp -p $JOB/$confname.conf $SCR/
>> cp -p $JOB/$name.psf $SCR/
>> cp -p $JOB/$name.pdb $SCR/
>>
>> #Enter the working directory
>> cd $SCR
>>
>> ######################
>> # RUN THE SIMULATION #
>> ######################
>> source /com/OpenMPI/1.4.5/intel/bin/openmpi.sh
>>
>> mpirun -bynode --mca btl self,openib $PROG/namd2 +setcpuaffinity
>> $confname.conf > "$confname"_1.log
>> rsync -rlptDz $SCR/* $JOB/
>>
>> #That's it
>>
>>
>> ----------------------------------------------------------------------------------------------
>>
>> ---------------------------------------- CONF FILE
>> ----------------------------------------
>>
>>
>> #############################################################
>> ## JOB DESCRIPTION ##
>> #############################################################
>>
>> # Minimization and Equilibration of
>> # Ubiquitin in generalized Born implicit solvent
>>
>>
>> #############################################################
>> ## ADJUSTABLE PARAMETERS ##
>> #############################################################
>>
>> structure ubq.psf
>> coordinates ubq.pdb
>>
>> set temperature 310
>> set outputname ubq_gbis_eq
>>
>> firsttimestep 0
>>
>>
>> #############################################################
>> ## SIMULATION PARAMETERS ##
>> #############################################################
>>
>> # Input
>> paraTypeCharmm on
>> parameters par_all27_prot_lipid.inp
>> temperature $temperature
>>
>> # Implicit Solvent
>> gbis on
>> alphaCutoff 12.0
>> ionConcentration 0.3
>>
>> # Force-Field Parameters
>> exclude scaled1-4
>> 1-4scaling 1.0
>> cutoff 14.0
>> switching on
>> switchdist 13.0
>> pairlistdist 16.0
>>
>>
>> # Integrator Parameters
>> timestep 2.0 ;# 2fs/step
>> rigidBonds all ;# needed for 2fs steps
>> nonbondedFreq 1
>> fullElectFrequency 2
>> stepspercycle 10
>>
>>
>> # Constant Temperature Control
>> langevin on ;# do langevin dynamics
>> langevinDamping 1 ;# damping coefficient (gamma) of 1/ps
>> langevinTemp $temperature
>> langevinHydrogen off ;# don't couple langevin bath to hydrogens
>>
>> # Output
>> outputName $outputname
>>
>> restartfreq 500 ;# 500steps = every 1ps
>> dcdfreq 250
>> xstFreq 250
>> outputEnergies 100
>> outputPressure 100
>>
>>
>> #############################################################
>> ## EXTRA PARAMETERS ##
>> #############################################################
>>
>>
>> #############################################################
>> ## EXECUTION SCRIPT ##
>> #############################################################
>>
>> # Minimization
>> minimize 100
>> reinitvels $temperature
>>
>> run 500000000 ;# 5ps
>>
>>
>> ----------------------------------------------------------------------------------------------
>>
>> Ph.D. student Ole J. Andersen
>> iNANO, University of Aarhus
>> Department of Chemistry, Aarhus University
>> Langelandsgade 140,
>> Building 1510-421
>> 8000 Aarhus C
>> Denmark
>> Tel.: +45 87 15 53 16 / +45 26 39 61 50
>> Mail: oja_at_chem.au.dk
>>
>
>
>
> --
> Aron Broom M.Sc
> PhD Student
> Department of Chemistry
> University of Waterloo
>

-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:23 CST