Re: Slow vacuum/GBIS simulations (time to completion keeps increasing)

From: Ole Juul Andersen (oja_at_chem.au.dk)
Date: Wed Jul 03 2013 - 07:28:20 CDT

Axel, yes I have. Ubiquitin is rather globular, however, it does have a flexible terminal that can move around. It does in fact stick to the protein in the last part of the simulation. However, compared to the size of the rest of the system, the increased number of contacts does not seem to jusitfy a factor 10 increase in simulation time, but I might be mistaken.

Aron, running the 9k system on a couple of nodes, I am actually getting 4.3 ns/day. I get this number by ignoring the timing statistics in the log file, and simply looking at how many hours it has taken to come to the current number of steps. This number does not fit with the timing statistics, so it seems that NAMD simply has a hard time estimating the time for smaller systems. This, in combination with my ignorant assumption of being able to produce more than 4 ns/day for a 9k system using an implicit solvation model seems to have gotten me confused =). Lesson learned!

I thank you for your time…

Best regards,
Ole

Ph.D. student Ole J. Andersen
iNANO, University of Aarhus
Department of Chemistry, Aarhus University
Langelandsgade 140,
Building 1510-421
8000 Aarhus C
Denmark
Tel.: +45 87 15 53 16 / +45 26 39 61 50
Mail: oja_at_chem.au.dk<mailto:oja_at_chem.au.dk>

On Jul 3, 2013, at 1:19 PM, Axel Kohlmeyer wrote:

On Wed, Jul 3, 2013 at 12:06 PM, Ole Juul Andersen <oja_at_chem.au.dk<mailto:oja_at_chem.au.dk>> wrote:
The output frequency does not seem to be the problem. After running
2,600,000 steps (~10 hours) the largest file (the dcd file) is only 7.4 MB,
and the time until completion has gone from 300 hours in the beginning of
the simulation to currently 3,000 hours. There is plenty of memory left on
the node, so it doesn't look like this is the bottleneck either.

have you looked at how the system has changed?

if you started with a very spread out structure and it collapses, then
there will be many more entries in the neighbor lists. for a system
with solvent, this makes no difference, but for a system without, this
can result in a big increase in the number of interactions to be
computed. remember that the realspace interactions scale O(N**2) with
the number of particles inside the cutoff.

axel.

Nevertheless, thank you for your suggestion!

/Ole

Ph.D. student Ole J. Andersen
iNANO, University of Aarhus
Department of Chemistry, Aarhus University
Langelandsgade 140,
Building 1510-421
8000 Aarhus C
Denmark
Tel.: +45 87 15 53 16 / +45 26 39 61 50
Mail: oja_at_chem.au.dk<mailto:oja_at_chem.au.dk>
________________________________
From: Aron Broom [broomsday_at_gmail.com<mailto:broomsday_at_gmail.com>]
Sent: Tuesday, July 02, 2013 8:10 PM

To: Ole Juul Andersen
Cc: namd-l_at_ks.uiuc.edu<mailto:namd-l_at_ks.uiuc.edu>
Subject: Re: namd-l: Slow vacuum/GBIS simulations (time to completion keeps
increasing)

I guess the best quick check is to see how large the files from your failed
ones ended up being before you killed them. I take your point that if the
slow down happens quickly it seems like it would be something else.

On Tue, Jul 2, 2013 at 1:57 PM, Ole Juul Andersen <oja_at_chem.au.dk<mailto:oja_at_chem.au.dk>> wrote:

Woops, my bad. I usually use 1 fs time steps, forgot that the tutorial
files use 2. You have a valid point, however, I don't think this is the
actual cause of the problem. The simulations we performed on our own system
had more reasonable output frequencies, and the increase in time to
completion is observed very quickly after the simulations start. I have just
submitted a job using 5000 as the output frequency to see if it helps =).

/Ole

Ph.D. student Ole J. Andersen
iNANO, University of Aarhus
Department of Chemistry, Aarhus University
Langelandsgade 140,
Building 1510-421
8000 Aarhus C
Denmark
Tel.: +45 87 15 53 16 / +45 26 39 61 50
Mail: oja_at_chem.au.dk<mailto:oja_at_chem.au.dk>
________________________________
From: Aron Broom [broomsday_at_gmail.com<mailto:broomsday_at_gmail.com>]
Sent: Tuesday, July 02, 2013 7:45 PM
To: Ole Juul Andersen
Cc: namd-l_at_ks.uiuc.edu<mailto:namd-l_at_ks.uiuc.edu>
Subject: Re: namd-l: Slow vacuum/GBIS simulations (time to completion
keeps increasing)

Just to clarify, you realize the config file you posted is asking for 1000
ns of simulation time? That is really quite long. More importantly, your
output frequencies are pretty low, so by the end you would have output 50
million energy lines! and 10 million DCD frames!! That is borderline
insane. But it might not just be a question of sanity, at that magnitude,
your slowdown may be because the files NAMD is writing to are becoming SO
large that the I/O process is becoming the limiting factor.

I would set all the output frequencies to ~5000 (10 ps) and try it.

~~Aron

On Tue, Jul 2, 2013 at 1:33 PM, Ole Juul Andersen <oja_at_chem.au.dk<mailto:oja_at_chem.au.dk>> wrote:

Dear all,

We would like to run implicit solvent simulations of a protein complex
(9135 atoms) using the GBIS model implemented in NAMD. However, we
experience that the simulations are VERY slow. We have therefore been
searching for errors in the configuration file and in the submit script,
using the mailing list archive, the NAMD tutorial, and the NAMD user guide
as resources. As none of these sources gave a helpful answer (we might have
missed it?), we turned to the example files provided with the implicit
solvent tutorial. The pdb file of ubiquitin contains 1231 atoms, and if we
start a production run of 500 ns on a node with 12 cores (using the tutorial
conf file), the simulation is estimated to take 1,000 hours. If we simply
switch GBIS from 'on' to 'off' in the conf file, the time drops to ~200
hours. For both of these simulations, we see the time to expected completion
rise (the GBIS 'on' simulation currently has 3500 hours remaining). This is
the exact same problem that we experienced while running GBIS simulations on
our own system. At one point, the log file stated that the job had over
10,000 hours remaining for less than 500 ns of simulation time and running
on 4 nodes!

The jobs are run using NAMD2.8, however, the problem also occurs when
they are run using NAMD2.9 on GPUs. The structures seem rather stable
throughout the simulations, and I do therefore not believe that the increase
in time to completion arises from an increase in neighbouring atoms
(supported by the fact the the GBIS 'off' simulations also show an increase
in simulation time). We don't experience problems when using explicit
solvation and PME, as the time to completion decreases at a steady rate for
these simulations. Have any of you experienced similar problems, and if so,
did you manage to find the reason, or even better, the solution? The submit
script and configuration file is presented below.

Thank you.

All the best,
Ole

------------------------------------- SUBMIT SCRIPT
-------------------------------------
#!/bin/sh
# Request resources - ppn is the number of processors requested per node
#PBS -q q12
#PBS -l nodes=1:ppn=12
#PBS -l walltime=24:00:00
#PBS -m abe
#PBS -N ubq_test
# Find additional options on
#
http://www.clusterresources.com/wiki/doku.php?id=torque:2.1_job_submission
# and try "man qsub"

#Directory where input files are to be found
export JOB=/home/oja/GPU/namd-tutorial-files/1-4-gbis/grendel

#SCR is where the job is run (could be redundant)
export SCR=/scratch/$PBS_JOBID

#The programdirectory
export PROG=/com/namd/NAMD_2.8_Linux-x86_64-OpenMPI

#Copy all(!) the input files to the scratch directory
export name=ubq
export confname=ubq_gbis_eq

cp -p $JOB/par_all27_prot_lipid.inp $SCR/
cp -p $JOB/$confname.conf $SCR/
cp -p $JOB/$name.psf $SCR/
cp -p $JOB/$name.pdb $SCR/

#Enter the working directory
cd $SCR

######################
# RUN THE SIMULATION #
######################
source /com/OpenMPI/1.4.5/intel/bin/openmpi.sh

mpirun -bynode --mca btl self,openib $PROG/namd2 +setcpuaffinity
$confname.conf > "$confname"_1.log
rsync -rlptDz $SCR/* $JOB/

#That's it

----------------------------------------------------------------------------------------------

---------------------------------------- CONF FILE
----------------------------------------

#############################################################
## JOB DESCRIPTION ##
#############################################################

# Minimization and Equilibration of
# Ubiquitin in generalized Born implicit solvent

#############################################################
## ADJUSTABLE PARAMETERS ##
#############################################################

structure ubq.psf
coordinates ubq.pdb

set temperature 310
set outputname ubq_gbis_eq

firsttimestep 0

#############################################################
## SIMULATION PARAMETERS ##
#############################################################

# Input
paraTypeCharmm on
parameters par_all27_prot_lipid.inp
temperature $temperature

# Implicit Solvent
gbis on
alphaCutoff 12.0
ionConcentration 0.3

# Force-Field Parameters
exclude scaled1-4
1-4scaling 1.0
cutoff 14.0
switching on
switchdist 13.0
pairlistdist 16.0

# Integrator Parameters
timestep 2.0 ;# 2fs/step
rigidBonds all ;# needed for 2fs steps
nonbondedFreq 1
fullElectFrequency 2
stepspercycle 10

# Constant Temperature Control
langevin on ;# do langevin dynamics
langevinDamping 1 ;# damping coefficient (gamma) of 1/ps
langevinTemp $temperature
langevinHydrogen off ;# don't couple langevin bath to hydrogens

# Output
outputName $outputname

restartfreq 500 ;# 500steps = every 1ps
dcdfreq 250
xstFreq 250
outputEnergies 100
outputPressure 100

#############################################################
## EXTRA PARAMETERS ##
#############################################################

#############################################################
## EXECUTION SCRIPT ##
#############################################################

# Minimization
minimize 100
reinitvels $temperature

run 500000000 ;# 5ps

----------------------------------------------------------------------------------------------

Ph.D. student Ole J. Andersen
iNANO, University of Aarhus
Department of Chemistry, Aarhus University
Langelandsgade 140,
Building 1510-421
8000 Aarhus C
Denmark
Tel.: +45 87 15 53 16 / +45 26 39 61 50
Mail: oja_at_chem.au.dk<mailto:oja_at_chem.au.dk>

--
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo
--
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo
--
Dr. Axel Kohlmeyer  akohlmey_at_gmail.com<mailto:akohlmey_at_gmail.com>  http://goo.gl/1wk0
International Centre for Theoretical Physics, Trieste. Italy.

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:23 CST