Re: colvars and parallel-tempering with namd2.10

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Sat Aug 17 2013 - 11:51:08 CDT

>In my opinion you could give the tclBC option a try.

My hope is on this suggestion by Giacomo Fiorin. Unless I did mistakes,
ordinary rmsd colvars turns out to slow down the simulation beyond
feasibility with my system of 500,000 atoms (protein, peptide ligands, POPC
bilayer, TIP3 water).

I set rmsd colvars for about 1300 CA atoms in six different segments
(protein-ligands mostly well defined by Xray diffr and well behaving in
namd2.9 equilibration charmm ff), leaving out only those poorly defined.

Let me show how I defined the six colvars to let you detecting errors, if
any, not felt by namd;

xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
colvar {
  name colvar2 # needed to identify the variable
  lowerBoundary -2.0
  upperBoundary 2.0 # force will kick in when rmsd exceeds 2 A

  lowerWall -1.5 # a lttle higher than lowerBoundary
  upperWall 1.5 # a little lower than upperBoundary

  lowerWallConstant 100.0 # force constant in kcal/mol
  upperWallConstant 100.0

  outputSystemForce yes # reports also the system force on this
                           # colvar in addition to the current value

  rmsd {
     atoms {
       # add all the C-alphas within residues 14 to 463 of segment "P2"
        psfSegID P2
       atomNameResidueRange CA 42-449
     }
     refPositionsFile npt-02.pdb
  }
}

followed by:

harmonic {

  name pippo

  colvars { colvar1 }

  centers 0.0

  colvars { colvar2 }

  centers 0.0

  colvars { colvar3 }

  centers 0.0

  colvars { colvar4 }

  centers 0.0

  colvars { colvar5 }

  centers 0.0

  colvars { colvar6 }

  centers 0.0

  forceConstant 10.0
}

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Ordinary MD was run on 64nodes/1024processors, getting namd benchmarks as
follows

Without colvars: 4096 CPUs 0.0286588 s/step 0.331699 days/ns (4096 are
processes, not cpus)

With colvars: 4096 CPUs 0.14067 s/step 1.62813 days/ns

In other words, 10 min wallclock allow to run less than 2000 steps at
outputEnergies 500 # multiple of fullElectFrequency or viceversa
restartfreq 100
DCDfreq 500
ts 1 fs

Clearly unattractive results in view of my projected parallel tempering as
at the start of this thread.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Mistakes or not in all above (particularly in the tentative values of
Boundary and Constant), I would be much indebted for sharing a tclBC
script, or a skeleton of that, for the above case. Of course I'll try to
build one (finally deciding to learn some serious tcl scripting), but an
efficient one could help.

Thanks to all
francesco pietra

On Tue, Aug 13, 2013 at 7:07 PM, Axel Kohlmeyer <akohlmey_at_gmail.com> wrote:

> On Tue, Aug 13, 2013 at 7:00 PM, Giacomo Fiorin
> <giacomo.fiorin_at_gmail.com> wrote:
> > Hello Sunhwan, that's indeed what I was thinking. In your case you've
> > cleverly used a "collective" variable such as RMSD because the most
> natural
> > option (applying individual restraints to very many dihedrals throughout
> the
> > system) was not immediately available to you.
> >
> > For collective variables, communication to the masternode (at least of
> their
> > values) is necessary, because the colvars represent "important"
> > thermodynamic quantities that summarize the state of the entire system:
> so
> > the potential applied to all of them must be evaluated together.
> >
> > In your case, the 5,000 variables do not need to be communicated to the
> > masternode, because you have a simple potential function that when
> applied
> > to one variable, doesn't need to know the value of the others. So in a
> > sense, your variables are not "collective" at all.
> >
> > In my opinion you could give the tclBC option a try (since you're new to
> > NAMD, you might have overlooked it). With that you can apply a script
> that
> > is executed only on the atoms of each node individually, and their
> > coordinates are not passed around between nodes. This is definitely
> > scalable, though it can carry a moderate overhead because of the Tcl
> > interpreter.
> >
> > Alternatively, if the dihedrals that you're trying to restrain are unique
> > combinations of four atom types connected with each other, you could
> > temporarily change their force field parameters during equilibration, and
> > revert to the original parameters for the production run.
>
> there is one more suggestion: since the issue at hand is not a
> production run issue, but a matter of initial equlibration, there is
> no need to do this with a huge system, but rather take a small subset
> and then pre-equilibrate this (quickly on a few CPUs and without
> parallel scaling issues) and then move on to larger system, simply by
> replicating the system (best in stages) until you have a large enough
> bilayer that is ready for production. whatever the method used, this
> divide-and-conquer strategy is going to be the most efficient.
>
> axel.
>
>
> >
> > Giacomo
> >
> >
> > On Tue, Aug 13, 2013 at 12:17 PM, Sunhwan Jo <sunhwan_at_uchicago.edu>
> wrote:
> >>
> >> Thanks for your input.
> >>
> >> I am fully aware of the situation and wanted to simply see what is the
> >> current performance. I'm certainly not going to use 25,000 atoms to
> define
> >> collective variables in production. Since I have these data on hand, I
> can
> >> be better off designing my simulation later, if I had to use colvars.
> And
> >> that was the reason I shared the table.
> >>
> >> Regarding the scalability with the number of colvars, how many colvars
> did
> >> you define and how many atoms did they use in total? I'd be curious to
> know
> >> if you are defining e.g. 5,000 individual variables on 5,000 atoms.
> >> Normally, the performance should be comparable to when you define a
> single
> >> colvar over 5,000 atoms.
> >>
> >>
> >> Let me elaborate this little bit. When you build lipid bilayer system,
> >> each lipid head group has chiral center and they could flip during
> >> minimization when the lipids are placed badly. The same is true for
> >> cis-double bonds found in the unsaturated lipids. So, I usually define
> >> dihedral angle restraints to hold until they are equilibrated. Now,
> when you
> >> have 250 unsaturated lipids in your system, you end up having 750 colvar
> >> harmonic restraints, which is not too bad, but it can get worse as
> people
> >> run larger and large lipid bilayer patches. For some glycolipid system,
> I
> >> have seen someone trying to apply 5000 colvar (each affects 4 atoms)
> and it
> >> wasn't pretty.
> >>
> >> Now, do you have some suggestion how we can increase the performance of
> >> colvar in parallel? I was thinking the issue is two-fold: a) reducing
> the
> >> communication to the main node (because each atom position has to be
> handed
> >> over to the main node in each time step) and b) parallelize colvar
> >> calculation (only main node does calculation).
> >>
> >> I am new to NAMD, so I'm not sure if it is feasible. For (a), can we
> >> identify a node that is close to the noes where all the colvar atoms are
> >> readily accessible after spatial decomposition, and perform colvar
> >> calculation in that node? And for (b), can we make separate force
> objects
> >> for each colvars (or for a group of colvars)?
> >>
> >> Again, I'm new to NAMD, so any suggestion or link to helpful documents
> >> would be appreciated.
> >>
> >> Thanks,
> >> Sunhwan
> >>
> >>
> >> On Aug 13, 2013, at 10:51 AM, Giacomo Fiorin <giacomo.fiorin_at_gmail.com>
> >> wrote:
> >>
> >> Hello Sunhwan, this is not uncommon for bulk systems (i.e. where the
> >> important atoms belong to thousands of small molecules, instead of a few
> >> macromolecules or just one protein).
> >>
> >> Nevertheless, can you simplify the problem using fewer atoms from each
> >> molecule (ideally, one atom from each)? In the third example, you're
> using
> >> a total of 25,000 atoms to define collective variables: you certainly
> don't
> >> have 25,000 lipids or carbohydrates in the system.
> >>
> >> Regarding the scalability with the number of colvars, how many colvars
> did
> >> you define and how many atoms did they use in total? I'd be curious to
> know
> >> if you are defining e.g. 5,000 individual variables on 5,000 atoms.
> >> Normally, the performance should be comparable to when you define a
> single
> >> colvar over 5,000 atoms.
> >>
> >> Giacomo
> >>
> >>
> >>
> >> On Tue, Aug 13, 2013 at 11:29 AM, Sunhwan Jo <sunhwan_at_uchicago.edu>
> wrote:
> >>>
> >>> I was curious about the parallel performance of Colvar. But I could
> >>> certain think of a case where I would like to restrain thousands of
> atoms at
> >>> a time.
> >>>
> >>> Another performance problem of colvar is inability to scale when large
> >>> number of colvars are applied, which is desirable during equilibration
> of
> >>> crowded system, e.g., systems containing unsaturated lipids, lipids in
> >>> general, and carbohydrates.
> >>>
> >>> Best,
> >>> Sunhwan
> >>>
> >>> On Aug 13, 2013, at 9:51 AM, Jérôme Hénin <jerome.henin_at_ibpc.fr>
> >>> wrote:
> >>>
> >>> > Hi Sunhwan,
> >>> >
> >>> > Thanks for sharing the benchmark figures. One question: are you sure
> >>> > you need to restrain thousands of atoms at a time?
> >>> >
> >>> > Cheers,
> >>> > Jerome
> >>> >
> >>> > ----- Original Message -----
> >>> >> Francesco,
> >>> >>
> >>> >>
> >>> >> I've done testing colvar parallel performance lately, so I'd like to
> >>> >> share it with you.
> >>> >>
> >>> >>
> >>> >> I've used three systems:
> >>> >>
> >>> >>
> >>> >> #1:
> >>> >> Total 135K atoms
> >>> >> 2 RMSD type colvar (1556 backbone and 1561 side chain atoms
> >>> >> restrained)
> >>> >>
> >>> >>
> >>> >> #2:
> >>> >> Total 506K atoms
> >>> >> 2 RMSD type colvar (6224 and 6244 atoms)
> >>> >>
> >>> >>
> >>> >> #3:
> >>> >> Total 443K atoms
> >>> >> 2 RMSD type colvar (12448 and 12488 atoms)
> >>> >>
> >>> >>
> >>> >> Each number represents the average (n=3) seconds took to finish 1000
> >>> >> MD steps staring from restart file.
> >>> >> These are tested with 8 core machines equipped with infiniband.
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> # processor
> >>> >> #1
> >>> >> #2
> >>> >> #3
> >>> >>
> >>> >> w/o
> >>> >> w
> >>> >> w/o
> >>> >> w
> >>> >> w/o
> >>> >> w
> >>> >>
> >>> >> 1
> >>> >> 1836
> >>> >> 1845
> >>> >> 5738
> >>> >> 5841
> >>> >> 5180
> >>> >> -
> >>> >>
> >>> >> 8
> >>> >> 213
> >>> >> 219
> >>> >> 798
> >>> >> 827
> >>> >> 715
> >>> >> 829
> >>> >>
> >>> >> 16
> >>> >> 113
> >>> >> 116
> >>> >> 433
> >>> >> 454
> >>> >> 412
> >>> >> 485
> >>> >>
> >>> >> 32
> >>> >> 62
> >>> >> 65
> >>> >> 225
> >>> >> 253
> >>> >> 219
> >>> >> 415
> >>> >>
> >>> >> 64
> >>> >> 35
> >>> >> 40
> >>> >> 132
> >>> >> 171
> >>> >> 115
> >>> >> 370
> >>> >>
> >>> >> 128
> >>> >> 25
> >>> >> 29
> >>> >> 87
> >>> >> 140
> >>> >> 73
> >>> >> 352
> >>> >>
> >>> >>
> >>> >>
> >>> >> Hope this helps.
> >>> >>
> >>> >>
> >>> >> Thanks,
> >>> >> Sunhwan
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Aug 12, 2013, at 9:54 AM, Francesco Pietra <
> chiendarret_at_gmail.com
> >>> >>> wrote:
> >>> >>
> >>> >>
> >>> >> Hello:
> >>> >> My aim is to carry out parallel tempering of a transmembrane protein
> >>> >> endowed of peptide ligands in a periodic TIP3 box. I would like to
> >>> >> restrain all except the mobile parts of the protein. To this end I
> am
> >>> >> examining the multifaceted opportunities of colvars with
> MPI-compiled
> >>> >> namd2.10.
> >>> >>
> >>> >> I would be very grateful for an input on where, in the now quite
> >>> >> complex panorama of colvars, to concentrate attention in view of the
> >>> >> above task. My major concern is the large number of atoms to include
> >>> >> in colvars, which my well conflict with the "few thousand" set fort
> >>> >> in
> >>> >> the manual. I.e., ca 20,000 protein + peptide atoms and, if I also
> >>> >> want to restrain the double-layer membrane, additional 27,000 atoms.
> >>> >>
> >>> >> Thanks for advice
> >>> >>
> >>> >> francesco pietra
> >>> >>
> >>> >>
> >>> >>
> >>>
> >>>
> >>
> >>
> >
>
>
>
> --
> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
> International Centre for Theoretical Physics, Trieste. Italy.
>

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:35 CST