Re: colvars and parallel-tempering with namd2.10

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Mon Aug 19 2013 - 01:41:12 CDT

Giacomo: Thanks a lot. The correct layout of harmonic bias improves the
performance by a very slight margin (1.57 vs previous 1.60 days/ns).
Anyway, the log file warns about the incorrect layout I used before.

The wallclock/cpuTime is in huge disagreement, as it indicates a slowdown
by only two on using these colvars, but it is told in namd performance wiki
that wallclock is unreliable in looking at performance.

As to the settings in namd conf file, I can't claim that they are optimal,
especially for the MPI/OpenMP standard compilation with threads of namd2.9
on Bluegene (I have another one without threads but I use it for parallel
tempering)..

As to the settings:
(1) One striking - surely well known - observation is that Bluegene changes
my "margin 0" to "margin 0.48" because it deals of a const pressure
simulation. Somewhat that contrasts with the indication on the said namd
wiki.
(2) The number of processes is 17 x 9 x 13 = 1989, i.e., correctly less
that the number of processors (1024).

But there are so many other possibilities of tuning...

At any event, it seems to me that the only hope is in what you suggested:
tclBP. Hopefully a script skeleton will be provided by some king namd user
as I found no similar scripting looking around. Applying so many patches
might seem to be crazy but I think I have good reasons to do that. Another
possibility could be to carry out parallel tempering on a system consisting
of only the parts of the protein that I would like to submit to a thermal
cycle. However, I have no idea how to recombine this part with the one from
which it was separated. With the further complication that there is a
double layer membrane.

francesco

On Sun, Aug 18, 2013 at 5:40 PM, Giacomo Fiorin <giacomo.fiorin_at_gmail.com>wrote:

> Francesco, the syntax of the "harmonic" bias is not correct. colvars and
> centers should both be lists, appearing only once in the block.
>
> Giacomo
>
>
> On Sat, Aug 17, 2013 at 12:51 PM, Francesco Pietra <chiendarret_at_gmail.com>wrote:
>
>> >In my opinion you could give the tclBC option a try.
>>
>> My hope is on this suggestion by Giacomo Fiorin. Unless I did mistakes,
>> ordinary rmsd colvars turns out to slow down the simulation beyond
>> feasibility with my system of 500,000 atoms (protein, peptide ligands, POPC
>> bilayer, TIP3 water).
>>
>> I set rmsd colvars for about 1300 CA atoms in six different segments
>> (protein-ligands mostly well defined by Xray diffr and well behaving in
>> namd2.9 equilibration charmm ff), leaving out only those poorly defined.
>>
>> Let me show how I defined the six colvars to let you detecting errors, if
>> any, not felt by namd;
>>
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> colvar {
>> name colvar2 # needed to identify the variable
>> lowerBoundary -2.0
>> upperBoundary 2.0 # force will kick in when rmsd exceeds 2 A
>>
>>
>> lowerWall -1.5 # a lttle higher than lowerBoundary
>> upperWall 1.5 # a little lower than upperBoundary
>>
>> lowerWallConstant 100.0 # force constant in kcal/mol
>> upperWallConstant 100.0
>>
>>
>> outputSystemForce yes # reports also the system force on this
>> # colvar in addition to the current value
>>
>>
>> rmsd {
>> atoms {
>> # add all the C-alphas within residues 14 to 463 of segment "P2"
>> psfSegID P2
>> atomNameResidueRange CA 42-449
>> }
>> refPositionsFile npt-02.pdb
>> }
>> }
>>
>>
>> followed by:
>>
>>
>> harmonic {
>>
>> name pippo
>>
>> colvars { colvar1 }
>>
>> centers 0.0
>>
>> colvars { colvar2 }
>>
>> centers 0.0
>>
>> colvars { colvar3 }
>>
>> centers 0.0
>>
>> colvars { colvar4 }
>>
>> centers 0.0
>>
>> colvars { colvar5 }
>>
>> centers 0.0
>>
>> colvars { colvar6 }
>>
>> centers 0.0
>>
>>
>> forceConstant 10.0
>> }
>>
>>
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>
>>
>> Ordinary MD was run on 64nodes/1024processors, getting namd benchmarks as
>> follows
>>
>> Without colvars: 4096 CPUs 0.0286588 s/step 0.331699 days/ns (4096 are
>> processes, not cpus)
>>
>> With colvars: 4096 CPUs 0.14067 s/step 1.62813 days/ns
>>
>> In other words, 10 min wallclock allow to run less than 2000 steps at
>> outputEnergies 500 # multiple of fullElectFrequency or viceversa
>> restartfreq 100
>> DCDfreq 500
>> ts 1 fs
>>
>> Clearly unattractive results in view of my projected parallel tempering
>> as at the start of this thread.
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>
>> Mistakes or not in all above (particularly in the tentative values of
>> Boundary and Constant), I would be much indebted for sharing a tclBC
>> script, or a skeleton of that, for the above case. Of course I'll try to
>> build one (finally deciding to learn some serious tcl scripting), but an
>> efficient one could help.
>>
>> Thanks to all
>> francesco pietra
>>
>>
>> On Tue, Aug 13, 2013 at 7:07 PM, Axel Kohlmeyer <akohlmey_at_gmail.com>wrote:
>>
>>> On Tue, Aug 13, 2013 at 7:00 PM, Giacomo Fiorin
>>> <giacomo.fiorin_at_gmail.com> wrote:
>>> > Hello Sunhwan, that's indeed what I was thinking. In your case you've
>>> > cleverly used a "collective" variable such as RMSD because the most
>>> natural
>>> > option (applying individual restraints to very many dihedrals
>>> throughout the
>>> > system) was not immediately available to you.
>>> >
>>> > For collective variables, communication to the masternode (at least of
>>> their
>>> > values) is necessary, because the colvars represent "important"
>>> > thermodynamic quantities that summarize the state of the entire
>>> system: so
>>> > the potential applied to all of them must be evaluated together.
>>> >
>>> > In your case, the 5,000 variables do not need to be communicated to the
>>> > masternode, because you have a simple potential function that when
>>> applied
>>> > to one variable, doesn't need to know the value of the others. So in a
>>> > sense, your variables are not "collective" at all.
>>> >
>>> > In my opinion you could give the tclBC option a try (since you're new
>>> to
>>> > NAMD, you might have overlooked it). With that you can apply a script
>>> that
>>> > is executed only on the atoms of each node individually, and their
>>> > coordinates are not passed around between nodes. This is definitely
>>> > scalable, though it can carry a moderate overhead because of the Tcl
>>> > interpreter.
>>> >
>>> > Alternatively, if the dihedrals that you're trying to restrain are
>>> unique
>>> > combinations of four atom types connected with each other, you could
>>> > temporarily change their force field parameters during equilibration,
>>> and
>>> > revert to the original parameters for the production run.
>>>
>>> there is one more suggestion: since the issue at hand is not a
>>> production run issue, but a matter of initial equlibration, there is
>>> no need to do this with a huge system, but rather take a small subset
>>> and then pre-equilibrate this (quickly on a few CPUs and without
>>> parallel scaling issues) and then move on to larger system, simply by
>>> replicating the system (best in stages) until you have a large enough
>>> bilayer that is ready for production. whatever the method used, this
>>> divide-and-conquer strategy is going to be the most efficient.
>>>
>>> axel.
>>>
>>>
>>> >
>>> > Giacomo
>>> >
>>> >
>>> > On Tue, Aug 13, 2013 at 12:17 PM, Sunhwan Jo <sunhwan_at_uchicago.edu>
>>> wrote:
>>> >>
>>> >> Thanks for your input.
>>> >>
>>> >> I am fully aware of the situation and wanted to simply see what is the
>>> >> current performance. I'm certainly not going to use 25,000 atoms to
>>> define
>>> >> collective variables in production. Since I have these data on hand,
>>> I can
>>> >> be better off designing my simulation later, if I had to use colvars--089e01227b7cf2dccb04e4473996--

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:32 CST