Re: colvars and parallel-tempering with namd2.10

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Fri Aug 23 2013 - 02:57:53 CDT

Giacomo Fiorin wrote on Aug 13 as to rmsd-colvars vs tclBC:
In my opinion you could give the tclBC option a try. With that you can
apply a script that is executed only on the atoms of each node
individually, and their coordinates are not passed around between nodes.
This is definitely scalable, though it can carry a moderate overhead
because of the Tcl interpreter.

As rmsd-colvars applied to most CA atoms of my large transmembrane protein
system proved to slow down MD by nearly 5-fold, I started considering
tclBC, as in "The User-Defined Forces in NAMD" tutorial:

The main difference between tclBC and tclForces is that an indepen-
dent instance of tclBC is running on each processor, and only atoms
forming the patch treated by that processor are visible to the given
instance of tclBC. This feature makes tclBC more efficient than
tclForces, which is run on just one CPU, but also limits its ca-
pabilities. If you need to apply forces to each atom irrespective of
the position of other atoms, use tclBC. On the other hand, if you
have to consider mutual positions of two or more atoms or gather
information about the whole system, you will need tclForces.

I am now confused by the above statement. What the result might be of
applying tclBC to the same set of CA atoms above? My aim is that the
thermal cycle of parallel tempering should operate on only the parts free
from tclBC, while the protein dihedrals subjected to tclBC should be
conserved substantially unaltered. It seems to me now that this endeavor,
concerning the whole system, is not what tclBC was devised for.

Such endeavor od applying parallel tempering to only a part of the system
appeared in the early literature of the technique,, around the year 2005,
devised on something else than tcl forces. Unfortunately, I am now unable
to trace back the correct reference.

Thanks for advice
francesco pietra

On Tue, Aug 13, 2013 at 7:00 PM, Giacomo Fiorin <giacomo.fiorin_at_gmail.com>wrote:

> Hello Sunhwan, that's indeed what I was thinking. In your case you've
> cleverly used a "collective" variable such as RMSD because the most natural
> option (applying individual restraints to very many dihedrals throughout
> the system) was not immediately available to you.
>
> For collective variables, communication to the masternode (at least of
> their values) is necessary, because the colvars represent "important"
> thermodynamic quantities that summarize the state of the entire system: so
> the potential applied to all of them must be evaluated together.
>
> In your case, the 5,000 variables do not need to be communicated to the
> masternode, because you have a simple potential function that when applied
> to one variable, doesn't need to know the value of the others. So in a
> sense, your variables are not "collective" at all.
>
> In my opinion you could give the tclBC option a try (since you're new to
> NAMD, you might have overlooked it). With that you can apply a script that
> is executed only on the atoms of each node individually, and their
> coordinates are not passed around between nodes. This is definitely
> scalable, though it can carry a moderate overhead because of the Tcl
> interpreter.
>
> Alternatively, if the dihedrals that you're trying to restrain are unique
> combinations of four atom types connected with each other, you could
> temporarily change their force field parameters during equilibration, and
> revert to the original parameters for the production run.
>
> Giacomo
>
>
> On Tue, Aug 13, 2013 at 12:17 PM, Sunhwan Jo <sunhwan_at_uchicago.edu> wrote:
>
>> Thanks for your input.
>>
>> I am fully aware of the situation and wanted to simply see what is the
>> current performance. I'm certainly not going to use 25,000 atoms to define
>> collective variables in production. Since I have these data on hand, I can
>> be better off designing my simulation later, if I had to use colvars. And
>> that was the reason I shared the table.
>>
>> Regarding the scalability with the number of colvars, how many colvars
>> did you define and how many atoms did they use in total? I'd be curious to
>> know if you are defining e.g. 5,000 individual variables on 5,000 atoms.
>> Normally, the performance should be comparable to when you define a single
>> colvar over 5,000 atoms.
>>
>>
>> Let me elaborate this little bit. When you build lipid bilayer system,
>> each lipid head group has chiral center and they could flip during
>> minimization when the lipids are placed badly. The same is true for
>> cis-double bonds found in the unsaturated lipids. So, I usually define
>> dihedral angle restraints to hold until they are equilibrated. Now, when
>> you have 250 unsaturated lipids in your system, you end up having 750
>> colvar harmonic restraints, which is not too bad, but it can get worse as
>> people run larger and large lipid bilayer patches. For some glycolipid
>> system, I have seen someone trying to apply 5000 colvar (each affects 4
>> atoms) and it wasn't pretty.
>>
>> Now, do you have some suggestion how we can increase the performance of
>> colvar in parallel? I was thinking the issue is two-fold: a) reducing the
>> communication to the main node (because each atom position has to be handed
>> over to the main node in each time step) and b) parallelize colvar
>> calculation (only main node does calculation).
>>
>> I am new to NAMD, so I'm not sure if it is feasible. For (a), can we
>> identify a node that is close to the noes where all the colvar atoms are
>> readily accessible after spatial decomposition, and perform colvar
>> calculation in that node? And for (b), can we make separate force objects
>> for each colvars (or for a group of colvars)?
>>
>> Again, I'm new to NAMD, so any suggestion or link to helpful documents
>> would be appreciated.
>>
>> Thanks,
>> Sunhwan
>>
>>
>> On Aug 13, 2013, at 10:51 AM, Giacomo Fiorin <giacomo.fiorin_at_gmail.com>
>> wrote:
>>
>> Hello Sunhwan, this is not uncommon for bulk systems (i.e. where the
>> important atoms belong to thousands of small molecules, instead of a few
>> macromolecules or just one protein).
>>
>> Nevertheless, can you simplify the problem using fewer atoms from each
>> molecule (ideally, one atom from each)? In the third example, you're using
>> a total of 25,000 atoms to define collective variables: you certainly don't
>> have 25,000 lipids or carbohydrates in the system.
>>
>> Regarding the scalability with the number of colvars, how many colvars
>> did you define and how many atoms did they use in total? I'd be curious to
>> know if you are defining e.g. 5,000 individual variables on 5,000 atoms.
>> Normally, the performance should be comparable to when you define a single
>> colvar over 5,000 atoms.
>>
>> Giacomo
>>
>>
>>
>> On Tue, Aug 13, 2013 at 11:29 AM, Sunhwan Jo <sunhwan_at_uchicago.edu>wrote:
>>
>>> I was curious about the parallel performance of Colvar. But I could
>>> certain think of a case where I would like to restrain thousands of atoms
>>> at a time.
>>>
>>> Another performance problem of colvar is inability to scale when large
>>> number of colvars are applied, which is desirable during equilibration of
>>> crowded system, e.g., systems containing unsaturated lipids, lipids in
>>> general, and carbohydrates.
>>>
>>> Best,
>>> Sunhwan
>>>
>>> On Aug 13, 2013, at 9:51 AM, Jérôme Hénin <jerome.henin_at_ibpc.fr>
>>> wrote:
>>>
>>> > Hi Sunhwan,
>>> >
>>> > Thanks for sharing the benchmark figures. One question: are you sure
>>> you need to restrain thousands of atoms at a time?
>>> >
>>> > Cheers,
>>> > Jerome
>>> >
>>> > ----- Original Message -----
>>> >> Francesco,
>>> >>
>>> >>
>>> >> I've done testing colvar parallel performance lately, so I'd like to
>>> >> share it with you.
>>> >>
>>> >>
>>> >> I've used three systems:
>>> >>
>>> >>
>>> >> #1:
>>> >> Total 135K atoms
>>> >> 2 RMSD type colvar (1556 backbone and 1561 side chain atoms
>>> >> restrained)
>>> >>
>>> >>
>>> >> #2:
>>> >> Total 506K atoms
>>> >> 2 RMSD type colvar (6224 and 6244 atoms)
>>> >>
>>> >>
>>> >> #3:
>>> >> Total 443K atoms
>>> >> 2 RMSD type colvar (12448 and 12488 atoms)
>>> >>
>>> >>
>>> >> Each number represents the average (n=3) seconds took to finish 1000
>>> >> MD steps staring from restart file.
>>> >> These are tested with 8 core machines equipped with infiniband.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> # processor
>>> >> #1
>>> >> #2
>>> >> #3
>>> >>
>>> >> w/o
>>> >> w
>>> >> w/o
>>> >> w
>>> >> w/o
>>> >> w
>>> >>
>>> >> 1
>>> >> 1836
>>> >> 1845
>>> >> 5738
>>> >> 5841
>>> >> 5180
>>> >> -
>>> >>
>>> >> 8
>>> >> 213
>>> >> 219
>>> >> 798
>>> >> 827
>>> >> 715
>>> >> 829
>>> >>
>>> >> 16
>>> >> 113
>>> >> 116
>>> >> 433
>>> >> 454
>>> >> 412
>>> >> 485
>>> >>
>>> >> 32
>>> >> 62
>>> >> 65
>>> >> 225
>>> >> 253
>>> >> 219
>>> >> 415
>>> >>
>>> >> 64
>>> >> 35
>>> >> 40
>>> >> 132
>>> >> 171
>>> >> 115
>>> >> 370
>>> >>
>>> >> 128
>>> >> 25
>>> >> 29
>>> >> 87
>>> >> 140
>>> >> 73
>>> >> 352
>>> >>
>>> >>
>>> >>
>>> >> Hope this helps.
>>> >>
>>> >>
>>> >> Thanks,
>>> >> Sunhwan
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Aug 12, 2013, at 9:54 AM, Francesco Pietra < chiendarret_at_gmail.com
>>> >>> wrote:
>>> >>
>>> >>
>>> >> Hello:
>>> >> My aim is to carry out parallel tempering of a transmembrane protein
>>> >> endowed of peptide ligands in a periodic TIP3 box. I would like to
>>> >> restrain all except the mobile parts of the protein. To this end I am
>>> >> examining the multifaceted opportunities of colvars with MPI-compiled
>>> >> namd2.10.
>>> >>
>>> >> I would be very grateful for an input on where, in the now quite
>>> >> complex panorama of colvars, to concentrate attention in view of the
>>> >> above task. My major concern is the large number of atoms to include
>>> >> in colvars, which my well conflict with the "few thousand" set fort
>>> >> in
>>> >> the manual. I.e., ca 20,000 protein + peptide atoms and, if I also
>>> >> want to restrain the double-layer membrane, additional 27,000 atoms.
>>> >>
>>> >> Thanks for advice
>>> >>
>>> >> francesco pietra
>>> >>
>>> >>
>>> >>
>>>
>>>
>>>
>>
>>
>

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:36 CST