Re: About the scope of show_replicas.vmd

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Fri Sep 20 2013 - 12:39:16 CDT

Hi Kkenno:

As mentioned earlier, there are different kinds of convergence. You seem to
> be trying to verify convergence in the sense of your replica exchange
> simulation being equilibrated. However, it sounds like the kind of
> convergence you are really looking for is convergence of the conformational
> ensemble. As also mentioned earlier, fundamentally spoken, this kind of
> convergence is impossible to check because of the wide range in magnitude
> of conformational barriers. However, pragmatically spoken, it is often
> possible to make an educated guess.
>
> Now, to tackle your actual problem at hand. Assuming the phase space
> overlap between your replicas is already adequate, I don't think you can
> improve conformational sampling significantly by simply increasing the
> number of replicas. In your situation, the best way to improve sampling
> seems to be to increase the temperature of the highest replica. Of course,
> you need to maintain phase space overlap while doing so, which means that,
> as the temperature of the highest replica increases, more replicas will be
> needed anyway. So I would recommend graphing the histograms of the energy
> of the different temperature windows together on one plot, and use that
> information to choose a set of 32 temperature windows with the highest
> temperature as high as possible, yet low enough so that you can be
> confident the histograms will have enough overlap.
>
> Oh yeah, T-remd is indeed known to be pretty painful in explicit solvent,
> so your decision to go implicit makes sense.
>

Thanks for the comments.

But no convergence check in the world will give you information on how bad
> the implicit solvent approximation is in your particular system. Only
> comparison with explicit solvent or experiment will do that.
>

Do you have any evidence that T-remd simulations in explicit water are so
much reliable? I mean, do you know of a paper published before than
experimental results were published (by a different team of research) that
proved that? If so, I could perhaps try to compare GB T-remd with explicit
water T-remd for one of my peptides (should built-in remd with namd-2.10
will prove substantially faster than tcl-driven remd with namd-2.9). It
would be a scientifically sound approach, hopefully lending credibility to
the protein I am attempting to complete after partial failure of X-ray
diffraction (NMR proved much too complex).

cheers
francesco

On Fri, Sep 20, 2013 at 6:00 PM, Kenno Vanommeslaeghe <
kvanomme_at_rx.umaryland.edu> wrote:

> As mentioned earlier, there are different kinds of convergence. You seem
> to be trying to verify convergence in the sense of your replica exchange
> simulation being equilibrated. However, it sounds like the kind of
> convergence you are really looking for is convergence of the conformational
> ensemble. As also mentioned earlier, fundamentally spoken, this kind of
> convergence is impossible to check because of the wide range in magnitude
> of conformational barriers. However, pragmatically spoken, it is often
> possible to make an educated guess.
>
> Now, to tackle your actual problem at hand. Assuming the phase space
> overlap between your replicas is already adequate, I don't think you can
> improve conformational sampling significantly by simply increasing the
> number of replicas. In your situation, the best way to improve sampling
> seems to be to increase the temperature of the highest replica. Of course,
> you need to maintain phase space overlap while doing so, which means that,
> as the temperature of the highest replica increases, more replicas will be
> needed anyway. So I would recommend graphing the histograms of the energy
> of the different temperature windows together on one plot, and use that
> information to choose a set of 32 temperature windows with the highest
> temperature as high as possible, yet low enough so that you can be
> confident the histograms will have enough overlap.
>
> Oh yeah, T-remd is indeed known to be pretty painful in explicit solvent,
> so your decision to go implicit makes sense. But no convergence check in
> the world will give you information on how bad the implicit solvent
> approximation is in your particular system. Only comparison with explicit
> solvent or experiment will do that.
>
>
>
>
> On 09/20/2013 03:33 AM, Francesco Pietra wrote:
>
>> Hi Niklaus:
>>
>> What I am trying to do is defining the conformation of starting and ending
>> parts of a homotrimer. These portions, 34aa and 12aa for each subunit in
>> our hands do not diffract adequately under x-ray at low temp. So, I guess
>> that the energy barriers are low. This is why my hope to get that type of
>> T-remd convergence I alluded to. Once modeled, those parts should be
>> replaced in the original (by superimposing a rmsd-colvars restrained small
>> helical portion).
>>
>> Computer time does matter for me indeed, as I am trying to accomplish the
>> investigation under a fixed budget. I first tried the 34aa in TIP3 water
>> but the request of hardware, in terms of physical nodes, was too high to
>> fix "Stray PME grid charges detected" that was raised. Therefore, I
>> changed to implicit GB, following other-people idea that most of the
>> resources for explicit waster T-remd are wasted with swapping water
>> interactions. GB runs fast, however I don't know if is as good as with
>> latest adjustment of parameters for amber (C. Simmerling 2013).
>>
>> Clearly, I need a test of convergence that is not that provided by
>> "show_replicas.vmd". I wonder whether the collection of replicas obtained
>> with show_replicas.vmd could in place be elaborated, such as for measuring
>> RMSD for the various replicas against the initial unfolded situation. Just
>> to have a measure of how much the various replicas differ from one
>> another. As I am using 32 replicas (0.7 exchange ratio), the matter is
>> rather tricky. With 16 replicas the exchange ratio is still very good
>> (0.4) but the computer time saved is minimal (and I pay the same as for 32
>> replicas). I hope that convergence is faster with 32 replicas. As a
>> biochemist, I never engaged myself in extensive coding, while it would be
>> probably not too difficult to adapt show_replicas.vmd to the task. I have
>> also thought to move to namd_2.10, where remd is built in, so that it
>> should be faster.
>>
>> cheers
>> francesco
>>
>>
>> On Thu, Sep 19, 2013 at 7:30 PM, Niklaus Johner <niki.johner_at_gmail.com
>> <mailto:niki.johner_at_gmail.com>**> wrote:
>>
>> Yes each replica samples from the same "extended" phase-space so that
>> if you wait long enough, each replica will have visited each
>> conformational state often enough to have a converged statistic of
>> that state's occupation at the different temperatures and therefore
>> you'll be able to say that, as Jason points out, your replicas have
>> converged.
>>
>> So fundamentally I agree with you that obtaining this kind of
>> convergence would be ideal (if not proof of convergence). I just think
>> it's not realistic with our current simulation capabilities, except
>> for very small systems with a reasonably small number of degrees of
>> freedom, like the alanine-dipeptide (that's why all the papers
>> introducing new sampling methods only look at the dipeptide and the
>> TRP-cage :-)). So if you get convergence of the probabilities of
>> occupancy of the few most sampled states, that would already be an
>> achievement and is, in my opinion, enough to give you some confidence
>> in the conformations you predict.
>>
>> How big is your peptide?
>>
>> Obviously looking at different measures of convergence will increase
>> your confidence in your sampling. You could look at frequency of
>> certain structural elements, convergence of contact maps etc. Split
>> the simulation in pieces and compare, compare different replicas...
>>
>> A good way to speed up convergence is to start each replica from a
>> different configuration. Again, I know it shouldn't matter if you wait
>> long enough, but again, I think it's usually not an option to wait
>> that long.
>>
>> Best,
>>
>> N.
>>
>>
>> Niklaus Johner
>> Weill Cornell Medical College
>> Harel Weinstein Lab
>> Department of Physiology and Biophysics
>> 1300 York Avenue, Room D-501
>> New York, NY 10065
>>
>>
>>
>> On Sep 19, 2013, at 11:53 AM, Francesco Pietra wrote:
>>
>> Absolutely no. Each individual replica samples from the same pool.
>>> Therefore, if you do not get the same from each individual replica,
>>> this means no convergence. It might be difficult to get convergence
>>> but, if not obtained, it would be non scientific to go to
>>> "sortreplicas" to compute properties at the temperature of your
>>> interest.
>>>
>>> This is the way I understand T-remd. But I am prone to reeducate
>>> myself in the light of compelling reasoning. This was not the case
>>> so far.
>>>
>>> cheers
>>> francesco pietra
>>>
>>>
>>> On Thu, Sep 19, 2013 at 4:48 PM, Niklaus Johner
>>> <niki.johner_at_gmail.com <mailto:niki.johner_at_gmail.com>**> wrote:
>>>
>>> What do you mean by "giving the same answer"? Are you hoping
>>> that all your replicas will converge to a unique structure? The
>>> whole point of T-remd is that the replicas switch from one
>>> temperature to another, to avoid getting stuck in a particular
>>> minimum. I think the expectation is that even if you could run
>>> long enough to have "convergence", meaning that you have the
>>> correct populations for all the important states, which is more
>>> than you can hope for, this would be a global convergence of the
>>> T-remd and not of the individual replicas. Different replicas
>>> might explore the phase-space around different local minima, but
>>> should populate the lower temperatures in the simulation
>>> according to the relative energies of these minima. So I think
>>> what you want to test is if the population of states in the
>>> lowest temperatures is stable. So I would do clustering on
>>> different parts of the trajectory and compare the clusters of
>>> highest occupancy, or something like this.
>>>
>>> Be aware that you can never really know if a simulation is
>>> converged. How could you know if you've seen all the important
>>> conformations? Maybe the most important conformation, with
>>> lowest free-energy can be reached from your starting
>>> configuration only by crossing a very high energy barrier, so
>>> that you will never see it, and all the measures you can come up
>>> with could tell you that it's converged. It's probably one of
>>> the biggest issues with MD.
>>>
>>> Good luck,
>>>
>>> N.
>>>
>>> Niklaus Johner
>>> Weill Cornell Medical College
>>> Harel Weinstein Lab
>>> Department of Physiology and Biophysics
>>> 1300 York Avenue, Room D-501
>>> New York, NY 10065
>>>
>>>
>>>
>>> On Sep 19, 2013, at 2:48 AM, Francesco Pietra wrote:
>>>
>>> Hello:
>>>> I posed the same question to the vmd forum but probably is to
>>>> the namd forum pertinent for the question.
>>>>
>>>> Thus, I carried out a short T-remd on alanin with 8 replicas,
>>>> as provided by namd_2.9. I used the final namd-provided folded
>>>> pdb for comparison. The script show_replicas.vmd with these
>>>> replicas warned "not converged".
>>>>
>>>> With my peptide, 32 replicas, after 370,000 steps still far
>>>> from convergence, I used, for the vmd comparison, the starting
>>>> unfolded.pdb also as a fake folded.pdb. In this case,
>>>> show_replicas.vmd did not raise any warning. Looking also at
>>>> the code, it seems to me that show_replicas.vmd assumes, as a
>>>> criterion of convergence, the comparison of the various
>>>> replicas with the pdb file that you give as folded peptide in
>>>> the fold.peptide.conf file. That works if your T-remd is just
>>>> devised to check if you are able to reproduce an experimentally
>>>> defined situation.
>>>>
>>>> Suppose instead that your T-remd is devised to search for the
>>>> best conformation, or cluster of conformations, for an
>>>> experimentally undefined peptide. I can imagine many situations
>>>> where the experimental approach is problematic. Then you need a
>>>> real criterion of convergence of yor T-remd, before starting to
>>>> examine the (sorted) same-T replicas. Obviously, a reliable
>>>> criterion of convergence is that what you get must be the same
>>>> from each replica, for example that the average structure is
>>>> the same from all replicas.
>>>>
>>>> Therefore, I got the impression that the warning "not
>>>> converged" raised by show_replicas.vmd is misleading. Is any
>>>> script available to check when any replica gives the same
>>>> answer.
>>>>
>>>> Thanks for advice, even if showing that I am wrong.
>>>>
>>>> francesco pietra
>>>>
>>>
>>>
>>>
>>
>>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:42 CST