Re: different seeds and protein behavior

From: Aron Broom (broomsday_at_gmail.com)
Date: Thu Jul 19 2012 - 20:21:44 CDT

If the temperature seed is different, or even if it is the same but the
precision of the calculation is insufficient, the behaviour can differ. If
you simply consider an average protein with a delta-G of folding around -5
kcal/mol, this suggests the folded state is e^(-dG/RT) more stable than the
unfolded, in this case at 298 K, it would be ~5000 times more stable. This
means that if one were to simultaneously watch 5000 such proteins, at any
given time 1 of 5000 would be unfolded, which also means that if you were
to pick 5000 folded copies and watch them for some arbitrary time, on
average, 1 of those 5000 would unfold while you were watching it. So then,
the answer to your question in this case (assuming the forcefield is good)
is that if you ran 5000 different trajectories you would see the protein
unfold in one of them. Of course, how long you would need to simulate for,
in order to get to this equilibrium distribution, from your non-equilibrium
starting point of all folded, depends on the kinetic stability of your
protein.

So I guess the overall answer is that it depends on the kinetic stability
(height of the energy barriers) around your system. In terms of what
people report, many actually do take the results of tens to thousands of
simulation runs into account. The limiting factor is computational time,
so in the cases you mention where it is just one run, that one run probably
took so long to do, that it wasn't feasible to do more. Also, some results
are naturally the average of several simulations, such as with replica
exchange, in which case you needn't be quite as concerned about
reproducibility as you would for a single standard run where you see
something fantastic. As you mention, whatever the case, you should always
look for convergence of your variable of interest to know that things may
be finished.

~Aron

On Thu, Jul 19, 2012 at 7:29 PM, Dr. Eddie <eackad_at_gmail.com> wrote:

> Hi,
> My colleage and I are having a gentleman's disagreement about MD for
> proteins that I am hoping you experts can solve. If we run MD calculations
> using namd (at constant pressure temperature with an initial minimization
> of 1600 cycles) with same parameters but a different seed we can get
> different "types" of behavior. That is two runs may find that some
> distance-distribution between two residue CA's is centered around some
> value while another two run are centered at a different distance (in the
> 1-2.5 Angstrom range, not too different). In most papers where people
> report results using NAMD they only show one single run. Presumably this is
> because runs with the same parameters (but a different seed) give
> essentially the same result, right?
>
> We can see this in something large as the full protein's RMSD. Two runs
> will have a RMSD centered nearly the same while two more will have their's
> different by ~1-2 Angstoms (for the backbone CA's). It seems that at 310K
> it is plausible that a protein can have multiple behaviors.Has anyone else
> seen this or is more time required for the protein to stabilize and always
> give a similar result (rmsd, distance distributions etc)? The runs I am
> referring to are done for 16ns.
>
> Thanks!
> Eddie
>
>

-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:22:17 CST