Re: FEP - one long simulation vs many short ones

From: Floris Buelens (
Date: Mon Jul 04 2005 - 07:22:55 CDT

Dear Jérôme,

Thanks for the very helpful reply, which answered my
questions and raised many interesting points.

> That's an very interesting question, because as you
> mention, in the case
> of a large distributed computing infrastructure with
> poor or average
> communication performance, running many independent
> FEP windows is
> obviously a very efficient approach. So how
> necessary is it to follow
> the "traditional", sequential protocol?

You speak of large distributed systems - I'm working
with a system of 5 linked 28CPU blade clusters, with
fast connections within each cluster and 2x1Gb
ethernet between clusters, which is capable of pretty
respectable scaling, probably better than most
researchers today will have available. However even
with this setup, if I scale to 32 CPUs, the speedup is
at best only about 16-fold. As I mentioned, with 80
independent simulations, 2CPUs each, I can get out
32ns in 36h. If I tried to run a single big
simulation, with other limitations of my system, the
very best I could hope for in 36h would be ~10ns.
With my numbers, you'll understand I'd take a lot of
convincing to go back to sequential simulations. If we
assume the sampling of a sequential simulation is more
valid, it'd have to be far better to outweigh the
advantage of simply having about 3 times more MD time
to sample.
I could go futher and argue that even for high-end
parallel systems, a modest penalty from scaling, say
30%, still means a 30% fewer MD steps than if you'd
simulated many independent windows. We need to
consider what we're getting in return for this 30%

> Well, the canonical point of view would probably
> imply that the "drift"
> means that your system is moving to regions of phase
> space
> well-characterized by the Hamiltonian with the
> current value of lambda.
> In other words, the system should be following the
> Hamiltonian.
> You raise a good point there, however: what part of
> the calculated
> hysteresis is due to incomplete sampling or poor
> ensemble overlap, and
> what part results from some slow degrees of freedom
> having drifted
> between the forward and reverse calculations? I
> don't think this
> distinction is clearly made in most papers
> presenting FEP results. I
> have to say, however, that if you come to fear that
> your system could
> drift in a non-physical/non-desired way if you run a
> longer simulation,
> then you somehow distrust your setup or
> forcefield/simulation
> conditions. Then you might want to consider
> carefully whether it is
> worth carrying out free energy calculations at all.
> I mean no offense
> there, that's a question I've already asked myself
> in the past.

A valid point and no offense taken :-) I have no
particular reason to question the stability of my
system. However I think these concepts of drift and
hysteresis are important to this discussion. I can't
claim much theoretical background on these concepts
and I'm sure these points have been discussed in much
detail in the past. However my intuition says that
variability between lambda windows measured in
different time frames of a long simulation needn't
necessarily be a quantity that has to be included in
our measurement.
I'd say we could divide what I called "drift" over a
long simulation, starting from a crystal structure,
into two categories. Firstly we have desirable
relaxation of the structure, as it transitions from an
unphysiological X-ray structure to a realistic
dynamic, solvated protein system. Secondly we could
consider some undesirable tendancies that arise from
the differences between the formulation of the
hamiltonian and the chemical physics of the real
system - the approximations of the empirical force
field as well as countless all-but-impossible to model
factors such as ionic strength, pH, cofactors...
The very nature of our method means we can't separate
these two categories of behaviour. However, it is
inevitable that over the many nanoseconds essential
for a protein FEP simulation, the balance between
these behaviours gradually shifts.
To cut a long story short, what I referred to as
"drift" was my intuition that, say, 10ns into a long
simulation, the dE average for any given lambda will
be sampled in a different region of phase space than
if the same window was measured, for example, 2ns in.
I don't claim that this difference in sampling is
automatically undesirable; as you imply, we may
strengthen our data if we can include the relaxation
of slow degrees of freedom, including those that are
only negligibly affected by lambda. However, I return
to my question from before - what do we get in
exchange for our minimum 30% scaling penalty? If the
relaxation of slow degrees of freedom is put forward
as an advantage, one would still have to show that
this sampling advantage outweighs the instant 30%
reduction (or in my case >60% reduction) in MD time
that we have to sacrifice straight away.

Sorry for such a long post, I hope you'll find time to
give your thoughts. I'm very grateful for the insights
I've gained so far, this will make excellent material
for my imminent thesis write-up...
Best regards,

Floris Buelens
Crystallography, Birkbeck College

Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:40:55 CST