Re: Measuring protein diffusion: results too small?

From: Joshua D. Moore (jdmoore_at_unity.ncsu.edu)
Date: Fri Jun 05 2009 - 21:49:22 CDT

This might be a silly questions, but I didn't see anyone address it.
You mention getting different results with different box sizes (at
least at long times....you said "steeper").

Are you measuring your MSD using wrapped coordinates in the trajectory?

The Langevin thermostat (especially with 5ps^-1 damping coefficient)
should slow down the diffusion, (possibly by an order of magnitude..at
least I've seen some systems in which this is the case, but not bulk
systems), depending on the system. I would also check the COM of your
system to see if it is drifting.

Josh

On Thu, Jun 4, 2009 at 3:56 AM, Stephen Hicks<shicks_at_ccmr.cornell.edu> wrote:
> On Wed, Jun 3, 2009 at 6:53 PM, Axel Kohlmeyer
> <akohlmey_at_cmm.chem.upenn.edu> wrote:
>> On Wed, 2009-06-03 at 17:49 -0400, Stephen Hicks wrote:
>>> Hi,
>>
>> hi stephen,
>
> Thanks for the response!
>
>>> I'm attempting to measure the diffusion constant of a more or less
>>> spherical protein, the C-terminal domain of HIV capsid protein 1AUM.
>>> It has 70 amino acids and I estimate the radius to be about a=1.3nm.
>>> At 310K I estimate the viscosity of water to be anywhere from
>>> eta=.25cP to .7cP (dependent on the model - I understand TIP3P is much
>>> lower than experiment).  So I ran NVT simulations in a large TIP3P
>>> water box (10nm to a side) with a 0.8fs timestep (langevin thermostat,
>>> rigid bonds) and measured D as follows:
>>>
>>> D = <(x(t+t')-x(t'))^2>/6t
>>
>>
>>>
>>> where x is the (3-vector) position of the protein and the <..> are
>>> averaging over times t'.  For the t part, I calculate the average as a
>>> function of t and fit the part near zero to a line to compute the
>>> slope, which is my diffusion constant.  But when I do this, my result
>>> is always on the order of about 3e-7 cm^2/s, while I expect, using
>>
>> actually, it might be better to use the time derivative of your
>> expression or fit a straight line to the large time part (last
>> third or so) of your MSD vs. time data. you definitely don't want
>> to fit so that D(0)=0, because the einstein relation only holds
>> for large t. have a look at, e.g. page 10 and beyond in.
>>
>> http://www.theochem.ruhr-uni-bochum.de/~axel.kohlmeyer/files/talk-trieste2004-water.pdf
>>
>> you also have to factor in system size effects. there should be
>> a paper by gerhard hummer deriving an estimate formula for that,
>> but i forgot exactly where.
>
> That's exactly the problem, though.  I did the same simulation in a
> 5nm and a 10nm box and measured exactly the same diffusion, while I
> expect I should get some difference.  As far as the measurement goes,
> the plot of MSD (of the center of mass of the protein) vs. time looks
> basically like this: at very short times (less than 200fs) it's
> quadratic: the protein is moving more or less ballistically.  At
> longer times (from 200fs to, say, 2ps) the MSD is beautifully linear.
> After about 2ps, the slope falls off a bit, but is still linear up to
> about 60ps (or 1/10 the total simulation time).  Running the
> simulation longer extends this linear section pretty regularly, out to
> about 1/10 the total time.  But the slope does not change, and if I
> plot the results of the 10nm box on top of the results of the 5nm box,
> the curves basically lie on top of one another, with the 5nm a bit
> steeper (!), and both slopes a factor of 10 smaller than what
> Einstein-Stokes predicts.  As far as the long-time limit goes, it's my
> understanding that it just needs to be outside the ballistic regime.
> After that point, the further out you go, the worse the data becomes,
> because there's fewer windows to average over.  If you want to see for
> yourself, http://pages.physics.cornell.edu/~shicks/trajectory-5nm-500ps.dat
> (time in ns; x,y,z in Ĺ) is pretty characteristic of my trajectories.
>
> Or am I misunderstanding the point of the "t->infinity" limit?  Am I
> looking at the wrong timescale?  (i.e. some qualitative change occurs
> at 100ps that I just can't see through the noise, and that's the slope
> I actually need... something similar is shown but not explained in the
> Hummer paper, but only in the presence of a driving force).  Or is the
> fact that I only simulated ~1ns really causing the slope at ~10ps to
> be 10x too small, and consistently so?  (i.e. if I break up my
> simulation into three separate simulations of 1/3 the length, they
> still cluster pretty well at the wrong slope, so it seems like my
> errorbars at ~10ps are sufficiently smaller than the discrepancy from
> what I expect).
>
> Here's what gets me: Hummer's 20bp RNA is more or less the same size
> as my protein, yet they measure 10x the diffusion.  2.3e-6 cm^2/s
> means that in 1ns I should see on average 5Ĺ drift.  In my
> simulations, after 1ns, I never see more than 1-2Ĺ drift - a factor of
> sqrt(10) difference...  so for all these reasons, I'm skeptical that
> I'm analyzing the trajectory wrong, and so I conclude that it must be
> something going wrong in the simulation itself...?
>
> If anybody has any sort of trajectory files exhibiting what looks like
> normal diffusive behavior of a globular protein, I would be much
> obliged if I could have a look at them, since I'm pretty much at my
> wits end in my inability to reproduce what seems to be a pretty
> standard result.
>
> Steve Hicks
> Ph.D. Candidate, Henley Group
> Laboratory of Atomic and Solid State Physics
> Cornell University
>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:52:54 CST