Re: Sampling frequency

From: Jeff Comer (jeffcomer_at_gmail.com)
Date: Tue Sep 01 2015 - 08:40:14 CDT

Hi Eddie,

This answer might be too simplistic, but in general you want to sample with
a period on the order of the correlation time of the process you're
studying. This correlation time can be quantified by computing the
autocorrelation function of the quantity of interest. If your samples are
taken with a period much shorter than the correlation time, they contain
redundant information and are occupying more disk space than is necessary.
If your samples are taken at a period much longer than the correlation
time, then you are throwing away perfectly good samples and doing more
computation than necessary to get a precise average.

The correlation time is a guide for doing things like calculating the mean
number of water molecules inside of a cavity (you want a dcdFreq just a bit
shorter than the typical time for molecules to enter and leave the cavity,
which could be ~ns) or calculating the mean force on a solute molecule in
solution (for small solutes the force is correlated on the ~50 fs
timescale).

Practically speaking, it sometimes happens to me that I use a dcdFreq of,
say, 20000, to save disk space, but later I realize that I want to analyze
something that takes place on a faster timescale. So it's often best to
choose a dcdFreq lower than what you think you will need. One option is to
sample just one or a few particular variables with high time resolution,
using Colvars and a small colvarsTrajFrequency. Of course if you later
decide you're interested in another variable, you have to go to the dcd.

Regards,
Jeff

–––––––––––––––––––––––––––––––––––———————
Jeffrey Comer, PhD
Assistant Professor
Institute of Computational Comparative Medicine
Nanotechnology Innovation Center of Kansas State
Kansas State University
Office: P-213 Mosier Hall
Phone: 785-532-6311

On Mon, Aug 31, 2015 at 2:02 PM, Dr. Eddie <eackad_at_gmail.com> wrote:

> Hello all,
> I have a simple question which I know does not have a simple answer: what
> is the largest sampling frequency (dcd output) I need?
>
> I'm looking for any theoretical work on how to know when: 1) the system
> has converged 2) the sampling rate of the system is sufficiently small to
> NOT alias too much information?
>
> I understand in work with conformational changes in the protein this is
> not an issue. However, for those working on studying the steady-state
> behaviour, how do you tell when your system has a "good" sampling of the
> conformational space so you can extract statistically relevant information
> from it? My goal is to understand this for regular MD so that In can use
> replica exchange which (hopefully) will give the same information in less
> time.
>
> Any references or help would be very helpful!
> Thanks,
> Eddie
>

This archive was generated by hypermail 2.1.6 : Tue Dec 27 2016 - 23:21:17 CST