Re: namdEnergy gives significantly different values from those in log file?

From: Peter Freddolino (petefred_at_ks.uiuc.edu)
Date: Wed Nov 26 2008 - 12:10:37 CST

Oops... I completely read past the fact that Steve was including the
kinetic energies. I should have been more clear and said "total
potential energy"; my point was that one should not just take the
potential energy of the protein (or some other component of the system).
Sorry for any confusion that generated; I shouldn't have missed that,
but got caught up in the namdenergy problem of calculating the potential
energies of interactions.

And yes, it is now clear that Steve is properly accumulating microstates
for analysis.

Thanks,
Peter

Jerome Henin wrote:
> Hi everyone,
> Peter: to answer your question about a frame being a "state", here
> Steve is using the word in the sense of microstate.
>
> Steve: since you are interested in statistics in conformation space,
> if I were you I'd try reweighting based on the potential energy only
> (using the target temperature as you did: I don't think the
> instantaneous "temperature" is meaningful). You can always think of
> your MD trajectories as if they were MC.
>
> Jerome
>
> On Wed, Nov 26, 2008 at 1:18 AM, Steven Samuel Plotkin
> <steve_at_physics.ubc.ca> wrote:
>
>> Hi Peter,
>>
>> Yes I am careful to get good equilibrium sampling by writing each "state" to
>> the log and dcd files every 20ps. The reweighting of states should be
>> independent of whatever reduced coordinate I choose, I think. However there
>> are 2 ways to do the reweighting, and I'm not sure which is the appropriate
>> one. One is to use T_real for the current temperature of the state, the
>> other is to use T_targ for the current temperature of the state. I noticed
>> that the state that dominated my statistics had an anomalously high T_real.
>> Here is a segment of the Temperatures T_real (blue) and T_targ (green) vs
>> time:
>>
>> http://www.physics.ubc.ca/~steve/NAMD/Repx/TtargAndTreal900-950.jpg
>>
>> The blue peak in the graph corresponds to the state that dominated my
>> statistics. I think the reweighting is best done with T_targ rather than
>> T_real, which gives a more robust answer. Does that make sense? The T_real's
>> fluctuate about the target temperatures. ( I have to apologize for my
>> confusing notation previously, where I used T_target for the temperature I
>> want to obtain my free energy surface at. )
>>
>> Using T_targ instead of T_real alleviated but did not eliminate the problem.
>> I think I simply need to collect more statistics. I thought 8000 states
>> would be enough for a decent 1-dimensional free energy surface but
>> apparently not in this case (the free E vs RMSD is quite rugged).
>>
>> The NVT simulations seem to be running fine.
>>
>> Thanks for your latest advice-- I do realize this thread has lately had more
>> to do with data analysis than with the workings of NAMD.
>>
>> Cheers,
>>
>> Steve
>>
>>
>> Peter Freddolino wrote:
>>
>> Hi Steve,
>>
>> I'm a little unclear on what you're doing... are you labeling as a "state"
>> each single frame from your trajectory? This is unlikely to give good
>> statistics as you need to average over solvent conformations and such (not
>> to mention local fluctuations in the protein). Or are you defining some set
>> of reduced coordinates to calculate your free energy surface? In the latter
>> case I'd recommend looking over some papers in the literature with complete
>> solutions for applying WHAM-style analysis to replica exchange simulations:
>> J. Phys. Chem. B, 2005, 109 (14), pp 67226731
>> JCTC, 2007, 3, pp 26-41
>>
>> All you should need to do to run at constant volume is turn off
>> langevinPiston and comment out the settings following it.
>>
>> Best,
>> Peter
>>
>> Steven Samuel Plotkin wrote:
>>
>> Hi Peter,
>>
>> I think you are right, and this was why I took the total energies initially,
>> which includes both KE and PE of the whole peptide plus water. This is what
>> governs the original sampling probabilities, so I have discarded the idea of
>> cherry-picking parts of the system. To find the re-weighted probabilities
>> for each state I have been simply reweighting each state by the factor:
>> exp( E_i ( beta_i - beta_targ) )
>> where E_i is the total energy of the state, beta_i is 1 over kB times the
>> temperature at that sample time, and beta_targ is 1 over kB times the target
>> temperature. This should be fine I think, at least for const volume
>> simulations.
>>
>> After doing this re-weighting however, the new exponents in the Boltzmann
>> factors are (when sorted from highest)
>> 1673.3
>> 1657
>> 1638
>> 1623.2
>> 1616.6
>> 1606.1
>> 1596.6
>> .
>> .
>> .
>> so one particular state ends up having all the Boltzmann weight in the
>> partition function (i.e. exp(16.3) more weight than the next weightiest
>> state). I didn't think this was reasonable, it says entropy isnt important
>> in the partition function. What are the solutions for this problem? Perhaps
>> more sampled configurations, but I have 8000 now which I thought should be
>> enough to at least get a rough idea what the potential looks like. Maybe I
>> need a narrower temperature range- the beta_i - beta_targ factors are making
>> the exponents too large. However previous studies of REMD seem to have gone
>> to 500K without difficulty... I must be missing something.
>>
>> I did not add a pressure correction to the replica exchange scripts. I
>> hadn't thought this through carefully enough and didn't know the REMD
>> scripts had not accounted for it. According to the reference you mentioned,
>> there is an extra term in the transition matrix, and an extra term in the
>> probabilities ~ Vi^N exp(-beta P V_i ). I will have to check how large this
>> factor is. I don't think the volume fluctuations should be very large but I
>> could be wrong. Then I would have to rerun my simulations at constant
>> volume. I assume from reading some previous posts on the forum that removing
>> all statements pertaining to pressure and variable volume such as
>>
>> useGroupPressure yes ;# needed for rigidBonds
>> useFlexibleCell no
>> useConstantArea no
>> langevinPiston on
>> langevinPistonTarget 1.01325 ;# in bar -> 1 atm
>> langevinPistonPeriod 100.
>> langevinPistonDecay 50.
>>
>> from the conf file should do the trick.
>>
>>
>> Thanks for any help,
>> Steve
>>
>>
>>
>> Peter Freddolino wrote:
>>
>> By the way, regarding the text below... I do not think that you should be
>> taking the energies of a subset of your system to use in the analysis. You
>> really ought to be using the total energy of your system for doing the type
>> of analysis that you're doing; otherwise you're basically cherry-picking
>> part of the total energy of your system without regard for the system as a
>> whole. There are several variants of WHAM for analyzing replica exchange
>> data in the literature that should be applicable to your problem. Also, the
>> namd replica exchange scripts don't contain a pressure correction (Chem.
>> Phys. Lett. 335:435439.), so you should be running each simulation at
>> constant volume unless you've added it...
>> Best,
>> Peter
>>
>> Steven Samuel Plotkin wrote:
>>
>> Dear NAMD users,
>>
>> I have run namdEnergy on a dcd file, printing out All energies. First I
>> noticed the kinetic energy was not printed, then I compared the "TOTAL"
>> potential energy from namdEnergy with the TOTAL energy minus the KINETIC
>> energy in my log file. The two are quite different -- about 370 kcal/mol on
>> average. I'm confused by this, I'm not sure what is being calculated
>> differently? I've provided links to my dcd, log, namdEnergy settings, and
>> conf file are below. I've also posted a long version of my question below to
>> help put it in context.
>>
>> (I'm using namd 2.6 and vmd 1.8.6).
>>
>> Thanks for any help,
>>
>> Steve P.
>>
>>
>> FILES:
>> log and dcd file:
>> http://www.physics.ubc.ca/~steve/NAMD/Repx/fold_dse2.job0.0.log
>> http://www.physics.ubc.ca/~steve/NAMD/Repx/fold_dse2.job0.0.dcd
>>
>> Conf files:
>> http://www.physics.ubc.ca/~steve/NAMD/Repx/fold_dse2.conf
>> http://www.physics.ubc.ca/~steve/NAMD/Repx/dse2_base.namd
>>
>> Screenshot of my namdEnergy gui, with settings:
>> http://www.physics.ubc.ca/~steve/NAMD/Repx/Screenshot-NAMDEnergy.png
>>
>> xsc file (if needed):
>> http://www.physics.ubc.ca/~steve/NAMD/Repx/fold_dse2.job0.0.xsc
>>
>>
>>
>> /Long version of my question:/
>>
>> I have been able to modify the Alanin replica exchange tcl code to run on a
>> small peptide which has been solvated and is in ionic solution. I am using
>> namd 2.6 on an x86_64 amd cluster. I ran 16 replicas on 8 CPUs across 2
>> nodes of a cluster, with temperatures ranging from 280K to 440K. My conf
>> file has some new information not in the alanin example because of my
>> solvated box: I have cellBasisVectors, wrapWater, wrapAll, PME, and Pressure
>> control options including langevinPiston. I was careful to choose the
>> langevinPistonTemp to be $NEWTEMP in replica_exchange.tcl.
>>
>> I want to construct a free energy profile for the small peptide. However
>> this must be obtained at one (target) temperature, so I reweighted all
>> states by what their Boltzmann weights should have been at the target
>> temperature. This doesn't work for me: because the system includes the
>> peptide plus the whole water box, after re-weighting, one particular state
>> ends up having all the Boltzmann weight in the partition function. I didn't
>> think this was feasible.
>>
>> So then what I must use is the total energy (KE +PE) of the protein itself,
>> which includes self interactions /plus/ interactions with water and ions.
>> Then I can reweight these energies at the target temperature and get a new
>> partition function and free energy profile. I thought namdEnergy would be
>> the tool for this.
>>
>> As a sanity check, I ran namdEnergy on one of my dcd files, printing out
>> 'All' energies. First I noticed the kinetic energy was not printed, then I
>> compared the "TOTAL" potential energy with the TOTAL energy minus the
>> KINETIC energy in my log file. The two are quite different -- about 370
>> kcal/mol on average. However the average total potential energy is about
>> 4460 kcal/mol, so the mean error is about 8%. Am I doing something wrong
>> here? Or is this to be expected?
>>
>> At the end of the day what I believe I want is simply the namdEnergy numbers
>> for selection 1= PROTEIN , selection 2= "everything else", added to the
>> PROTEIN-PROTEIN namdEnergies. However the errors present in my sanity check
>> are giving me pause that the namdEnergies are trustable.
>>
>>
>>
>>
>>
>>
>>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:48:42 CST