# Re: How to interpret Thermodynamic Integration output

Date: Tue Feb 27 2018 - 12:26:17 CST

Brian and Josh,

Thanks for your replies. As a rough analysis, I plotted the 'non-averaged'
columns vs. timesteps. The plots look noisy for some energy components.
Also, I multiplied each entry of the columns by 10 (since I sampled every
10 timesteps), and summed them up. I believe that is the same as
integration; correct me if I'm wrong. Then I summed the sums of
BOND+ELECT+VDW columns for each group (I guess that is the total DeltaG).
The DeltaG is -10.9 kcal/mol for group 1 and -15.4 kcal/mol for group 2. I
assume in an ideal case they should be equal. If the procedure I explained
is correct, I wonder what is the best way to get rid of noisy data and
having two DeltaG values close together. I assume increasing the number of
lambdas form 16 to 30+ or increasing the length of lambdas from 20ns to
40ns may help.

Thanks and sorry if some of my questions look basic,

On Sun, Feb 25, 2018 at 6:23 PM, Vermaas, Joshua <Joshua.Vermaas_at_nrel.gov>
wrote:

>
> To add to what Brian said, the way I tend to parse the output is in a
> two-step process. In the first step, you are only looking to extract the
> non-averaged columns of the output file, which may be a simple call to awk
> or numpy.loadtxt, depending on your comfort level in bash or python. Taking
> the non-averaged columns let you have some idea of how noisy your averaging
> process needs to be, and lets you rigorously compute error estimates for
> each point. From there, you have the value for dG/dlambda along your lambda
> coordinate, which you can plot/integrate to get your final answer. I would,
> however, NOT recommend a simple trapezoidal rule in general for the
> integration. While it is normally ok for TI, if you ever decide to switch
> to FEP, I find that a spline interpolation deals better with the spiky end
> points I see on occasion, so learning how to integrate an interpolated
> curve is a useful skill to have (see https://docs.scipy.org/doc/
> scipy/reference/generated/scipy.interpolate.BSpline.
> html#scipy.interpolate.BSpline).
>
> -Josh
>
> On 02/25/2018 11:52 AM, Brian Radak wrote:
>
> Yes, this is indeed a slightly undocumented update in the latest version
> (though not much less documented than the old version).
>
> The derivatives are separated by alchemical groups (group 1 is scaled with
> lambda and group 2 is scaled with 1 - lambda, these are specified in the
> input PDB with 1 and -1).
>
> Note also that everything in the alchOutFile is an average. The normal
> output is a local/block average over the last alchOutFreq steps and the AVG
> columns are cumulative averages. There's really no good reason to have
> alchOutFreq as small as you have it - it just hurts performance due to
> frequent I/O.
>
> It is up to the user to decide what they want to do with the output. If
> you are using simple trapezoidal integration, you would have to scale each
> column according to the appropriate deltaLambda increment (which you have
> to compute yourself).
>
> HTH,
> BKR
>
> P.S. Note also that the instantaneous TI derivatives are now also sent to
> stdout according to outputEnergies. You can safely suppress all output to
> alchOutFile by setting alchOutFreq 0 and just use the stdout if you prefer
> - I sincerely doubt you would be able to detect a difference in the
> analysis of any reasonable calculation.
>
> On Fri, Feb 23, 2018 at 7:09 PM, Sadegh Faramarzi Ganjabad <
> wrote:
> Hello all,
>
> I have done a TI calculation, but can't find any tutorial or document
> about how to interpret the output. Here is how my output file starts
>
> #TITITLE: TS BOND1 AVGBOND1 ELECT1
> AVGELECT1 VDW1 AVGVDW1 BOND2 AVGBOND2
> ELECT2 AVGELECT2 VDW2 AVGVDW2
> #NEW TI WINDOW: LAMBDA 0
> #PARTITION 1 SCALING: BOND 0 VDW 0 ELEC 0
> #PARTITION 2 SCALING: BOND 1 VDW 1 ELEC 1
> #CONSTANT TEMPERATURE: 310 K
> TI: 0 19.7158 19.7158 -31.3316
> -31.3316 -6.5855 -6.5855 12.4099 12.4099
> -40.2905 -40.2905 21.8477 21.8477
> TI: 10 19.4744 19.4964 -31.2402
> -31.2486 -6.5801 -6.5806 10.7752 10.9238
> -40.0014 -40.0277 18.3626 18.6795
> TI: 20 24.9347 22.0861 -29.0597
> -30.2062 -6.5748 -6.5778 10.0471 10.5063
> -38.9418 -39.5106 12.0912 15.5422
> TI: 30 32.2653 25.3697 -25.5063
> -28.6901 -6.5840 -6.5798 9.2265 10.0935
> -38.3273 -39.1289 10.4395 13.8962
> TI: 40 42.3413 29.5091 -20.9719
> -26.8077 -6.6005 -6.5848 9.9239 10.0521
> -38.7601 -39.0389 13.5623 13.8147
> TI: 50 58.5311 35.1997 -15.9560
> -24.6799 -6.6079 -6.5894 11.9345 10.4212
> -39.2660 -39.0835 12.5075 13.5584
> TI: 60 80.3605 42.6031 -12.8759
> -22.7448 -6.6090 -6.5926 10.8168 10.4861
> -39.5107 -39.1535 6.7353 12.4399
> TI: 70 102.0040 50.9694 -14.6704
> -21.6076 -6.6234 -6.5969 10.6856 10.5142
> -39.9275 -39.2625 5.4385 11.4538
> TI: 80 119.1639 59.3885 -20.0657
> -21.4172 -6.6514 -6.6037 12.3398 10.7396
> -39.4442 -39.2849 8.3762 11.0738
> TI: 90 130.3328 67.1846 -23.9356
> -21.6940 -6.6710 -6.6111 9.7465 10.6304
> -37.6627 -39.1067 11.6755 11.1399
> #100 STEPS OF EQUILIBRATION AT LAMBDA 0 COMPLETED
> #STARTING COLLECTION OF ENSEMBLE AVERAGE
> TI: 100 127.3925 127.3925 -23.4962
> -23.4962 -6.6485 -6.6485 9.9449 9.9449
> -37.7934 -37.7934 9.3023 9.3023
> TI: 110 123.5767 123.9236 -23.1205
> -23.1546 -6.6329 -6.6343 10.3050 10.2723
> -39.3857 -39.2410 6.6564 6.8969
>
> It seems like BOND1 and BOND2 etc. are for the initial and final values of
> lambda, respectively. If so, is AVGBOND1+AVGELECT1+AVGVDW1 the total TI
> energy for each timestep? then if I sum the total energies of all timesteps
> I will get the total TI for the total transition from lambda=0 to lambda=1?
>