 
 
 
 
 
   This section contains exercises that will let you analyze properties of a system in equilibrium. The exercises mostly use pre-equilibrated trajectories from different ensembles to calculate properties of the ubiquitin system.
![\framebox[\textwidth]{
\begin{minipage}[r]{0.9\textwidth}
\noindent{\textbf{Ob...
...ith each residue colored according to its average RMSD value.}
\end{minipage} }](img68.png) 
You will use a script within VMD that will allow you to compute the average RMSD of each residue in your protein, and assign this value to the User field for each residue.
![\framebox[\textwidth]{
\begin{minipage}{.2\textwidth}
\includegraphics[width=2...
... contain pertinent PDB information!) like the User field can.}
\end{minipage} }](img69.png) 
In the previous unit, you had VMD open with the equilibration trajectory of the ubiquitin periodic system loaded. You will repeat this process:
 New Molecule... menu item. Browse for and Load the file ubq_wb.psf in the directory common.
 New Molecule... menu item. Browse for and Load the file ubq_wb.psf in the directory common.   
| source residue_rmsd.tcl | 
This command itself does not actually perform any calculations. It reads the script residue_rmsd.tcl which contains a procedure called rmsd_residue_over_time.  Calling this procedure will calculate the average RMSD for each residue you select over all the frames in a trajectory. The procedure is called as:
  
  
| rmsd_residue_over_time <mol> <sel_resid> | 
| set sel_resid [[atomselect top "protein and alpha"] get resid] | 
 -carbons in the protein. Since there is just one and only one
-carbons in the protein. Since there is just one and only one  -carbon per residue, it is a good option. The command will create a list of residue numbers in the variable $sel_resid.
-carbon per residue, it is a good option. The command will create a list of residue numbers in the variable $sel_resid. 
| rmsd_residue_over_time top $sel_resid | 
The procedure also sets the value of the User field of all the atoms of the residues in the selection to the computed RMSD value. You will now color the protein according to this value. You will be able to recognize which residues are free to move more and which ones move less during the equilibration.
In order to use these new values to learn about the mobility of the residues, you will create a representation of the protein colored by User value.   
 Representations... menu item.
 Representations... menu item. 
You should now see the protein colored according to average RMSD values. The residues displayed in blue are more mobile while the ones in red move less. This is somewhat counterintuitive, so you will change the color scale.
 Colors
 Colors menu item. Choose the Color Scale tab. In the Method pull-down menu, choose BGR. Your figure should now look similar to Fig. 9.
 menu item. Choose the Color Scale tab. In the Method pull-down menu, choose BGR. Your figure should now look similar to Fig. 9. 
|   | 
 Programs and finding your version of Excel.
 Programs and finding your version of Excel. 
 Open
 Open , navigate to the directory 2-1-rmsd/, and choose the file residue_rmsd.dat. Make sure the dropdown menu Files of type is set to All Files.
, navigate to the directory 2-1-rmsd/, and choose the file residue_rmsd.dat. Make sure the dropdown menu Files of type is set to All Files. 
 Chart
 Chart , choosing XY (Scatter) as your Chart Type, selecting the last Chart sub-type in the lower-right, and clicking Finish. Excel should plot Column A on the x-axis and B on the y-axis, and the plot of RMSD data for each residue will appear.
, choosing XY (Scatter) as your Chart Type, selecting the last Chart sub-type in the lower-right, and clicking Finish. Excel should plot Column A on the x-axis and B on the y-axis, and the plot of RMSD data for each residue will appear. 
Take a look at the RMSD distribution. You can see regions where a set of residues shows less mobility. Compare with the structural feature of your protein. You will find a correlation between the location of these regions and secondary structure features like  helices or
 helices or  sheets.
 sheets.  
 Delete Molecule.
 Delete Molecule. 
![\framebox[\textwidth]{
\begin{minipage}[r]{0.9\textwidth}
\noindent{\textbf{Ob...
...esponds to the Maxwell distribution for a given temperature. }
\end{minipage} }](img70.png) 
For this exercise, you need a file that contains the velocities of every atom for one instant of the simulation. You will use a file corresponding to the last frame of an equilibration of ubiquitin. The simulation that generated this file was run exactly as the one you performed in directory 1-3-box/, except that the time step used here was 1fs. You can compare the configuration file used for that simulation to the one in the example-output/ of this directory. As discussed in the first unit, one should use rigidBonds for water in any production simulation since water molecules have been parametrized as rigid molecules. But for the illustration of Maxwell-Boltzmann energy distribution, the effect of force field can be negligible. To be simple, we are not using the option rigidBonds. Here, we provide you with the file ubq_wb_eq_1fs.restart.vel. Alternatively, you can use any velocity file, as long as constraints were not used in the simulation.
 New Molecule... menu item. Browse for and Load the file ubq_wb.psf in the directory common.
 New Molecule... menu item. Browse for and Load the file ubq_wb.psf in the directory common. 
The molecule you loaded looks terrible! That is because VMD is reading the velocities as if they were coordinates. It does not matter how it looks, because what we want is to make VMD write a useful file: one that contains the masses and velocities for every atom.
| set all [atomselect top all] | 
| set fil [open energy.dat w] | 
 , and write it to a file
, and write it to a file
| foreach m [$all get mass] v [$all get {x y z}] { | |
| puts $fil [expr 0.5 * $m * [vecdot $v $v] ] | |
| } | 
| close $fil | 
| 1.26764386127 | |
| 0.189726622306 | |
| 0.268275447408 | |
| ... | 
| source get_energy.tcl | 
You now have a file of raw data that you can use to fit the obtained energy distribution to the Maxwell-Boltzmann distribution. You can obtain the temperature of the distribution from that fit. In the rest of this section, we will show how to do this with Excel. If you are more familiar with another tool, you may use it instead.
 Open
 Open , navigate to the directory 2-2-maxwell/, and choose the file energy.dat. Make sure the dropdown menu Files of type is set to All Files.
, navigate to the directory 2-2-maxwell/, and choose the file energy.dat. Make sure the dropdown menu Files of type is set to All Files. 
We will construct a histogram plot of the energy values and fit a Maxwell-Boltzmann distribution to it to find the average temperature of your simulations.
 Data Analysis
 Data Analysis . If that option is not present, you will need to add in the Data Analysis tool by clicking Tools
. If that option is not present, you will need to add in the Data Analysis tool by clicking Tools  Add Ins
 Add Ins and selecting Analysis ToolPak from the list.
 and selecting Analysis ToolPak from the list. 
| Input Range: $A$1:$A$7051 | |
| Bin Range: $B$1:$B$71 | 
To obtain the normalized frequencies, we will obtain the area under our histogram and divide each frequency by it.
 Chart
 Chart , choosing Column as your Chart Type, selecting the first Chart sub-type in the upper-left, and clicking Next.
, choosing Column as your Chart Type, selecting the first Chart sub-type in the upper-left, and clicking Next. 
 , clicking the Options tab, and setting the Gap width: to 0.
, clicking the Options tab, and setting the Gap width: to 0. 
Now, you will fit a Maxwell-Boltzmann distribution to this plot and find out the temperature that the distribution corresponds to. (Hopefully, the temperature you wanted to have in your simulation!)
 
| = (2 / SQRT(pi() * G$2  3)) * SQRT(A2) * EXP(-A2 / G$2) | 
 in units of kcal/mol. (These are the energy units that NAMD uses.)
 in units of kcal/mol. (These are the energy units that NAMD uses.) 
 . Click the Series tab and add a series with Values: in Column F, Rows 2-72. Click OK.
. Click the Series tab and add a series with Values: in Column F, Rows 2-72. Click OK. 
 , choose Line as your Chart Type, and choose the first Chart sub-type in the upper-left. Click OK.
, choose Line as your Chart Type, and choose the first Chart sub-type in the upper-left. Click OK. 
You may want to change the line weight of your Maxwell-Boltzmann fit by right-clicking the line in your histogram and selecting Format Data Series .
.  
We have the Maxwell-Boltzmann expression plotted, but the fit is not optimal. Indeed, if we change the value for  in cell G2, we will see the fit change. We will now determine the value of
 in cell G2, we will see the fit change. We will now determine the value of  that yields the optimal fit.
 that yields the optimal fit.  
 2. Then click and hold the lower-right corner of cell H2 and drag down to row 72 to fill in the error values for each bin.
2. Then click and hold the lower-right corner of cell H2 and drag down to row 72 to fill in the error values for each bin. 
 Solver
 Solver . If that option is not present, you will need to add in the Solver tool by clicking Tools
. If that option is not present, you will need to add in the Solver tool by clicking Tools  Add Ins
 Add Ins and selecting Solver Add-in from the list.
 and selecting Solver Add-in from the list. 
| Set Target Cell: $I$2 | |
| Equal To: Min | |
| By Changing Cells: $G$2 | 
 (cell G2) which optimally fits the Maxwell-Boltzmann distribution to our raw energy data. Your plot should look like Fig. 10. Obtain the temperature
 (cell G2) which optimally fits the Maxwell-Boltzmann distribution to our raw energy data. Your plot should look like Fig. 10. Obtain the temperature  for this distribution with
 for this distribution with 
 . Your result is hopefully close to
. Your result is hopefully close to  K!
K! 
The deviation from your value and the expected one is due to the poor sampling as well as to the finite size of the protein; the latter implies that temperature can be ``measured'' only with an accuracy of about  even in case of perfect sampling.
 even in case of perfect sampling.  
![\framebox[\textwidth]{
\begin{minipage}[r]{0.9\textwidth}
\noindent{\textbf{Ob...
...different internal energies) over the course of a simulation.}
\end{minipage} }](img80.png) 
You will look at the kinetic energy and all internal energies: bonded (bonds, angles and dihedrals) and non-bonded (electrostatic, van der Waals) over the course of the water box simulation which you ran in the previous unit. This section will introduce you to a tcl script called namdstats.tcl which is very useful for calculating average energy data for a simulation or extracting energy data over time.
We will investigate the simulation you ran in directory ../1-3-box/. Namdstats.tcl will search all the information in the NAMD output file and extract the energy data for you. You should investigate it with your text editor to familiarize yourself with it.
 Programs
 Programs  VMD.
 VMD. 
| source namdstats.tcl | |
| data_avg ../1-3-box/ubq_wb_eq.log 101 last | 
 
| BOND: | 232.561128 | 
| ANGLE: | 690.805988 | 
| DIHED: | 397.649676 | 
| IMPRP: | 41.250136 | 
| ELECT: | -23674.863868 | 
| VDW: | 1541.108344 | 
| BOUNDARY: | 0.0 | 
| MISC: | 0.0 | 
| KINETIC: | 4579.147764 | 
| TOTAL: | -16192.340832 | 
| TEMP: | 313.429264 | 
| TOTAL2: | -16173.821268 | 
| TOTAL3: | -16174.663916 | 
| TEMPAVG: | 313.140436 | 
| PRESSURE: | -47.513352 | 
| GPRESSURE: | -19.105588 | 
| VOLUME: | 71087.006196 | 
| PRESSAVG: | 3.340104 | 
| GPRESSAVG: | 3.4207 | 
 
The temperature of your system should be close to 310 K, the temperature we specified for the water box simulation. It will not be exact because temperature fluctuations occur due to the finite size of the system. We will investigate these fluctuations in the next section.
![\framebox[\textwidth]{
\begin{minipage}[r]{0.9\textwidth}
\noindent{\textbf{Ob...
...in an NVE ensemble, and analyze the temperature distribution.}
\end{minipage} }](img83.png) 
For this exercise, you need a simulation that is long enough to sample the quantity of interest very well. Here, we need to sample the fluctuations in kinetic energy, and hence in the temperature. Instead of doing such a simulation yourself, we recommend that you use the provided file example-output/ubq-nve.log. However, if you choose to run your own simulation, you can find all the necessary files and instructions below. We recommend that you read the whole section even if you don't perform the simulations.
 
The simulation is performed in the microcanonical ensemble (NVE, i.e., constant number, volume, and energy). The configuration file provided to start the NAMD run is called ubq-nve.conf. The main features of this configuration file are:
 K, the temperature of the water sphere simulation.
K, the temperature of the water sphere simulation.
 
| copy example-output  ubq-nve.log . | 
| data_time TEMP ubq-nve.log | 
In order to plot the data, you can use Excel.
 Open
 Open , navigate to the directory 2-4-temp/, and choose the file TEMP.dat. Make sure the dropdown menu Files of type is set to All Files.
, navigate to the directory 2-4-temp/, and choose the file TEMP.dat. Make sure the dropdown menu Files of type is set to All Files. 
We will construct a histogram plot of the temperature values and fit a Gaussian distribution to it.
 Data Analysis
 Data Analysis . If that option is not present, you will need to add in the Data Analysis tool by clicking Tools
. If that option is not present, you will need to add in the Data Analysis tool by clicking Tools  Add Ins
 Add Ins and selecting Analysis ToolPak from the list.
 and selecting Analysis ToolPak from the list. 
| Input Range: $B$1:$B$10000 | |
| Bin Range: $C$1:$C$121 | 
 Chart
 Chart , choosing Column as your Chart Type, selecting the first Chart sub-type in the upper-left, and clicking Next.
, choosing Column as your Chart Type, selecting the first Chart sub-type in the upper-left, and clicking Next. 
 , clicking the Options tab, and setting the Gap width: to 0.
, clicking the Options tab, and setting the Gap width: to 0. 
Does this distribution look any familiar? Your distribution should look like a Gaussian distribution.
 
| = D$2 * EXP(-((A2-D$3)  2)/D$4) | 
 .
. 
 . Click the Series tab and add a new series with Values: in Column C, Rows 2-122. Click OK.
. Click the Series tab and add a new series with Values: in Column C, Rows 2-122. Click OK. 
 , choose Line as your Chart Type, and choose the first Chart sub-type in the upper-left. Click OK.
, choose Line as your Chart Type, and choose the first Chart sub-type in the upper-left. Click OK. 
You may want to change the line weight of your Gaussian fit by right-clicking the line in your histogram and selecting Format Data Series .
.  
We have the Gaussian plotted, but the fit is not optimal. If we change any of the values for parameters Column D, we will see the fit change. We will now determine the optimal parameters.
 2. Then click and hold the lower-right corner of cell E2 and drag down to row 122 to fill in the error values for each bin.
2. Then click and hold the lower-right corner of cell E2 and drag down to row 122 to fill in the error values for each bin. 
 Solver
 Solver . If that option is not present, you will need to add in the Solver tool by clicking Tools
. If that option is not present, you will need to add in the Solver tool by clicking Tools  Add Ins
 Add Ins and selecting Solver Add-in from the list.
 and selecting Solver Add-in from the list. 
| Set Target Cell: $F$2 | |
| Equal To: Min | |
| By Changing Cells: $D$2:$D$4 | 
| data_avg ubq-nve.log | 
 which can be calculated from your data. Note how it is dependent on the size of the system!
 which can be calculated from your data. Note how it is dependent on the size of the system! 
Take home message: You may note from this example that single proteins, on one hand, behave like infinite thermodynamic ensembles, reproducing the respective mean temperature. On the other hand, they show signs of their finiteness in the form of temperature fluctuations. In general, it is interesting to note that the velocities of the atoms and, therefore, the kinetic energy of the protein serves as its thermometer!
![\framebox[\textwidth]{
\begin{minipage}[r]{0.9\textwidth}
\noindent{\textbf{Objective:} Find the specific heat of ubiquitin.}
\end{minipage} }](img91.png) 
Specific heat is an important property of thermodynamic systems. It is the amount of heat needed to raise the temperature of an object per degree of temperature increase per unit mass, and a sensitive measure of the degree of order in a system. (Near a bistable point, the specific heat becomes large.)
 
For this exercise, you need to carry out a long simulation in the NVT (canonical) ensemble that samples sufficiently the averages 
 and
 and 
 that arise in the definition of
 that arise in the definition of  . To produce the data for the total energy of the molecule over time, you need to work through the steps presented below. This exercise is long and computationally intensive. You may wish to work only through the steps marked with *, that correspond to the analysis of the data. Nevertheless, we encourage you to read through the whole section since it will provide you with tools for future analysis.
. To produce the data for the total energy of the molecule over time, you need to work through the steps presented below. This exercise is long and computationally intensive. You may wish to work only through the steps marked with *, that correspond to the analysis of the data. Nevertheless, we encourage you to read through the whole section since it will provide you with tools for future analysis.  
In order to produce an NVT run, you will restart the equilibration you performed in Unit 1 for the water sphere. All necessary files are present in your directory.
 
 
| copy example-output  ubq-nvt* . | 
You now have an equilibration run of ubiquitin in the NVT ensemble, and you want to look at the fluctuation in the energy of ubiquitin. In order to calculate the specific heat for the protein only (as opposed to the whole system), you need the total energies for the atoms in the protein only, however the NAMD output for the run you just performed contains the energies for all atoms in the system.
You would like NAMD to write the energies for only a part of the system, rather than for the entire system, as is reported in the log file. One way to do this is to run a separate simulation with only the part of the system you are interested in, but this may be unrealistic from a biological standpoint. Fortunately, VMD has a plugin which will call NAMD and allow you to calculate energies for any part of your system that you specify.
| cd ../2-5-spec-heat/ | 
| mol new ../common/ubq_ws.psf | |
| mol addfile ubq-nvt.dcd | 
| mol new ../common/example-output/ubq_ws.psf | |
| mol addfile ubq-nvt.dcd | 
 Analysis
 Analysis  NAMD Energy.
 NAMD Energy. 
 Help...
 Help... 
Now you have a file called myenergy.dat which contains the energy of only the ubiquitin protein atoms in your system. In order to calculate the specific heat of the molecule, we must derive both the average and squared-average total energy of the molecule from this data. We will use a tcl script to do so.
| source average.tcl | 
The script will search your output file for the total energy data for the system, calculate both the average and squared-average total energy over all timesteps, and output the values for you. You should become familiar with the script to fully utilize the power of Tcl scripting in analyzing NAMD output. This script will teach you how to read input files, extract data from them, and perform calculations on that data.
 and
 and  K. What is your result? Try to convert your results to units of specific heat used everyday, i.e.,
K. What is your result? Try to convert your results to units of specific heat used everyday, i.e., 
 using the conversion factor
 using the conversion factor 
 . Note that this units correspond to the specific heat per unit mass. You calculated the fluctuation in the energy of the whole protein, so you must take into account the mass of your protein by dividing your obtained value by
. Note that this units correspond to the specific heat per unit mass. You calculated the fluctuation in the energy of the whole protein, so you must take into account the mass of your protein by dividing your obtained value by 
 . Compare with some specific heats that are given to you in Table 2.
. Compare with some specific heats that are given to you in Table 2. 
| Substance | Specific Heat | 
|  | |
| Human Body(average) | 3470 | 
| Wood | 1700 | 
| Water | 4180 | 
| Alcohol | 2400 | 
| Ice | 2100 | 
| Gold | 130 | 
| Strawberries | 3890 | 
 
 Delete Molecule.
 Delete Molecule. 
 
 
 
 
