next up previous
Next: Building a shape-based coarse-grained Up: Shape-Based Coarse Graining Previous: Coarse-graining an atomic structure

Subsections

Parameterizing SBCG force field

As we mentioned above, the original coarse-graining procedure creates an SBCG parameter file, but the energies associated with various interactions are assigned uniformly and arbitrarily. Refining these interaction parameters to the extent where they realistically reproduce general properties of the original all-atom system is the most important, and perhaps the most difficult, part of the coarse-graining. In this section, we will learn basic techniques for refining the CG parameters, using information about the original static all-atom structure, or an MD simulation of this structure, as an input.

Navigate to the directory 2_parameters/. Most files that serve as input for this section have been prepared at the steps described in the previous section (check the contents of the folder example-input). Remember that the sample CG PDB, PSF, topology and parameter files provided in example-input will not be compatible with your own analogous files, because each run of the SBCG Builder contains stochastic elements, so that the final distribution of CG beads in the SBCG model is never exactly the same.

Non-bonded interaction parameters.

In the SBCG model, non-bonded interactions are represented by the Coulomb and 6-12 Lennard-Jones (LJ) potentials, just as it is commonly done for all-atom simulations. Since each CG bead inherits the mass and charge of the group of atoms it represents, the charges are already assigned, and we do not need to worry about Coulomb interactions. The LJ interactions appear to be a more complex issue. In early applications of the SBCG model, the interaction strength of the LJ potential was set to a uniform value (see Arkhipov et al., Structure, 14:1767, 2006; Biophys. J., 91:4589, 2006). Later (Arkhipov et al., Biophys. J., 95:2806, 2008), the procedure was extended to introduce more specificity for each CG bead. In NAMD, the LJ energy constant $\epsilon_{ij}$ for the pair of beads $i$ and $j$ is computed as $\epsilon_{ij} = \sqrt{\epsilon_i \epsilon_j}$, where $\epsilon_i$ and $\epsilon_j$ are the strengths for each bead. In the SBCG model, the value of $\epsilon_i$ is assigned for each bead $i$ based on the hydrophobic solvent accessible surface area (SASA) for the protein domain represented by the bead,

\begin{displaymath}
\epsilon_i = \epsilon_{max} \left(\frac{SASA^{hphob}_i}{SASA^{tot}_i}\right)^2,
\end{displaymath} (1)

where $SASA^{hphob}_i$ and $SASA^{tot}_i$ are the hydrophobic and total SASA of the domain $i$; $\epsilon_{max}$ is the user-defined constant. SASA for an all-atom domain is computed in the context of the whole protein, i.e., atoms that are at the surface between two domains, but are buried inside the protein, do not contribute to the computed value. The idea behind using the SASA to determine $\epsilon_i$ is to let hydrophobic beads aggregate and hydrophilic beads dissolve in the solvent, which is only implicitly present in SBCG simulations. For a pair of completely hydrophilic beads, $\epsilon_{ij}=0$, in which case the two beads are free to dissociate unless they are bound to other particles; $\epsilon_{ij}$ for two completely hydrophobic beads is $\epsilon_{max}$, which should be tuned to represent hydrophobic aggregation realistically. For 150 atoms per CG bead, an appropriate value is approximately $\epsilon_{max} = 10$kcal/mol.

The formula and parameter values used here (such as $\epsilon_{max} = 10$kcal/mol) were found to be optimal for certain SBCG applications (Arkhipov et al., Biophys. J., 95:2806, 2008). For SBCG applications to significantly different systems, one may need to modify the formula or parameters, or both, which can be done by editing the plugin scripts that are delivered together with VMD.



Figure: GUI for assigning parameters for non-bonded (LJ) interactions for the SBCG model.
\begin{figure}\begin{center}
\includegraphics[width=\linewidth]{figs/gui-LJ}
\end{center}
\end{figure}


Assigning these parameters is easily done with another GUI in the VMD CG tools.

1. Start VMD if you have closed it, and load the reference all-atom monomer PDB aa_ref_monomer.pdb.

2. Start the SBCG LJ GUI in VMD: Extensions $\rightarrow$ Modeling $\rightarrow$ CG Builder $\rightarrow$ Assign Lennard-Jones Params For CG Model From All-Atom.

3. In the field ``Original CG Param File'', provide the path to the CG parameter file, cg_monomer.par, that was created when you ran SBCG Builder. The easiest way is to copy this file directly in the folder you are working in, 2_parameters/. Then, you can just type ``cg_monomer.par'' in the corresponding field in the GUI (make sure that VMD's working directory is 2_parameters/ - you can check it my using command ``pwd'' in the VMD's Tk Console, and if necessary navigate to the right directory using command ``cd''). Otherwise, provide the full path in the GUI's field.

4. For ``All Atom Structure'', choose the molecule aa_ref_monomer.pdb.

5. ``Max energy for LJ well depth'' is the aforementioned $\epsilon_{max}$; set it to 10kcal/mol.

6. The LJ radius $\sigma_{ij}$ for the interaction between beads $i$ and $j$ is obtained in NAMD by default as $\sigma_{ij} = (\sigma_{i} + \sigma_{j})/2$, where $\sigma_{i}$ and $\sigma_{j}$ are radii associated with each atom. In SBCG model, each $\sigma_{i}$ is obtained as a radius of gyration of the groups of atoms represented by the CG bead, increased by a constant value $\Delta\sigma$, which accounts for the fact that each atom has a radius (typically, 1-2Å). This value is chosen by the user in the GUI field ``Addition to LJ radius''. Set $\Delta\sigma = 1$Å.

7. Choose the Output Parameter filename'' to be cg_monomer_updated_LJ.par and hit the ``Assign Parameters'' button.

8. Compare entries for non-bonded interactions in the originally produced parameter file cg_monomer.par with those in the new file, cg_monomer_updated_LJ.par. The LJ radii $\sigma_{i}$ are very similar between the two files, because the original SBCG Builder algorithm uses almost the same procedure as we just employed to assign these values ( $\Delta\sigma = 1$Å in both cases). However, LJ energies in cg_monomer.par are set uniformly to 4kcal/mol by the SBCG Builder, and now they are changed significantly in the updated file, accounting for the choice $\epsilon_{max} = 10$kcal/mol and for the specificity of each CG bead.

\framebox[\textwidth]{
\begin{minipage}{.2\textwidth}
\includegraphics[width=2...
...the correct VMD molecule ID for the all-atom reference file!
}
\end{minipage} }

Obtaining initial guess for bonded interaction parameters from all-atom simulations.

The terms for bonded interactions in the SBCG method are described by potentials $V_{bond}(r) = K_b(L - L_0)^2$ and $V_a(\theta) = K_a(\theta - \theta_0)^2$ for bond length $L$ and angle $\theta$, where $K_b, L_0, K_a$, and $\theta_0$ are the force-field parameters. The procedure for extracting the values for these parameters from an all-atom simulation is based on the concept of the so-called Boltzmann inversion: for each variable $x$ (such as $i$-th bond length $L_i$), one obtains the distribution $\rho(x)$ from the all-atom simulation, and uses the Boltzmann relation $\rho(x) = \rho_0 \exp\left[-V(x)/k_BT\right]$ to obtain $V(x)$. This procedure can be illustrated by the simple example of a one-dimensional harmonic oscillator, with a particle moving along the $x$ coordinate in the potential $V(x) = f(x-x_0)^2$ (note that there is no $1/2$ factor, according to the CHARMM force-field notation). With the system in equilibrium at temperature $T$, the average position $\langle x\rangle$ is equal to $x_0$, and the root mean square deviation of $x$ (RMSD) is given by $\langle x^2 \rangle - \langle x\rangle^2 = k_BT/(2 f)$ ($k_B$ is the Boltzmann constant). Using an MD simulation, one can compute $\langle x\rangle$ and the RMSD, thus obtaining $x_0$ and $f$; note that in this case it is not necessary to compute complete $\rho(x)$. These formulas are used in the SBCG method to obtain an initial guess for $K_b, L_0, K_a$, and $\theta_0$.



Figure: Extracting bonds and angles for the CG model from an all-atom simulation. The GUI is shown on the left. An example of analyzing a CG bond is shown on the right. $L$ is the distance between the centers of mass of all-atom domains (blue and red) that correspond to the two CG beads being analyzed (green).
\begin{figure}\begin{center}
\includegraphics[width=\linewidth]{figs/bonds}
\end{center}
\end{figure}


The procedure consists of computing positions of centers of mass for each all-atom domain that is represented by one CG bead, at each time frame from the all-atom trajectory. Then, for each pair of beads that forms a bond, or triple of beads that form an angle, the distance $L$ or angle $\theta$ between the respective centers of mass is computed and averaged over all time frames; RMSDs are also calculated.

To obtain the initial guess for the bonded CG parameters, we need to perform the following actions.

1. Start the SBCG Bonds GUI in VMD: Extensions $\rightarrow$ Modeling $\rightarrow$ CG Builder $\rightarrow$ Extract Bond/Angle Params of CG Model from All-Atom Simulation.

2. In the GUI, provide the path to the CG PSF and PDB files, cg_monomer-psfgen.psf and cg_monomer-psfgen.pdb. Those are the files that you have created after coarse-graining the BAR domain monomer. For convenience, you may want to copy these files directly into the folder you are working in, 2_parameters/. Then, you can just type, e.g., ``cg_monomer-psfgen.psf'' in the respective field of the GUI. If you want to do this, make sure that VMD's working directory is 2_parameters/, as in the previous example. Otherwise, provide the full path in the GUI's field.

3. The all-atom trajectory is provided: 2_parameters/aa_ref-monomer-noh.dcd. Add this file name to the GUI's field ``AA DCD File''.

4. To save space, the all-atom trajectory contains only heavy protein atoms, i.e., no hydrogen atoms. To match this with a reference PDB file, we need to devoid it of hydrogen atoms as well. This can be easily done in VMD by loading the full reference all-atom PDB, aa_ref-monomer.pdb, which you created at the first stage of the coarse-graining, and saving only non-hydrogen atoms (selection ``noh'') in a new PDB file (using either File $\rightarrow$ Save Coordinates or the Tk Console command writepdb). Call this new file aa_ref-monomer-noh.pdb, and add its name to the corresponding field in the GUI.

5. Set Simulation Temperature to 310K (this is the temperature at which the all-atom simulation was run).

6. Set Output Parameter Filename, and filenames for the bond and angle datafiles, to from-aa.par, from-aa-bonds.dat, and from-aa-angles.dat. The latter two files are for analysis purposes only. They contain the same parameters for bonds and angles that the output parameter file does, but in addition they also contain the corresponding values of average $L$ for each bond, its RMSD, and the same for angles. All these values are conveniently organized in columns, which makes it easy to visualize the values of interest using plotting programs. This may become necessary for further refinement of the parameters (see below).

7. Hit the Extract Parameters button. Since we have not provided a parameter file as an input, the output file from-aa.par does not contain any other information besides the parameters for bonds and angles. To rectify this, just append the entries for the non-bonded parameters from the file cg_monomer_updated_LJ.par to from-aa.par.

\framebox[\textwidth]{
\begin{minipage}{.2\textwidth}
\includegraphics[width=2...
....dcd 310.0 from-aa.par from-aa-bonds.dat from-aa-angles.dat}
}
\end{minipage} }

Iterative refinement of bonded interaction parameters.

The Boltzmann relation for a single variable $x$, which we used in the previous step, holds only if $x$ is an independent variable not affected by other potentials. For a network of bonds, which is often the case for SBCG protein models, the bond lengths and angles are not independent, and when parameters for each of them are derived individually using Boltzmann inversion, the stiffness of the structure may be overestimated. Therefore, the direct Boltzmann inversion can be used only to obtain the initial values for force constants $K_b$ and $K_a$, which need to be further adjusted until the stiffness of the CG model becomes closer to that of the all-atom model. In this section, we will learn an exemplary (and simplified) procedure used to adjust $K_b$ and $K_a$. For the equilibrium bond length $L_0$ and angle $\theta_0$, Boltzmann inversion usually provides a very good guess, so that these parameters do not require any further modification.

1. Let us first run a SBCG simulation using the initial guess for bonds and angles in the parameter file from-aa.par. This will be our iteration 1. We will use NAMD for the simulation. Set up a folder iteration1, similar to that found in 2_parameters/example-output/.

2. In your folder iteration1, create an empty folder output, and a folder input; fill the latter one with the files that you created at previous steps: cg_monomer-psfgen.pdb, cg_monomer-psfgen.psf, and from-aa.par. Copy the NAMD configuration file iteration1.conf from 2_parameters/example-output/iteration1/ to your folder iteration1. Check the contents of the configuration file. You will see that many settings are similar to those used for all-atom simulations, but others are different. For example, the time step used is 100fs, instead of 1fs commonly used for all-atom simulations. These parameters can be, in principle, changed depending on what you want to simulate.

3. Now, use NAMD to run the simulation.

4. We will now use the SBCG Bonds GUI to analyse the SBCG simulation trajectory and obtain RMSDs of bonds and angles, just the same way as we did before for an all-atom trajectory. The SBCG Bonds GUI requires a reference PDB file (see above), where the beta-values of atoms are assigned according to the number of CG bead that the atom corresponds to. Here, the structure we analyze is the same as the SBCG model. Thus, the beta-values for each CG bead in the reference PDB file should be the same as the number of that bead. Note that this number starts from 1, i.e., it is offset by one from the bead index in VMD, which starts from 0. You can copy your CG PDB file cg_monomer-psfgen.pdb to create the reference PDB, cg_monomer-beta.pdb; fill the beta fields in the file using a text editor (see the example in the folder example-input/). Alternatively, you can use the simple script, example-input/beta.tcl, to do this automatically.

5. Employ the SBCG Bonds GUI as you did before for an all-atom simulation. Use cg_monomer-psfgen.psf and cg_monomer-psfgen.pdb for CG PSF and PDB files. For the trajectory to analyze (``AA DCD File''), use the CG trajectory that you just obtained from the simulation: iteration1/output/iteration1.dcd. For the reference file, type the name of the just created CG reference PDB, cg_monomer-beta.pdb. Set Simulation Temperature to 310K. Set Output Parameter Filename, and filenames for the bond and angle datafiles, to iteration1/from-iter1.par, iteration1/from-iter1-bonds.dat, and iteration1/from-iter1-angles.dat. Hit the Extract Parameters button.

6. The extracted parameters are in the folder iteration1. We can ignore the file from-iter1.par. Let us look at the files from-iter1-bonds.dat and from-iter1-angles.dat, which contain columns of data obtained from the CG simulation, in the following order. For the file from-iter1-bonds.dat: name of bead 1, name of bead 2, average distance between the two beads, its RMSD, and the bond spring constant obtained from the RMSD using Boltzmann inversion (see above). For the file from-iter1-angles.dat: name of bead 1, name of bead 2, name of bead 3, average angle between the three beads, its RMSD, and the angle spring constant similarly obtained from the RMSD.

7. We can now compare these data with those obtained from the all-atom simulation using a plotting program (e.g., compare iteration1/from-iter1-bonds.dat with from-aa-bonds.dat). One can see that average bond lengths and angles, which for CG simulation are mainly defined by $L_0$ and $\theta_0$ in the CG parameter file, are essentially the same in the all-atom and CG simulations. Thus, the original assignment of $L_0$ and $\theta_0$ is appropriate, and we do not need to tune them further.

8. We can further compare the RMSDs of bonds and angles, which demonstrate the structure flexibility. However, practically it is more convenient to compare inverse RMSDs, which, in Boltzmann inversion procedure, are proportional to $K_b$ and $K_a$. Thus, we can compare the last columns of the *dat files for bonds and angles obtained from the CG simulation with those from the all-atom simulation. Note again the difference: what we compare is not the parameter values used in the two simulations, but stiffnesses of bonds and angles observed in those simulations.



Figure: Comparison of bond and angle strengths, $K_b$ and $K_a$, obtained using Boltzmann inversion from simulations with successively scaled parameter file entries. Since the Boltzmann procedure is used, the values of $K_b$ and $K_a$ are inversely proportional to the RMSDs of corresponding bond lengths and angles, as observed in simulations. Thus, essentially, here we compare stiffnesses of the structure in simulations with different parameters. Comparing RMSDs directly is not as informative as comparing their inverses, or $K_b$ and $K_a$. Left: bonds; right: angles. Black circles: from all-atom simulation. Red squares: from SBCG, iteration 1. Green diamonds: from SBCG, iteration 2. Blue triangles: from SBCG, iteration 3. The iteration numbers correspond to the files provided in the folder 2_parameters/example-output/.
\begin{figure}\begin{center}
\includegraphics[width=\linewidth]{figs/scaling}
\end{center}
\end{figure}


9. You will see that the protein structure is much stiffer in the SBCG simulation than it was in the all-atom simulation (i.e., bond and angle RMSDs are smaller, or, inverse RMSDs are higher). This is due to the strong interconnection between beads in the SBCG model, which makes the direct Boltzmann inversion procedure not quite adequate, as mentioned above. To overcome this, we need to decrease $K_b$ and $K_a$ in our parameter file. One can change these parameters one-by-one, and run many CG simulations to find the set of values that gives the right stiffness for every bond and angle; this is, however, very tedious, and, for such coarse model as ours, is probably unnecessary. Below, we follow a simpler approach, where $K_b$ and $K_a$ are scaled uniformly over all bonds and angles, to match the structure stiffness roughly.

10. To scale $K_b$ and $K_a$, we will use the Scaling GUI in the SBCG tool set (entitled ``Scale Bond/Angle Spring Constants''). Note that this GUI can be used for any CHARMM-style parameter file, not necessarily for CG only. Remember, the parameter file we want to scale is the one we used for the simulation in iteration 1, i.e., the file from-aa.par. Do not make a mistake of scaling the file iteration1/from-iter1.par; parameters in this file reflect the properties of the CG model in iteration 1, while we are interested in reproducing the properties of the all-atom simulation.



Figure: GUI for scaling bonds and angles in a parameter file.
\begin{figure}\begin{center}
\includegraphics[width=\linewidth]{figs/scaling-gui}
\end{center}
\end{figure}


11. Set the Input Parameter File in the GUI to from-aa.par. Set scaling constants, for example, to 0.3 for both Bond Spring Constant Scaling and Angle Spring Constant Scaling.

12. The problem with the direct Boltzmann inversion is that it usually makes the CG structure too stiff, i.e., bond and angle spring constants are too large. However, we can see that for some of the spring constants that have relatively small values, the results of the all-atom simulation and of the CG one in iteration 1 agree quite well. It is, therefore, counterproductive to scale $K_b$ and $K_a$ for such weak bonds and angles, since that would make them much weaker than we want. Thus, we do not want to apply scaling to the bonds and angles for which $K_b$ and $K_a$ are below certain threshold. This can be achieved by setting the cutoff level for scaling in the GUI, for example, setting the Bond Spring Constant Cutoff to 3.5kcal/(mol Å$^2$) and Angle Spring Constant Cutoff to 170kcal/(mol rad$^2$). You should choose the actual values for scaling and cutoffs according to the agreement between your CG and all-atom simulations.

13. Make a new folder, iteration2, for the new CG simulation. Add the same files and subfolders into this folder as you did for iteration1. The only difference is that we will need to replace the parameter file from-aa.par in folder iteration2/input by the one with scaled $K_b$ and $K_a$. In the GUI, set Output Parameter File to iteration2/input/scaled1.par, and hit the Scale Values button.

14. We can now run the new CG simulation in the folder iteration2. After it is done, analyze the CG trajectory using the SBCG Bonds GUI, as in the step 5 above (you can reuse the SBCG reference PDB file that you created earlier). This procedure will give you the following files in the folder iteration2: from-iter2.par, from-iter2-bonds.dat, and from-iter2-angles.dat. Plot the data from these files and compare them with the data from the all-atom simulation and from iteration 1. The bond and angles stiffness in interaction 2 should be closer to those observed in the all-atom simulation than the stiffness from iteration 1.

15. The scaling steps should be repeated several times, possibly, separately for bonds and angles. Usually, for the next iteration it is convenient to scale the values from the previous iteration, e.g., scale values from the file from-aa.par to obtain scaled1.par, then scale values from scaled1.par to obtain scaled2.par, and so on.

16. The final stiffness of CG bonds and angles does not have to be exactly the same as that from the all-atom simulation. If the average deviation is $\pm$25%, this already can be considered a reasonable agreement for such a coarse model. In the example-output, we performed three iterations. In iteration 1, the original file from-aa.par, with parameters obtained directly from the all-atom simulation, was used. For iteration 2, the values in from-aa.par were scaled as described in steps 11-12, to yield scaled1.par. For iteration 3, the Bond Spring Constant Scaling was set to 1.0, Bond Spring Constant Cutoff to 100.0kcal/(mol Å$^2$), Angle Spring Constant Scaling to 0.7, and Angle Spring Constant Cutoff to 300kcal/(mol rad$^2$), i.e., bonds were not scaled.

17. Now the SBCG protein structure is established and the protein force-field is fully parameterized. We can run production SBCG simulations! Note that, depending on the stiffness of the bonds and angles in the resulting model, the time step for simulations can be larger or smaller. The time step of 100fs, which we used during the parameterization run, can be potentially increased, resulting in the faster simulations.

\framebox[\textwidth]{
\begin{minipage}{.2\textwidth}
\includegraphics[width=2...
...onstants puts cg\_monomer.par 0.3 3.5 0.3 170.0 scaled1.par}
}
\end{minipage} }


next up previous
Next: Building a shape-based coarse-grained Up: Shape-Based Coarse Graining Previous: Coarse-graining an atomic structure
tutorial-l@ks.uiuc.edu