Coarse-graining an atomic structure

We will work with the amphiphysin BAR domain dimer from Drosophila (Peter et al., Science, 303:495, 2004). It is a homodimer, i.e., it consists of two identical monomers. It is natural to employ exactly the same SBCG model for each monomer, which can be done by coarse-graining one monomer first, and then copying the resulting SBCG model and aligning it with the orientation and position of the second monomer. In this section, we will learn how to use SBCG VMD plugins for both steps.

**Figure:** BAR domain homodimer. The two monomers are shown in green and purple. The all-atom structure is shown on the left, and an example of a SBCG structure is shown on the right. Both all-atom and SBCG structures are shown from the top and from the side.
$\begin{figure}\begin{center} \includegraphics[width=\linewidth]{figs/monomer-dimer} \end{center} \end{figure}$

Mapping an SBCG structure created for one protein onto other copies of that protein is a common task in coarse-graining of large macromolecular assemblies, which often contain multiple copies of one protein. A good example are viral capsids, the protein shells enclosing genetic material of viruses. Most known viral capsids are highly symmetric (e.g., icosahedral) structures, composed of multiple copies of a few proteins (see, e.g., Arkhipov et al., Structure, 14:1767, 2006).

Navigate to the directory 1_build_cg_model/. You can examine the whole dimer in VMD (files dimer.pdb and dimer.psf in 1_build_cg_model/). One monomer is designated as segname P1, and the other as segname P2. You can save each monomer from VMD to separate PDB files using the writepdb command for atom selections segname P1 or segname P2, and a PSF using the writepsf command (one PSF file will work for either monomer, since they are identical except for the atoms' positions). Such PDB and PSF are already created: see monomer.psf, monomer.pdb and monomer-2.pdb in the directory 1_build_cg_model/. Note that both dimer.psf and monomer.psf contain information about individual atoms and bonds between them, but not about angles, since they were created using the VMD command writepsf (not to be confused with the psfgen command writepsf), and VMD does not store information about angles, dihedrals, etc. Because of that, these PSF files cannot be used for MD simulations, but they are sufficient for SBCG conversions.

Coarse-graining of a BAR domain monomer.

1. Start VMD and load the all-atom monomer structure (load monomer.psf and monomer.pdb into the same molecule).

2. Open the CG Builder in VMD (Extensions Modeling CG Builder), and choose the option ``Create SBCG Model''. This will bring you to the SBCG Builder GUI.

3. Make sure that in the GUI you choose ``Molecule'', and not ``Electron Density Map''. The latter allows one to construct a SBCG model from a density map in CITUS or .dx format (such maps can be obtained from cryo-electron microscopy).

5. We do not need to specify the mass of the molecule, since VMD obtains that information from the PSF file you loaded. Without a PSF, VMD makes a good guess of atomic masses based on atom names in the PDB file, but quality of the guess can be compromised if the PDB file contains non-conventional atom names. Thus, it is usually better to specify the mass of the molecule if the PSF is not available, and especially in case you are working with a density map.

6. Set ``Number of CG Beads'' to 25. This corresponds to approximately 150 atoms per CG bead. Commonly used ratios in SBCG applications are 150 to 500 atoms per CG bead.

**Figure:** SBCG Builder GUI.
$\begin{figure}\begin{center} \includegraphics[width=0.9\linewidth]{figs/gui-sbcg-build} \end{center} \end{figure}$

$\framebox[\textwidth]{ \begin{minipage}{.2\textwidth} \includegraphics[width=2... ...er, the overall shape of the protein is maintained each time.} \end{minipage} }$

7. Once CG bead positions are assigned, the algorithm connects some of them by bonds. By default, a bond between two beads is established if the parts of the protein represented by each bead are directly connected by the protein backbone (``Determine Bonds From All Atom''). Toggle the other switch on, Provide Bond Cutoff, and set the cutoff value to 18. Now, a bond between two beads will be established if the beads are 18Å apart or closer. Which of the two options to choose depends on the application. Choosing connectivity according to the protein backbone is more realistic, but for the exercise with BAR domains this choice does not matter much for the end result.

8. Change the ``CG Residue Name'' to ``BAR'', and ``CG Name Prefix'' to ``A''. Names of the CG beads will be ``A1'', ``A2'', ``A3'', and so on, up to ``A25''.

9. Hit the ``Build Coarse Grain Model'' button. Completion of the SBCG algorithm will take a few moments.

10. The main result of running the algorithm is the production of output files that are written on the hard drive, namely the SBCG topology, parameter, and PDB files, and an all-atom reference PDB file. If you want to have specific names for those files, they can be changed in the SBCG Builder GUI before hitting ``Build Coarse Grain Model'' button. The output PDB file containing the newly constructed SBCG model is automatically loaded in VMD as a new molecule, overlapped with the original all-atom model.

$\framebox[\textwidth]{ \begin{minipage}{.2\textwidth} \includegraphics[width=2... ...} {}}\\ \\ and use \lq\lq {\tt ps}'' instead of \lq\lq {\tt puts}''. } \end{minipage} }$

11. Sometimes, the SBCG algorithm does not converge well during the allocated learning steps, or the obtained CG model does not look as you like. One common problem may be that the algorithm did not assign positions to all the beads, in which case one or more beads are left ``empty'' (a warning will appear in the bottom part of the SBCG Builder GUI if this is the case). The simplest solution is just to re-run the SBCG Builder, which usually solves such problems immediately.

12. The SBCG output PDB and topology files determine the structure of the coarse-grained protein model. To obtain the complete structure for display in VMD, or for subsequent simulations, we need to make a PSF file. This can be done the same way as commonly achieved for all-atom files, namely, using a PSFgen script or employing the AutoPSF VMD plugin. An example PSFgen script is provided: build.tcl. Just run the following command in the VMD Tk Console: source build.tcl. Note that the script uses the SBCG PDB and topology files you have just created, cg_monomer.pdb and cg_monomer.top; if you did not place these files in the directory where build.tcl is located, you will need to edit build.tcl and specify correct path to the files. If you choose to employ the AutoPSF plugin (Extensions Modeling Automatic PSF Builder), remember to delete the default topology file from the list of topologies in the plugin, and add the CG topology file that you created.

SBCG Builder output files.

Please also note that if you are creating a CG structure using a density map as an input, there is usually much less information available to parameterize the CG force field than in the case of an all-atom structure used as an input. For a density map, one usually does not know, for example, the charge distribution over the map. One would have to guess and tune most of the CG force-filed parameters based on some assumptions about, e.g., the structure stiffness, which is usually unknown. In the case of an all-atom structure, many characterstics, e.g., charge distribution, can be obtained easily. Such characteristics as the structure stiffness can be estimated using all-atom simulations and employed to parameterize the SBCG force field, as shown in the next section.

Mapping the coarse-grained monomer structure onto a different copy of the monomer.

We will now create the CG model for the second BAR domain monomer, which is structurally identical to the CG model of the first monomer, by mapping the first model onto the position and orientation of the second monomer.

1. In the CG Builder window, go back from the SBCG Builder GUI to the main CG menu, by hitting the button ``Back To Previous Screen''.

2. Choose the option ``Map A Previously Generated SBCG Model To An All-Atom Model'', and hit the button Next->.

3. The Mapping GUI requires you to choose the original CG model, reference all-atom structure, and the all-atom model to map onto from the list of molecules currently loaded into VMD. For the Coarse-Grained Molecule, choose cg_monomer.pdb that you have just created (you can load it to VMD or use the one that has been already loaded automatically after running the SBCG Builder).

**Figure:** Mapping the SBCG structure of one BAR domain monomer on the position and orientation of the other monomer. The all-atom structure of the second monomer is shown as a transparent purple surface.
$\begin{figure}\begin{center} \includegraphics[width=\linewidth]{figs/sbcg-map} \end{center} \end{figure}$

4. For the Reference Molecule, load into VMD and choose in the Mapping GUI aa_ref_monomer.pdb, which was created by the SBCG Builder. This so-called reference all-atom PDB file is very important for mapping and for fine-tuning the SBCG parameter file (see next section). The reference PDB file contains the same all-atom structure that was used as an input for the SBCG Builder, but its beta-field is filled with numbers that show for each atom, which CG bead it belongs to. Thus, the reference PDB file provides the information about direct mapping of atoms to CG beads for the specific instance of the coarse-graining; a reference all-atom PDB file is always created by the SBCG Builder when a new SBCG model is constructed.

5. For the All-Atom Molecule To Map Onto, load into VMD and choose in the Mapping GUI monomer-2.pdb, the all-atom PDB file for the second monomer.

7. Hit the Map Model button. The program will produce the file cg_monomer-2.pdb, which will be automatically loaded into VMD. You can create a PSF/PDB pair for this second monomer's CG model in the same way as was done for the first monomer. Use the same CG topology file. If you want to use the PSFgen script build.tcl, do not forget to change cg_monomer.pdb to cg_monomer-2.pdb there, and call output files differently, e.g., cg_monomer-2-psfgen.psf and cg_monomer-2-psfgen.pdb.

$\framebox[\textwidth]{ \begin{minipage}{.2\textwidth} \includegraphics[width=2... ...$\ ::cgtools::mapCGMolecule puts 2 1 3 cg\_monomer-2.pdb}\\ } \end{minipage} }$