Coarse Graining using the Coarse Grain Builder

CG Builder provides a simple set of tools for transforming structures between coarse grained and all-atom representations.

CG Builder supports two methods of creating a coarse grained model:

  1. A residue-based method, in which two or more particles from an all atom representation map onto a single "bead". A given bead will be placed at the center of mass of the atomic group defining it. Any atoms not defined as part of a CG Bead will be left alone. Prior to coarse graining, CG bead definitions are read from a file using the format specified below.
  2. A shape-based method, where a neural network learning algorithm is used to determine the placement of neurons (or CG beads). The CG beads have masses correlated to the clusters of atoms which the beads are representing. The shape-based method can be applied to molecules in PDB form, or to electron density maps where a full atom PDB might not even be available.

The graphical interface is available under Extensions | Modeling | CG Builder from within VMD. First, you choose what you want to do: convert an all atom model to a coarse-grained one (and the particular method to use), or you can convert a previously converted coarse-grained model back to all-atom.

First, you will need to convert an all-atom representation to a coarse grained representation.

Residue-based Coarse Graining

The CG Builder requires you to have the appropriate all-atom molecule loaded into VMD. You can then choose the correct VMD molecule from the dropdown list. If you don't have the molecule loaded, you can load it in VMD and then choose it from this dropdown.

You must define the relationship between the desired CG beads and the atoms in your all atom representation. Sample relationships (database files) are given for proteins and for water. If you want to use these, you can just click the 'Add' button next to the file and it will be used for the CG mapping.

You can also create your own bead database definitions. Instructions are given below. Once you have a file(s) with these custom bead definitions, you can load them into the interface. Browse to the bead definition file and then select 'Add' next to the Browse button. VMD will load the bead definitions and the number of 'Bead Definitions Currently Loaded' should increase by the expected amount.

The Residue-based CG Builder produces two output files. The first is the revised PDB file reflecting the coarse grained beads instead of all-atom. A sample filename is given based on the molecule name loaded, but you can change it as desired.

To be able to properly return to an all-atom representation, CG Builder will needs a work file that is a 'Reverse Coarse Graining File'. Again, a sample filename is given. The important thing is that this file isn't lost, because it will be needed when you want to convert the coarse-grained system back to the all-atom representation.

Shape-based Coarse Graining

Shape-based coarse graining uses a neural network algorithm to learn the best location in which to place the CG beads. The placement is then adjusted (usually slightly) based on the centers of mass of the atomic clusters comprising each bead. A technical description of what the shape-based CG algorithm does is available at the TCBG's Shape-based CG web page.

Shape-based coarse graining can act upon either a molecule loaded in PDB/PSF form, or an electron density map. Choosing the proper option will slightly modify which elements of the form need to be given.

Starting From A Molecule - If you have a molecule, the CG Builder requires you to have the appropriate all-atom molecule loaded into VMD. You can then choose the correct VMD molecule from the dropdown list. If you don't have the molecule loaded, you can load it in VMD and then choose it from this dropdown. In addition, you will want to load in an appropriate PSF file for the molecule, as certain values from the PSF file are used for the placement and assignment of properties for the CG beads. Specifically, PSF file contains information about the mass and charge of every atom. If the PSF file is not provided, VMD will guess masses of the atoms (usually, VMD's guess is quite good), but all charges will be assumed to be 0.0. In this case, the CG model will reflect the mass distribution well, but it will not contain any information about the charge distribution.

Starting From An Electron Density Map - If using an electron density map, you will need to choose the location on your disk of a SITUS or .DX file that contains the map.

Mass of the CG model - If you choose to do so, you can specify the desired total mass of the resulting CG model (which will get put into the topology file). When starting with a molecule, this typically won't be necessary. The molecule will contain enough information about the atoms that the mass can be properly determined. If, for whatever reason, you want to scale the masses of the CG beads you might want to specify the final mass, though.
When starting with an electron density map, being able to specify the total mass of the CG model is much more useful, though. After calculating the location of each CG bead and how much of the density maps that is being represented by each bead, the algorithm will take the total mass value that you have provided and scale each CG bead accordingly.

Learning Parameters

Number of Beads - Choose the number of beads that you want to create from the original molecule. If you choose a number that is too large, you stand a good chance of having some beads created which don't have any actual real atoms associated with them. If this happens, rerun the model creation with fewer beads. The plugin will default to number of atoms divided by 500 if using a molecule as input. If using an electron density map you will need to provide a reasonable value (the default number of beads will be the number of density points divided by 550).

Number of Learning Steps - By default, this is 200 times the number of desired beads. You can set it to anything that you wish, though.

"Lambda" and "eps" are parameters used by the learning algorithm. The default values for Initial/Final eps and Initial/Final Lambda in the plugin are a reasonable choice that will work in most cases. You can change these values if needed. By default, the initial value for lambda will be 0.2 times the number of desired beads. Other default values are independent of the number of beads.

Bond Cutoff - Cut-off distance for establishing bonds between beads (in angstroms).

Frac Cutoff - This parameter (only used when working with density maps) should be a number between zero and one. Regions of the map with density values below this number (times the maximum density value in the current map) will be neglected.

CG Residue Name - Three characters or fewer. This residue name will be printed in the output files.

CG Name Prefix - For naming the CG atoms and types. This should be a single character. If 'A' is used, atom names will be A1, A2, etc.

Output Files - The Coarse-Grained PDB file, the topology file, and the parameter file will contain information about the beads, their locations, connections, charges, etc. The All-Atom Reference PDB (which only applies if you are starting from a molecule and not an electron density map) will be the same as the original molecule, but the beta field for each atom will contain the index number of the bead to which the atom was assigned.

After building the coarse grain model, one would normally run psfgen (can be conveniently done by using AutoPSF plugin of VMD: Extensions | Modeling | CG Automatic PSF builder) on the coarse grained structure (using a coarse grain topology file) and then be ready to begin simulations.

Reverse Coarse Graining

After running the simulations, you will likely have coarse grained molecules that you need to convert back to all-atom. The CG Builder plugin currently supports reverse coarse graining for Residue-based coarse graining. The coarse grained molecule that you wish to convert back to all-atom needs to be loaded in VMD and selected as the Coarse-Grained Molecule. In addition, CG Builder needs to have the original all-atom molecule available and loaded into VMD. Select this molecule as the All-Atom Molecule. You will need to specify the work file that was saved in the earlier step as the Rev CG File. And, the reconstructed all-atom representation molecule will be saved as PDB file with the given name.

A simulated annealing run from NAMD will usually need to be run after reconstruction of the all-atom model. The annealing run needs to be run in a specific way, so the CG builder tool can create the proper NAMD configuration files to use. By default the CHARMM parameter file (used by several other VMD plugins) will be used for the config file, but you can alter this as desired. In addition, the PSF filename will be needed for the NAMD simulation.

Residue-Based Coarse Graining Text Interface

To access the plugin via the text interface, the relevant commands for coarse graining a system are:

It is also useful in some CG schemes to include water in the system as it is coarse-grained. To make sure waters are properly assigned to CG beads (if one uses a cgc entry such as the water entry below) they must have consecutive residue numbers, which can be assigned using the function To reverse the coarse-graining procedure, one must first load the original molecule and the coarse-grained timestep that is to be reversed. The relevant commands are then:

CG database file format

The current bead definition database format (which is subject to change) is a single file with a set of one or more bead blocks, each of which starts with a CGBEGIN statement and ends with a CGEND statement. Any lines outside of a CGBEGIN ... CGEND block will be ignored, as will blank lines or lines starting with a # character. Each line within a bead block contains three whitespace delimited pieces of information: A residue name, atom name, and resid offset (relative to that of the key atom). The first line in the block contains the information that will be applied to the newly created CG bead, the second line of the block is the "key" atom, which will be used to identify clusters of atoms that should be turned into a bead, and all subsequent lines are component atoms of the bead. Note that one bead will be created for each key atom found (unless it has been previously incorporated into another bead), but that missing component atoms will be ignored. The following block is an example CG block for coarse graining of TIP3 water; application of this block would map every cluster of four consecutively numbered TIP3 water residues onto a single CG bead with resname TIP3 and name H2O.

CGBEGIN
TIP3 H2O 0
TIP3 OH2 0
TIP3 H1  0
TIP3 H2  0
TIP3 OH2 1
TIP3 H1  1
TIP3 H2  1
TIP3 OH2 2
TIP3 H1  2
TIP3 H2  2
TIP3 OH2 3
TIP3 H1  3
TIP3 H2  3
CGEND