From: Andrew Dalke (dalke_at_ks.uiuc.edu)
Date: Fri Feb 07 1997 - 14:37:24 CST

  For the people who joined the VMD mailing list during last week, we
(the developers) are writing a weekly issue about some aspect of VMD
we think you all might want to hear about. Each one should cover some
aspect of VMD in detail. We will also include much of this material
in the new version of the documentation, so please tell us if there
are any problems.

  In this week's edition we'll cover a new feature added by popular
request to the 1.1 vesion of VMD; RMSD calculation and best-fit
alignment. We didn't get around to discussing it fully in the manual,
so will probably be your first full explanation of how it works.

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

                        RMSD and Best-Fit Alignments

When you have two similar structures you often want to compare them.
What's the difference between two X-ray structures? How much did the
simulation change during a simulation? To answer these you must first
figure out how to compare two stuctures, which usually means to find
the RMSD.

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
                                Background

  Formally, given N atom positions from structure x and the
corresponding N atoms from structure y with a weighting factor w(i),
the RMSD is defined as:

                       / N \ 1/2
                       | -- 2 |
        RMSD(N; x,y) = | \ w ||x - y || / |
                       | / i i i / |
                       | -- / [sum(wieghts)*N |
                       \ i=1 /

(And I hope no one here is using proportional fonts! :)

Using this equation by itself probably won't give you the answer you
are looking for. Imaging two identical structures offset by some
distance. The RMSD should be 0, but the offset prevents that from
happening.

So what you really want is the minimum RMSD between two given
structures; the best-fit. There are many ways to do this, but for VMD
we implemented the method of Kabsch (Acta Cryst. (1978) A34, 827-828
or see Measure.C). I won't discuss it here other than to say it
computes the transformation needed to move one structure onto another
to minimize the RMSD, in our case, a 4x4 matrix.

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
                        Which Atoms?

So the math is all well and fine, but the problem I ignorned in the
previous section is how to choose which atoms to compare. If you want
to compare all the atoms in both structures, and they both have the
same number of atoms, then the problem is easy -- "N" is everything.
This occurs most often in MD simulations when the only thing different
between two structures are the coordinates.

But what about homologous sequences? In this case, the number of
atoms differ because while the number of residues is the same, the
sidechains have different numbers of atoms. The usual solution is to
determine the RMSD based solely on the backbone atoms or, in some
X-ray structures where only the CAs have been determined, on the CAs.

VMD can use two other methods for fitting. Fitting by heavy atoms
omits the hydrogens, since their positions are not often well
determined. Fitting by "picked atoms" performs only a translation to
bring one atom directly on top of another molecule.

It is about to get more confusing, so let me mention some terminology.
In VMD, a "molecule" means all the atoms from a structure file. The
file may contain multiple molecules under standard chemical usage, but
VMD still thinks of that as one molecule. Instead, those individual
parts are called "fragments," for lack of a better term.

Suppose you have loaded a dimer and want to compare it with another
dimer. It is easy enough to say "compare all the atoms from molecule
1 with all the atoms from molecule 2" but you might also want to
"compare one monomer from molecule 1 with the other monomer in molecule
1." In other words, sometimes you want to use all the atoms in the
molecule, and sometimes just the fragments of interest.

For example, you might want to:

        Compute the RMSD between the backbone atoms of the second
        monomer of molecule 1 with those on the second monomer of
        molecule 3.

  -or-

        Do the best fit of molecule 7 onto molecule 6 using all
        the heavy atoms in the structures.

You can do these with the mouse. In the graphics pop-up menu, go to
the "Fit" item. The bottom half says "Two Molecules" and "Two
Fragments" and means exactly what was discussed above. The subitems
for both are identical, and choose which atoms in the selections are
to be used. Then, to pick the two molecules or fragments, click on
one atom of each.

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
                        Computing RMSD

  As a simple example of computing RMSD, load the same molecule twice.
Press the '8' key (this puts VMD into the "MoveMolecule" pick mode and
lets you use the mouse to change the coordinates of a molecule).
Click and drag one of the molecules away from the other so there is a
space between the two.

By default the fit routines are configured to do the best fit but for
this example you will only compute the RMSD, so pick "Fit -> Print
RMSD." Now go to the pop-up menu and pick "Fit -> Two Molecules ->
All Atoms." Look at the VMD console window and you'll see a new line
was printed to show the current state.

  The mouse should look like a crosshair. Pick one of the atoms in
the first molecule then one of the atoms in the second molecule. The
value of the RMSD will be printed to the screen, for example:

        RMSD between the two molecules is: 1.123410

To stop computing RMSDs, simply change the mouse mode to what you want
to do next, for example, press 'r' to go to the rotate mode.

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
                An Improvement You Can Do

One problem with the current implementation is VMD doesn't really tell
you what is going on. It is pretty hard to tell if you really did
pick an atom as VMD give you no notification. If you want more
information (like we do) go to $VMDDIR/scripts/vmd/fit.tcl, change

    # if there have been two atoms, do the fit

(this is around line 71) to

    puts "Picked atom $vmd_fit_count"
    # if there have been two atoms, do the fit

and change like 78 (or 79) from

        set vmd_fit_count 0

to
        puts "Please pick two atoms"
        set vmd_fit_count 0

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
                Performing a Best-Fit Alignment

As I mentioned, this RMSD is not necessarily the minimum RMSD. You
will have to do the best-fit alignment first. To do this, choose "Fit
-> Do RMSD Fit" from the pop-up menu then choose the selection method
you want to use. For this example, again choose "Fit -> Two Molecules
-> All Atoms."

First pick an atom in the molecule you want to move. Then pick on an
atom in the reference molecule. VMD thinks for a little bit as it
computes and applies the transformation, then you should see the first
molecule on top of the second one. The way to remember which molecule
goes where is "I want to move this thing<click> to that one<click>."
In addition, the final RMSD between the two molecules is printed in
the text window.

To prove to yourself that things work reasonably, you might want to
move a residue to the side using "Pick Item Mode -> MoveResidue" and
do the alignment again. The resultant alignment should show that this
method is reasonable.

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
                        The Text Command

As you may expect by now, these actions can also be done with the VMD
scripting language. If you've read (and understood) the sections in
the manual on atom selections you should be able to follow this next
section, otherwise, skim it just to humor me.

 - - - - - RMSD Computation

There are two atom selections needed to do an RMSD computation, the
list of atoms to compare in both molecules. The first atom of the
first selection is compared to the first atom of the second selection,
fifth to fifth, and so on. The actual order is identical to the order
from the input PDB file.

Once the two selections are made, the RMSD calculation is a matter of
calling the "measure rmsd" function. Here's an example:

        set sel1 [atomselect 0 "backbone"]
        set sel2 [atomselect 1 "backbone"]
        measure rmsd $sel1 $sel2
  Info) 10.403014

This prints the RMSD between the backbone atoms of molecule 0 with
those of molecule 1. You could also using a weighting factor in these
calculations. The best way to understand how to do this is to see
another example:

        set weighted_rmsd [measure rmsd $sel1 $sel2 weight mass]
  Info) 10.403022

In this case, the weight is determined by the mass of each atom.
Actually, the term is really one of the standard keywords available to
an atom selection. Other ones include index and resid (which would
both be rather strange to use) as well as charge, beta and occupancy.
These last terms useful if you want to specify your own values for the
weighting factors.

 - - - - - Computing the Alignment

The best-fit alignment is done in two steps. The first is to compute
the 4x4 matrix transformation that takes one set of coordinates onto
the other. This is done with the "measure fit" command. Assuming the
same selections as before:

        set transformation_matrix [measure fit $sel1 $sel2]
Info) {0.971188 0.00716391 0.238206 -13.2877} {0.0188176 0.994122
-0.106619 3.25415} {-0.23757 0.108029 0.965345 -2.97617} {0.0 0.0 0.0
1.0}

As with the RMSD calculation, you could also add an optional "weight
<keyword>" term on the end.

The next step is to apply the matrix to a set of atoms using the move
command. So far you have two coordinate sets. You might think you
could do something like "$sel1 move $transformation_matrix" to apply
the matrix to all the atoms of that selection. You could, but that's
not the right selection, and it took me a long time to figure that
out.

The thing to recall is that $sel1 is the selection for the backbone
atoms. You really want to move the whole fragment to which it is
attached, or even the whole molecule. (This is where the discussion
earlier comes into play.) So you need to make a third selection
containing all the atoms which are to be moved, and apply the
transformation to those atoms.

        # molecule 0 is the same molecule used for $sel1
        set move_sel [atomselect 0 "all"]
        $move_sel move $transformation_matrix

As a more complicated example, with the mouse you cannot do something
like "align all of molecule 1 with molecule 9 using only the backbone
atoms of residues 4 to 10 in both systems" with the mouse. However,
you can do it with the text command. Here's how:

        # compute the transformation matrix
        set reference_sel [atomselect 9 "backbone and resid 4 to 10"]
        set comparison_sel [atomselect 1 "backbone and resid 4 to 10"]
        set transformation_mat [measure fit $comparison_sel $reference_sel]

        # apply it to all of the molecule 1
        set move_sel [atomselect 1 "all"]
        $move_sel move $transformation_mat

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
                        A Simulation Example Script

  Here's a longer script which you might find useful. The problem is
to compute the RMSD between each timestep of the simulation and the
first frame. Usually in a simulation there is no initial global
velocity, so the center of mass doesn't move, but because of angular
rotations and because of numerical imprecisions that slowly build up,
the script aligns the molecule before computing its RMSD.

        # Prints the RMSD of the protein atoms between each timestep
        # and the first timestep for the given molecule id (default: top)
        proc print_rmsd_through_time {{mol top}} {
                # use frame 0 for the reference
                set reference [atomselect $mol "protein" frame 0]
                # the frame being compared
                set compare [atomselect $mol "protein"]

                set num_steps [molinfo $mol get numframes]
                for {set frame 0} {$frame < $num_steps} {incr frame} {
                        # get the correct frame
                        $compare frame $frame

                        # compute the transformation
                        set trans_mat [measure fit $compare $reference]
                        # do the alignment
                        $compare move $trans_mat
                        # compute the RMSD
                        set rmsd [measure rmsd $compare $reference]
                        # print the RMSD
                        puts "RMSD of $frame is $rmsd"
                }
        }

  To use this, load a molecule with an animation (hopefully you have
one handy; the next release of VMD will contain one). Then run
'print_rmsd_through_time'. Example output is shown here:

vmd > print_rmsd_through_time
RMSD of 0 is 0.000000
RMSD of 1 is 1.060704
RMSD of 2 is 0.977208
RMSD of 3 is 0.881330
RMSD of 4 is 0.795466
RMSD of 5 is 0.676938
RMSD of 6 is 0.563725
RMSD of 7 is 0.423108
RMSD of 8 is 0.335384
RMSD of 9 is 0.488800
RMSD of 10 is 0.675662
RMSD of 11 is 0.749352
 [...]

If you wanted you could do all sorts of things, like graph this value
through time.

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

                        But I Want To Do .... ?

VMD's atom alignment should be able to handle most common tasks, but
there are some it cannot do. If this occurs, you might want to look
at Andrew Martin's ProFit at
http://www.biochem.ucl.ac.uk/~martin/text/ProFit.readme (yes, Andrews
are everywhere!).

 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

  We hope this proved to be helpful and, as always, if you have any
ideas, suggestions, or questions about VMD, send mail to the list or
to us at vmd_at_ks.uiuc.edu.

                                                Andrew Dalke
                                                dalke_at_ks.uiuc.edu