Next: MDFF GUI and Timeline Up: MDFF Tutorial Previous: MDFF with Symmetry Restraints Contents

Subsections

xMDFF: MDFF for Low-Resolution X-ray Crystallography

Although originally developed for fitting crystal structures into cryo-EM densities, MDFF can also be used to refine structures from low-resolution x-ray crystallographic diffraction data, termed xMDFF. For use with low-resolution X-ray crystallography, the MDFF protocol is modified to work with densities derived from molecular replacement, which uses the phases $\phi_{calc}$ calculated from a tentative model and the amplitudes $\left\vert F_{obs}\right\vert$ from the X-ray diffraction data. The density is biased by the model, but contains enough information from the $\left\vert F_{obs}\right\vert$ to determine the experimental structure. These densities are created using the PHENIX software suite to generate $2mF_{obs}-DF_{calc}$ maps. Because of this reliance on PHENIX, you will need to have a recent version installed and executable from the command line, which can be obtained from http://www.phenix-online.org/.

Unlike in standard MDFF simulations, the density map in xMDFF changes throughout the course of the refinement. Once the tentative model is fit to the generated density, the xMDFF-fitted structure provides new $\phi_{calc}$ that, together with $\left\vert F_{obs}\right\vert$ , are used to regenerate the electron density. The fitted structure is then employed as an updated search model to be driven into the new density map obtained from a molecular replacement procedure, and this process continues iteratively until a sufficiently low R $_{\text{free}}$ and R $_{\text{work}}$ are obtained.

Please note that you should at least briefly read through Section 2 or refer back to it as you work through this section of the tutorial. xMDFF shares many of the same initial steps with MDFF which are covered in earlier sections. Overlapping material will be presented here in a complete yet more concise form to avoid excessive repetition.

Preparing the initial structure

Normally xMDFF would require the low-resolution experimental reflection data and an initial homology model or predicted structure to begin refinement. For the following example, we will use the open conformation of the D ribose binding protein (PDB: 1URP) as an initial model and refine it against synthetically created diffraction data from a known closed conformation (PDB: 2DRI) at 5 Å resolution (Fig. 13).

**Figure 13:** The open conformation of the ribose binding protein (PDB: 1URP) colored in cyan will be used as an initial phasing model to be refined against relfection data from the closed conformation (PDB: 2DRI) colored in red.
$\begin{figure}\begin{center} \par \par \latex{ \includegraphics[scale=0.5]{FIGS/xmdff_initial_target} } \end{center} \end{figure}$

The initial steps of xMDFF are identical to a standard MDFF simulation outlined in Section 2 which you should refer to for further information. First we must generate a PSF file to provide NAMD with connectivity and partial aotmic charge information.

1

Load the initial structure in VMD by typing:

mol new 1urp-initial.pdb

2: Use the AutoPSF plugin as in Section 2.1. If you are working on the same VMD session from the beginning of the tutorial, make sure you click the Reset AutoPSF button and the choose the correct molecule in the AutoPSF plugin. You should be able to generate the files 1urp-initial_autopsf.psf and 1urp-initial_autopsf.pdb.

As in previous sections, we need to generate secondary structure, chirality, and cispeptide restraints to help prevent overfitting during the simulation.

3

Define restraints for $\phi$ and $\psi$ dihedral angles for amino acid residues in helices or sheets, as well as restraints for hydrogen bonds involving backbone atoms from the same residues:

ssrestraints -psf 1urp-initial_autopsf.psf

-pdb 1urp-initial_autopsf.pdb -o 1urp-extrabonds.txt -hbonds

4

Make sure the initial structure generated by AutoPSF is loaded as the top molecule in VMD. If not, you can load it by running:

mol new 1urp-initial_autopsf.psf

mol addfile 1urp-initial_autopsf.pdb

5

Use the cispeptide plugin to restrain cis peptide bonds to their current cis/trans configuration:

cispeptide restrain -o 1urp-extrabonds-cispeptide.txt

6

Analogously, use the chirality plugin to restrain chiral centers to their current handedness:

chirality restrain -o 1urp-extrabonds-chirality.txt

7

You now need to generate a PDB file containing the per-atom scaling factors

in Equation 1. These scaling factors are set to the atomic mass by the mdff gridpdb command in the VMD mdff plugin, which you can load by typing package require mdff in the VMD Tk Console. In xMDFF it is especially useful to only couple certain atoms at different stages of refinement. For example, the initial density maps are usually so poor and noisy that you may want to only designate the $\alpha$ -carbon or backbone for having xMDFF forces applied to them. This allows for greater flexibility in the structure for increased sampling early on and to avoid over-fitting to any noise in the potential. As the refinement progresses, you can couple more atoms (e.g. side chains) to improve the overall fit. As such, you should generate three PDB files here which will select $\alpha$ -carbons, backbone and all protein atoms excluding hydrogen.

mdff gridpdb -psf 1urp-initial_autopsf.psf

    -pdb 1urp-iniitial_autopsf.pdb -o cagrid.pdb -seltext "name CA"

mdff gridpdb -psf 1urp-initial_autopsf.psf

    -pdb 1urp-iniitial_autopsf.pdb -o backgrid.pdb -seltext "backbone"

mdff gridpdb -psf 1urp-initial_autopsf.psf

    -pdb 1urp-iniitial_autopsf.pdb -o nohgrid.pdb -seltext "protein and noh"

Setting Up NAMD Configuration Files for xMDFF

In this section we will create the NAMD configuration files and tcl script required to run xMDFF.

1: You must first generate a NAMD configuration file, which is automated by the MDFF plugin in VMD. In the VMD Tk Console window, type mdff setup for usage information.

2

Generate a NAMD configuration file using the command:

mdff setup -o 2dri -psf 1urp-initial_autopsf.psf

    -pdb 1urp-initial_autopsf.pdb

    -griddx step1.dx

    -gridpdb cagrid.pdb

    -extrab {1urp-extrabonds.txt 1urp-extrabonds-cispeptide.txt

    1urp-extrabonds-chirality.txt} -gscale 0.1 -numsteps 700000

    --xmdff     -refs 2DRI.mtz

Most options above specify the names of files you generated in previous steps and should be self-explanatory. As discussed previously, we will be initially applying forces only to the $\alpha$ -carbons using "cagrid.pdb". The option -gscale defines the scaling factor $\xi$ in Equation 1. In addition to the initial $\alpha$ -carbon coupling, we will use a low scaling factor (gscale 0.1) to decrease the magnitutde of the xMDFF forces and allow for greater initial flexibility. Unlike in previous sections, the potential map specified by -griddx does not yet exist, but instead will be generated automatically during the simulation with the name given. The flag --xmdff causes the plugin to accept xMDFF specific options as well as generate a few additional lines in the NAMD configuration file, which will be discussed next. The -refs option is required for any xMDFF simulation, and refers to the file containing the reflection data you wish to refine the structure against. This file can be in either the mtz or cif format. Additional options we are not specifying here will also be discussed next.

3

Open the configuration file 2dri-step1.namd with a text editor. We will be taking a look at some of the options in this file along with xMDFF specific entries that differ from normal MDFF simulations. The first things you should notice near the beginning of the file are the variables:

set REFINESTEP 20000

set REFS 2DRI.mtz

set BFS 0

set MASK 0

set CRYSTPDB 0

REFINESTEP sets the number of timesteps in between regenerating the density map from the new phases taken from the current structure. 20,000 was found to provide enough time for the structure to fit to the current density. You can increase this number to allow for increased sampling of each density, or lower it to capture smaller changes in phases, at the cost of computational speed. 20,000 is the defualt value, but this can be changed with the -refsteps option of mdff setup.

REFS indicates the reflection data file as discussed previously.

BFS turns on (1) or off (0) individual adp refinement using phenix for the calculation of B-factors. This option causes the refinement to take longer every time a map is regenerated, and is not necessarily required for this example so we will leave it turned off. This option can be turned on with mdff setup using the --bfs flag.

MASK turns on (1) or off (0) masking the generated density so that only the density immediately around the structure is used during fitting. Masking the density can help remove noise, but is not always needed. This option can be turned on with mdff setup using the --mask flag.

CRYSTPDB gives the name of a text file (can be a PDB) with a properly formatted CRYST1 line with space group and unit cell information. This is not always required as the reflection data file usually contains this information, but in the case that it does not or is improperly specified, you can supply the information with this file. This file can be given with mdff setup using the -crystpdb option.

4

Now open the file xmdff_template.namd with a text editor. This files contains more NAMD configuration parameters, including xMDFF specific commands. First take a look at the following command near the bottom of the file: if {[info exists INPUTNAME]} set env(VMDARG) [list $PSFFILE $INPUTNAME $GRIDFILE $REFS $BFS $MASK $CRYSTPDB $MASKRES $MASKCUTOFF $XMDFFSEL $AVERAGE] exec -ignorestderr vmd -dispdev text -e xmdff_phenix.tcl > map.log } else { set env(VMDARG) [list $PSFFILE $PDBFILE $GRIDFILE $REFS $BFS $MASK $CRYSTPDB $MASKRES $MASKCUTOFF $XMDFFSEL $AVERAGE] exec -ignorestderr vmd -dispdev text -e xmdff_phenix.tcl > map.log

This section marks the beginning of the most significant changes to the standard MDFF NAMD configuration file. This first block generates the initial density map to which the structure will be fit. If you are restarting from a previous run, it will use the last restart coordinates provided by INPUTNAME as the phasing model. If you are beginning from just a PDB, then that will be used instead. The important part of this section of code is the line beginning exec -ignorestderr vmd ... which runs a tcl script through VMD to perform the required steps of map generation. This script will be investigated in the next section.

if {$ITEMP != $FTEMP} { set ANNEALSTEP [expr abs($FTEMP-$ITEMP)*100] run $ANNEALSTEP set env(VMDARG) [list $PSFFILE $OUTPUTNAME $GRIDFILE $REFS $BFS $MASK $CRYSTPDB $MASKRES $MASKCUTOFF $XMDFFSEL $AVERAGE] exec -ignorestderr vmd -dispdev text -e xmdff_phenix.tcl > map.log

for {set i 0} {$i < [llength $REFS]} {incr i} { set grid [expr [llength $GRIDFILE]-[llength $REFS]+$i] reloadGridforceGrid $grid } }

This block of code will perform perform the appropriate amount of steps for any simulated annealing you might do, followed by generating the map just as in the previous section. There is a new command here, reloadGridforceGrid, which tells NAMD to reload the potential map derived from the density which we are fitting to (since we just created a new one).

for {set i 0} {$i < $TS/$REFINESTEP} {incr i} { run $REFINESTEP set env(VMDARG) [list $PSFFILE $OUTPUTNAME $GRIDFILE $REFS $BFS $MASK $CRYSTPDB $MASKRES $MASKCUTOFF $XMDFFSEL $AVERAGE] exec -ignorestderr vmd -dispdev text -e xmdff_phenix.tcl > map.log for {set j 0} {$j < [llength $REFS]} {incr j} { set grid [expr [llength $GRIDFILE]-[llength $REFS]+$j] reloadGridforceGrid $grid } }

This is where the standard refinement occurs by running NAMD for the number of specified refinement steps while iteratively re-generating the map each time from the latest restart coordinates.

5: Using a text editor, open xmdff_phenix.tcl. This file is a tcl script which is run using VMD to prepare a structure file for map generation, as well as calling PHENIX. There is nothing in this file you should need to modify, however since it is just making use of VMD's scripting capabilities with tcl, you are able to make a wide variety of changes to suit your particular system.

6

Run NAMD using the configuration file generated by VMD, i.e., run the following command in a terminal:

namd2 2dri-step1.namd > 2dri-step1.log

This will begin the NAMD simulation and refinement process. If you do not wish to wait for this simulation to complete, you may use the supplied 2dri-step1_result.dcd file and skip to the analysis in the next section.

Analysis of xMDFF refinements

1: To analyze the results from the first step of the xMDFF refinement, you will need to start VMD and load your structure files and trajectory:
mol new 1urp-initial_autopsf.psf
mol addfile 1urp-initial_autopsf.pdb
mol addfile 2dri-step1.dcd waitfor all
or if you are using the supplied dcd:
mol addfile 2dri-step1_result.dcd waitfor all

2

Plot the backbone RMSD with respect to the initial structure for each trajectory frame using the command

mdff check -rmsd

A window similar to the one depicted in Fig. 14 should appear. Note how the RMSD levels off toward the end of the simulation.

**Figure 14:** RMSD plot of refinement, showing this step has converged.
$\begin{figure}\begin{center} \par \par \latex{ \includegraphics[scale=0.5]{FIGS/xmdff_rmsd_self} } \end{center} \end{figure}$

3

Now plot the backbone RMSD with respect to the target structure using the command

mdff check -rmsd -refpdb 2DRI.pdb

A window similar to the one depicted in Fig. 15 should now appear. Note how the RMSD decreases (the fitting improves) which indicates the refinement of the structure over the course of the simulation.

**Figure 15:** RMSD plot of refinement relative to target structure, indicating refinement.
$\begin{figure}\begin{center} \par \par \latex{ \includegraphics[scale=0.5]{FIGS/xmdff_rmsd_target} } \end{center} \end{figure}$

4

We will also analyze the refined structure using a PHENIX program, phenix.model_vs_data which will provide R $_{\text{work}}$ and R $_{\text{free}}$ values in addition to structural statistics from Molprobity (e.g. % Ramachandran Favored Angles). First we should obtain a PDB of the final frame and refine the beta factors:

In VMD make sure the molecule containing your refinement trajectory is the "top" molecule, then in the VMD TKConsole type:

set sel [atomselect top "protein and noh"]

$sel frame last

$sel writepdb final.pdb

Open final.pdb with a text editor and delete the first line beginning with "CRYST1". Additionally, you will have to change every occurrence of "HSD" in the file with "HIS". Any text editor with a 'search and replace' feature can do this, for example using vi you can type: %s/HSD/HIS/g". Then, run phenix from a normal shell terminal:

phenix.refine final.pdb 2DRI.mtz refinement.refine.strategy=individual_adp --overwrite

Now we will run phenix.model_vs_data on the structure with the beta factors:
phenix.model_vs_data final_refine_001.pdb 2DRI.mtz > model.log

Once this is complete, you can open model.log with a text editor and view the statistics of your refined structure.

5

One thing you may notice from the RMSD plots and phenix.model_vs_data output, is that our structure isn't as refined as it could be. To do this, you will have to run further simulations using the previous one as input while coupling additional atoms to the density, increasing the global scaling factors, and using any additional techniques such as kicked maps and beta factor sharpening. It is also often useful to bring the simulation temperature down to 0K for the final structure by setting the FTEMP variable to 0 in the last NAMD simulation. Some example steps to set up the next simulation are:

Copy 2dri-step1.namd to 2dri-step2.namd, then open 2dri-step2.namd with a text editor and make the following changes:

Edit the following lines to reflect these new values:
set GRIDPDB backgrid.pdb to couple all backbone atoms to the density.
set GSCALE 0.3 to increase the global scaling factor of the applied forces.

Add the following line just before OUTPUTNAME:
set INPUTNAME test-step1 to restart from the previous step's progress.

Edit the following line:
set OUTPUTNAME test-step2 to name the output of this simulation as step2.

Run NAMD again using the new configuration file:
namd2 2dri-step2.namd > 2dri-step2.log
Once this simulation is complete, repeat the previous steps by making a 2dri-step3.namd and using the nohgrid.pdb as the GRIDPDB and increasing the GSCALE to 0.6. You should also experiment with using kicked maps and beta factor sharpening for different steps and analyze the difference in the resulting structures. An example of a more complete refinement trajectory, 2dri_result.dcd, can be found in the tutorial files.

Next: MDFF GUI and Timeline Up: MDFF Tutorial Previous: MDFF with Symmetry Restraints Contents

school@ks.uiuc.edu

`ssrestraints -psf 1urp-initial_autopsf.psf`
`-pdb 1urp-initial_autopsf.pdb -o 1urp-extrabonds.txt -hbonds`

`mol new 1urp-initial_autopsf.psf`
`mol addfile 1urp-initial_autopsf.pdb`

`mdff gridpdb -psf 1urp-initial_autopsf.psf`
`-pdb 1urp-iniitial_autopsf.pdb -o cagrid.pdb -seltext "name CA"`

`mdff gridpdb -psf 1urp-initial_autopsf.psf`
`-pdb 1urp-iniitial_autopsf.pdb -o backgrid.pdb -seltext "backbone"`

`mdff gridpdb -psf 1urp-initial_autopsf.psf`
`-pdb 1urp-iniitial_autopsf.pdb -o nohgrid.pdb -seltext "protein and noh"`

`mdff setup -o 2dri -psf 1urp-initial_autopsf.psf`
`-pdb 1urp-initial_autopsf.pdb`
`-griddx step1.dx`
`-gridpdb cagrid.pdb`
`-extrab {1urp-extrabonds.txt 1urp-extrabonds-cispeptide.txt`
`1urp-extrabonds-chirality.txt} -gscale 0.1 -numsteps 700000`
`--xmdff` `-refs 2DRI.mtz`

`set REFINESTEP 20000`
`set REFS 2DRI.mtz`
`set BFS 0`
`set MASK 0`
`set CRYSTPDB 0`

`set sel [atomselect top "protein and noh"]`
`$sel frame last`
`$sel writepdb final.pdb`