next up previous contents
Next: Modeling amino acid insertions Up: Rosetta/MDFF Tutorial Previous: Required software   Contents

Subsections


Folding protein termini using ModelMaker

Here, we predict a structural model for the truncated C-terminal tail from amino acid 217 to 306 of chain B of 4OCM. Create a folder named terminus for this section and navigate to it. Sample input and output files are provided in the 3.terminus folder in the tutorial folder. First you need to obtain a template structure which should be completed by modeling the structurally unresolved segments. In this tutorial we complete Rpn11, the deubiquitilyation subunit of the 26S proteasome. As template for modeling, we use chain B of PDB structure 4OCM. Go to the Protein Data Bank (PDB, http://www.rcsb.org/pdb/home/home.do) and download the PDB structure with the PDB-ID 4OCM. Alternatively you can download the PDB structure directly through VMD by going to File $\rightarrow$ New Molecule, type 4OCM in the Filename box and click the Load button. The PDB structure 4OCM contains two Rpn8/Rpn11 dimers in the unit cell. For now we only need one Rpn11, so we use chain B. In order to create a PDB file containing only chain B run the following command in the TK console (Extensions $\rightarrow$ Tk Console):

[atomselect top "chain B and protein"] writepdb rpn11_yeast_4ocm.pdb


Structure prediction

In order to use ModelMaker for structure prediction you first need to modify the configuration file.

1
Preparing the configuration file:
Copy the prepared configuration file fold_rpn11_terminus.tcl from the tutorial files to your terminus folder, open it in a text editor and change the following variables to fit your workstation configuration.

Table 1: Default configuration variables
variable description
packagePath path of the ModelMaker plugin files
vmdexe path of your vmd executable
gnuplotexe path of your gnuplot executable
rosettapath directory path containing the Rosetta binaries
rosettaDBpath Rosetta database path
platform "linuxgccrelease" or "macosclangrelease"


2
Obtaining the target amino acid residue sequence:
As we are only going to fold the C-terminus of Rpn11, we will discard the missing N-terminus in our following procedures. Go to the uniprot website (http://www.uniprot.org) and download the fasta sequence of Rpn11 in yeast (UniprotID: P43588). Create a folder called input. With a text editor, remove the first 22 amino acids from the sequence and save the file in the input folder as rpn11_yeast_23-306.fasta. To facilitate the sub-sequence extraction, a Python script subrange.py is provided in the scripts folder, that you can use to get the amino acid codes for a given range. Simply copy the script to your working directory and make the following changes: Running

python subrange.py

creates a file <name>_<start>-<end>.fasta from the input fasta sequence.
3
Generating the fragment file library:
Use the Robetta server to generate two library files containing internal coordinates for the target sequence structure. The server performs a homolgy search algorithm in the PDB Data Bank based on a running window of 3 and 9 amino acid length and produce two files (3mer and 9mer) presenting the best 200 results for each window. To do so, go to http://www.robetta.org and set an academic user account. Submit the target sequence rpn11_yeast_23-306.fasta to the Fragment file server and as soon as the search is finished you will receive an email with a link to download the results. Save the 3mer and 9mer files as rpn11_yeast_23-306.frag3 and rpn11_yeast_23-306.frag9 in your input folder.
4
Building one complete model for the target amino acid sequence:
Create a full_length_model folder and copy the file rpn11_yeast_4ocm.pdb to it. Furthermore, copy the file run_full_length_model.sh from the tutorial files to your full_length_model folder. Make the following changes in run_full_length_model.sh: Run run_full_length_model.sh to generate a complete template model. Afterwards, rename the output file rpn11_yeast_4ocm.pdb_full_length.pdb to rpn11_yeast_23-306_complete.pdb.

./run_full_length_model.sh

mv rpn11_yeast_4ocm.pdb_full_length.pdb rpn11_yeast_23-306_complete.pdb

Rosetta's full length modell application yields PDB files that do not keep the original amino acid numbering. To keep it, copy the script renumber.tcl from the scripts folder to the full_length_model folder. In the mols list, you can specify the file name of the input PDB file, in this case, the line should look like:

set mols [list rpn11_yeast_23-306_complete.pdb]

In the next line, you can set the newstart varible to 23, so that the output PDB file starts its numbering from 23, as in the fasta file. Run

vmd -dispdev text -e renumber.tcl

to get the output file rpn11_yeast_23-306_complete-numb.pdb, then replace the old file with the new one:
mv rpn11_yeast_23-306_complete-numb.pdb rpn11_yeast_23-306_complete.pdb.

5
Running Rosetta from VMD:
Now that we have prepared all necessary input files, we can complete the configuration file to finally run Rosetta from VMD to predict the C-terminal structure of Rpn11. The recommendation from the literature is to predict between 5,000 and 20,000 modelss. In our test case we predict only 100 structures for demonstration purpose. We use RosettaScripts with a Brokered Environment and execute the classic Rosetta de novo protocol upon it. The ModelMaker plugin can handle input file generation for Rosetta automatically, so we just need to add a few lines to fold_rpn11_terminus.tcl.

Execute the configuration file in VMD text mode and wait for the tasks to finish. Depending on the number of structures to generate, this may take a while.

vmd -dispdev text -e fold_rpn11_terminus.tcl

If no error occurs, go to the folder called rosetta_output_rpn11_terminus containing the results of your run.


Table 2: Rosetta ab initio procedure arguments
\begin{table}\centering
\begin{tabularx}{16cm}{c\vert X\vert X}
arg.& descript...
...0 or 1 (\bf only set to 1 for test cases!)& \tt0 \\
\end{tabularx} \end{table}



Table 3: Rosetta ab initio analysis procedure arguments
\begin{table}\centering
\begin{tabularx}{16cm}{c\vert X\vert X}
arg.& descript...
... analysis tasks (previously defined)& \tt\$comps \\
\end{tabularx} \end{table}



Interactive fitting to a cryo-EM density with iMDFF and QwikMD

Create a new folder named mdff in your working directory.
1
Aligning the predicted model with the cryo-EM density map:
In order to perform interartive molecular dynamics flexible fitting you first need to place the modeled structure in the right position inside the density map. Download the cryo-EM density map of the 26S proteasome (EMDB-ID 2594) from the electron microscopy database (http://www.ebi.ac.uk/pdbe/emdb/): emd_2594.map. In this special case there exist already a near-atomic structural model (PDB-ID 4CR2) for this map. Download the structure with the PDB-ID 4CR2 from the PDB (http://www.rcsb.org/pdb/home/home.do). Use align_segments.tcl to align the output structure from the structure prediction (ss_average_100.pdb) to chain V of 4CR2.pdb in order to get ss_average_100_aligned.pdb.

2
Generating a density map file for MDFF:
In order to generate a readable density map file for MDFF the file emd_2594.map first needs to be renamed to emd_2594.ccp4 and then run the command in your terminal (The script is contained in the scripts folder):

vmd -dispdev text -e get_density.tcl

which will excecute

mdff griddx -i emdb_2594.ccp4 -o emdb_2594_potential.dx
mdff griddx -i emdb_2594_potential.dx -o emdb_2594_density.dx

to obtain the density file emdb_2594_density.dx, which can be read by MDFF.

3
Crop the density:
In order to crop the density to the area of interest, which is here around the predicted structural model of Rpn11, run crop_density.tcl.

vmd -dispdev text -e crop_density.tcl

The script generates the density file rpn11_model_5_2594_density.dx, which contains the density of emd_2594_density.dx within a cutoff of $5~\AA$ around .

4
Fitting the modeled part to the cryo EM density:
We are going to use the VMD plugin QwikMD to setup the interactive MDFF run as it automatically generates all the required input files and structures.

Structure preparation with QwikMD

Run MDFF

Hint: If you have problems moving the structure to the correct positions, load the final Rpn11 structure rpn11_yeast.pdb from the tutorial folder into VMD while performing iMDFF.

Short introduction to iMDFF:

In the first step drag the predicted structure to the density apply forces (Mouse $\rightarrow$ Forces $\rightarrow$ Atom) by clicking on an atom. In this step a grid spacing of 0.3 should be applied. For detailed instructions on the usage of the MDFF GUI and interactive fitting see the MDFF tutorial and the Youtube tutorial https://www.youtube.com/watch?v=-KJiH_WF65s. As soon as the predicted segment fits the density a second MDFF run with a grid spacing of 0.6 can be performed.
Hint: Use cartoon representation for the protein with different coloring for the fixed and flexible segments. Represent the density as solid surface with white color and transparent material. Use the CPK representation for the backbone atoms of the flexible segment and apply only forces to these backbone atoms.


next up previous contents
Next: Modeling amino acid insertions Up: Rosetta/MDFF Tutorial Previous: Required software   Contents
www.ks.uiuc.edu/Training/Tutorials