next up previous
Next: Sequence and structural alignments Up: Bioinformatics Tutorial Previous: SCOP fold classification

Subsections


Sequence Alignment Algorithms

In this section you will optimally align two short protein sequences using pen and paper, then search for homologous proteins by using a computer program to align several, much longer, sequences.

Dynamic programming algorithms are recursive algorithms modified to store intermediate results, which improves efficiency for certain problems. The Smith-Waterman (Needleman-Wunsch) algorithm uses a dynamic programming algorithm to find the optimal local (global) alignment of two sequences -- $a$ and $b$. The alignment algorithm is based on finding the elements of a matrix $H$ where the element $H_{i,j}$ is the optimal score for aligning the sequence ($a_1$,$a_2$,...,$a_i$) with ($b_1$,$b_2$,.....,$b_j$). Two similar amino acids (e.g. arginine and lysine) receive a high score, two dissimilar amino acids (e.g. arginine and glycine) receive a low score. The higher the score of a path through the matrix, the better the alignment. The matrix $H$ is found by progressively finding the matrix elements, starting at $H_{1,1}$ and proceeding in the directions of increasing $i$ and $j$. Each element is set according to:


\begin{displaymath}H_{i,j} = \textrm{max} \left\{ \begin{array}{l}
H_{i-1,j-1} ...
...\
H_{i-1,j} - d \\
H_{i,j-1} - d \\
\end{array} \right.
\end{displaymath}

where $S_{i,j}$ is the similarity score of comparing amino acid $a_i$ to amino acid $b_j$ (obtained here from the BLOSUM40 similarity table) and $d$ is the penalty for a single gap. The matrix is initialized with $H_{0,0} = 0$. When obtaining the local Smith-Waterman alignment, $H_{i,j}$ is modified:


\begin{displaymath}H_{i,j} = \textrm{max} \left\{ \begin{array}{l}
0\\
H_{i-1...
...\\
H_{i-1,j} - d \\
H_{i,j-1} - d \\
\end{array} \right.
\end{displaymath}

The gap penalty can be modified, for instance, $d$ can be replaced by $(d \times k)$, where $d$ is the penalty for a single gap and $k$ is the number of consecutive gaps.

Once the optimal alignment score is found, the ``traceback'' through $H$ along the optimal path is found, which corresponds to the the optimal sequence alignment for the score. In the next set of exercises you will manually implement the Needleman-Wunsch alignment for a pair of short sequences, then perform global sequence alignments with a computer program developed by Anurag Sethi, which is based on the Needleman-Wunsch algorithm with an affine gap penalty, $d + e (k-1)$, where $e$ is the extension gap penalty. The output file will be in the GCG format, one of the two standard formats in bioinformatics for storing sequence information (the other standard format is FASTA).


Manually perform a Needleman-Wunsch alignment

In the first exercise you will test the Smith-Waterman algorithm on a short sequence parts of hemoglobin (PDB code 1AOW) and myoglobin 1 (PDB code 1AZI).


Table 3: The empty matrix with initial gap penalties.
    H G S A Q V K G H G
  0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
K -8                    
T -16                    
E -24                    
A -32                    
E -40                    
M -48                    
K -56                    
A -64                    
S -72                    
E -80                    
D -88                    
L -96                    
K -104                    
K -112                    
H -120                    
G -128                    
T -136                    


A  5 
R -2  9
N -1  0  8  
D -1 -1  2  9 
C -2 -3 -2 -2 16 
Q  0  2  1 -1 -4  8 
E -1 -1 -1  2 -2  2  7 
G  1 -3  0 -2 -3 -2 -3  8 
H -2  0  1  0 -4  0  0 -2 13 
I -1 -3 -2 -4 -4 -3 -4 -4 -3  6  
L -2 -2 -3 -3 -2 -2 -2 -4 -2  2  6 
K -1  3  0  0 -3  1  1 -2 -1 -3 -2  6 
M -1 -1 -2 -3 -3 -1 -2 -2  1  1  3 -1  7 
F -3 -2 -3 -4 -2 -4 -3 -3 -2  1  2 -3  0  9 
P -2 -3 -2 -2 -5 -2  0 -1 -2 -2 -4 -1 -2 -4 11 
S  1 -1  1  0 -1  1  0  0 -1 -2 -3  0 -2 -2 -1  5 
T  0 -2  0 -1 -1 -1 -1 -2 -2 -1 -1  0 -1 -1  0  2  6 
W -3 -2 -4 -5 -6 -1 -2 -2 -5 -3 -1 -2 -2  1 -4 -5 -4 19 
Y -2 -1 -2 -3 -4 -1 -2 -3  2  0  0 -1  1  4 -3 -2 -1  3  9 
V  0 -2 -3 -3 -2 -3 -3 -4 -4  4  2 -2  1  0 -3 -1  1 -3 -1  5 
B -1 -1  4  6 -2  0  1 -1  0 -3 -3  0 -3 -3 -2  0  0 -4 -3 -3  5 
Z -1  0  0  1 -3  4  5 -2  0 -4 -2  1 -2 -4 -1  0 -1 -2 -2 -3  2  5
X  0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1  0 -1 -2  0  0 -2 -1 -1 -1 -1 -1  
   A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V  B  Z  X


Table 4: Alignment score worksheet. In all alignment boxes, the similarity score $S_{i,j}$ from the BLOSUM40 matrix lookup is supplied (small text, bottom of square). Four alignment scores are provided as examples (large text, top of square), try and calculate at least four more, following the direction provided in the text for calculating $H_{i,j}$.
    H G S A Q V K G H G
  0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
K -8 $\begin{array}{c}
\mathbf{-1}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-9}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{6}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $
T -16 $\begin{array}{c}
\mathbf{-9}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-3}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $
E -24 $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $
A -32 $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{5}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $
E -40 $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $
M -48 $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $
K -56 $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{6}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $
A -64 $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{5}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $
S -72 $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{5}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $
E -80 $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $
D -88 $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $
L -96 $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-4}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-4}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-4}}\\
\end{array} $
K -104 $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{6}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $
K -112 $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{6}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $
H -120 $\begin{array}{c}
\mathbf{}\\
{_{13}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-4}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{13}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $
G -128 $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{8}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-4}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{8}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{8}}\\
\end{array} $
T -136 $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{}\\
{_{-2}}\\
\end{array} $



Table 5: Traceback worksheet. The completed alignment score matrix $H$ (large text, top of each square) with the BLOSUM40 lookup scores S$_{i,j}$ (small text, bottom of each square). To find the alignment, trace back starting from the lower right (T vs G, score -21) and proceed diagonally (to the left and up), left, or up. Only proceed, however, if the square in that direction could have been a predecessor, according to the conditions described in the text.
H G S A Q V K G H G
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
K -8 $\begin{array}{c}
\mathbf{-1}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-9}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -16}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-24}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -31}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-39}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -42 }\\
{_{6}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-50 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -58 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-66}\\
{_{-2}}\\
\end{array} $
T -16 $\begin{array}{c}
\mathbf{-9}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-3}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -7}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-15}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -23}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-30}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -38 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-44 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -52 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-60}\\
{_{-2}}\\
\end{array} $
E -24 $\begin{array}{c}
\mathbf{ -16}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-11}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -3}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -8}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -13}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-21}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -29 }\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-37 }\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -44 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-52}\\
{_{-3}}\\
\end{array} $
A -32 $\begin{array}{c}
\mathbf{-24}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-15}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -10}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 2}\\
{_{5}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -6}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-13}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -21 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-28 }\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -36 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-43}\\
{_{1}}\\
\end{array} $
E -40 $\begin{array}{c}
\mathbf{-32}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-23}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-15}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -6}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 4}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-4}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -12 }\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-20 }\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -28 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-36}\\
{_{-3}}\\
\end{array} $
M -48 $\begin{array}{c}
\mathbf{-39}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-31}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -23}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-14}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -4}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 5}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -3 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-11 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -19 }\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-27}\\
{_{-2}}\\
\end{array} $
K -56 $\begin{array}{c}
\mathbf{-47}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-39}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -31}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-22}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -12 }\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-3}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 11 }\\
{_{6}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 3 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -5 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-13}\\
{_{-2}}\\
\end{array} $
A -64 $\begin{array}{c}
\mathbf{-55}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-46}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -38}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-26}\\
{_{5}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-20}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -11}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 3}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 12}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 4 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -4}\\
{_{1}}\\
\end{array} $
S -72 $\begin{array}{c}
\mathbf{-63 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-54 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -41 }\\
{_{5}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-34}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-25 }\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-19}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-5 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 4 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{11 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 4 }\\
{_{0}}\\
\end{array} $
E -80 $\begin{array}{c}
\mathbf{-71 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-62 }\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -49 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-42}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-32 }\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-27}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -13 }\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-4}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 4 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 8}\\
{_{-3}}\\
\end{array} $
D -88 $\begin{array}{c}
\mathbf{-79 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-70 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -57 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-50}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-40 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-35}\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -21 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-12 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-4 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ 2}\\
{_{-2}}\\
\end{array} $
L -96 $\begin{array}{c}
\mathbf{-87 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-78 }\\
{_{-4}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -65 }\\
{_{-3}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-58}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-48 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-38}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -29 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-20 }\\
{_{-4}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-12 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-6}\\
{_{-4}}\\
\end{array} $
K -104 $\begin{array}{c}
\mathbf{-95 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-86 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -73 }\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-66}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-56 }\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-46}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -32 }\\
{_{6}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-28 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-20 }\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-14}\\
{_{-2}}\\
\end{array} $
K -112 $\begin{array}{c}
\mathbf{-103}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -94}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -81}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -74}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-64}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-54}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -40}\\
{_{6}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -34}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-28}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-22}\\
{_{-2}}\\
\end{array} $
H -120 $\begin{array}{c}
\mathbf{-99}\\
{_{13}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-102}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -89}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -82}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-72}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-62}\\
{_{-4}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -48}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -42}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-21}\\
{_{13}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -29 }\\
{_{-2}}\\
\end{array} $
G -128 $\begin{array}{c}
\mathbf{-107}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -91}\\
{_{8}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-97}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -88}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-80}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-70}\\
{_{-4}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -56}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -40}\\
{_{8}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -29 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-13}\\
{_{8}}\\
\end{array} $
T -136 $\begin{array}{c}
\mathbf{-115}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -99}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-89}\\
{_{2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -96}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-88}\\
{_{-1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-78}\\
{_{1}}\\
\end{array} $ $\begin{array}{c}
\mathbf{ -64}\\
{_{0}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-48 }\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-37}\\
{_{-2}}\\
\end{array} $ $\begin{array}{c}
\mathbf{-21}\\
{_{-2}}\\
\end{array} $


\framebox[\textwidth]{
\begin{minipage}{.2\textwidth}
\includegraphics[width=2...
... output of the {\sf pair} program. Do the alignments match?
}
\end{minipage} }


Finding homologous pairs of ClassII tRNA synthetases

Homologous proteins are proteins derived from a common ancestral gene. In this exercise with the Needleman-Wunsch algorithm you will study the sequence identity of several class II tRNA synthetases, which are either from Eucarya, Eubacteria or Archaea or differ in the kind of aminoacylation reaction which they catalyze. Table 6 summarizes the reaction type, the organism and the PDB accession code and chain name of the employed Class II tRNA synthetase domains.


Table 6: Domain types, origins, and accession codes
Specificity Organism PDB code:chain ASTRAL catalytic domain
Aspartyl Eubacteria 1EQR:B d1eqrb3
Aspartyl Archaea 1B8A:A d1b8aa2
Aspartyl Eukarya 1ASZ:A d1asza2
Glycl Archaea 1ATI:A d1atia2
Histidyl Eubacteria 1ADJ:C d1adjc2
Lysl Eubacteria 1BBW:A d1bbwa2
Aspartyl Eubacteria 1EFW:A d1efwa3


\framebox[\textwidth]{
\begin{minipage}{.2\textwidth}
\includegraphics[width=2...
...utionarily divergent pair according to the sequence identity?}
\end{minipage} }


next up previous
Next: Sequence and structural alignments Up: Bioinformatics Tutorial Previous: SCOP fold classification
zan@uiuc.edu