Bioinformatics (2011/2012)
Lecture GK5
Sequence alignment
Aims

To introduce methods for assessing the significance of sequence
alignment scores.

To explain how PAM and BLOSUM substitution matrices are derived.

To describe heuristic methods for finding local alignments.

To give examples of applications of pattern matching in bioinformatics.
Objectives
After this lecture you will be able to:

understand how zscores are used in assessing the significance of
global alignment scores;

understand how PAM and BLOSUM substitution matrices are derived;

describe how the values in substitution matrices reflect similarities and
differences in the properties of amino acid residues, and also reflect the
relative abundances of different amino acid residues;

be aware of which substitution matrices are suitable for different tasks;

describe applications of pattern matching with DNA and protein sequences;

describe how covariance can give clues about RNA secondary structure.
Supplementary Material
The lecture handout, featuring some of the lecture slides, is
available online.
Download PDF
Tutorial on
The
Statistics of Sequence Similarity Scores by Stephen Altschul, author
of BLAST.
The method for deriving BLOSUM matrices is described in:
Henikoff, S. and Henikoff, J.G. (1992)
Amino acid substitution matrices from protein blocks.
Proc Natl Acad Sci USA, 89, 1091510919
(PubMed).
There is an overview of BLAST and substitution matrices in:
Pertsemlidis, A. and Fondon, J.W. (2001)
Having a BLAST
with bioinformatics (and avoiding BLASTphemy),
Genome Biol., 2, reviews2002.1reviews2002.10.
Last Modified: 6 February 2012
by Graham Kemp