To introduce methods for assessing the significance of sequence
To explain how PAM and BLOSUM substitution matrices are derived.
To describe heuristic methods for finding local alignments.
To give examples of applications of pattern matching in bioinformatics.
After this lecture you will be able to:
understand how z-scores are used in assessing the significance of
global alignment scores;
understand how PAM and BLOSUM substitution matrices are derived;
describe how the values in substitution matrices reflect similarities and
differences in the properties of amino acid residues, and also reflect the
relative abundances of different amino acid residues;
be aware of which substitution matrices are suitable for different tasks;
describe applications of pattern matching with DNA and protein sequences;
describe how covariance can give clues about RNA secondary structure.
The lecture handout, featuring some of the lecture slides, is
Statistics of Sequence Similarity Scores by Stephen Altschul, author
The method for deriving BLOSUM matrices is described in:
Henikoff, S. and Henikoff, J.G. (1992)
Amino acid substitution matrices from protein blocks.
Proc Natl Acad Sci USA, 89, 10915-10919
There is an overview of BLAST and substitution matrices in:
Pertsemlidis, A. and Fondon, J.W. (2001)
Having a BLAST
with bioinformatics (and avoiding BLASTphemy),
Genome Biol., 2, reviews2002.1-reviews2002.10.
Last Modified: 6 February 2012
by Graham Kemp