Bioinformatics (2013/2014)
Lecture GK-5
Sequence alignment
Aims
-
To introduce methods for assessing the significance of sequence
alignment scores.
-
To explain how PAM and BLOSUM substitution matrices are derived.
-
To describe heuristic methods for finding local alignments.
-
To give examples of applications of pattern matching in bioinformatics.
Objectives
After this lecture you will be able to:
-
describe how z-scores are used in assessing the significance of
global alignment scores;
-
describe how PAM and BLOSUM substitution matrices are derived;
-
describe how the values in substitution matrices reflect similarities and
differences in the properties of amino acid residues, and also reflect the
relative abundances of different amino acid residues;
-
be aware of which substitution matrices are suitable for different tasks;
-
describe applications of pattern matching with DNA and protein sequences;
-
describe how covariance can give clues about RNA secondary structure.
Supplementary Material
The lecture handout, featuring some of the lecture slides, is
available on-line
(one per page,
four per page).
Tutorial on
The
Statistics of Sequence Similarity Scores by Stephen Altschul, author
of BLAST.
The method for deriving BLOSUM matrices is described in:
Henikoff, S. and Henikoff, J.G. (1992)
Amino acid substitution matrices from protein blocks.
Proc Natl Acad Sci USA, 89, 10915-10919
(PubMed).
There is an overview of BLAST and substitution matrices in:
Pertsemlidis, A. and Fondon, J.W. (2001)
Having a BLAST
with bioinformatics (and avoiding BLASTphemy),
Genome Biol., 2, reviews2002.1-reviews2002.10.
Last Modified: 24 February 2014
by Graham Kemp