Bioinformatics (2014/2015)
Lecture GK-5
Sequence alignment
Aims
-
To explain how PAM and BLOSUM substitution matrices are derived.
-
To describe heuristic methods for finding local alignments.
-
To give examples of applications of pattern matching in bioinformatics.
-
To describe the sequence assembly problem.
Objectives
After this lecture you will be able to:
-
describe how PAM and BLOSUM substitution matrices are derived;
-
describe how the values in substitution matrices reflect similarities and
differences in the properties of amino acid residues, and also reflect the
relative abundances of different amino acid residues;
-
be aware of which substitution matrices are suitable for different tasks;
-
describe applications of pattern matching with DNA and protein sequences;
-
describe how covariance can give clues about RNA secondary structure.;
-
describe the sequence assembly problem and how a program addressing this
problem can be implemented.
Supplementary Material
The lecture handout, featuring some of the lecture slides, is
available on-line
(one per page,
four per page).
Tutorial on
The
Statistics of Sequence Similarity Scores by Stephen Altschul, author
of BLAST.
The method for deriving BLOSUM matrices is described in:
Henikoff, S. and Henikoff, J.G. (1992)
Amino acid substitution matrices from protein blocks.
Proc Natl Acad Sci USA, 89, 10915-10919
(PubMed).
There is an overview of BLAST and substitution matrices in:
Pertsemlidis, A. and Fondon, J.W. (2001)
Having a BLAST
with bioinformatics (and avoiding BLASTphemy),
Genome Biol., 2, reviews2002.1-reviews2002.10.
Last Modified: 11 February 2015
by Graham Kemp