TMS145
(2010/2011)
|
Graham Kemp's classes
Lecture: Sequence Alignment 2
Aims
-
To introduce methods for assessing the significance of sequence
alignment scores.
-
To describe heuristic methods for finding local alignments.
-
To explain how PAM and BLOSUM substitution matrices are derived.
-
To describe methods for constructing multiple sequence alignments.
Objectives
After this lecture you will:
-
understand how z-scores are used in assessing the significance of
global alignment scores;
-
understand how the BLAST and FASTA programs find local alignments when
searching for matches in large databases;
-
be aware of e-values and p-values as indicators of the significance
of BLAST hits;
-
understand how PAM and BLOSUM substitution matrices are derived;
-
be aware of which substitution matrices are suitable for different tasks;
-
understand minimum entropy and sum of pairs methods for scoring
multiple alignments;
-
understand how multi-dimensional dynamic programming can yield an
optimal multiple sequence alignment;
-
understand the Feng-Doolittle algorithm for progressive multiple
sequence alignment.
Supplementary Material
The lecture handout, featuring some of the lecture slides, is
available on-line.
Download PDF
Notes on multiple sequence alignment
(local access only).
The method for deriving BLOSUM matrices is described in:
Henikoff, S. and Henikoff, J.G. (1992)
Amino acid substitution matrices from protein blocks.
Proc Natl Acad Sci USA, 89, 10915-10919
(PubMed).
The BLAST program is described in:
Pertsemlidis, A. and Fondon, J.W. 3rd (2001)
Having a BLAST with bioinformatics (and avoiding BLASTphemy).
Genome Biol., 2(10):REVIEWS2002
(PubMed).
Tutorial on
The
Statistics of Sequence Similarity Scores by Stephen Altschul, author
of BLAST.
Last Modified: 24 October 2010
by Graham Kemp