Computational methods in bioinformatics (2016-2017)

Lecture 3

Protein conformation; Protein domains

Aims

To introduce concepts related to protein conformation.
To describe how protein domains can be assigned automatically.

Objectives

After this lecture you will be able to:

describe the polypeptide backbone and the 20 common side chain types;
describe protein secondary structure elements, and the main chain hydrogen bond patterns associated with these;
discuss factors that affect protein stability;
implement code to compute distances and angles;
describe how protein secondary structure elements can be recognised automatically;
interpret distance maps;
describe the domain assignment problem;
describe approaches to solving the domain assignment problem.

Supplementary Material

Some of the lecture slides are available on-line (one per page, 4 per page).

For a comprehensive description of protein structure, see:

Richardson, J.S. (1981) The Anatomy and Taxonomy of Protein Structure. Advances in Protein Chemistry, 34, 167-339.
Web version

Within that article, I particularly recommend that you look at the section on Amino Acids and Backbone Conformation, and be aware of the main elements of protein structure, including α-helices, β sheets and domains.

The 20 common amino acid residues are described in section V of:

Richardson, J.S. and Richardson, D.C. (1989) Principles and Patterns of Protein Conformation. In Prediction of Protein Structure and the Principles of Protein Conformation, G.D. Fasman (ed.) Plenum Press, New York, 1-98.
SpringerLink

The article Abbreviations and Symbols for the Description of the Conformation of Polypeptide Chains (Eur. J. Biochem., 1969, 17, 193-201) contains the Rules prepared by the IUPAC-IUB Commission on Biochemical Nomenclature. The article Nomenclature and Symbolism for Amino Acids and Peptides (Biochem. J., 1984, 219, 345-373) describes the three-letter and one-letter systems for naming amino acid residues.

The source code for the geometry routines mentioned in the lecture is available online.

The DSSP program is widely used to make secondary structure assignments in a standard way. The DSSP home page is maintained by Gert Vriend. DSSP is described in:

Kabsch W. and Sander C. (1983) Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 2577-2637
doi:10.1002/bip.360221211

Data files for experimentally determined macromolecular structures can be downloaded from the Worldwide Protein Data Bank (wwPDB). Documentation on that site includes a description of PDB File Format.

Several distance maps are shown on The Protein Kinase Resource web site.

The algorithm used by DOMAK to identify structural domains is described in:

Siddiqui, A.S. and Barton G.J. (1995) Continuous and discontinuous domains: an algorithm for the automatic generation of reliable protein domain definitions. Protein Science, 4, 872-884.
doi:10.1002/pro.5560040507

The algorithm used by STRUDL to identify structural domains is described in:

Wernisch, L., Hunting, M. and Wodak, S.J. (1999) Identification of structural domains in proteins by a graph heuristic. Proteins: Structure, Function and Genetics, 35, 338-352.
doi:10.1002/(SICI)1097-0134(19990515)35:3<338::AID-PROT8>3.0.CO;2-I

Voronoi diagrams are described at MathWorld.

Last Modified: 4 November 2016