An Introduction to Protein Contact Prediction
A fundamental problem in molecular biology is the prediction of the three-dimensional structure of a protein from its amino acid sequence. However, molecular modeling to find the structure is at present intractable and is likely to remain so for some time, hence intermediate steps such as predicting which residues pairs are in contact have been developed. Predicted contact pairs have been used for fold prediction, as an initial condition or constraint for molecular modeling, and as a filter to rank multiple models arising from homology modeling. As contact prediction has advanced it is becoming more common for 3D structure predictors to integrate contact prediction into structure building, as this often gives information that is orthogonal to that produced by other methods. This chapter shows how evolutionary information contained in protein sequences and multiple sequence alignments can be used to predict protein structure, and the state-of-the-art predictors and their methodologies are reviewed.
Key wordsProtein structure prediction contact prediction contact map multiple sequence alignments CASP
The authors gratefully acknowledge financial support from the University of Queensland, the ARC Australian Centre for Bio-informatics and the Institute for Molecular Bioscience. The first author would also like to acknowledge the support of Prof. Kevin Burrage's Australian Federation Fellowship.
- 15.Shapire, R.E., The boosting approach to machine learning: An overview. MSRI Workshop on Nonlinear Estimation and Classification. 2002: Springer.Google Scholar
- 16.Haykin, S., Neural Networks. 2nd ed. 1999: Prentice Hall. 104Google Scholar
- 17.Zell, A., Marnier, M., Vogt, N., et al, Stuttgart Neural Network Simulator User Manual Version 4.2. 1998: University of Stuttgart.Google Scholar
- 22.Cortes, C, and Vapnik, V. (1995) Support vector network. Machine and learning 20, 273–297.Google Scholar
- 23.Boser, B., Guyon, I., and Vapnik, V. A training algorithm for optimal margin classifiers. in Proceedings of the fifth annual workshop on computational learning theory. 1992.Google Scholar
- 24.Chang, C-C, and Lin, C-J, LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu. tw/ cjlin/libsvm. 2001.Google Scholar
- 25.Koski, T., Hidden Markov Models for Bioinformatics. 2002: Springer.Google Scholar
- 30.Hu, J., Shen, X., Shao, Y., et al., eds. Mining protein contact maps. In 2nd BIOKDD Workshop on Data Mining in Bioinformatics. 2002.Google Scholar