Summary
Hidden Markov Models (HMMs) are an extremely versatile statistical representation that can be used to model any set of one-dimensional discrete symbol data. HMMs can model protein sequences in many ways, depending on what features of the protein are represented by the Markov states. For protein structure prediction, states have been chosen to represent either homologous sequence positions, local or secondary structure types, or transmembrane locality. The resulting models can be used to predict common ancestry, secondary or local structure, or membrane topology by applying one of the two standard algorithms for comparing a sequence to a model. In this chapter, we review those algorithms and discuss how HMMs have been constructed and refined for the purpose of protein structure prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Eddy, S. Profile hidden Markov models. Bioinformatics, 14:755–763.
Madera, M. et al. (2004). The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids Res, 32(90001):235–239.
Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286.
Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E. (2001). Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol, 305(3):567–580.
Needleman, S. and Wunsch, C. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol, 48(3):443–453.
Durbin, R., Eddy, S., Krogh, A., and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press.
Kyte, J. and Doolittle, R. (1982). A simple method for displaying the hydropathic character of a protein. J Mol Biol, 157(1):105–132.
Argos, P., Rao, J., and Hargrave, P. (1982). Structural prediction of membrane-bound proteins. Eur J Biochem, 128:565–575.
von Heijne, G. (1990). The signal peptide. J Membr Biol, 115(3):195–201.
von Heijne, G. (1992). Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol, 225(2):487–494.
Jones, D., Taylor, W., and Thornton, J. (1994). A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry, 33(10):3038–3049.
Rost, B., Casadio, R., and Fariselli, P. (1996). Refining neural network predictions for helical transmembrane proteins by dynamic programming. Proc Int Conf Intell Syst Mol Biol, 4:192–200.
Yuan, Z., Mattick, J., and Teasdale, R. (2004). SVMtm: support vector machines to predict transmembrane segments. J Comput Chem, 25(5):632–636.
Sonnhammer, E., von Heijne, G., Krogh, A., et al. (1998). A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol, 6:175–182.
Tusnady, G. and Simon, I. (1998). Principles governing amino acid composition of integral membrane proteins: application to topology prediction. J Mol Biol, 283(2):489–506.
Chow, Y. and Schwartz, R. (1989). The N-Best algorithm: an efficient procedure for finding top N sentence hypotheses. Proceedings of the DARPA Speech and Natural Language Workshop, 199–202.
Kahsay, R., Gao, G., Liao, L., and Journals, O. (2005). An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes. Bioinformatics, 21(9):1853–1858.
Viklund, H. and Elofsson, A. (2004). Best α-helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information. Protein Sci, 13:1908–1917.
Käll, L., Krogh, A., and Sonnhammer, E. (2005). An HMM posterior decoder for sequence feature prediction that includes homology in formation. Bioinformatics, 21(1):i251–i257.
Bendtsen, J., Nielsen, H., von Heijne, G., and Brunak, S. (2004). Improved prediction of signal peptides: SignalP 3.0. J Mol Biol, 340(4):783–795.
Nielsen, H. and Krogh, A. (1998). Prediction of signal peptides and signal anchors by a hidden Markov model. Proc Int Conf Intell Syst Mol Biol, 6:122–130.
Juncker, A., Willenbrock, H., von Heijne, G., Brunak, S., Nielsen, H., and Krogh, A. (2003). Prediction of lipoprotein signal peptides in Gram-negative bacteria. Protein Sci, 12:1652–1662.
Klee, E. and Ellis, L. (2005). Evaluating eukaryotic secreted protein prediction. BMC Bioinformatics, 6(1):256.
Martelli, P.L., Fariselli P., and Casadio, R. (2003). An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins. Bioinformatics, 19(Suppl 1):I205–I211.
Fariselli, P., Finelli, M., Marchignoli, D., Martelli, P.L., Rossi, I., and Casadio, R. (2003). MaxSubSeq: an algorithm for segment-length optimization. The case study of the transmembrane spanning segments. Bioinformatics, 19:500–505.
Delorenzi, M. and Speed, T. (2002). An HMM model for coiled-coil domains and a comparison with PSSM-based predictions. Bioinformatics, 18(4):617–625.
Kabsch, W. and Sander, C. (1983). How good are predictions of protein secondary structure? Biopolymers, 22:2577–2637.
Heinig, M. and Frishman, D. (2004). STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Res, 32:500–502.
Rost, B. and Sander, C. (1993). Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol, 232(2):584–599.
Jones, D. (1999). Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol, 292(2):195–202.
Ward, J., McGuffin, L., Buxton, B., and Jones, D. (2003). Secondary structure prediction with support vector machines. Bioinformatics, 19(13):1650–1655.
Asai, K., Hayamizu, S., and Handa, K. (1993). Prediction of protein secondary structure by the hidden Markov model. Bioinformatics, 9:141–146.
Zemla, A., Venclovas, C., Moult, J., and Fidelis, K. (2001). Processing and evaluation of predictions in CASP 4. Proteins, 45(Suppl 5):13–21.
Stultz, C. (1993). Structural analysis based on state-space modeling. Protein Sci, 2(3):305–314.
Bienkowska, J., He, H., and Smith, T. (2001). Automatic pattern embedding in protein structure models. Intelligent Systems, IEEE [see also IEEE Expert], 16(6):21–25.
Rooman, M.J., Kocher, J.P., and Wodak, S.J. (1991). Prediction of protein backbone conformation based on seven structure assignments. Influence of local interactions. J Mol Biol, 221(3):961–979.
de Brevern, A.G., Valadie, H., Hazout, S., and Etchebest, C. (2002). Extension of a local backbone description using a structural alphabet: a new approach to the sequence-structure relationship. Protein Sci, 11:2871–2886.
Bystroff, C. and Baker, D. (1998). Prediction of local structure in proteins using a library of sequence-structure motifs. J Mol Biol, 281(3):565–577.
Unger, R., Harel, D., Wherland, S., and Sussman, J. (1989). A 3D building blocks approach to analyzing and predicting structure of proteins. Proteins, 5:355–373.
Camproux, A., Tuffery, P., Chevrolat, J., Boisvieux, J., and Hazout, S. (1999). Hidden Markov model approach for identifying the modular framework of the protein backbone. Protein Eng, 12(12):1063–1073.
Kent, J. T. and Hamelryck, T. (2005). Using the Fisher-Bingham distribution in stochastic models for protein structure. In Barber, S., Baxter, P. D., V.Mardia, K., and Walls, R. E., editors, Proceedings of the 24th LASR Workshop, 57–60. Leeds University Press.
Hamelryck, T., Kent, JT, Krogh, A. (2006) Sampling realistic protein conformations using local structural bias. PLoS J Comput Biol., 2(9):e131.
Bystroff, C., Thorsson, V., and Baker, D. (2000). HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. J Mol Biol, 301(1):173–190.
Bystroff, C. and Shao, Y. (2002). Fully automated ab initio protein structure prediction using I-SITES, HMMSTR and ROSETTA. Bioinformatics, 18(1):54–61.
Shao, Y. and Bystroff, C. (2003). Predicting interresidue contacts using templates and pathways. Proteins, 53(Supple 6):497–502.
Zahn, R., Liu, A., Luhrs, T., Riek, R., von Schroetter, C., Garcia, F., Billeter, M., Calzolai, L., Wider, G., and Wuthrich, K. (2000). NMR solution structure of the human prion protein. Proc Natl Acad Sci USA, 97(1):145–150.
Knaus, K., Morillas, M., Swietnicki, W., Malone, M., Surewicz, W., and Yee, V. (2001). Crystal structure of the human prion protein reveals a mechanism for oligomerization. Nat Struct Biol, 8:770–774.
Kovacs, G., Trabattoni, G., Hainfellner, J., Ironside, J., Knight, R., and Budka, H. (2002). Mutations of the prion protein gene. J Neurol, 249(11):1567–1582.
Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Yeats, C., and Eddy, S.R. (2004). The Pfam protein families database. Nucleic Acids Res 32: D138–D141.
Karplus, K., Sjoelander, K., Barrett, C., Cline, M., Haussler, D., Hughey, R., Holm, L., and Sander, C. (1997). Predicting protein structure using hidden Markov models. Proteins, 29(Suppl 1):134–139.
Tsigelny, I., Sharikov, Y., and Ten Eyck, L. (2002). Hidden Markov Models-based system (HMMSPECTR) for detecting structural homologies on the basis of sequential information. Protein Eng, 15(5):347–352.
Krogh, A., Brown, M., Mian, I. S., Sjölander, K., and Haussler, D. (1994). Hidden Markov Models in computational biology: applications to protein modeling. J Mol Biol., 235:1501–1531.
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grant DBI-0448072 to C.B.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Humana Press Inc
About this protocol
Cite this protocol
Bystroff, C., Krogh, A. (2008). Hidden Markov Models for Prediction of Protein Features. In: Zaki, M.J., Bystroff, C. (eds) Protein Structure Prediction. Methods in Molecular Biology™, vol 413. Humana Press. https://doi.org/10.1007/978-1-59745-574-9_7
Download citation
DOI: https://doi.org/10.1007/978-1-59745-574-9_7
Publisher Name: Humana Press
Print ISBN: 978-1-58829-752-5
Online ISBN: 978-1-59745-574-9
eBook Packages: Springer Protocols