Hidden Markov Models for Prediction of Protein Features

Bystroff, Christopher; Krogh, Anders

doi:10.1007/978-1-59745-574-9_7

Christopher Bystroff &
Anders Krogh

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 413))

2517 Accesses
4 Citations

Summary

Hidden Markov Models (HMMs) are an extremely versatile statistical representation that can be used to model any set of one-dimensional discrete symbol data. HMMs can model protein sequences in many ways, depending on what features of the protein are represented by the Markov states. For protein structure prediction, states have been chosen to represent either homologous sequence positions, local or secondary structure types, or transmembrane locality. The resulting models can be used to predict common ancestry, secondary or local structure, or membrane topology by applying one of the two standard algorithms for comparing a sequence to a model. In this chapter, we review those algorithms and discuss how HMMs have been constructed and refined for the purpose of protein structure prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Eddy, S. Profile hidden Markov models. Bioinformatics, 14:755–763.
Google Scholar
Madera, M. et al. (2004). The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids Res, 32(90001):235–239.
Article Google Scholar
Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286.
Google Scholar
Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E. (2001). Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol, 305(3):567–580.
Article CAS PubMed Google Scholar
Needleman, S. and Wunsch, C. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol, 48(3):443–453.
Article CAS PubMed Google Scholar
Durbin, R., Eddy, S., Krogh, A., and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press.
Google Scholar
Kyte, J. and Doolittle, R. (1982). A simple method for displaying the hydropathic character of a protein. J Mol Biol, 157(1):105–132.
Article CAS PubMed Google Scholar
Argos, P., Rao, J., and Hargrave, P. (1982). Structural prediction of membrane-bound proteins. Eur J Biochem, 128:565–575.
Article CAS PubMed Google Scholar
von Heijne, G. (1990). The signal peptide. J Membr Biol, 115(3):195–201.
Article Google Scholar
von Heijne, G. (1992). Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol, 225(2):487–494.
Article Google Scholar
Jones, D., Taylor, W., and Thornton, J. (1994). A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry, 33(10):3038–3049.
Article CAS PubMed Google Scholar
Rost, B., Casadio, R., and Fariselli, P. (1996). Refining neural network predictions for helical transmembrane proteins by dynamic programming. Proc Int Conf Intell Syst Mol Biol, 4:192–200.
CAS PubMed Google Scholar
Yuan, Z., Mattick, J., and Teasdale, R. (2004). SVMtm: support vector machines to predict transmembrane segments. J Comput Chem, 25(5):632–636.
Article CAS PubMed Google Scholar
Sonnhammer, E., von Heijne, G., Krogh, A., et al. (1998). A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol, 6:175–182.
CAS PubMed Google Scholar
Tusnady, G. and Simon, I. (1998). Principles governing amino acid composition of integral membrane proteins: application to topology prediction. J Mol Biol, 283(2):489–506.
Article CAS PubMed Google Scholar
Chow, Y. and Schwartz, R. (1989). The N-Best algorithm: an efficient procedure for finding top N sentence hypotheses. Proceedings of the DARPA Speech and Natural Language Workshop, 199–202.
Google Scholar
Kahsay, R., Gao, G., Liao, L., and Journals, O. (2005). An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes. Bioinformatics, 21(9):1853–1858.
Article CAS PubMed Google Scholar
Viklund, H. and Elofsson, A. (2004). Best α-helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information. Protein Sci, 13:1908–1917.
Article CAS PubMed Google Scholar
Käll, L., Krogh, A., and Sonnhammer, E. (2005). An HMM posterior decoder for sequence feature prediction that includes homology in formation. Bioinformatics, 21(1):i251–i257.
Article PubMed Google Scholar
Bendtsen, J., Nielsen, H., von Heijne, G., and Brunak, S. (2004). Improved prediction of signal peptides: SignalP 3.0. J Mol Biol, 340(4):783–795.
Article PubMed Google Scholar
Nielsen, H. and Krogh, A. (1998). Prediction of signal peptides and signal anchors by a hidden Markov model. Proc Int Conf Intell Syst Mol Biol, 6:122–130.
CAS PubMed Google Scholar
Juncker, A., Willenbrock, H., von Heijne, G., Brunak, S., Nielsen, H., and Krogh, A. (2003). Prediction of lipoprotein signal peptides in Gram-negative bacteria. Protein Sci, 12:1652–1662.
Article CAS PubMed Google Scholar
Klee, E. and Ellis, L. (2005). Evaluating eukaryotic secreted protein prediction. BMC Bioinformatics, 6(1):256.
Article PubMed Google Scholar
Martelli, P.L., Fariselli P., and Casadio, R. (2003). An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins. Bioinformatics, 19(Suppl 1):I205–I211.
Article PubMed Google Scholar
Fariselli, P., Finelli, M., Marchignoli, D., Martelli, P.L., Rossi, I., and Casadio, R. (2003). MaxSubSeq: an algorithm for segment-length optimization. The case study of the transmembrane spanning segments. Bioinformatics, 19:500–505.
Article CAS PubMed Google Scholar
Delorenzi, M. and Speed, T. (2002). An HMM model for coiled-coil domains and a comparison with PSSM-based predictions. Bioinformatics, 18(4):617–625.
Article CAS PubMed Google Scholar
Kabsch, W. and Sander, C. (1983). How good are predictions of protein secondary structure? Biopolymers, 22:2577–2637.
Article CAS PubMed Google Scholar
Heinig, M. and Frishman, D. (2004). STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Res, 32:500–502.
Article Google Scholar
Rost, B. and Sander, C. (1993). Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol, 232(2):584–599.
Article CAS PubMed Google Scholar
Jones, D. (1999). Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol, 292(2):195–202.
Article CAS PubMed Google Scholar
Ward, J., McGuffin, L., Buxton, B., and Jones, D. (2003). Secondary structure prediction with support vector machines. Bioinformatics, 19(13):1650–1655.
Article CAS PubMed Google Scholar
Asai, K., Hayamizu, S., and Handa, K. (1993). Prediction of protein secondary structure by the hidden Markov model. Bioinformatics, 9:141–146.
Article CAS Google Scholar
Zemla, A., Venclovas, C., Moult, J., and Fidelis, K. (2001). Processing and evaluation of predictions in CASP 4. Proteins, 45(Suppl 5):13–21.
Article Google Scholar
Stultz, C. (1993). Structural analysis based on state-space modeling. Protein Sci, 2(3):305–314.
Article CAS PubMed Google Scholar
Bienkowska, J., He, H., and Smith, T. (2001). Automatic pattern embedding in protein structure models. Intelligent Systems, IEEE [see also IEEE Expert], 16(6):21–25.
Article Google Scholar
Rooman, M.J., Kocher, J.P., and Wodak, S.J. (1991). Prediction of protein backbone conformation based on seven structure assignments. Influence of local interactions. J Mol Biol, 221(3):961–979.
Article CAS PubMed Google Scholar
de Brevern, A.G., Valadie, H., Hazout, S., and Etchebest, C. (2002). Extension of a local backbone description using a structural alphabet: a new approach to the sequence-structure relationship. Protein Sci, 11:2871–2886.
Article PubMed Google Scholar
Bystroff, C. and Baker, D. (1998). Prediction of local structure in proteins using a library of sequence-structure motifs. J Mol Biol, 281(3):565–577.
Article CAS PubMed Google Scholar
Unger, R., Harel, D., Wherland, S., and Sussman, J. (1989). A 3D building blocks approach to analyzing and predicting structure of proteins. Proteins, 5:355–373.
Article CAS PubMed Google Scholar
Camproux, A., Tuffery, P., Chevrolat, J., Boisvieux, J., and Hazout, S. (1999). Hidden Markov model approach for identifying the modular framework of the protein backbone. Protein Eng, 12(12):1063–1073.
Article CAS PubMed Google Scholar
Kent, J. T. and Hamelryck, T. (2005). Using the Fisher-Bingham distribution in stochastic models for protein structure. In Barber, S., Baxter, P. D., V.Mardia, K., and Walls, R. E., editors, Proceedings of the 24th LASR Workshop, 57–60. Leeds University Press.
Google Scholar
Hamelryck, T., Kent, JT, Krogh, A. (2006) Sampling realistic protein conformations using local structural bias. PLoS J Comput Biol., 2(9):e131.
Article Google Scholar
Bystroff, C., Thorsson, V., and Baker, D. (2000). HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. J Mol Biol, 301(1):173–190.
Article CAS PubMed Google Scholar
Bystroff, C. and Shao, Y. (2002). Fully automated ab initio protein structure prediction using I-SITES, HMMSTR and ROSETTA. Bioinformatics, 18(1):54–61.
Google Scholar
Shao, Y. and Bystroff, C. (2003). Predicting interresidue contacts using templates and pathways. Proteins, 53(Supple 6):497–502.
Article CAS PubMed Google Scholar
Zahn, R., Liu, A., Luhrs, T., Riek, R., von Schroetter, C., Garcia, F., Billeter, M., Calzolai, L., Wider, G., and Wuthrich, K. (2000). NMR solution structure of the human prion protein. Proc Natl Acad Sci USA, 97(1):145–150.
Article CAS PubMed Google Scholar
Knaus, K., Morillas, M., Swietnicki, W., Malone, M., Surewicz, W., and Yee, V. (2001). Crystal structure of the human prion protein reveals a mechanism for oligomerization. Nat Struct Biol, 8:770–774.
Article CAS PubMed Google Scholar
Kovacs, G., Trabattoni, G., Hainfellner, J., Ironside, J., Knight, R., and Budka, H. (2002). Mutations of the prion protein gene. J Neurol, 249(11):1567–1582.
Article CAS PubMed Google Scholar
Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Yeats, C., and Eddy, S.R. (2004). The Pfam protein families database. Nucleic Acids Res 32: D138–D141.
Google Scholar
Karplus, K., Sjoelander, K., Barrett, C., Cline, M., Haussler, D., Hughey, R., Holm, L., and Sander, C. (1997). Predicting protein structure using hidden Markov models. Proteins, 29(Suppl 1):134–139.
Article Google Scholar
Tsigelny, I., Sharikov, Y., and Ten Eyck, L. (2002). Hidden Markov Models-based system (HMMSPECTR) for detecting structural homologies on the basis of sequential information. Protein Eng, 15(5):347–352.
Article CAS PubMed Google Scholar
Krogh, A., Brown, M., Mian, I. S., Sjölander, K., and Haussler, D. (1994). Hidden Markov Models in computational biology: applications to protein modeling. J Mol Biol., 235:1501–1531.
Article CAS PubMed Google Scholar

Download references

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant DBI-0448072 to C.B.

Authors

Christopher Bystroff
View author publications
You can also search for this author in PubMed Google Scholar
Anders Krogh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Rensselaer Polytechnic Institute, Troy, New York, USA
Mohammed J. Zaki & Christopher Bystroff &

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Bystroff, C., Krogh, A. (2008). Hidden Markov Models for Prediction of Protein Features. In: Zaki, M.J., Bystroff, C. (eds) Protein Structure Prediction. Methods in Molecular Biology™, vol 413. Humana Press. https://doi.org/10.1007/978-1-59745-574-9_7

Download citation

DOI: https://doi.org/10.1007/978-1-59745-574-9_7
Publisher Name: Humana Press
Print ISBN: 978-1-58829-752-5
Online ISBN: 978-1-59745-574-9
eBook Packages: Springer Protocols

Publish with us

Policies and ethics