Skip to main content

Bayesian Protein Structure Prediction

  • Conference paper
Book cover Case Studies in Bayesian Statistics

Part of the book series: Lecture Notes in Statistics ((LNS,volume 162))

Abstract

An important role for statisticians in the age of the Human Genome Project has developed in the emerging area of “structural bioinformatics”. Sequence analysis and structure prediction for biopolymers is a crucial step on the path to turning newly sequenced genomic data into biologically and pharmaceutically relevant information in support of molecular medicine. We describe our work on Bayesian models for prediction of protein structure from sequence, based on analysis of a database of experimentally determined protein structures. We have previously developed segment-based models of protein secondary structure which capture fundamental aspects of the protein folding process. These models provide predictive performance at the level of the best available methods in the field (Schmidler et al., 2000). Here we show that this Bayesian framework is naturally generalized to incorporate information based on non-local sequence interactions. We demonstrate this idea by presenting a simple model for β-strand pairing and a Markov chain Monte Carlo (MCMC) algorithm for inference. We apply the approach to prediction of 3-dimensional contacts for two example proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Asai, K., Hayamizu, S., and Handa, K. (1993). Prediction of protein secondary structure by the hidden Markov model. Comp. Appl. BioscL, 9(2): 141–146.

    Google Scholar 

  • Aurora, R. and Rose, G. D. (1998). Helix capping. Prot. Sci., 7:21–38.

    Google Scholar 

  • Baldwin, R. L. and Rose, G. D. (1999). Is protein folding hierarchic? I. Local structure and peptide folding. Trends Biochem. Sci., 24:26–33.

    Article  Google Scholar 

  • Barton, G. J. (1995). Protein secondary structure prediction. Curr. Opin. Struct Biol., 5:372–376.

    Article  Google Scholar 

  • Burley, S. K., Almo, S. C., Bonanno, J. B., Capel, M., Chance, M. R., Gaasterland, T., Lin, D., Sali, A., Studier, F. W., and Swaminathan, S. (1999). Structural genomics: Beyond the Human Genome Project. Nat. Genet., 23:151–157.

    Article  Google Scholar 

  • Cohen, B. I., Presnell, S. R., and Cohen, F. E. (1993). Origins of structural diversity within sequentially identical hexapeptides. Prot Sci., 2:2134–2145.

    Article  Google Scholar 

  • Collins, F. S., Patrinos, A., Jordan, E., Chakravarti, A., Gesteland, R., and Walters, L. (1998). New goals for the U.S. Human Genome Project: 1998–2003. Science, 282:682–689.

    Article  Google Scholar 

  • Dill, K. A. (1999). Polymer principles and protein folding. Prot. Sci., 8:1166–1180.

    Article  Google Scholar 

  • Eyrich, V. A., Standley, D. M., and Friesner, R. A. (1999). Prediction of protein tertiary structure to low resolution: Performance for a large and structurally diverse test set. J Mol. Biol., 288:725–742.

    Article  Google Scholar 

  • Fischer, D. and Eisenberg, D. (1996). Protein fold recognition using sequence-derived predictions. Prot. Sci., 5:947–955.

    Article  Google Scholar 

  • Frishman, D. and Argos, P. (1996). Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Prot. Eng., 9(2):133–142.

    Article  Google Scholar 

  • Gilks, W. R., Richardson, S., and Spiegelhalter, D. J., editors (1996). Markov Chain Monte Carlo in Practice. Chapman & Hall.

    Google Scholar 

  • Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82(4):711–732.

    Article  MathSciNet  MATH  Google Scholar 

  • Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57:97–109.

    Article  MATH  Google Scholar 

  • Hubbard, T. J. and Park, J. (1995). Fold recognition and ab initio structure predictions using hidden Markov models and β-strand pair potentials. Proteins: Struct. Funct. Genet., 23:398–402.

    Article  Google Scholar 

  • Hutchinson, E. G., Sessions, R. B., Thornton, J. M., and Woolfson, D. N. (1998). Determinants of strand register in antiparallel β-sheets of proteins. Prot. Sci., 7:2287–2300.

    Article  Google Scholar 

  • Kabsch, W. and Sander, C. (1984). On the use of sequence homologies to predict protein structure: Identical pentapeptides can have completely different conformations. Proc. Natl. Acad. Sci. USA, 81(4): 1075–1078.

    Article  Google Scholar 

  • King, R. D. and Sternberg, M. J. E. (1996). Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Prot. Sci., 5:2298–2310.

    Article  Google Scholar 

  • Klingler, T. M. and Brutlag, D. L. (1994). Discovering structural correlations in a-helices. Prot. Sci., 3:1847–1857.

    Article  Google Scholar 

  • Krogh, A. and Riis, S. K. (1996). Prediction of beta sheets in proteins. In Touretzky DS, Mozer MC, H. M., editor, Advances in Neural Information Processing Systems 8. MIT Press.

    Google Scholar 

  • Lifson, S. and Sander, C. (1980). Specific recognition in the tertiary structure of β-sheets of proteins. J Mol. Biol., 139:627–639.

    Article  Google Scholar 

  • Liu, J. S. (1996). Peskun’s theorem and a modified discrete-state Gibbs sampler. Biometrika, 83:681–682.

    Article  MathSciNet  MATH  Google Scholar 

  • Minor, D. L. J. and Kim, P. S. (1996). Context-dependent secondary structure formation of a designed protein sequence. Nature, 380:730–734.

    Article  Google Scholar 

  • Monge, A., Friesner, R. A., and Honig, B. (1994). An algorithm to generate low-resolution protein tertiary structures from knowledge of secondary structure. Proc. Natl. Acad. Sci. USA, 91:5027–5029.

    Article  Google Scholar 

  • Montelione, G. T. and Anderson, S. (1999). Structural genomics: Keystone for a Human Proteome Project. Nat. Struct. Biol., 6:11–12.

    Article  Google Scholar 

  • Neumaier, A. (1997). Molecular modeling of proteins and mathematical prediction of protein structure. SIAM Rev., 39(3):407–460.

    Article  MathSciNet  MATH  Google Scholar 

  • Russell, R. B., Copley, R. R., and Barton, G. J. (1996). Protein fold recognition by mapping predicted secondary structures. J Mol. Biol., 259:349–365.

    Article  Google Scholar 

  • Schmidler, S. C. (2000). Statistical Models and Monte Carlo Methods for Protein Structure Prediction. PhD thesis, Stanford University.

    Google Scholar 

  • Schmidler, S. C., Liu, J. S., and Brutlag, D. L. (2000). Bayesian segmentation of protein secondary structure. J. Comp. Biol., 7(l):233–248.

    Article  Google Scholar 

  • Stultz, C. M., White, J. V., and Smith, T. F. (1993). Structural analysis based on state-space modeling. Prot. Sci., 2:305–314.

    Article  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer Science+Business Media New York

About this paper

Cite this paper

Schmidler, S.C., Liu, J.S., Brutlag, D.L. (2002). Bayesian Protein Structure Prediction. In: Gatsonis, C., et al. Case Studies in Bayesian Statistics. Lecture Notes in Statistics, vol 162. Springer, New York, NY. https://doi.org/10.1007/978-1-4613-0035-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-0035-9_10

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-95169-0

  • Online ISBN: 978-1-4613-0035-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics