Abstract
An important role for statisticians in the age of the Human Genome Project has developed in the emerging area of “structural bioinformatics”. Sequence analysis and structure prediction for biopolymers is a crucial step on the path to turning newly sequenced genomic data into biologically and pharmaceutically relevant information in support of molecular medicine. We describe our work on Bayesian models for prediction of protein structure from sequence, based on analysis of a database of experimentally determined protein structures. We have previously developed segment-based models of protein secondary structure which capture fundamental aspects of the protein folding process. These models provide predictive performance at the level of the best available methods in the field (Schmidler et al., 2000). Here we show that this Bayesian framework is naturally generalized to incorporate information based on non-local sequence interactions. We demonstrate this idea by presenting a simple model for β-strand pairing and a Markov chain Monte Carlo (MCMC) algorithm for inference. We apply the approach to prediction of 3-dimensional contacts for two example proteins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Asai, K., Hayamizu, S., and Handa, K. (1993). Prediction of protein secondary structure by the hidden Markov model. Comp. Appl. BioscL, 9(2): 141–146.
Aurora, R. and Rose, G. D. (1998). Helix capping. Prot. Sci., 7:21–38.
Baldwin, R. L. and Rose, G. D. (1999). Is protein folding hierarchic? I. Local structure and peptide folding. Trends Biochem. Sci., 24:26–33.
Barton, G. J. (1995). Protein secondary structure prediction. Curr. Opin. Struct Biol., 5:372–376.
Burley, S. K., Almo, S. C., Bonanno, J. B., Capel, M., Chance, M. R., Gaasterland, T., Lin, D., Sali, A., Studier, F. W., and Swaminathan, S. (1999). Structural genomics: Beyond the Human Genome Project. Nat. Genet., 23:151–157.
Cohen, B. I., Presnell, S. R., and Cohen, F. E. (1993). Origins of structural diversity within sequentially identical hexapeptides. Prot Sci., 2:2134–2145.
Collins, F. S., Patrinos, A., Jordan, E., Chakravarti, A., Gesteland, R., and Walters, L. (1998). New goals for the U.S. Human Genome Project: 1998–2003. Science, 282:682–689.
Dill, K. A. (1999). Polymer principles and protein folding. Prot. Sci., 8:1166–1180.
Eyrich, V. A., Standley, D. M., and Friesner, R. A. (1999). Prediction of protein tertiary structure to low resolution: Performance for a large and structurally diverse test set. J Mol. Biol., 288:725–742.
Fischer, D. and Eisenberg, D. (1996). Protein fold recognition using sequence-derived predictions. Prot. Sci., 5:947–955.
Frishman, D. and Argos, P. (1996). Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Prot. Eng., 9(2):133–142.
Gilks, W. R., Richardson, S., and Spiegelhalter, D. J., editors (1996). Markov Chain Monte Carlo in Practice. Chapman & Hall.
Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82(4):711–732.
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57:97–109.
Hubbard, T. J. and Park, J. (1995). Fold recognition and ab initio structure predictions using hidden Markov models and β-strand pair potentials. Proteins: Struct. Funct. Genet., 23:398–402.
Hutchinson, E. G., Sessions, R. B., Thornton, J. M., and Woolfson, D. N. (1998). Determinants of strand register in antiparallel β-sheets of proteins. Prot. Sci., 7:2287–2300.
Kabsch, W. and Sander, C. (1984). On the use of sequence homologies to predict protein structure: Identical pentapeptides can have completely different conformations. Proc. Natl. Acad. Sci. USA, 81(4): 1075–1078.
King, R. D. and Sternberg, M. J. E. (1996). Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Prot. Sci., 5:2298–2310.
Klingler, T. M. and Brutlag, D. L. (1994). Discovering structural correlations in a-helices. Prot. Sci., 3:1847–1857.
Krogh, A. and Riis, S. K. (1996). Prediction of beta sheets in proteins. In Touretzky DS, Mozer MC, H. M., editor, Advances in Neural Information Processing Systems 8. MIT Press.
Lifson, S. and Sander, C. (1980). Specific recognition in the tertiary structure of β-sheets of proteins. J Mol. Biol., 139:627–639.
Liu, J. S. (1996). Peskun’s theorem and a modified discrete-state Gibbs sampler. Biometrika, 83:681–682.
Minor, D. L. J. and Kim, P. S. (1996). Context-dependent secondary structure formation of a designed protein sequence. Nature, 380:730–734.
Monge, A., Friesner, R. A., and Honig, B. (1994). An algorithm to generate low-resolution protein tertiary structures from knowledge of secondary structure. Proc. Natl. Acad. Sci. USA, 91:5027–5029.
Montelione, G. T. and Anderson, S. (1999). Structural genomics: Keystone for a Human Proteome Project. Nat. Struct. Biol., 6:11–12.
Neumaier, A. (1997). Molecular modeling of proteins and mathematical prediction of protein structure. SIAM Rev., 39(3):407–460.
Russell, R. B., Copley, R. R., and Barton, G. J. (1996). Protein fold recognition by mapping predicted secondary structures. J Mol. Biol., 259:349–365.
Schmidler, S. C. (2000). Statistical Models and Monte Carlo Methods for Protein Structure Prediction. PhD thesis, Stanford University.
Schmidler, S. C., Liu, J. S., and Brutlag, D. L. (2000). Bayesian segmentation of protein secondary structure. J. Comp. Biol., 7(l):233–248.
Stultz, C. M., White, J. V., and Smith, T. F. (1993). Structural analysis based on state-space modeling. Prot. Sci., 2:305–314.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media New York
About this paper
Cite this paper
Schmidler, S.C., Liu, J.S., Brutlag, D.L. (2002). Bayesian Protein Structure Prediction. In: Gatsonis, C., et al. Case Studies in Bayesian Statistics. Lecture Notes in Statistics, vol 162. Springer, New York, NY. https://doi.org/10.1007/978-1-4613-0035-9_10
Download citation
DOI: https://doi.org/10.1007/978-1-4613-0035-9_10
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-95169-0
Online ISBN: 978-1-4613-0035-9
eBook Packages: Springer Book Archive