Skip to main content

Protein Structure Prediction

  • Chapter
  • First Online:
Book cover Biomedical Applications of Biophysics

Part of the book series: Handbook of Modern Biophysics ((HBBT,volume 3))

Abstract

The molecular basis of life rests on the activity of large biomolecules, mostly nucleic acids(DNA and RNA), carbohydrates, lipids, and proteins. While each of these molecules has itsrole, there is something special about proteins, as they are the lead performers of cellular functions.This was dramatized by Jacques Monod, who stated that “C’est à ce niveau d’ organisation chimique que gît, s’il y en a un, le secret de la vie,” i.e., that it is at this level of organization that lies the secret of life, if there is one [1]. To understand how these molecules function we first need to know their shapes; consequently, structural molecular biology has emerged as a new line of experimental research focused on revealing the structure of these biomolecules. This branch of biology has recently experienced a major uplift through the development of highthroughput structural studies, the structural genomics projects, aimed atdeveloping a comprehensive view of the protein structure universe. All these initiatives are expected to help us unravel the connections between the sequence, structure, and function of a protein. Experimental data at a molecular level are scarce, however; this has led to the development of many modeling initiatives to shed light on these connections. Probably the most famous is the study of the protein-folding problem — the “holy grail” for the structural biology community. Its elusive goal is to predict the detailed three-dimensional structure of a protein from its sequence as well as to decipher the sequence of events the protein goes through to reach its folded state. This chapter is dedicated to the first part of this task, namely the protein structure prediction problem. We structure prediction problem benefit from two different approaches to science, which differ in the importance they give to experimental data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 179.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 229.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Further Reading

  1. Branden C, Tooze J. 1991. Introduction to protein structure. New York: Garland Publishing.

    Google Scholar 

  2. Creighton TE. 1993. Proteins. New York: W.H. Freeman & Co.

    Google Scholar 

  3. Taylor WR, May ACW, Brown NP, Aszodi A. 2001. Protein structure: geometry, topology and classification. Rep Prog Phys 64:517-590.

    CAS  Google Scholar 

  4. Marti-Renom MA, Stuart AC, Fiser A, Sanchez R, Melo F, Sali A. 2000. Comparative protein structure model-ing of genes and genomes. Annu Rev Biophys Biomol Struct 29:291-325.

    CAS  PubMed  Google Scholar 

  5. Bonneau R, Baker D. 2001. Ab initio protein structure prediction: progress and prospects. Annu Rev Biophys Biomol Struct 30:173-189.

    CAS  PubMed  Google Scholar 

  6. Dill KA, Bromberg S, Yue KZ, Fiebig KM, Yee DP, Thomas PD, Chan HS. 1995. Principles of protein fold-ing—a perspective from simple exact models. Protein Sci 4:561-602.

    PubMed Central  CAS  PubMed  Google Scholar 

References

  1. Monod J. 1973. Le hasard et la necessité. Paris: Seuil.

    Google Scholar 

  2. Levy Y, Wolynes PG, Onuchic JN. 2004. Protein topology determines binding mechanism. Proc Natl Acad Sci USA 101:511-516.

    PubMed Central  CAS  PubMed  Google Scholar 

  3. Plaxco KW, Simons KT, Baker D. 1998. Contact order, transition state placement and the refolding rates of single domain proteins. J Mol Biol 277:985-994.

    CAS  PubMed  Google Scholar 

  4. Alm E, Baker D. 1999. Prediction of protein-folding mechanisms from free energy landscapes derived from native structures. Proc Natl Acad Sci USA 96:11305-11310.

    PubMed Central  CAS  PubMed  Google Scholar 

  5. Munoz V, Eaton WA. 1999. A simple model for calculating the kinetics of protein folding from three-dimensional structures. Proc Natl Acad Sci USA 96:11311-11316.

    PubMed Central  CAS  PubMed  Google Scholar 

  6. Alm E, Morozov AV, Kortemme T, Baker D. 2002. Simple physical models connect theory and experiments in protein folding kinetics. J Mol Biol 322:463-476.

    CAS  PubMed  Google Scholar 

  7. Koehl P, Levitt M. 2002. Protein topology and stability defines the space of allowed sequences. Proc Natl Acad Sci USA 99:1280-1285.

    PubMed Central  CAS  PubMed  Google Scholar 

  8. Smalheiser NR. 2002. Informatics and hypothesis-driven research. EMBO Rep 3:702.

    PubMed Central  CAS  PubMed  Google Scholar 

  9. Kell DB, Oliver SG. 2003. Here is the evidence, now what is the hypothesis? The complementary role of induc-tive and hypothesis driven science in the post genomic era. Bioessays 26:99-105.

    Google Scholar 

  10. Liolios K, Mavrommatis K, Tavernarakis N, Kyrpides NC. 2007. The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucl Acids Res 36:D475-D479.

    PubMed Central  PubMed  Google Scholar 

  11. Bernstein FC, Koetzle TF, William G, Meyer DJ, Brice MD, Rodgers JR. 1977. The protein databank: a com-puter-based archival file for macromolecular structures. J Mol Biol 112:535-542.

    CAS  PubMed  Google Scholar 

  12. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H. 2000. The Protein Data Bank. Nucl Acids Res 28:235-242.

    PubMed Central  CAS  PubMed  Google Scholar 

  13. Schulz GE, Schirmer RH. 1979. Principles of protein structure. New York: Springer-Verlag.

    Google Scholar 

  14. Cantor CR, Schimmel PR. 1980. Biophysical chemistry: the conformation of biological macromolecules. New York: W.H. Freeman Company.

    Google Scholar 

  15. Branden C, Tooze J. 1991. Introduction to protein structure. New York: Garland Publishing.

    Google Scholar 

  16. Creighton TE. 1993. Proteins. New York: W.H. Freeman & Co.

    Google Scholar 

  17. Taylor WR, May ACW, Brown NP, Aszodi A. 2001. Protein structure: geometry, topology and classification. Rep Prog Phys 64:517-590.

    CAS  Google Scholar 

  18. Timberlake KC. 2004. General, organic, and biological chemistry: structures of life. San Francisco: Benjamin Cummings.

    Google Scholar 

  19. Brooks C, Karplus M, Pettitt M. 1988. Proteins: a theoretical perspective of dynamics, structure and thermody-namics. Adv Chem Phys 71:1-259.

    Google Scholar 

  20. Kendrew J, Dickerson R, Strandberg B, Hart R, Davies D, Philips D. 1960. Structure of myoglobin: a three dimensional Fourier synthesis at 2 angstrom resolution. Nature (London) 185:422-427.

    CAS  Google Scholar 

  21. Perutz M, Rossmann M, Cullis A, Muirhead G, Will G, North A. 1960. Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5 angstrom resolution, obtained by X-ray analysis. Nature (London) 185:416-422.

    CAS  Google Scholar 

  22. Levitt M, Chothia C. 1976. Structural patterns in globular proteins. Nature (London) 261:552-558.

    CAS  Google Scholar 

  23. Lesk AM, Chothia C. 1980. How different amino-acid sequences determine similar protein structures: the struc-ture and evolutionary dynamics of the globins. J Mol Biol 136:225-270.

    CAS  PubMed  Google Scholar 

  24. Chothia C, Janin J. 1981. Relative orientation of close packed beta pleated sheets in proteins. Proc Nat Acad Sci USA 78:4146-4150.

    PubMed Central  CAS  PubMed  Google Scholar 

  25. Cohen FE, Sternberg MJE, Taylor WR. 1981. Analysis of the tertiary structure of protein beta sheet sand-wiches. J Mol Biol 148:253-272.

    CAS  PubMed  Google Scholar 

  26. Chothia C, Janin J. 1982. Orthogonal packing of beta pleated sheets in proteins. Biochemistry 21:3955-3965.

    CAS  PubMed  Google Scholar 

  27. Cohen FE, Sternberg MJE, Taylor WR. 1982. Analysis and prediction of the packing of aplha helices against a beta sheet in the tertiary structure of globular proteins. J Mol Biol 156:821-862.

    CAS  PubMed  Google Scholar 

  28. Chou KC. 1995. A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space. Proteins: Struct Funct Genet 21:319-344.

    CAS  Google Scholar 

  29. Chou KC, Zhang CT. 1995. Prediction of protein structural classes. Crit Rev Biochem Molec Biol 30:275-349.

    CAS  Google Scholar 

  30. Bahar I, Atilgan AR, Jernigan RL, Erman B. 1997. Understanding the recognition of protein structural classes by amino acid composition. Proteins: Struct Funct Genet 29:172-185.

    CAS  Google Scholar 

  31. Liu WM, Chou KC. 1998. Prediction of protein structural classes by modified mahalanobis discriminant algo-rithm. J Prot Chem 17:209-217.

    CAS  Google Scholar 

  32. Chou KC, Liu WM, Maggiora GM, Zhang CT. 1998. Prediction and classification of domain structural classes. Proteins: Struct Funct Genet 31:97-103.

    CAS  Google Scholar 

  33. Cai YD, Li YX, Chou KC. 2000. Using neural networks for prediction of domain structural classes. Biochim Biophys Acta 1476:1-2.

    CAS  PubMed  Google Scholar 

  34. Zhou GP, Assa-Munt N. 2001. Some insights into protein structural class prediction. Proteins: Struct Funct Genet 44:57-59.

    CAS  Google Scholar 

  35. Luo RY, Feng ZP, Liu JK. 2002. Prediction of protein structural class by amino acid and polypeptide composi-tion. Eur J Biochem 269:4219-4225.

    CAS  PubMed  Google Scholar 

  36. Xiao X, Lin W-Z, Chou KC. 2008. Using grey dynamic modeling and pseudo amino acid composition to pre-dict protein structural classes. J Comput Chem 29:2018-2024.

    CAS  PubMed  Google Scholar 

  37. Hutchinson EG, Thornton JM. 1993. The Greek key motif: extraction, classification and analysis. Protein Eng 6:233-245.

    CAS  PubMed  Google Scholar 

  38. Meirovitch H. 2007. Recent developments in methodologies for calculating the entropy and free energy of bio-logical systems by computer simulation. Curr Opin Struct Biol 17:181-186.

    CAS  PubMed  Google Scholar 

  39. Dill KA, Shortle D. 1991. Denatured states of proteins. Annu Rev Biochem 60:795-825.

    CAS  PubMed  Google Scholar 

  40. Cozetto D, Tramontano A. 2005. Relationship between multiple sequence alignments and quality of protein comparative models. Proteins: Struct Funct Genet 58:151-157.

    Google Scholar 

  41. Chothia C, Lesk A. 1986. The relation betweeen the divergence of sequence and structure in proteins. EMBO J 5:823-826.

    PubMed Central  CAS  PubMed  Google Scholar 

  42. Flores TP, Orengo C, Moss DS, Thornton J. 1993. Comparison of conformation characteristics in structurally similar protein pairs. Protein Sci 2:1811-1826.

    PubMed Central  CAS  PubMed  Google Scholar 

  43. Russel RB, Saqi AS, Sayle RA, Bates PA, Sternberg MJE. 1997. Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mol Biol 269:423-439.

    Google Scholar 

  44. Sauder JM, Arthur JW, Dunbrack RL. 2000. Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins: Struct Funct Genet 40:6-22.

    CAS  Google Scholar 

  45. Lipman DJ, Pearson WR. 1985. Rapid and sensitive protein similarity searches. Science 227:1435-1441.

    CAS  PubMed  Google Scholar 

  46. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403-410.

    CAS  PubMed  Google Scholar 

  47. Pearson WR. 1995. Comparison of methods for searching protein sequence databases. Protein Sci 4:1145-1160.

    PubMed Central  CAS  PubMed  Google Scholar 

  48. Agarwal P, States DJ. 1998. Comparative accuracy of methods for protein sequence similarity search. Bioin-formatics 14:40-47.

    CAS  Google Scholar 

  49. Brenner SE, Chothia C, Hubbard TJ. 1998. Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc Nat Acad Sci USA 95:6073-6078.

    PubMed Central  CAS  PubMed  Google Scholar 

  50. Rost B. 1999. Twilight zone of protein sequence alignments. Protein Eng 12:85-94.

    CAS  PubMed  Google Scholar 

  51. Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C. 1998. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol 284:1201-1210.

    CAS  PubMed  Google Scholar 

  52. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 25:3389-33402.

    PubMed Central  CAS  PubMed  Google Scholar 

  53. Eddy SR. 1996. Hidden Markov models. Curr Opin Struct Biol 6:361-365.

    CAS  PubMed  Google Scholar 

  54. Jones DT. 1997. Progress in protein structure prediction. Curr Opin Struct Biol 7:377-387.

    CAS  PubMed  Google Scholar 

  55. Marchler-Bauer A, Bryant SH. 1997. A measure of success in fold recognition. Trends Biochem Sci 22:236-240.

    CAS  PubMed  Google Scholar 

  56. Levitt M. 1997. Competitive assessment of protein fold recognition and alignment accuracy. Proteins: Struct Funct Genet Suppl 1:92-104.

    Google Scholar 

  57. Godzik A. 2003. Fold recognition methods. Methods Biochem Anal 44:525-546.

    CAS  PubMed  Google Scholar 

  58. Chothia C. 1992. One thousand fold families for the molecular biologist? Nature (London) 357:543.

    CAS  Google Scholar 

  59. Sali A, Blundell TL. 1993. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234:779-815.

    CAS  PubMed  Google Scholar 

  60. Sanchez R, Sali A. 1997. Evaluation of comparative protein structure modelling by MODELLER-3. Proteins Suppl 1:50-58.

    PubMed  Google Scholar 

  61. Lemer CMR, Rooman MJ, Wodak SJ. 1995. Protein structure prediction by threading methods: evaluation of current techniques. Proteins: Struct Funct Genet 23:337-355.

    CAS  Google Scholar 

  62. Marti-Renom MA, Stuart AC, Fiser A, Sanchez R, Melo F, Sali A. 2000. Comparative protein structure model-ing of genes and genomes. Annu Rev Biophys Biomol Struct 29:291-325.

    CAS  PubMed  Google Scholar 

  63. Go N, Scheraga HA. 1970. Ring closure and local conformational deformations of chain molecules. Macro-molecules 3:178-187.

    CAS  Google Scholar 

  64. Palmer KA, Scheraga HA. 1991. Standard-geometry chains fitted to X-ray derived structures: validation of the rigid-geometry approximation, 1: chain closure through a limited search of loop conformations. J Comput Chem 12:505-526.

    CAS  Google Scholar 

  65. Wedemeyer WJ, Scheraga HA. 1999. Exact analytical loop closure in proteins using polynomial equations. J Comput Chem 20:819-844.

    CAS  Google Scholar 

  66. Bruccoleri RE, Karplus M. 1985. Chain closure with bond angle variations. Macromolecules 18:2767-2773.

    CAS  Google Scholar 

  67. Moult J, James MNG. 1986. An algorithm which predicts the conformation of short lengths of chain in proteins. J Mol Graphics 4:180.

    Google Scholar 

  68. Deane CM, Blundell TL. 2000. A novel exhaustive search algorithm for predicting the conformation of poly-peptide segments in proteins. Proteins: Struct Funct Genet 40:135-144.

    CAS  Google Scholar 

  69. Bruccoleri RE, Karplus M. 1990. Conformational sampling using high-temperature molecular dynamics. Bio-polymers 29:1847-1862.

    CAS  Google Scholar 

  70. Carlacci L, Englander SW. 1993. The Loop problem in proteins: a Monte-Carlo simulated annealing approach. Biopolymers 33:1271-1286.

    CAS  PubMed  Google Scholar 

  71. Ring CS, Cohen FE. 1994. Conformational sampling of loop structures using genetic algorithms. Israel J Chem 34:245-252.

    CAS  Google Scholar 

  72. Zheng Q, Rosenfeld R, Vajda S, Delisi C. 1993. Loop closure via bond scaling and relaxation. J Comput Chem 14:556-565.

    CAS  Google Scholar 

  73. Zheng Q, Rosenfeld R, Delisi C, Kyle JD. 1994. Multiple copy sampling in protein loop modeling: computa-tional efficiency and sensitivity to dihedral angle perturbations. Protein Sci 3:493-506.

    PubMed Central  CAS  PubMed  Google Scholar 

  74. Lavalle SM, Finn PW, Kavraki LE, Latombe JC. 2000. A ramdomized kinematics-based approach to pharma-cophore-constrained conformational search and database screening. J Comput Chem 21:731-747.

    CAS  Google Scholar 

  75. Fine RM, Wang H, Shenkin PS, Yarmush DL, Levinthal C. 1996. Predicting antibody hyper-variable loop con-formations, II: minimization and molecular dynamics studies of mcp603 from many randomly generated loop conformations. Proteins: Struct Funct Genet 1:342-362.

    Google Scholar 

  76. Canutescu AA, Dunbrack RL. 2003. Cyclic coordinate descent: a robotics algorithm for protein loop closure. Protein Sci 12:963-972.

    PubMed Central  CAS  PubMed  Google Scholar 

  77. Jones TA, Thirup S. 1986. Using known substructures in protein model building and crystallography. EMBO J 5:819-822.

    PubMed Central  CAS  PubMed  Google Scholar 

  78. Fidelis K, Stern PS, Bacon D, Moult J. 1994. Comparison of systematic search and database methods for con-structing segments of protein-structure. Protein Eng 7:953-960.

    CAS  PubMed  Google Scholar 

  79. Kolodny R, Guibas L, Levitt M, Koehl P. 2005. Inverse kinematics in biology: the protein loop closure prob-lem. Int J Rob Res 24:151-163.

    Google Scholar 

  80. Ponder JW, Richards FM. 1987. Tertiary templates for proteins: use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 193:775-791.

    CAS  PubMed  Google Scholar 

  81. Dunbrack RL, Karplus M. 1994. Conformational-analysis of the backbone-dependent rotamer preferences of protein side-chains. Nat Struct Biol 1:334-340.

    CAS  PubMed  Google Scholar 

  82. Pierce NA, Winfree E. 2002. Protein design is NP-hard. Protein Eng 15:779-782.

    CAS  PubMed  Google Scholar 

  83. Chazelle B, Kingsfort C, Singh MA. 2004. A semi-definite programming approach to side-chain positioning with new rounding strategies. INFORMS J Comput 16:380-392.

    Google Scholar 

  84. Desmet J, Maeyer MD, Hazes B, Lasters I. 1992. The dead end elimination theorem and its use in protein side-chain positioning. Nature (London) 356:539-542.

    CAS  Google Scholar 

  85. Lasters I, Maeyer MD, Desmet J. 1995. Enhanced dead-end elimination in the search for the global minimum conformation of a collection of protein side chains. Protein Eng 8:815-822.

    CAS  PubMed  Google Scholar 

  86. Goldstein RF. 1994. Efficient rotamer elimination applied to protein side-chains and related spin glasses. Bio-phys J 66:1335-1340.

    CAS  Google Scholar 

  87. Gordon DB, Mayo SL. 1998. Radical performance enhancements for combinatorial optimization algorithms based on the dead-end elimination theorem. J Comput Chem 19:1505-1514.

    CAS  Google Scholar 

  88. Looger LL, Hellinga HW. 2001. Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. J Mol Biol 307:429-445.

    CAS  PubMed  Google Scholar 

  89. Holm L, Sander C. 1991. Database algorithm for generating protein backbone and side-chain co-ordinates from a C-alpha trace: Application to model building and detection of co-ordinate errors. J Mol Biol 218:183-194.

    CAS  PubMed  Google Scholar 

  90. Peterson RW, Dutton PL, Wand AJ. 2004. Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library. Protein Sci 13:735-751.

    PubMed Central  CAS  PubMed  Google Scholar 

  91. Lu M, Dousis AD, Ma J. 2008. OPUS-Rota: a fast and accurate method for side-chain modeling. Protein Sci 17:1576-1585.

    PubMed Central  CAS  PubMed  Google Scholar 

  92. Xiang Z, Honig B. 2001. Extending the accuracy limits of prediction for side-chain conformations. J Mol Biol 311:421-430.

    CAS  PubMed  Google Scholar 

  93. Samudrala R, Moult J. 1998. A graph theoretic algorithm for comparative modeling of protein structure. J Mol Biol 279:298-302.

    Google Scholar 

  94. Canutescu AA, Shelenkov AA, Dunbrack RL. 2003. A graph theory algorithm for rapid protein side-chain pre-diction. Protein Sci 12:2001-2014.

    PubMed Central  CAS  PubMed  Google Scholar 

  95. Dukka-Bahadur KC, Tomita E, Suzuki J, Akutsu T. 2005. Protein side-chain packing problem: a maximum edge-weigth clique algorithmic approach. J Bioinfo Comput Biol 3:103-126.

    CAS  Google Scholar 

  96. Koehl P, Delarue M. 1994. Application of a self consistent mean field theory to predict protein side-chains con-formation and estimate their conformational entropy. J Mol Biol 239:249-275.

    CAS  PubMed  Google Scholar 

  97. Koehl P, Delarue M. 1996. Mean-field minimization methods for biological macromolecules. Curr Opin Struct Biol 6:222-226.

    CAS  PubMed  Google Scholar 

  98. Koehl P, Delarue M. 1995. A self consistent mean field approach to simultaneous gap closure and side-chain positioning in homology modelling. Nat Struct Biol 2:163-170.

    CAS  PubMed  Google Scholar 

  99. Levitt M, Lifson S. 1969. Refinement of protein conformations using a macromolecular energy minimization procedure. J Mol Biol 46:269-279.

    CAS  PubMed  Google Scholar 

  100. Koehl P, Levitt M. 1999. A brighter future for protein structure prediction. Nat Struct Biol 6:108-111.

    CAS  PubMed  Google Scholar 

  101. Venclovas C, Zemla A, Fidelis K, Moult J. 2003. Assessment of progress over the CASP experiments. Pro-teins: Struct Funct Genet 53:585-595.

    CAS  Google Scholar 

  102. Laskowski RA, Mc Arthur MW, Moss DS, Thornton J. 1993. PROCHECK: a program to check the stereo-chemical quality of protein structures. J Appl Cryst 26:283-291.

    CAS  Google Scholar 

  103. Hooft RW, Vriend G, Sander C, Abola EE. 1996. Errors in protein structures. Nature (London) 381:272.

    CAS  Google Scholar 

  104. Bowie JU, Lüthy R, Eisenberg D. 1991. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253:164-170.

    CAS  PubMed  Google Scholar 

  105. Lüthy R, Bowie JU, Eisenberg D. 1992. Assessment of protein models with three-dimensional profiles. Nature (London) 356:83-85.

    Google Scholar 

  106. Eisenberg D, Luthy R, Bowie JU. 1997. VERIFY3D, assessment of protein models with three-dimensional profiles. Methods Enzymol 277:396-404.

    CAS  PubMed  Google Scholar 

  107. Sippl MJ. 1993. Recognition of errors in three-dimensional structures of proteins. Proteins: Struct Funct Genet 17:355-362.

    CAS  Google Scholar 

  108. Wiederstein M, Sippl MJ. 2007. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res 35:W407-W410.

    PubMed Central  PubMed  Google Scholar 

  109. Pawlowski M, Gajda MJ, Matlak R, Bujnicki JM. 2008. MetaMQAP: a meta-server for the quality assessment of protein models. BMC Bioinformatics 9:403.

    PubMed Central  PubMed  Google Scholar 

  110. Jones DT. 2001. Evaluating the potential of using fold-recognition models for molecular replacement. Acta Cryst D57:1428-1434.

    CAS  Google Scholar 

  111. Rossmann MG. 2001. Molecular replacement—historical background. Acta Crystallogr D Biol Crystallogr 57:1360-1366.

    CAS  PubMed  Google Scholar 

  112. Ilari A, Savino C. 2008. Protein structure determination by x-ray crystallography. Methods Mol Biol 452:63-87.

    CAS  PubMed  Google Scholar 

  113. Taylor G. 2003. The phase problem. Acta Crystallogr D Biol Crystallogr 59:1881-1890.

    PubMed  Google Scholar 

  114. Friedberg I, Jaroszewski L, Ye Y, Godzik A. 2004. The interplay of fold recognition and experimental structure determination in structural genomics. Curr Opin Struct Biol 14:307-312.

    CAS  PubMed  Google Scholar 

  115. Claude J-B, Suhre K, Notredame C, Claverie J-M, Abergel C. 2004. CaspR: a web server for automated mo-lecular replacement using homology modeling. Nucl Acids Res 32:W606-W609.

    PubMed Central  CAS  PubMed  Google Scholar 

  116. Giorgetti A, Raimondo D, Miele AE, Tramontano A. 2005. Evaluating the usefulness of protein structure mod-els for molecular replacement. Bioinformatics 21:72-76.

    Google Scholar 

  117. Qian B, Raman S, Das R, Bradley P, McCoy AJ, Read RJ, Baker D. 2007. High-resolution structure prediction and the crystallographic phase problem. Nature (London) 450:259-264.

    CAS  Google Scholar 

  118. Topf M, Sali A. 2005. Combining electron microscopy and comparative protein structure modeling. Curr Opin Struct Biol 15:578-585.

    CAS  PubMed  Google Scholar 

  119. Zheng W, Doniach S. 2002. Protein structure prediction constrained by solution X-ray scattering data and struc-tural homology identification. J Mol Biol 316:173-187.

    CAS  PubMed  Google Scholar 

  120. Chen SW, Pellequer JL. 2004. Identification of functionally important residues in proteins using comparative models. Curr Med Chem 11:595-605.

    CAS  PubMed  Google Scholar 

  121. Skrabanek L, Saini HK, Bader GD, Enright AJ. 2008. Computational prediction of protein-protein interactions. Mol Biotechnol 38:1-17.

    CAS  PubMed  Google Scholar 

  122. Hutchins C, Greer J. 1991. Comparative modeling of proteins in the design of novel renin inhibitors. Crit Rev Biochem Mol Biol 26:77-127.

    CAS  PubMed  Google Scholar 

  123. Hillisch A, Pineda LF, Hilgenfeld R. 2004. Utility of homology models in the drug discovery process. Drug Discovery Today 9:659-669.

    CAS  PubMed  Google Scholar 

  124. Rockey WM, Elcock AH. 2006. Structure selection for protein kinase docking and virtual screening: homology models or crystal structures? Curr Protein Pept Sci 7:437-457.

    CAS  PubMed  Google Scholar 

  125. Villoutreix BO, Renault N, Lagorce D, Sperandio O, Montes M, Miteva MA. 2007. Free resources to assist structure-based virtual ligand screening experiments. Curr Protein Pept Sci 8:381-411.

    CAS  PubMed  Google Scholar 

  126. Roessler CG, Hall BM, Anderson WJ, Ingram WM, Roberts SA, Montfort WR, Cordes MH. 2008. Transitive homology-guided structural studies lead to the discovery of Cro proteins with 40% sequence identity but differ-ent folds. Proc Nat Acad Sci USA 105:2343-2348.

    PubMed Central  CAS  PubMed  Google Scholar 

  127. Bradley P, Misura KM, Baker D. 2005. Toward high-resolution de novo structure prediction for small proteins. Science 309:1868-1871.

    CAS  PubMed  Google Scholar 

  128. Bonneau R, Baker D. 2001. Ab initio protein structure prediction: progress and prospects. Annu Rev Biophys Biomol Struct 30:173-189.

    CAS  PubMed  Google Scholar 

  129. Hardin C, Pogorelov TV, Luthey-Schulten Z. 2002. Ab initio protein structure prediction. Curr Opin Struct Biol 12:176-181.

    CAS  PubMed  Google Scholar 

  130. Chivian D, Robertson T, Bonneau R, Baker D. 2003. Ab initio methods. Methods Biochem Anal 44:547-557.

    CAS  PubMed  Google Scholar 

  131. Jauch R, Yeo HC, Kolatkar PR, Clarke ND. 2007. Assessment of CASP7 structure predictions for template free targets. Proteins: Struct Funct Genet 69(Suppl 8):57-67.

    CAS  Google Scholar 

  132. Dill KA, Ozkan SB, Welkl TR, Chodera JD, Voetz VA. 2007. The protein folding problem: when will it be solved? Curr Opin Struct Biol 17:342-346.

    CAS  PubMed  Google Scholar 

  133. Zhang Y. 2008. Progress and challenges in protein structure prediction. Curr Opin Struct Biol 18:342-348.

    PubMed Central  CAS  PubMed  Google Scholar 

  134. Dill KA, Bromberg S, Yue KZ, Fiebig KM, Yee DP, Thomas PD, Chan HS. 1995. Principles of protein fold-ing—a perspective from simple exact models. Protein Sci 4:561-602.

    PubMed Central  CAS  PubMed  Google Scholar 

  135. Covell DG, Jernigan RL. 1990. Conformations of folded proteins in restricted space. Biochemistry 29:3287-3294.

    CAS  PubMed  Google Scholar 

  136. Park BH, Levitt M. 1995. The complexity and accuracy of discrete state models of protein structure. J Mol Biol 249:493-507.

    CAS  PubMed  Google Scholar 

  137. Lau KF, Dill K. 1989. A lattice statistical mechanics model of the conformational and sequence spaces of pro-teins. Macromolecules 22:3986-3997.

    CAS  Google Scholar 

  138. Shakhnovich EI, Gutin AM. 1993. Engineering of stable and fast-folding sequences of model proteins. Proc Natl Acad Sci USA 90:7195-7199.

    PubMed Central  CAS  PubMed  Google Scholar 

  139. Go N, Takemoti H. 1978. Resepctive roles of short-and long-range interactions in protein folding. Proc Nat Acad Sci USA 75:559-563.

    PubMed Central  CAS  PubMed  Google Scholar 

  140. Miyazawa S, Jernigan RL. 1985. Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 18:534-552.

    CAS  Google Scholar 

  141. Chan HS, Dill K. 1989. Compact polymers. Macromolecules 22:4559-4573.

    CAS  Google Scholar 

  142. Chan HS, Dill K. 1990. Origins of structure in globular proteins. Proc Nat Acad Sci USA 87:6388-6392.

    PubMed Central  CAS  PubMed  Google Scholar 

  143. Karplus M, McCammon JA. 2002. Molecular dynamics simulations of biomolecules. Nat Struct Biol 9:646-652.

    CAS  PubMed  Google Scholar 

  144. Duan Y, Kollman PA. 1998. Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. Science 282:740-744.

    CAS  PubMed  Google Scholar 

  145. Pitera JW, Swope W. 2003. Understanding folding and design: replica-exchange simulations of "Trp-cage" miniproteins. Proc Nat Acad Sci USA 100:7587-7592.

    PubMed Central  CAS  PubMed  Google Scholar 

  146. Lei H, Wu C, Liu H, Duan Y. 2007. Folding free energy landscape of vllin headpiece subdomain from molecu-lar dynamic simulations. Proc Nat Acad Sci USA 104:4925-4930.

    PubMed Central  CAS  PubMed  Google Scholar 

  147. Zagrovic B, Snow CD, Shirts MR, Pande VS. 2002. Simulation of folding of a small alpha-helical protein in atomistic detail using worldwide-distributed computing. J Mol Biol 323:927-937.

    CAS  PubMed  Google Scholar 

  148. Chou PY, Fasman GD. 1974. Conformational parameters for amino-acids in helical, beta-sheet, and random coil regions calculated from proteins. Biochemistry 13:211-222.

    CAS  PubMed  Google Scholar 

  149. Garnier J, Osguthorpe D, Robson B. 1978. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120:97-120.

    CAS  PubMed  Google Scholar 

  150. Heringa J. 2000. Computational methods for protein secondary structure prediction using multiple sequence alignments. Curr Protein Pept Sci 1:273-301.

    CAS  PubMed  Google Scholar 

  151. Rost B. 2001. Review: protein secondary structure prediction continues to rise. J Struct Biol 134:204-218.

    CAS  PubMed  Google Scholar 

  152. Rost B, Eyrich VA. 2001. EVA: large-scale analysis of secondary structure prediction. Proteins: Struct Funct Genet Suppl 5:192-199.

    Google Scholar 

  153. Rost B, Sander C. 1993. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol 232:584-599.

    CAS  PubMed  Google Scholar 

  154. Montgomerie S, Sundararaj S, Gallin WJ, Wishart DS. 2006. Improving the accuracy of protein secondary structure prediction using structural alignment. BMC Bioinformatics 7:301.

    PubMed Central  PubMed  Google Scholar 

  155. Fain B, Levitt M. 2001. A novel method for sampling alpha-helical protein backbones. J Mol Biol 305:191-201.

    CAS  PubMed  Google Scholar 

  156. Bradley P, Baker D. 2006. Improved beta-protein structure prediction by multilevel optimization of nonlocal strand pairings and local backbone conformation. Proteins: Struct Funct Genet 65:922-929.

    CAS  Google Scholar 

  157. Wu GA, Coutsias EA, Dill KA. 2008. Iterative assembly of helical proteins by optimal hydrophobic packing. Structure 16:1257-1266.

    PubMed Central  CAS  PubMed  Google Scholar 

  158. Orengo C, Bray J, Hubbard T, Lo Conte L, Sillitoe I. 1999. Analysis and assessment of ab initio three-dimensional prediction, secondary structure, and contacts prediction. Proteins: Struct Funct Genet 37:149-170.

    Google Scholar 

  159. Ortiz AR, Kolinski A, Skolnick J. 1998. Native-like topology assembly of small proteins using predicted re-straints in Monte Carlo folding simulations. Proc Nat Acad Sci USA 95:1020-1025.

    PubMed Central  CAS  PubMed  Google Scholar 

  160. Rohl CA, Strauss CE, Misura KM, Baker D. 2004. Protein structure prediction using Rosetta. Methods Enzymol 383:66-93.

    CAS  PubMed  Google Scholar 

  161. Das R, Baker D. 2008. Macromolecular modeling with Rosetta. Annu Rev Biochem 77:363-382.

    CAS  PubMed  Google Scholar 

  162. Lazaridis T, Karplus M. 2000. Effective energy functions for protein structure prediction. Currr Opin Struct Biol 10:139-145.

    CAS  Google Scholar 

  163. Huang ES, Samudrala R, Park BH. 2000. Scoring functions for ab initio protein structure prediction. Methods Mol Biol 143:223-245.

    CAS  PubMed  Google Scholar 

  164. Ngan S-C, Hung LH, Liu T, Samudrala R. 2008. Scoring functions for de novo protein structure prediction revisited. Methods Mol Biol 413:243-281.

    CAS  PubMed  Google Scholar 

  165. Roux B, Simonson T. 1999. Implicit solvent models. Biophys Chem 78:1-20.

    CAS  PubMed  Google Scholar 

  166. Koehl P. 2006. Electrostatics calculations: latest methodological advances. Curr Opin Struct Biol 16:142-51.

    CAS  PubMed  Google Scholar 

  167. Sippl M. 1990. Calculation of conformational ensembles from potentials of mean force: an approach to the knowledge-based prediction of local structures in globular proteins. J Mol Biol 1990:859-883.

    Google Scholar 

  168. Sippl M. 1993. Boltzmann’s principle, knowledge-based mean fields and protein folding: an approach to the computational determination of protein structures. J Comput Aided Mol Des 7:473-501.

    CAS  PubMed  Google Scholar 

  169. Samudrala R, Moult J. 1998. An all-atom distance dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol 275:895-916.

    CAS  PubMed  Google Scholar 

  170. Moult J, Pedersen JT, Judson RS, Fidelis K. 1995. A large scale experiment to assess protein structure predic-tion methods. Proteins: Struct Funct Genet 23:R2-R4.

    Google Scholar 

  171. Subbiah S, Laurents DV, Levitt M. 1993. Structural similarity of DNA-binding domains of bacteriophage rep-ressors and the globin core. Curr Biol 3:141-148.

    CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrice Koehl .

Editor information

Editors and Affiliations

1.1 Electronic Supplementary material

Figure 1.1.

Amino acids: the building blocks of proteins. (A) Each amino acid has a mainchain (N, Cα, C, and O) on which is attached a sidechain schematically represented as R. The mainchain can itself be partitioned into three groups: the amino group, the central Cα group, and the carboxyl group. Note that even though the amino group and the carboxyl group are charged at neutral pH, the amino acid is neutral: we say that it is a zwitterion. (B) Amino acids in proteins are attached through planar peptide bonds, connecting atom C of the current residue to atom N of the following residue. Please visit http://extras.springer.com/ to view a high-resolution full-color version of this illustration. (PDF 2,785 KB)

Figure 1.2.

The three most common arrangements of secondary structure elements (SSE) found in proteins. (A) The regular α–helix is a right–handed helix, in which all residues adopt similar conformations. The α–helix is characterized by hydrogen bonds between the oxygen O of residue i, and the polar backbone hydrogen HN (bound to N) of residue i + 4. Note that all C=O and N–HN bonds are parallel to the main axis of the helix. (B) An anti-parallel β–sheet. Two strands (stretches of extended backbone segments) are running in an anti-parallel geometry. The atoms HN and O of residue i in the first strand hydrogen bond with the atoms O and HN of residue j in the opposite strand, respectively, while residues i + 1 and j + 1 face outward. (C) A parallel β–sheet. The two strands are parallel, and the atoms HN and O of residue i in the first strand hydrogen bond with the O of residue j and the HN of residue j + 2, respectively. The same alternating pattern of residues involved in hydrogen bonds with the opposite strand, and facing outward is observed in parallel and anti-parallel β–sheets. A strand can therefore be involved in two different sheets. For simplicity, sidechains and non-polar hydrogens are ignored. Figure drawn using Pymol (http://www.pymol.org). Please visit http://extras.springer.com/ to view a high-resolution full-color version of this illustration. (PDF 2,783 KB)

Figure 1.3.

The three main types of proteins. (A) Collagen is the main protein of connective tissues in animals and the most abundant protein in mammals, making up close to 30% of their body protein content. It is a fiber protein, with each fiber made up of three polypeptide strands possessing the conformation of left-handed helices. These three left-handed helices are twisted together into a right-handed coiled coil, a cooperative quaternary structure stabilized by numerous hydrogen bonds. (B) Bacteriorhodopsin is a mainly α–protein, containing seven helices, that crosses the membrane of a cell (a few lipids of the membrane are shown as a space-filling diagram in green). It serves as an ion pump, and is found in bacteria that can survive in high salt concentrations. (C) TIM is a globular protein that belongs to the α–β class. The protein chain alternates between β and α secondary structure type, giving rise to a barrel β–sheet in the center surrounded by a large ring of α-helix on the outside. This structure, first seen in the triose phosphate isomerase of chicken, has been observed in many unrelated proteins since then. Figure drawn using Pymol (http://www.pymol.org). Please visit http://extras.springer.com/ to view a high-resolution full-color version of this illustration. (PDF 2,792 KB)

Figure 1.7.

A self-consistent mean field (SCMF) approach to the problem of predicting sidechain conformation. (A) The multicopy approach. Let us assume that residue i in the protein of interest is a phenylalanine, and that this phenylalanine can adopt three possible conformations. A systematic enumeration of all possible sidechain conformations in the protein would require that all three conformations of phenylalanine i be considered. If the protein contains 100 residues, each with three possible conformations, the size of the corresponding conformational space is 3100, a number out of reach of modern computers. As an alternative, we construct a chimera molecule, where sidechains are represented as an ensemble of discrete conformation: phenylalanine i is now represented with 3 conformations, each with a weight P(i,j), such that the sum of the weights is 1. (B,C) The mean field. The chimera molecule considered contains all conformations of all sidechains in the proteins. The energy of conformation k for residue i includes the internal energy for conformation k, the energy of interaction of conformation k for i with the backbone, and all interactions with all conformations of the remaining sidechains of the protein, each weighted with their probabilities. (D) Updating the probabilities. The initial probabilities are chosen to be uniform. Using the equations given in (C) we get the energies of all conformations of all residues in the chimera protein. These energies are then used to update the probabilities of these conformations. We have shown that updating the probabilities using a Boltzmann law is equivalent to minimizing the total free energy of the chimera molecule [97]. The new probabilities are then used to compute new energies; this procedure is repeated until we reach convergence (“self-consistency”), i.e., when the probabilities and energies do not change anymore. For each residue, we choose the conformation with the resolution full-color version of this illustration. highest converged probability as its predicted conformation. Please visit http://extras.springer.com/ to view a high resolution full-color version of this illustration. (PDF 2,802 KB)

Figure 1.8.

Lattice model of a protein structure. The figure depicts an example of a compact selfavoiding structure of a protein chain of 27 “residues” on a regular cubic lattice. This structure contains 28 contacts between non-sequential residues (shown as dashed line). The total energy of this conformation is the sum of the energies over these contacts. Please visit http://extras.springer.com/ to view a high resolution full-color version of this illustration. (PDF 2,789 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Koehl, P. (2010). Protein Structure Prediction. In: Jue, T. (eds) Biomedical Applications of Biophysics. Handbook of Modern Biophysics, vol 3. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60327-233-9_1

Download citation

Publish with us

Policies and ethics