Protein Structure Prediction

Koehl, Patrice

doi:10.1007/978-1-60327-233-9_1

Patrice Koehl²

Part of the book series: Handbook of Modern Biophysics ((HBBT,volume 3))

2237 Accesses
1 Citations

Abstract

The molecular basis of life rests on the activity of large biomolecules, mostly nucleic acids(DNA and RNA), carbohydrates, lipids, and proteins. While each of these molecules has itsrole, there is something special about proteins, as they are the lead performers of cellular functions.This was dramatized by Jacques Monod, who stated that “C’est à ce niveau d’ organisation chimique que gît, s’il y en a un, le secret de la vie,” i.e., that it is at this level of organization that lies the secret of life, if there is one [1]. To understand how these molecules function we first need to know their shapes; consequently, structural molecular biology has emerged as a new line of experimental research focused on revealing the structure of these biomolecules. This branch of biology has recently experienced a major uplift through the development of highthroughput structural studies, the structural genomics projects, aimed atdeveloping a comprehensive view of the protein structure universe. All these initiatives are expected to help us unravel the connections between the sequence, structure, and function of a protein. Experimental data at a molecular level are scarce, however; this has led to the development of many modeling initiatives to shed light on these connections. Probably the most famous is the study of the protein-folding problem — the “holy grail” for the structural biology community. Its elusive goal is to predict the detailed three-dimensional structure of a protein from its sequence as well as to decipher the sequence of events the protein goes through to reach its folded state. This chapter is dedicated to the first part of this task, namely the protein structure prediction problem. We structure prediction problem benefit from two different approaches to science, which differ in the importance they give to experimental data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 179.00; Price excludes VAT (USA)

Softcover Book: USD 229.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Branden C, Tooze J. 1991. Introduction to protein structure. New York: Garland Publishing.
Google Scholar
Creighton TE. 1993. Proteins. New York: W.H. Freeman & Co.
Google Scholar
Taylor WR, May ACW, Brown NP, Aszodi A. 2001. Protein structure: geometry, topology and classification. Rep Prog Phys 64:517-590.
CAS Google Scholar
Marti-Renom MA, Stuart AC, Fiser A, Sanchez R, Melo F, Sali A. 2000. Comparative protein structure model-ing of genes and genomes. Annu Rev Biophys Biomol Struct 29:291-325.
CAS PubMed Google Scholar
Bonneau R, Baker D. 2001. Ab initio protein structure prediction: progress and prospects. Annu Rev Biophys Biomol Struct 30:173-189.
CAS PubMed Google Scholar
Dill KA, Bromberg S, Yue KZ, Fiebig KM, Yee DP, Thomas PD, Chan HS. 1995. Principles of protein fold-ing—a perspective from simple exact models. Protein Sci 4:561-602.
PubMed Central CAS PubMed Google Scholar

References

Monod J. 1973. Le hasard et la necessité. Paris: Seuil.
Google Scholar
Levy Y, Wolynes PG, Onuchic JN. 2004. Protein topology determines binding mechanism. Proc Natl Acad Sci USA 101:511-516.
PubMed Central CAS PubMed Google Scholar
Plaxco KW, Simons KT, Baker D. 1998. Contact order, transition state placement and the refolding rates of single domain proteins. J Mol Biol 277:985-994.
CAS PubMed Google Scholar
Alm E, Baker D. 1999. Prediction of protein-folding mechanisms from free energy landscapes derived from native structures. Proc Natl Acad Sci USA 96:11305-11310.
PubMed Central CAS PubMed Google Scholar
Munoz V, Eaton WA. 1999. A simple model for calculating the kinetics of protein folding from three-dimensional structures. Proc Natl Acad Sci USA 96:11311-11316.
PubMed Central CAS PubMed Google Scholar
Alm E, Morozov AV, Kortemme T, Baker D. 2002. Simple physical models connect theory and experiments in protein folding kinetics. J Mol Biol 322:463-476.
CAS PubMed Google Scholar
Koehl P, Levitt M. 2002. Protein topology and stability defines the space of allowed sequences. Proc Natl Acad Sci USA 99:1280-1285.
PubMed Central CAS PubMed Google Scholar
Smalheiser NR. 2002. Informatics and hypothesis-driven research. EMBO Rep 3:702.
PubMed Central CAS PubMed Google Scholar
Kell DB, Oliver SG. 2003. Here is the evidence, now what is the hypothesis? The complementary role of induc-tive and hypothesis driven science in the post genomic era. Bioessays 26:99-105.
Google Scholar
Liolios K, Mavrommatis K, Tavernarakis N, Kyrpides NC. 2007. The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucl Acids Res 36:D475-D479.
PubMed Central PubMed Google Scholar
Bernstein FC, Koetzle TF, William G, Meyer DJ, Brice MD, Rodgers JR. 1977. The protein databank: a com-puter-based archival file for macromolecular structures. J Mol Biol 112:535-542.
CAS PubMed Google Scholar
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H. 2000. The Protein Data Bank. Nucl Acids Res 28:235-242.
PubMed Central CAS PubMed Google Scholar
Schulz GE, Schirmer RH. 1979. Principles of protein structure. New York: Springer-Verlag.
Google Scholar
Cantor CR, Schimmel PR. 1980. Biophysical chemistry: the conformation of biological macromolecules. New York: W.H. Freeman Company.
Google Scholar
Branden C, Tooze J. 1991. Introduction to protein structure. New York: Garland Publishing.
Google Scholar
Creighton TE. 1993. Proteins. New York: W.H. Freeman & Co.
Google Scholar
Taylor WR, May ACW, Brown NP, Aszodi A. 2001. Protein structure: geometry, topology and classification. Rep Prog Phys 64:517-590.
CAS Google Scholar
Timberlake KC. 2004. General, organic, and biological chemistry: structures of life. San Francisco: Benjamin Cummings.
Google Scholar
Brooks C, Karplus M, Pettitt M. 1988. Proteins: a theoretical perspective of dynamics, structure and thermody-namics. Adv Chem Phys 71:1-259.
Google Scholar
Kendrew J, Dickerson R, Strandberg B, Hart R, Davies D, Philips D. 1960. Structure of myoglobin: a three dimensional Fourier synthesis at 2 angstrom resolution. Nature (London) 185:422-427.
CAS Google Scholar
Perutz M, Rossmann M, Cullis A, Muirhead G, Will G, North A. 1960. Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5 angstrom resolution, obtained by X-ray analysis. Nature (London) 185:416-422.
CAS Google Scholar
Levitt M, Chothia C. 1976. Structural patterns in globular proteins. Nature (London) 261:552-558.
CAS Google Scholar
Lesk AM, Chothia C. 1980. How different amino-acid sequences determine similar protein structures: the struc-ture and evolutionary dynamics of the globins. J Mol Biol 136:225-270.
CAS PubMed Google Scholar
Chothia C, Janin J. 1981. Relative orientation of close packed beta pleated sheets in proteins. Proc Nat Acad Sci USA 78:4146-4150.
PubMed Central CAS PubMed Google Scholar
Cohen FE, Sternberg MJE, Taylor WR. 1981. Analysis of the tertiary structure of protein beta sheet sand-wiches. J Mol Biol 148:253-272.
CAS PubMed Google Scholar
Chothia C, Janin J. 1982. Orthogonal packing of beta pleated sheets in proteins. Biochemistry 21:3955-3965.
CAS PubMed Google Scholar
Cohen FE, Sternberg MJE, Taylor WR. 1982. Analysis and prediction of the packing of aplha helices against a beta sheet in the tertiary structure of globular proteins. J Mol Biol 156:821-862.
CAS PubMed Google Scholar
Chou KC. 1995. A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space. Proteins: Struct Funct Genet 21:319-344.
CAS Google Scholar
Chou KC, Zhang CT. 1995. Prediction of protein structural classes. Crit Rev Biochem Molec Biol 30:275-349.
CAS Google Scholar
Bahar I, Atilgan AR, Jernigan RL, Erman B. 1997. Understanding the recognition of protein structural classes by amino acid composition. Proteins: Struct Funct Genet 29:172-185.
CAS Google Scholar
Liu WM, Chou KC. 1998. Prediction of protein structural classes by modified mahalanobis discriminant algo-rithm. J Prot Chem 17:209-217.
CAS Google Scholar
Chou KC, Liu WM, Maggiora GM, Zhang CT. 1998. Prediction and classification of domain structural classes. Proteins: Struct Funct Genet 31:97-103.
CAS Google Scholar
Cai YD, Li YX, Chou KC. 2000. Using neural networks for prediction of domain structural classes. Biochim Biophys Acta 1476:1-2.
CAS PubMed Google Scholar
Zhou GP, Assa-Munt N. 2001. Some insights into protein structural class prediction. Proteins: Struct Funct Genet 44:57-59.
CAS Google Scholar
Luo RY, Feng ZP, Liu JK. 2002. Prediction of protein structural class by amino acid and polypeptide composi-tion. Eur J Biochem 269:4219-4225.
CAS PubMed Google Scholar
Xiao X, Lin W-Z, Chou KC. 2008. Using grey dynamic modeling and pseudo amino acid composition to pre-dict protein structural classes. J Comput Chem 29:2018-2024.
CAS PubMed Google Scholar
Hutchinson EG, Thornton JM. 1993. The Greek key motif: extraction, classification and analysis. Protein Eng 6:233-245.
CAS PubMed Google Scholar
Meirovitch H. 2007. Recent developments in methodologies for calculating the entropy and free energy of bio-logical systems by computer simulation. Curr Opin Struct Biol 17:181-186.
CAS PubMed Google Scholar
Dill KA, Shortle D. 1991. Denatured states of proteins. Annu Rev Biochem 60:795-825.
CAS PubMed Google Scholar
Cozetto D, Tramontano A. 2005. Relationship between multiple sequence alignments and quality of protein comparative models. Proteins: Struct Funct Genet 58:151-157.
Google Scholar
Chothia C, Lesk A. 1986. The relation betweeen the divergence of sequence and structure in proteins. EMBO J 5:823-826.
PubMed Central CAS PubMed Google Scholar
Flores TP, Orengo C, Moss DS, Thornton J. 1993. Comparison of conformation characteristics in structurally similar protein pairs. Protein Sci 2:1811-1826.
PubMed Central CAS PubMed Google Scholar
Russel RB, Saqi AS, Sayle RA, Bates PA, Sternberg MJE. 1997. Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mol Biol 269:423-439.
Google Scholar
Sauder JM, Arthur JW, Dunbrack RL. 2000. Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins: Struct Funct Genet 40:6-22.
CAS Google Scholar
Lipman DJ, Pearson WR. 1985. Rapid and sensitive protein similarity searches. Science 227:1435-1441.
CAS PubMed Google Scholar
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403-410.
CAS PubMed Google Scholar
Pearson WR. 1995. Comparison of methods for searching protein sequence databases. Protein Sci 4:1145-1160.
PubMed Central CAS PubMed Google Scholar
Agarwal P, States DJ. 1998. Comparative accuracy of methods for protein sequence similarity search. Bioin-formatics 14:40-47.
CAS Google Scholar
Brenner SE, Chothia C, Hubbard TJ. 1998. Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc Nat Acad Sci USA 95:6073-6078.
PubMed Central CAS PubMed Google Scholar
Rost B. 1999. Twilight zone of protein sequence alignments. Protein Eng 12:85-94.
CAS PubMed Google Scholar
Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C. 1998. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol 284:1201-1210.
CAS PubMed Google Scholar
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 25:3389-33402.
PubMed Central CAS PubMed Google Scholar
Eddy SR. 1996. Hidden Markov models. Curr Opin Struct Biol 6:361-365.
CAS PubMed Google Scholar
Jones DT. 1997. Progress in protein structure prediction. Curr Opin Struct Biol 7:377-387.
CAS PubMed Google Scholar
Marchler-Bauer A, Bryant SH. 1997. A measure of success in fold recognition. Trends Biochem Sci 22:236-240.
CAS PubMed Google Scholar
Levitt M. 1997. Competitive assessment of protein fold recognition and alignment accuracy. Proteins: Struct Funct Genet Suppl 1:92-104.
Google Scholar
Godzik A. 2003. Fold recognition methods. Methods Biochem Anal 44:525-546.
CAS PubMed Google Scholar
Chothia C. 1992. One thousand fold families for the molecular biologist? Nature (London) 357:543.
CAS Google Scholar
Sali A, Blundell TL. 1993. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234:779-815.
CAS PubMed Google Scholar
Sanchez R, Sali A. 1997. Evaluation of comparative protein structure modelling by MODELLER-3. Proteins Suppl 1:50-58.
PubMed Google Scholar
Lemer CMR, Rooman MJ, Wodak SJ. 1995. Protein structure prediction by threading methods: evaluation of current techniques. Proteins: Struct Funct Genet 23:337-355.
CAS Google Scholar
Marti-Renom MA, Stuart AC, Fiser A, Sanchez R, Melo F, Sali A. 2000. Comparative protein structure model-ing of genes and genomes. Annu Rev Biophys Biomol Struct 29:291-325.
CAS PubMed Google Scholar
Go N, Scheraga HA. 1970. Ring closure and local conformational deformations of chain molecules. Macro-molecules 3:178-187.
CAS Google Scholar
Palmer KA, Scheraga HA. 1991. Standard-geometry chains fitted to X-ray derived structures: validation of the rigid-geometry approximation, 1: chain closure through a limited search of loop conformations. J Comput Chem 12:505-526.
CAS Google Scholar
Wedemeyer WJ, Scheraga HA. 1999. Exact analytical loop closure in proteins using polynomial equations. J Comput Chem 20:819-844.
CAS Google Scholar
Bruccoleri RE, Karplus M. 1985. Chain closure with bond angle variations. Macromolecules 18:2767-2773.
CAS Google Scholar
Moult J, James MNG. 1986. An algorithm which predicts the conformation of short lengths of chain in proteins. J Mol Graphics 4:180.
Google Scholar
Deane CM, Blundell TL. 2000. A novel exhaustive search algorithm for predicting the conformation of poly-peptide segments in proteins. Proteins: Struct Funct Genet 40:135-144.
CAS Google Scholar
Bruccoleri RE, Karplus M. 1990. Conformational sampling using high-temperature molecular dynamics. Bio-polymers 29:1847-1862.
CAS Google Scholar
Carlacci L, Englander SW. 1993. The Loop problem in proteins: a Monte-Carlo simulated annealing approach. Biopolymers 33:1271-1286.
CAS PubMed Google Scholar
Ring CS, Cohen FE. 1994. Conformational sampling of loop structures using genetic algorithms. Israel J Chem 34:245-252.
CAS Google Scholar
Zheng Q, Rosenfeld R, Vajda S, Delisi C. 1993. Loop closure via bond scaling and relaxation. J Comput Chem 14:556-565.
CAS Google Scholar
Zheng Q, Rosenfeld R, Delisi C, Kyle JD. 1994. Multiple copy sampling in protein loop modeling: computa-tional efficiency and sensitivity to dihedral angle perturbations. Protein Sci 3:493-506.
PubMed Central CAS PubMed Google Scholar
Lavalle SM, Finn PW, Kavraki LE, Latombe JC. 2000. A ramdomized kinematics-based approach to pharma-cophore-constrained conformational search and database screening. J Comput Chem 21:731-747.
CAS Google Scholar
Fine RM, Wang H, Shenkin PS, Yarmush DL, Levinthal C. 1996. Predicting antibody hyper-variable loop con-formations, II: minimization and molecular dynamics studies of mcp603 from many randomly generated loop conformations. Proteins: Struct Funct Genet 1:342-362.
Google Scholar
Canutescu AA, Dunbrack RL. 2003. Cyclic coordinate descent: a robotics algorithm for protein loop closure. Protein Sci 12:963-972.
PubMed Central CAS PubMed Google Scholar
Jones TA, Thirup S. 1986. Using known substructures in protein model building and crystallography. EMBO J 5:819-822.
PubMed Central CAS PubMed Google Scholar
Fidelis K, Stern PS, Bacon D, Moult J. 1994. Comparison of systematic search and database methods for con-structing segments of protein-structure. Protein Eng 7:953-960.
CAS PubMed Google Scholar
Kolodny R, Guibas L, Levitt M, Koehl P. 2005. Inverse kinematics in biology: the protein loop closure prob-lem. Int J Rob Res 24:151-163.
Google Scholar
Ponder JW, Richards FM. 1987. Tertiary templates for proteins: use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 193:775-791.
CAS PubMed Google Scholar
Dunbrack RL, Karplus M. 1994. Conformational-analysis of the backbone-dependent rotamer preferences of protein side-chains. Nat Struct Biol 1:334-340.
CAS PubMed Google Scholar
Pierce NA, Winfree E. 2002. Protein design is NP-hard. Protein Eng 15:779-782.
CAS PubMed Google Scholar
Chazelle B, Kingsfort C, Singh MA. 2004. A semi-definite programming approach to side-chain positioning with new rounding strategies. INFORMS J Comput 16:380-392.
Google Scholar
Desmet J, Maeyer MD, Hazes B, Lasters I. 1992. The dead end elimination theorem and its use in protein side-chain positioning. Nature (London) 356:539-542.
CAS Google Scholar
Lasters I, Maeyer MD, Desmet J. 1995. Enhanced dead-end elimination in the search for the global minimum conformation of a collection of protein side chains. Protein Eng 8:815-822.
CAS PubMed Google Scholar
Goldstein RF. 1994. Efficient rotamer elimination applied to protein side-chains and related spin glasses. Bio-phys J 66:1335-1340.
CAS Google Scholar
Gordon DB, Mayo SL. 1998. Radical performance enhancements for combinatorial optimization algorithms based on the dead-end elimination theorem. J Comput Chem 19:1505-1514.
CAS Google Scholar
Looger LL, Hellinga HW. 2001. Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. J Mol Biol 307:429-445.
CAS PubMed Google Scholar
Holm L, Sander C. 1991. Database algorithm for generating protein backbone and side-chain co-ordinates from a C-alpha trace: Application to model building and detection of co-ordinate errors. J Mol Biol 218:183-194.
CAS PubMed Google Scholar
Peterson RW, Dutton PL, Wand AJ. 2004. Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library. Protein Sci 13:735-751.
PubMed Central CAS PubMed Google Scholar
Lu M, Dousis AD, Ma J. 2008. OPUS-Rota: a fast and accurate method for side-chain modeling. Protein Sci 17:1576-1585.
PubMed Central CAS PubMed Google Scholar
Xiang Z, Honig B. 2001. Extending the accuracy limits of prediction for side-chain conformations. J Mol Biol 311:421-430.
CAS PubMed Google Scholar
Samudrala R, Moult J. 1998. A graph theoretic algorithm for comparative modeling of protein structure. J Mol Biol 279:298-302.
Google Scholar
Canutescu AA, Shelenkov AA, Dunbrack RL. 2003. A graph theory algorithm for rapid protein side-chain pre-diction. Protein Sci 12:2001-2014.
PubMed Central CAS PubMed Google Scholar
Dukka-Bahadur KC, Tomita E, Suzuki J, Akutsu T. 2005. Protein side-chain packing problem: a maximum edge-weigth clique algorithmic approach. J Bioinfo Comput Biol 3:103-126.
CAS Google Scholar
Koehl P, Delarue M. 1994. Application of a self consistent mean field theory to predict protein side-chains con-formation and estimate their conformational entropy. J Mol Biol 239:249-275.
CAS PubMed Google Scholar
Koehl P, Delarue M. 1996. Mean-field minimization methods for biological macromolecules. Curr Opin Struct Biol 6:222-226.
CAS PubMed Google Scholar
Koehl P, Delarue M. 1995. A self consistent mean field approach to simultaneous gap closure and side-chain positioning in homology modelling. Nat Struct Biol 2:163-170.
CAS PubMed Google Scholar
Levitt M, Lifson S. 1969. Refinement of protein conformations using a macromolecular energy minimization procedure. J Mol Biol 46:269-279.
CAS PubMed Google Scholar
Koehl P, Levitt M. 1999. A brighter future for protein structure prediction. Nat Struct Biol 6:108-111.
CAS PubMed Google Scholar
Venclovas C, Zemla A, Fidelis K, Moult J. 2003. Assessment of progress over the CASP experiments. Pro-teins: Struct Funct Genet 53:585-595.
CAS Google Scholar
Laskowski RA, Mc Arthur MW, Moss DS, Thornton J. 1993. PROCHECK: a program to check the stereo-chemical quality of protein structures. J Appl Cryst 26:283-291.
CAS Google Scholar
Hooft RW, Vriend G, Sander C, Abola EE. 1996. Errors in protein structures. Nature (London) 381:272.
CAS Google Scholar
Bowie JU, Lüthy R, Eisenberg D. 1991. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253:164-170.
CAS PubMed Google Scholar
Lüthy R, Bowie JU, Eisenberg D. 1992. Assessment of protein models with three-dimensional profiles. Nature (London) 356:83-85.
Google Scholar
Eisenberg D, Luthy R, Bowie JU. 1997. VERIFY3D, assessment of protein models with three-dimensional profiles. Methods Enzymol 277:396-404.
CAS PubMed Google Scholar
Sippl MJ. 1993. Recognition of errors in three-dimensional structures of proteins. Proteins: Struct Funct Genet 17:355-362.
CAS Google Scholar
Wiederstein M, Sippl MJ. 2007. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res 35:W407-W410.
PubMed Central PubMed Google Scholar
Pawlowski M, Gajda MJ, Matlak R, Bujnicki JM. 2008. MetaMQAP: a meta-server for the quality assessment of protein models. BMC Bioinformatics 9:403.
PubMed Central PubMed Google Scholar
Jones DT. 2001. Evaluating the potential of using fold-recognition models for molecular replacement. Acta Cryst D57:1428-1434.
CAS Google Scholar
Rossmann MG. 2001. Molecular replacement—historical background. Acta Crystallogr D Biol Crystallogr 57:1360-1366.
CAS PubMed Google Scholar
Ilari A, Savino C. 2008. Protein structure determination by x-ray crystallography. Methods Mol Biol 452:63-87.
CAS PubMed Google Scholar
Taylor G. 2003. The phase problem. Acta Crystallogr D Biol Crystallogr 59:1881-1890.
PubMed Google Scholar
Friedberg I, Jaroszewski L, Ye Y, Godzik A. 2004. The interplay of fold recognition and experimental structure determination in structural genomics. Curr Opin Struct Biol 14:307-312.
CAS PubMed Google Scholar
Claude J-B, Suhre K, Notredame C, Claverie J-M, Abergel C. 2004. CaspR: a web server for automated mo-lecular replacement using homology modeling. Nucl Acids Res 32:W606-W609.
PubMed Central CAS PubMed Google Scholar
Giorgetti A, Raimondo D, Miele AE, Tramontano A. 2005. Evaluating the usefulness of protein structure mod-els for molecular replacement. Bioinformatics 21:72-76.
Google Scholar
Qian B, Raman S, Das R, Bradley P, McCoy AJ, Read RJ, Baker D. 2007. High-resolution structure prediction and the crystallographic phase problem. Nature (London) 450:259-264.
CAS Google Scholar
Topf M, Sali A. 2005. Combining electron microscopy and comparative protein structure modeling. Curr Opin Struct Biol 15:578-585.
CAS PubMed Google Scholar
Zheng W, Doniach S. 2002. Protein structure prediction constrained by solution X-ray scattering data and struc-tural homology identification. J Mol Biol 316:173-187.
CAS PubMed Google Scholar
Chen SW, Pellequer JL. 2004. Identification of functionally important residues in proteins using comparative models. Curr Med Chem 11:595-605.
CAS PubMed Google Scholar
Skrabanek L, Saini HK, Bader GD, Enright AJ. 2008. Computational prediction of protein-protein interactions. Mol Biotechnol 38:1-17.
CAS PubMed Google Scholar
Hutchins C, Greer J. 1991. Comparative modeling of proteins in the design of novel renin inhibitors. Crit Rev Biochem Mol Biol 26:77-127.
CAS PubMed Google Scholar
Hillisch A, Pineda LF, Hilgenfeld R. 2004. Utility of homology models in the drug discovery process. Drug Discovery Today 9:659-669.
CAS PubMed Google Scholar
Rockey WM, Elcock AH. 2006. Structure selection for protein kinase docking and virtual screening: homology models or crystal structures? Curr Protein Pept Sci 7:437-457.
CAS PubMed Google Scholar
Villoutreix BO, Renault N, Lagorce D, Sperandio O, Montes M, Miteva MA. 2007. Free resources to assist structure-based virtual ligand screening experiments. Curr Protein Pept Sci 8:381-411.
CAS PubMed Google Scholar
Roessler CG, Hall BM, Anderson WJ, Ingram WM, Roberts SA, Montfort WR, Cordes MH. 2008. Transitive homology-guided structural studies lead to the discovery of Cro proteins with 40% sequence identity but differ-ent folds. Proc Nat Acad Sci USA 105:2343-2348.
PubMed Central CAS PubMed Google Scholar
Bradley P, Misura KM, Baker D. 2005. Toward high-resolution de novo structure prediction for small proteins. Science 309:1868-1871.
CAS PubMed Google Scholar
Bonneau R, Baker D. 2001. Ab initio protein structure prediction: progress and prospects. Annu Rev Biophys Biomol Struct 30:173-189.
CAS PubMed Google Scholar
Hardin C, Pogorelov TV, Luthey-Schulten Z. 2002. Ab initio protein structure prediction. Curr Opin Struct Biol 12:176-181.
CAS PubMed Google Scholar
Chivian D, Robertson T, Bonneau R, Baker D. 2003. Ab initio methods. Methods Biochem Anal 44:547-557.
CAS PubMed Google Scholar
Jauch R, Yeo HC, Kolatkar PR, Clarke ND. 2007. Assessment of CASP7 structure predictions for template free targets. Proteins: Struct Funct Genet 69(Suppl 8):57-67.
CAS Google Scholar
Dill KA, Ozkan SB, Welkl TR, Chodera JD, Voetz VA. 2007. The protein folding problem: when will it be solved? Curr Opin Struct Biol 17:342-346.
CAS PubMed Google Scholar
Zhang Y. 2008. Progress and challenges in protein structure prediction. Curr Opin Struct Biol 18:342-348.
PubMed Central CAS PubMed Google Scholar
Dill KA, Bromberg S, Yue KZ, Fiebig KM, Yee DP, Thomas PD, Chan HS. 1995. Principles of protein fold-ing—a perspective from simple exact models. Protein Sci 4:561-602.
PubMed Central CAS PubMed Google Scholar
Covell DG, Jernigan RL. 1990. Conformations of folded proteins in restricted space. Biochemistry 29:3287-3294.
CAS PubMed Google Scholar
Park BH, Levitt M. 1995. The complexity and accuracy of discrete state models of protein structure. J Mol Biol 249:493-507.
CAS PubMed Google Scholar
Lau KF, Dill K. 1989. A lattice statistical mechanics model of the conformational and sequence spaces of pro-teins. Macromolecules 22:3986-3997.
CAS Google Scholar
Shakhnovich EI, Gutin AM. 1993. Engineering of stable and fast-folding sequences of model proteins. Proc Natl Acad Sci USA 90:7195-7199.
PubMed Central CAS PubMed Google Scholar
Go N, Takemoti H. 1978. Resepctive roles of short-and long-range interactions in protein folding. Proc Nat Acad Sci USA 75:559-563.
PubMed Central CAS PubMed Google Scholar
Miyazawa S, Jernigan RL. 1985. Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 18:534-552.
CAS Google Scholar
Chan HS, Dill K. 1989. Compact polymers. Macromolecules 22:4559-4573.
CAS Google Scholar
Chan HS, Dill K. 1990. Origins of structure in globular proteins. Proc Nat Acad Sci USA 87:6388-6392.
PubMed Central CAS PubMed Google Scholar
Karplus M, McCammon JA. 2002. Molecular dynamics simulations of biomolecules. Nat Struct Biol 9:646-652.
CAS PubMed Google Scholar
Duan Y, Kollman PA. 1998. Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. Science 282:740-744.
CAS PubMed Google Scholar
Pitera JW, Swope W. 2003. Understanding folding and design: replica-exchange simulations of "Trp-cage" miniproteins. Proc Nat Acad Sci USA 100:7587-7592.
PubMed Central CAS PubMed Google Scholar
Lei H, Wu C, Liu H, Duan Y. 2007. Folding free energy landscape of vllin headpiece subdomain from molecu-lar dynamic simulations. Proc Nat Acad Sci USA 104:4925-4930.
PubMed Central CAS PubMed Google Scholar
Zagrovic B, Snow CD, Shirts MR, Pande VS. 2002. Simulation of folding of a small alpha-helical protein in atomistic detail using worldwide-distributed computing. J Mol Biol 323:927-937.
CAS PubMed Google Scholar
Chou PY, Fasman GD. 1974. Conformational parameters for amino-acids in helical, beta-sheet, and random coil regions calculated from proteins. Biochemistry 13:211-222.
CAS PubMed Google Scholar
Garnier J, Osguthorpe D, Robson B. 1978. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120:97-120.
CAS PubMed Google Scholar
Heringa J. 2000. Computational methods for protein secondary structure prediction using multiple sequence alignments. Curr Protein Pept Sci 1:273-301.
CAS PubMed Google Scholar
Rost B. 2001. Review: protein secondary structure prediction continues to rise. J Struct Biol 134:204-218.
CAS PubMed Google Scholar
Rost B, Eyrich VA. 2001. EVA: large-scale analysis of secondary structure prediction. Proteins: Struct Funct Genet Suppl 5:192-199.
Google Scholar
Rost B, Sander C. 1993. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol 232:584-599.
CAS PubMed Google Scholar
Montgomerie S, Sundararaj S, Gallin WJ, Wishart DS. 2006. Improving the accuracy of protein secondary structure prediction using structural alignment. BMC Bioinformatics 7:301.
PubMed Central PubMed Google Scholar
Fain B, Levitt M. 2001. A novel method for sampling alpha-helical protein backbones. J Mol Biol 305:191-201.
CAS PubMed Google Scholar
Bradley P, Baker D. 2006. Improved beta-protein structure prediction by multilevel optimization of nonlocal strand pairings and local backbone conformation. Proteins: Struct Funct Genet 65:922-929.
CAS Google Scholar
Wu GA, Coutsias EA, Dill KA. 2008. Iterative assembly of helical proteins by optimal hydrophobic packing. Structure 16:1257-1266.
PubMed Central CAS PubMed Google Scholar
Orengo C, Bray J, Hubbard T, Lo Conte L, Sillitoe I. 1999. Analysis and assessment of ab initio three-dimensional prediction, secondary structure, and contacts prediction. Proteins: Struct Funct Genet 37:149-170.
Google Scholar
Ortiz AR, Kolinski A, Skolnick J. 1998. Native-like topology assembly of small proteins using predicted re-straints in Monte Carlo folding simulations. Proc Nat Acad Sci USA 95:1020-1025.
PubMed Central CAS PubMed Google Scholar
Rohl CA, Strauss CE, Misura KM, Baker D. 2004. Protein structure prediction using Rosetta. Methods Enzymol 383:66-93.
CAS PubMed Google Scholar
Das R, Baker D. 2008. Macromolecular modeling with Rosetta. Annu Rev Biochem 77:363-382.
CAS PubMed Google Scholar
Lazaridis T, Karplus M. 2000. Effective energy functions for protein structure prediction. Currr Opin Struct Biol 10:139-145.
CAS Google Scholar
Huang ES, Samudrala R, Park BH. 2000. Scoring functions for ab initio protein structure prediction. Methods Mol Biol 143:223-245.
CAS PubMed Google Scholar
Ngan S-C, Hung LH, Liu T, Samudrala R. 2008. Scoring functions for de novo protein structure prediction revisited. Methods Mol Biol 413:243-281.
CAS PubMed Google Scholar
Roux B, Simonson T. 1999. Implicit solvent models. Biophys Chem 78:1-20.
CAS PubMed Google Scholar
Koehl P. 2006. Electrostatics calculations: latest methodological advances. Curr Opin Struct Biol 16:142-51.
CAS PubMed Google Scholar
Sippl M. 1990. Calculation of conformational ensembles from potentials of mean force: an approach to the knowledge-based prediction of local structures in globular proteins. J Mol Biol 1990:859-883.
Google Scholar
Sippl M. 1993. Boltzmann’s principle, knowledge-based mean fields and protein folding: an approach to the computational determination of protein structures. J Comput Aided Mol Des 7:473-501.
CAS PubMed Google Scholar
Samudrala R, Moult J. 1998. An all-atom distance dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol 275:895-916.
CAS PubMed Google Scholar
Moult J, Pedersen JT, Judson RS, Fidelis K. 1995. A large scale experiment to assess protein structure predic-tion methods. Proteins: Struct Funct Genet 23:R2-R4.
Google Scholar
Subbiah S, Laurents DV, Levitt M. 1993. Structural similarity of DNA-binding domains of bacteriophage rep-ressors and the globin core. Curr Biol 3:141-148.
CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Genome Center, University of California, Davis, One Shields Avenue, Davis, 95616, USA
Patrice Koehl

Authors

Patrice Koehl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrice Koehl .

Editor information

Editors and Affiliations

Department of Biochemistry and Molecular Medicine, University of California Davis, One Shields Avenue, Davis, CA, 95616, USA
Thomas Jue Ph.D.

1.1 Electronic Supplementary material

Figure 1.1.

Amino acids: the building blocks of proteins. (A) Each amino acid has a mainchain (N, C_α, C, and O) on which is attached a sidechain schematically represented as R. The mainchain can itself be partitioned into three groups: the amino group, the central C_α group, and the carboxyl group. Note that even though the amino group and the carboxyl group are charged at neutral pH, the amino acid is neutral: we say that it is a zwitterion. (B) Amino acids in proteins are attached through planar peptide bonds, connecting atom C of the current residue to atom N of the following residue. Please visit http://extras.springer.com/ to view a high-resolution full-color version of this illustration. (PDF 2,785 KB)

Figure 1.2.

The three most common arrangements of secondary structure elements (SSE) found in proteins. (A) The regular α–helix is a right–handed helix, in which all residues adopt similar conformations. The α–helix is characterized by hydrogen bonds between the oxygen O of residue i, and the polar backbone hydrogen HN (bound to N) of residue i + 4. Note that all C=O and N–HN bonds are parallel to the main axis of the helix. (B) An anti-parallel β–sheet. Two strands (stretches of extended backbone segments) are running in an anti-parallel geometry. The atoms HN and O of residue i in the first strand hydrogen bond with the atoms O and HN of residue j in the opposite strand, respectively, while residues i + 1 and j + 1 face outward. (C) A parallel β–sheet. The two strands are parallel, and the atoms HN and O of residue i in the first strand hydrogen bond with the O of residue j and the HN of residue j + 2, respectively. The same alternating pattern of residues involved in hydrogen bonds with the opposite strand, and facing outward is observed in parallel and anti-parallel β–sheets. A strand can therefore be involved in two different sheets. For simplicity, sidechains and non-polar hydrogens are ignored. Figure drawn using Pymol (http://www.pymol.org). Please visit http://extras.springer.com/ to view a high-resolution full-color version of this illustration. (PDF 2,783 KB)

Figure 1.3.

The three main types of proteins. (A) Collagen is the main protein of connective tissues in animals and the most abundant protein in mammals, making up close to 30% of their body protein content. It is a fiber protein, with each fiber made up of three polypeptide strands possessing the conformation of left-handed helices. These three left-handed helices are twisted together into a right-handed coiled coil, a cooperative quaternary structure stabilized by numerous hydrogen bonds. (B) Bacteriorhodopsin is a mainly α–protein, containing seven helices, that crosses the membrane of a cell (a few lipids of the membrane are shown as a space-filling diagram in green). It serves as an ion pump, and is found in bacteria that can survive in high salt concentrations. (C) TIM is a globular protein that belongs to the α–β class. The protein chain alternates between β and α secondary structure type, giving rise to a barrel β–sheet in the center surrounded by a large ring of α-helix on the outside. This structure, first seen in the triose phosphate isomerase of chicken, has been observed in many unrelated proteins since then. Figure drawn using Pymol (http://www.pymol.org). Please visit http://extras.springer.com/ to view a high-resolution full-color version of this illustration. (PDF 2,792 KB)

Figure 1.7.

A self-consistent mean field (SCMF) approach to the problem of predicting sidechain conformation. (A) The multicopy approach. Let us assume that residue i in the protein of interest is a phenylalanine, and that this phenylalanine can adopt three possible conformations. A systematic enumeration of all possible sidechain conformations in the protein would require that all three conformations of phenylalanine i be considered. If the protein contains 100 residues, each with three possible conformations, the size of the corresponding conformational space is 3¹⁰⁰, a number out of reach of modern computers. As an alternative, we construct a chimera molecule, where sidechains are represented as an ensemble of discrete conformation: phenylalanine i is now represented with 3 conformations, each with a weight P(i,j), such that the sum of the weights is 1. (B,C) The mean field. The chimera molecule considered contains all conformations of all sidechains in the proteins. The energy of conformation k for residue i includes the internal energy for conformation k, the energy of interaction of conformation k for i with the backbone, and all interactions with all conformations of the remaining sidechains of the protein, each weighted with their probabilities. (D) Updating the probabilities. The initial probabilities are chosen to be uniform. Using the equations given in (C) we get the energies of all conformations of all residues in the chimera protein. These energies are then used to update the probabilities of these conformations. We have shown that updating the probabilities using a Boltzmann law is equivalent to minimizing the total free energy of the chimera molecule [97]. The new probabilities are then used to compute new energies; this procedure is repeated until we reach convergence (“self-consistency”), i.e., when the probabilities and energies do not change anymore. For each residue, we choose the conformation with the resolution full-color version of this illustration. highest converged probability as its predicted conformation. Please visit http://extras.springer.com/ to view a high resolution full-color version of this illustration. (PDF 2,802 KB)

Figure 1.8.

Lattice model of a protein structure. The figure depicts an example of a compact selfavoiding structure of a protein chain of 27 “residues” on a regular cubic lattice. This structure contains 28 contacts between non-sequential residues (shown as dashed line). The total energy of this conformation is the sum of the energies over these contacts. Please visit http://extras.springer.com/ to view a high resolution full-color version of this illustration. (PDF 2,789 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Koehl, P. (2010). Protein Structure Prediction. In: Jue, T. (eds) Biomedical Applications of Biophysics. Handbook of Modern Biophysics, vol 3. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60327-233-9_1

Download citation

DOI: https://doi.org/10.1007/978-1-60327-233-9_1
Published: 13 August 2010
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60327-232-2
Online ISBN: 978-1-60327-233-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Protein Structure Prediction

Abstract

Access this chapter

Preview

Further Reading

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1.1 Electronic Supplementary material

Figure 1.1.

Figure 1.2.

Figure 1.3.

Figure 1.7.

Figure 1.8.

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Protein Structure Prediction

Abstract

Access this chapter

Preview

Further Reading

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1.1 Electronic Supplementary material

Figure 1.1.

Figure 1.2.

Figure 1.3.

Figure 1.7.

Figure 1.8.

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation