Toward Quantitative Protein Structure Prediction

  • Teresa Head-Gordon


We review a constrained optimization strategy known as the antlion method for the purpose of protein structure prediction. This method involves the use of neural network predictions of secondary and tertiary structure to systematically deform a protein energy hypersurface to retain only a single minimum near to the native structure. Successful constrained optimization as applied to protein folding relies on (1) an understanding of the chemistry that distinguishes the native minimum from other metastable structures, (2) the incorporation of such information as robust constraints on the energy function to isolate the native structure minimum, and (3) progress toward providing a quantitative representation of the potential or free energy function. We provide a discussion of completed work by us that begins to affect these three problem areas as we move toward our goal of quantitative protein structure prediction.


Boolean Function Penalty Function Hide Neuron Protein Structure Prediction Nonbonded Interaction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bash PA, Field MJ, Karplus M (1987): Free energy purturbation method for chemical reactions in the condensed phase: A dynamical approach based on a combined quantum and molecular mechanics force field. J Am Chem Soc 109:8092CrossRefGoogle Scholar
  2. Bengio Y, Pouliot Y (1990): Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network. Computer Applications in the Biosciences 6:319–324PubMedGoogle Scholar
  3. Binkley JS, Pople JA, Hehre WJ (1980): Self-consistent molecular orbital methods. 21. Small split valence basis sets for first-row elements. J Am Chem Soc 102:939–947CrossRefGoogle Scholar
  4. Bohr H, Bohr J, Brunak S, Cotterill RMJ (1990): A novel approach to prediction of the three-dimensional structures of protein backbones by neural networks. FEBS Lett 261:43–46PubMedCrossRefGoogle Scholar
  5. Bonaccorsi R, Palla P, Tomasi J (1984): Conformational energy of glycine in aqueous solutions and relative stability of the zwitterionic and neutral forms. An ab initio study. J Am Chem Soc 106:1945–1950CrossRefGoogle Scholar
  6. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M (1983): CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J Comp Chem 4:187–217CrossRefGoogle Scholar
  7. Bryngelson JD, Wolynes PG (1987): Spin glasses and the statistical mechanics of protein folding. Proc Natl Acad Sci USA 84:7524–7528PubMedCrossRefGoogle Scholar
  8. Chan HS, Dill KA (1991): Polymer principles in protein structure and stability. Annu Rev Biophys Chem 20:447–490CrossRefGoogle Scholar
  9. Chou PY, Fasman GD (1974): Prediction of protein conformation. Biochem 13:222–275CrossRefGoogle Scholar
  10. Churchland PS, Sejnowski TJ (1992): The Computational Brain. Cambridge: MIT PressGoogle Scholar
  11. Clark T, Chandrasekhar J, Spitznagel GW, Schleyer PVR (1983): Efficient diffuse functions augmented basis sets for anion calculations. III. The 3-21G basis set for first row elements, lithium to fluorine. J Comp Chem 4:294–301CrossRefGoogle Scholar
  12. Deisenhofer J, Steigemann W (1975): Crystallographic refinement of the structure of bovine pancreatic trypsin inhibitor at 1.5 Å resolution. Acta Crystallogr, Sect B 31:238CrossRefGoogle Scholar
  13. Eisenberg D, Bowie JU, Luthy R, Choe S (1992): Three-dimensional profiles for analysing protein sequence structure relations. Faraday Discussions of the Chem Soc, 25-34Google Scholar
  14. Eriksson AE, Baase WA, Zhang X-J, Heinz DW, Blaber M, Baldwin EP, Matthews BW (1992): Response of a protein structure to cavity-creating mutations and its relation to the hydrophobic effect. Science 255:178–183PubMedCrossRefGoogle Scholar
  15. Ferran EA, Ferrara P (1992): Clustering proteins into families using artificial neural networks. Computer Applications in the Biosciences 8:39–44PubMedGoogle Scholar
  16. Friedrichs MS, Goldstein RA, Wolynes PG (1991): Generalized protein tertiary structure recognition using associative memory hamiltonians. J Mol Biol 222: 1013–1034PubMedCrossRefGoogle Scholar
  17. Frisch MJ, Head-Gordon M, Foresman JB, Trucks GW, Raghavachari K, Schlegel HB, Robb MA, Binkley JS, Gonzalez C, Defreez DJ, Fox DJ, Whiteside RA, Seeger R, Melius CF, Baker J, Kahn LR, Stewart JJP, Fluder EM, Topiol S, Pople JA (1990): Gaussian 90, Gaussian Inc., Pittsburgh, PAGoogle Scholar
  18. Frisch MJ, Pople JA, Binkley JS (1984a): Self-consistent molecular orbital methods. 25. Supplementary functions for Gaussian basis sets. J Chem Phys 80:3265–69CrossRefGoogle Scholar
  19. Frisch MJ, Pople JA, Del Bene JE (1984b): Molecular orbital study of the dimers (A H n)2 formed from ammonia, water, hydrogen fluoride, phosphine, hydrogen sulfide, and hydrochloric acid. J Phys Chem 89:3664–3669CrossRefGoogle Scholar
  20. Frisch MJ, Trucks GW, Head-Gordon M, Gill PMW, Wong MW, Foresman JB, Johnson BG, Schlegel HB, Robb MA, Replogle ES, Gomperts R, Andres JL, Raghavachari K, Binkley JS, Gonzalez C, Martin RL, Fox DJ, Defreez DJ, Baker J, Stewart JJP, Pople JA (1992): Gaussian 92, Revision A. Gaussian Inc., Pittsburgh, PAGoogle Scholar
  21. Gamier J, Osguthorpe DJ, Robson B (1978): Analysis of accuracy and implications of simple methods for predicting secondary structure of globular proteins. J Mol Biol 120:97–120CrossRefGoogle Scholar
  22. Gibrat JF, Gamier J, Robson B (1987): Further developments of protein secondary structure prediction using information theory. J Mol Biol 198:425–443PubMedCrossRefGoogle Scholar
  23. Godzik A, Skolnick J (1992): Sequence structure matching in globular proteins: application to supersecondary structure and tertiary structure determination. Proc Natl Acad Sci USA 89:12098–12102PubMedCrossRefGoogle Scholar
  24. Goldstein RA, Luthey-Schulten ZA, Wolynes PG (1992): Protein tertiary structure recognition using optimized Hamiltonians with local interactions. Proc Natl Acad Sci USA 89:9029–9033PubMedCrossRefGoogle Scholar
  25. Hagler AT, Huler E, Lifson S (1974): Energy functions for peptides and proteins. I. Derivation of a consistent force field including the hydrogen bond from amide crystals. J Am Chem Soc 96:5319–5327PubMedCrossRefGoogle Scholar
  26. Hariharan PC, Pople JA (1974): Effect of d functions on molecular orbital energies for hydrocarbons. Mol Phys 27:209–14CrossRefGoogle Scholar
  27. Hayward S, Collins JF (1992): Limits on α-helix prediction with neural network models. Proteins — Structure, Function and Genetics 14:372–381CrossRefGoogle Scholar
  28. Head-Gordon T, Head-Gordon M, Frisch MJ, Brooks CL, Pople JA (1989): A theoretical study of alanine dipeptide and analogues. Int J Quant Chem Biol Symp 16:311–322Google Scholar
  29. Head-Gordon T, Head-Gordon M, Frisch MJ, Brooks CL, Pople JA (1991): Theoretical study of blocked glycine and alanine peptide analogues. J Am Chem Soc 113:5989–5997CrossRefGoogle Scholar
  30. Head-Gordon T, Stillinger FH (1993a): Toward optimal neural networks for protein structure prediction. Phys Rev E 48. (In press.)Google Scholar
  31. Head-Gordon T, Stillinger FH (1993b): Predicting Polypeptide and protein structures from amino acid sequence: Antlion method applied to melittin. Biopolymers 33:293–303CrossRefGoogle Scholar
  32. Head-Gordon T, Stillinger FH, Arrecis J (1990): A strategy for finding classes of minima on a hypersurface implications for approaches to the protein folding problem. Proc Natl Acad Sci USA 88:11076–11080CrossRefGoogle Scholar
  33. Head-Gordon T, Stillinger FH, Wright MH, Gay DM (1992): Poly-L-alanine as a universal reference material for undertanding protein energies and structures. Proc Natl Acad Sci USA 89:11513–11517PubMedCrossRefGoogle Scholar
  34. Hehre WJ, Ditchfield R, Pople JA (1972): Self-consistent molecular orbital methods. XII. Further extensions of Gaussian-type basis sets for use in molecular orbital studies of organic molecules. J Chem Phys 56:2257–61CrossRefGoogle Scholar
  35. Hehre WJ, Radom L, Schleyer PVR, Pople JA (1986): Ab initio Molecular Orbital Theory. New York: WileyGoogle Scholar
  36. Hendrickson WA, Teeter MM (1981): Structure of the hydrophobic protein crambin determined directly from the anomalous scattering of sulfur. Nature 290:107–113CrossRefGoogle Scholar
  37. Hertz J, Krogh A, Palmer RG (1991): Introduction to the Theory of Neural Computations. Redwood City, CA: Addison-WesleyGoogle Scholar
  38. Hirst JD, Sternberg MJE (1991): Prediction of ATP-binding motifs a comparison of a perceptron type neural network and a consensus sequence method. Prot Eng 4:615–623CrossRefGoogle Scholar
  39. Hirst JD, Sternberg MJE (1992): Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. Biochem 31:7211–7218CrossRefGoogle Scholar
  40. Holley LH, Karplus M (1989): Protein secondary structure prediction with a neural network. Proc Natl Acad Sci USA 86:152–156PubMedCrossRefGoogle Scholar
  41. Jorgensen WL, Tirado-Rives J (1988): The OPLS potential functions for proteins. Energy minimizations for crystals of cyclic peptides and crambin. J Am Chem Soc 110:1657–1666CrossRefGoogle Scholar
  42. Kabsch W, Sander C (1983): Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22: 2577–2637PubMedCrossRefGoogle Scholar
  43. Kartha G, Bello J, Harker D (1967): Tertiary structure of ribonuclease. Nature 213:862–865PubMedCrossRefGoogle Scholar
  44. Kneller DG, Cohen FE, Langridge R (1990): Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol 214:171–182PubMedCrossRefGoogle Scholar
  45. Kolinski A, Skolnick J, Yaris R (1988): Monte Carlo simulations on an equilibrium globular protein folding model. Proc Natl Acad Sci USA 83:7267–7271CrossRefGoogle Scholar
  46. Lee C, Subbiah S (1991): Prediction of protein side-chain conformation by packing optimization. J Mol Biol 217:373–388PubMedCrossRefGoogle Scholar
  47. Levin JM, Robson B, Gamier J (1986): An algorithm for secondary structure determination in proteins based on sequence similarity. FEBS Lett 205:303–308PubMedCrossRefGoogle Scholar
  48. Levitt M (1976): A simplified representation of protein conformations for rapid simulation of protein folding. J Mol Biol 104:59–107PubMedCrossRefGoogle Scholar
  49. Levitt M (1978): Conformational preferences of amino acids in globular proteins. Biochemistry 17:4277–4285PubMedCrossRefGoogle Scholar
  50. Levitt M, Warshel A (1975) Computer simulation of protein folding. Nature 253: 694–698PubMedCrossRefGoogle Scholar
  51. Lim VI (1974): Algorithms for prediction of α-helical and β-structural regions in globular proteins. J Mol Biol 88:873–894PubMedCrossRefGoogle Scholar
  52. Madura JD, Jorgensen WL (1986): Ab initio and monte carlo calculations for a nucleophilic addition reaction in the gas phase and in aqueous solution. J Am Chem Soc 108:2517CrossRefGoogle Scholar
  53. McGregor MJ, Flores TP, Sternberg MJE (1989): Prediction of β-turns in proteins using neural networks. Prot Eng 2:521–526CrossRefGoogle Scholar
  54. Momany FA, Carruthers LM, McGuire RF, Scheraga HA (1974): Intermolecular potentials from crytal data. III. Determination of empirical potentials and application to the packing configurations and lattice energies in crystals of hydrocarbons, carboxylic acids, amines, and amides. J Phys Chem 78:1595–1620CrossRefGoogle Scholar
  55. Momany FA, Klimkowski VJ, Schafer L (1990): On the use of conformationally dependent geometry trends from ab initio dipeptide studies to refine potentials for the empirical force field CHARMM. J Comp Chem 11:654–662CrossRefGoogle Scholar
  56. Müller B, Reinhardt J (1990): Neural Networks: An Introduction. Berlin, Heidelberg: Springer-VerlagGoogle Scholar
  57. Muskal SM, Kim SH (1992): Predicting protein secondary structure content a tandem neural network approach. J Mol Biol 225:713–727PubMedCrossRefGoogle Scholar
  58. O’Neill KT, DeGrado WF (1990): A thermodynamic scale for the helix-forming tendencies of the commonly occurring amino acids. Science 250:646–651CrossRefGoogle Scholar
  59. Onsager L (1936): Electric moments of molecules in water. J Am Chem Soc 58:1486–1493CrossRefGoogle Scholar
  60. Pauling L, Corey RB, Branson HR (1951): Structure of proteins two hydrogenbonded helical configurations of the Polypeptide chain. Proc Natl Acad Sci USA 37:205–211PubMedCrossRefGoogle Scholar
  61. Press WH, Flannery BP, Teukolsky SA, Vetterling VT (1986): Numerical Recipes Cambridge: Cambridge University PressGoogle Scholar
  62. Ptitsyn OB, Finkelstein AV (1989): Prediction of protein secondary structure based on physical theory. Protein Eng 2:443–447PubMedCrossRefGoogle Scholar
  63. Qian N, Sejnowski TJ (1988): Predicting the secondary structure of globular proteins using neural network models. J Mol Biol 202:865–884PubMedCrossRefGoogle Scholar
  64. Ramachandran GN, Ramakrishnan C, Sasisekharan V (1973): Stereochemistry of Polypeptide chain configurations. J Mol Biol 7:95–99CrossRefGoogle Scholar
  65. Rooman MJ, Wodak SJ (1988): Identification of predictive sequence motifs limited by protein structure database size. Nature 335:45–49PubMedCrossRefGoogle Scholar
  66. Scheraga HA (1992): Some approaches to the multiple-minima problem in the calculation of Polypeptide and protein structures. Int J Quant Chem 42:1529–1536CrossRefGoogle Scholar
  67. Shakhnovich E, Farztdinov G, Gutin AM, Karplus M (1991): Protein folding bottlenecks a lattice monte-carlo simulation. Phys Rev Lett 67: 1665PubMedCrossRefGoogle Scholar
  68. Shang H, Head-Gordon T (1994): Stabilization of helices in glycine and alanine dipeptide in a reaction field model of solvent. J Am Chem Soc 116:1528–1532CrossRefGoogle Scholar
  69. Stillinger FH, Head-Gordon T, Hirschfeld CL (1993): Toy model for protein folding. Phys Rev E (In press.)Google Scholar
  70. Stolorz P, Lapedes A, Xia Y (1992): Predicting protein secondary structure using neural networks and statistical methods. J Mol Biol 225:363–377PubMedCrossRefGoogle Scholar
  71. Tainer JA, Getzoff ED, Beem KM, Richardson JS, and Richardson DC (1982): Determination and analysis of the 2 Å structure of copper, zinc Superoxide dismutase. J Mol Biol 160:181–217PubMedCrossRefGoogle Scholar
  72. Tapia O (1991): On the theory of solvent-effect representation. 1. A generalized self-consistent reaction field theory. J Mol Struct (Theochem) 226:59–72Google Scholar
  73. Terwilliger TC, Eisenberg D (1982): The structure of melittin. J Biol Chem 257: 6016–6022PubMedGoogle Scholar
  74. Vieth M, Kolinski A (1991): Prediction of protein secondary structure by an enhanced neural network. Acta Biochimica Polonica 38:335–351PubMedGoogle Scholar
  75. Weiner SJ, Kollman PA, Nguyen DT, Case DA (1986): An all atom force field for simulations of proteins and nucleic acids. J Am Chem Soc 106:230–252Google Scholar
  76. Wilcox GL, Poliac M, Liebman MN (1990): Neural network analysis of protein tertiary structure. Tetrahedron Comput Methodol 3:191–211CrossRefGoogle Scholar
  77. Williams IH (1987): Theoretical modeling of specific solvation effects upon carbonyl addition. J Am Chem Soc 109:6299CrossRefGoogle Scholar
  78. Wilmanns M, Eisenberg D (1993): Three-dimensional profiles from residue-pair preferences identification of sequences with beta/alpha-barrel fold. Proc Natl Acad Sci USA 90:1379–83PubMedCrossRefGoogle Scholar
  79. Wong MW, Frisch MJ, Wiberg KB (1991a): Solvent effects. 1. The mediation of electrostatic effects by solvents. J Am Chem Soc 113:4776–4782CrossRefGoogle Scholar
  80. Wong MW, Wiberg KB, Frisch MJ (1991b): Solvent effects. 3. Tautomeric equilibria of formamide and 2-pryidone in the gas phase and solution an ab initio scrf study. J Am Chem Soc 114:1645–1652CrossRefGoogle Scholar
  81. Wong MW, Wiberg KB, Frisch MJ (1992): Solvent effects. 2. Medium effect on the structure, energy, charge density, and vibrational frequencies of sulfamic acid. J Am Chem Soc 114:523–529CrossRefGoogle Scholar
  82. Zhang X-J, Baase WA, Matthews BW (1991): Toward a simplification of the protein folding problem a stabilizing polyalanine α-helix engineered in T4 lysozyme. Biochem 30:2012–2017CrossRefGoogle Scholar
  83. Zwanzig R, Szabo A, Bagchi B (1992): Levinthals paradox. Proc Natl Acad Sci USA 89:20–22PubMedCrossRefGoogle Scholar

Copyright information

© Birkhäuser Boston 1994

Authors and Affiliations

  • Teresa Head-Gordon

There are no affiliations available

Personalised recommendations