Computational Complexity, Protein Structure Prediction, and the Levinthal Paradox

  • J. Thomas Ngo
  • Joe Marks
  • Martin Karplus


A protein molecule is a covalent chain of amino acid residues. Although it is topologically linear, in physiological conditions it folds into a unique (though flexible) three-dimensional structure. This structure, which has been determined by x-ray crystallography and nuclear magnetic resonance for many proteins (Bernstein et al., 1977; Abola et al., 1987), is referred to as the native structure. As demonstrated by the experiments of Anfinsen and co-workers (Anfinsen et al., 1961; Anfinsen, 1973), at least some protein molecules, when denatured (unfolded) by disrupting conditions in their environment (such as acidity or high temperature) can spontaneously refold to their native structures when proper physiological conditions are restored. Thus, all of the information necessary to determine the native structure can be contained in the amino acid sequence.


Computational Complexity Problem Instance Travel Salesman Problem Restricted Form Rigid Domain 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Abola EE, Bernstein FC, Bryant SH, Koetzle TF, and Weng J (1987): Protein data bank. In Allen FH, Bergerhoff G, Sievers R, eds. Crystallographic Databases-Information Content, Software Systems, Scientific Applications, pp. 107-132. Data Commission of the International Union of Crystallography, Bonn/Cambridge/ChesterGoogle Scholar
  2. Aho AV, Hopcroft JE, Ullman JD (1974): The Design and Analysis of Computer Algorithms. Reading, MA: Addison-WesleyGoogle Scholar
  3. Aho AV, Hopcroft JE, Ullman JD (1982): Data Structures and Algorithms. Reading, MA: Addison-WesleyGoogle Scholar
  4. Amara P, Hsu D, Straub JE (1993): Global energy minimum searches using an approximate solution of the imaginary time Schrödinger equation. Journal of Physical Chemistry 97:6715–6721CrossRefGoogle Scholar
  5. Anfinsen CB (1973): Principles that govern the folding of protein chains. Science 181 (4096):223–230PubMedCrossRefGoogle Scholar
  6. Anfinsen CB, Haber E, Sela M, White FH (1961): The kinetics of formation of native ribonuclease during oxidation of the reduced Polypeptide chain. Proceedings of the National Academy of Sciences, USA 47:1309–1314CrossRefGoogle Scholar
  7. Arora S, Lund C, Motwani R, Sudan M, Szegedy M (1992): Proof verification and hardness of approximation problems. In Thirty-Third Annual Symposium on Foundations of Computer Science (FOCS)Google Scholar
  8. Baldwin RL (1989): How does protein folding get started? Trends Biochem Sci 14:291–294PubMedCrossRefGoogle Scholar
  9. Baraff D (1991): Coping with friction for non-penetrating rigid body simulation. Computer Graphics 25(4):31–40CrossRefGoogle Scholar
  10. Barahona F (1982): On the computational complexity of Ising spin glass models. Journal of Physics A: Mathematics and General 15:3241–3253CrossRefGoogle Scholar
  11. Berg BA, Neuhaus T (1992): Multicanonical ensemble: A new approach to simulate first-order phase transitions. Physical Review Letters 68(1):9–12PubMedCrossRefGoogle Scholar
  12. Bernstein FC, Koetzle TF, Williams GJB, Meyer EF Jr., Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M (1977): The protein data bank: A computerbased archival file for macromolecular structures. Journal of Molecular Biology 112:535–542PubMedCrossRefGoogle Scholar
  13. Bierzynski A, Kim PS, Baldwin RL (1982): A salt bridge stabilizes the helix formed by isolated C-peptide of RNAse A. Proceedings of the National Academy of Sciences, USA 79:2470–2474CrossRefGoogle Scholar
  14. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M (1983): CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry 4(2): 187–217CrossRefGoogle Scholar
  15. Brown JE, Klee WA (1971): Helix-coil transition of the isolated amino terminus. Biochemistry 10(3):470–476PubMedCrossRefGoogle Scholar
  16. Bruccoleri RE, Karplus M (1987): Prediction of the folding of short Polypeptide segments by uniform conformational sampling. Biopolymers 26:137–168PubMedCrossRefGoogle Scholar
  17. Brünger AT, Clore GM, Gronenborn AM, Karplus M (1986): Three-dimensional structure of proteins determined by molecular dynamics with interproton distance restraints: Application to crambin. Proceedings of the National Academy of Sciences, USA 83:3801–3805CrossRefGoogle Scholar
  18. Bryngelson JD, Wolynes PG (1989): Intermediates and barrier crossing in a random energy model (with applications to protein folding).Journal of Physical Chemistry 93:6902–6915CrossRefGoogle Scholar
  19. Caflisch A, Miranker A, Karplus M (1993): Multiple copy simultaneous search and construction of ligands in binding sites: Application to inhibitors of HIV-1 aspartic proteinase. Journal of Medicinal Chemistry 36:2142–2167PubMedCrossRefGoogle Scholar
  20. Chan HS, Dill KA (1991): Polymer principles in protein structure and stability. Annual Reviews of Biophysics and Biophysical Chemistry 20:447–490CrossRefGoogle Scholar
  21. Chang G, Guida WC, Still WC (1989): An internal coordinate Monte Carlo method for searching conformational space. Journal of the American Chemical Society 11:4379CrossRefGoogle Scholar
  22. Cheeseman P, Kanefsky B, Taylor WM (1991): Where the really hard problems are. In Proceedings of IJCAI’ 91, pp. 163–169Google Scholar
  23. Christofides N (1976): Worst-case analysis of a new heuristic for the travelling salesman problem. Technical report, Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh, PAGoogle Scholar
  24. Cook SA (1971): The complexity of theorem-proving procedures. In Proceedings of the Third Annual ACM Symposium on the Theory of Computing, pp. 151-158Google Scholar
  25. Creighton TE, ed. (1992): Protein Folding. New York: WH FreemanGoogle Scholar
  26. Crippen GM (1975): Global optimization and Polypeptide conformation. Journal of Computational Physics 18:224–231CrossRefGoogle Scholar
  27. Crippen GM, Scheraga HA (1969): Minimization of Polypeptide energy. VIII. Application of the deflation technique to a dipeptide. Proceedings of the National Academy of Sciences, USA 64:42–49CrossRefGoogle Scholar
  28. Crippen GM, Scheraga HA (1971): Minimization of Polypeptide energy. XI. Method of gentlest ascent. Archives of Biochemistry and Biophysics 144:462–466PubMedCrossRefGoogle Scholar
  29. Dandekar T, Argos P (1992): Potential of genetic algorithms in protein folding and protein engineering simulations. Protein Engineering 5(7):637–645PubMedCrossRefGoogle Scholar
  30. Dantzig GB (1963): Linear Programming and Extensions. Princeton, NJ: Princeton University PressGoogle Scholar
  31. Davis L (1991): Handbook of Genetic Algorithms. New York: Van Nostrand Rein-holdGoogle Scholar
  32. Dunbrack RL Jr., Karplus M (1993): Backbone-dependent rotamer library for proteins: Application to side-chain prediction. Journal of Molecular Biology 230: 543–574PubMedCrossRefGoogle Scholar
  33. Dyson HJ, Rance M, Houghten RA, Lerner RA, Wright PE (1988a): Folding of immunogenic peptide fragments of proteins in water solution. I. Sequence requirements for the formation of a reverse turn. Journal of Molecular Biology 201(1):161–200PubMedCrossRefGoogle Scholar
  34. Dyson HJ, Rance M, Houghten RA, Wright PE, Lerner RA (1988b): Folding of immunogenic peptide fragments of proteins in water solution. II. The nascent helix. Journal of Molecular Biology 201(1):201–217PubMedCrossRefGoogle Scholar
  35. Elber R, Karplus M (1987): Multiple conformational states of proteins: A molecular dynamics analysis of myoglobin. Science 235:318–321PubMedCrossRefGoogle Scholar
  36. Epstein CJ, Goldberger RF, Anfinsen CB (1963): The genetic control of tertiary protein structure: Studies with model systems. Cold Spring Harbor Symposium on Quantitative Biology 28:439–449CrossRefGoogle Scholar
  37. Fasman GD, ed. (1988): Prediction of Protein Structure and The Principles of Protein Conformation. New York: Plenum PressGoogle Scholar
  38. Finkelstein AV, Reva BA (1992): Search for the stable state of a short chain in a molecular field. Protein Engineering 5(7):617–624PubMedCrossRefGoogle Scholar
  39. Formann M, Wagner F (1991): A packing problem with applications to lettering of maps. In Proceedings of the Seventh Annual Symposium on Computational Geometry, pp. 281–288, North Conway, NH: ACMCrossRefGoogle Scholar
  40. Fraenkel AS (1993): Complexity of protein folding. Bulletin of Mathematical Biology 55(6):1199–1210PubMedGoogle Scholar
  41. Franco J, Pauli M (1983): Probabilistic analysis of the Davis-Putnam procedure for solving the satisfiability problem. Discrete Applied Mathematics 5:77–87CrossRefGoogle Scholar
  42. Garey MR, Johnson DS (1979): Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: WH Freeman and CompanyGoogle Scholar
  43. Gibson KD, Scheraga HA (1986): Predicted conformations for the immunodominant region of the circumsporozoite protein of the human malaria parasite. Proceedings of the National Academy of Sciences, USA 83:5649–5653CrossRefGoogle Scholar
  44. Goldberg A (1979): On the complexity of the satisfiability problem. Courant Computer Science Report 16, New York UniversityGoogle Scholar
  45. Goldberg A, Purdom PW Jr., Brown CA (1982): Average time analysis of simplified Davis-Putnam procedures. Information Processing Letters 15:72–75. See also “Errata,” vol. 16, 1983, p. 213CrossRefGoogle Scholar
  46. Gordon HL, Somorjai RL (1992): Applicability of the method of smoothed functionals as a global minimizer for model Polypeptides. Journal of Physical Chemistry 96:7116–7121CrossRefGoogle Scholar
  47. Greengard L (1987): The Rapid Evaluation of Potential Fields in Particle Systems. ACM Distinguished Dissertations. Cambridge, MA: MIT PressGoogle Scholar
  48. Greengard L, Rokhlin V (1989): On the evaluation of electrostatic interactions in molecular modeling. Chemica Scripta 29A:139–144Google Scholar
  49. Harrison SC, Durbin R (1985): Is there a single pathway for the folding of a polypeptide chain? Proceedings of the National Academy of Sciences, USA 82:4028–4030CrossRefGoogle Scholar
  50. Head-Gordon T, Stillinger FH (1993): Predicting Polypeptide and protein structures from amino acid sequence: Antlion method applied to melittin. Biopolymers 33(2):293–303CrossRefGoogle Scholar
  51. Head-Gordon T, Stillinger FH, Arrecis J (1991): A strategy for finding classes of minima on a hypersurface: Implications for approaches to the protein folding problem. Proceedings of the National Academy of Sciences, USA 88:11076–11080CrossRefGoogle Scholar
  52. Jaenicke R (1987): Protein folding and protein association. Progress in Biophysics and Molecular Biology 49:117–237PubMedCrossRefGoogle Scholar
  53. Karplus M, Shakhnovich E (1992): Protein folding: Theoretical studies of thermodynamics and dynamics. In Creighton TE, ed., Protein Folding, chapter 4, pp. 127–196. New York: WH FreemanGoogle Scholar
  54. Karplus M, Weaver DL (1976): Protein-folding dynamics. Nature 260:404–406PubMedCrossRefGoogle Scholar
  55. Karplus M, Weaver DL (1979): Diffusion-collision model for protein folding. Biopolymers 18:1421–1437CrossRefGoogle Scholar
  56. Khachiyan LG (1979): A polynomial time algorithm in linear programming. Soviet Math Dokl 20:191–194Google Scholar
  57. Kim PS, Baldwin RL (1990): Intermediates in the folding reactions of small proteins. Annual Reviews of Biochemistry 59:631–660CrossRefGoogle Scholar
  58. Kirkpatrick S, Gelatt CD Jr., Vecchi MP (1983): Optimization by simulated annealing. Science 220:671–680PubMedCrossRefGoogle Scholar
  59. Kostrowicki J, Piela L (1991): Diffusion equation method of global minimization: Performance for standard test functions. Journal of Optimization Theory and Applications 69(2):269–284CrossRefGoogle Scholar
  60. Kostrowicki J, Piela L, Cherayil BJ, Scheraga HA (1991): Performance of the diffusion equation method in searches for optimum structures of clusters of Lennard-Jones atoms. Journal of Physical Chemistry 95(10):4113–4119CrossRefGoogle Scholar
  61. Kostrowicki J, Scheraga HA (1992): Application of the diffusion equation method for global optimization to Oligopeptides. Journal of Physical Chemistry 96: 7442–7449CrossRefGoogle Scholar
  62. Ladner RE (1975): On the structure of polynomial time reducibility. Journal of the Association of Computing Machinery 22:155–171CrossRefGoogle Scholar
  63. Lee C, Subbiah S (1991): Prediction of protein side-chain conformation by packing optimization. Journal of Molecular Biology 217:373–388PubMedCrossRefGoogle Scholar
  64. LeGrand S, Merz K Jr. (1993): The application of the genetic algorithm to the minimization of potential energy functions. Journal of Global Optimization 3:49–66CrossRefGoogle Scholar
  65. Levinthal C (1966): Molecular model-building by computer. Scientific American 214Google Scholar
  66. Levinthal C (1968): Are there pathways for protein folding? Journal de Chimie Physique 65(1):44–45Google Scholar
  67. Levinthal C (1969): In Mössbauer Spectroscopy in Biological Systems, pp. 22–24. Urbana, IL: University of Illinois Press. Proceedings of a meeting held at Allerton House, Monticello, ILGoogle Scholar
  68. Lewis HR, Papadimitriou CH (1978): The efficiency of algorithms. Scientific American 238(1):96–109CrossRefGoogle Scholar
  69. Lewis HR, Papadimitriou CH (1981): Elements of the Theory of Computation. Englewood Cliffs, NJ: Prentice-HallGoogle Scholar
  70. Lipton M, Still WC (1988): The multiple minimum problem in molecular modeling. Tree searching internal coordinate conformational space. Journal of Computational Chemistry 9(4):343–355CrossRefGoogle Scholar
  71. Marks J, Shieber S (1991): The computational complexity of cartographic label placement. Technical Report TR-05-91, Harvard University, Cambridge, MAGoogle Scholar
  72. Marqusee S, Baldwin RL (1987): Helix stablization by Gly-Lys salt bridges in short peptides of de novo design. Proceedings of the National Academy of Sciences, USA 84:8898–8902CrossRefGoogle Scholar
  73. Metropolis N, Rosenbluth AW, Teller AH, Teller E (1953): Equation of state calculations by fast computing machines. Journal of Chemical Physics 21:1087–1092CrossRefGoogle Scholar
  74. Mitchell D, Selman B, Levesque H (1992): Hard and easy distributions of SAT problems. In Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI’ 92) pp. 459–465. San Jose, CA: AAAI Press/MIT PressGoogle Scholar
  75. Momany FA, McGuire RF, Burgess AW, Scheraga HA (1975): Energy parameters in Polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occurring amino acids. Journal of Physical Chemistry 79(22):2361–2381CrossRefGoogle Scholar
  76. Moult J, James MNG (1986): An algorithm for determining the conformation of Polypeptide segments in proteins by systematic search. PROTEINS: Structure, Function, and Genetics 1:146–163CrossRefGoogle Scholar
  77. Moult J, Unger R (1991): An analysis of protein folding pathways. Biochemistry 30:3816–3824PubMedCrossRefGoogle Scholar
  78. Ngo JT, Marks J (1991): Computational complexity of a problem in molecular structure prediction. Technical Report TR-17-91, Harvard University, Cambridge, MA. Older version in which 90° angles were employedGoogle Scholar
  79. Ngo JT, Marks J (1992): Computational complexity of a problem in molecularstructure prediction. Protein Engineering 5(4):313–321PubMedCrossRefGoogle Scholar
  80. Nilsson NJ (1980): Principles of Artificial Intelligence. Palo Alto, CA: Tioga Publishing CompanyGoogle Scholar
  81. Papadimitriou CH, Steiglitz K (1982): Combinatorial Optimization: Algorithms and Complexity. Englewood Cliffs, NJ: Prentice-HallGoogle Scholar
  82. Papadimitriou CH, Yannakakis M (1991): Optimization, approximation, and complexity classes. Journal of Computer and System Sciences 43:425–440CrossRefGoogle Scholar
  83. Piela L, Kostrowicki J, Scheraga HA (1989): The multiple-minima problem in the conformational analysis of molecules. Deformation of the potential energy hypersurface by the diffusion equation method. Journal of Physical Chemistry 93(8):3339–3346CrossRefGoogle Scholar
  84. Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1988): Numerical Recipes in C. The Art of Scientific Computing. Cambridge, UK: Cambridge University PressGoogle Scholar
  85. Ptitsyn OB (1987): Protein folding: Hypotheses and experiments. Journal of Protein Chemistry 6:273–293CrossRefGoogle Scholar
  86. Purisima EO, Scheraga HA (1986): An approach to the multiple-minima problem by relaxing dimensionality. Proceedings of the National Academy of Sciences, USA 83:2782–2786CrossRefGoogle Scholar
  87. Purisima EO, Scheraga HA (1987): An approach to the multiple-minima problem in protein folding by relaxing dimensionality. Tests on enkephalin. Journal of Molecular Biology 196:697–709PubMedCrossRefGoogle Scholar
  88. Rabin MO (1976): Probabilistic algorithms. In Traub JF, ed., Algorithms and Complexity: New Directions and Recent Results, pp. 21–39. New York: Academic PressGoogle Scholar
  89. Rabin MO (1980): Probabilistic algorithms for testing primality. Journal of Number Theory 12(1):128–138CrossRefGoogle Scholar
  90. Reeke GN Jr. (1988): Protein folding: Computational approaches to an exponentialtime problem. In Annual Reviews of Computer Science 3:59–84; Annual Reviews, Inc.CrossRefGoogle Scholar
  91. Robson B, Platt E, Fishleigh RV, Marsden A, Miliard P (1987): Expert system for protein engineering: Its application in the study of chloroamphenicol acetyltransferase and avian pancreatic Polypeptide. Journal of Molecular Graphics 5(1):8–17CrossRefGoogle Scholar
  92. Roder H, Elöve GA, Englander SW (1988): Structural characterization of folding intermediates in cytochrome c by H-exchange labelling and proton NMR. Nature 335:700–704PubMedCrossRefGoogle Scholar
  93. Šali A, Shakhnovich EI, Karplus M (1994): Kinetics of protein folding: A lattice model study of the requirements for folding to the native state. Journal of Molecular Biology 235:1614–1636PubMedCrossRefGoogle Scholar
  94. Saunders M, Houk KN, Wu Y-D, Still WC, Lipton M, Chang G, Guida WC (1990): Comformations of cycloheptadecane: A comparison of methods for conformational searching. Journal of the American Chemical Society 112:1419–1427CrossRefGoogle Scholar
  95. Scheraga HA (1992): Some approaches to the multiple-minima problem in the calculation of Polypeptide and protein structures. International Journal of Quantum Chemistry 42:1529–1536CrossRefGoogle Scholar
  96. Schmidt KE, Lee MA (1991): Implementing the fast multipole method in three dimensions. Journal of Statistical Physics 63(5/6):1223–1235CrossRefGoogle Scholar
  97. Shakhnovich EI, Farztdinov GM, Gutin GM, Karplus M (1991): Protein folding bottlenecks: A lattice Monte-Carlo simulation. Physical Review Letters 67(12):1665–1668PubMedCrossRefGoogle Scholar
  98. Shakhnovich EI, Gutin AM (1989): Frozen states of a disordered globular heteropolymer. Journal of Physics A22(10): 1647–1659Google Scholar
  99. Shalloway D (1992): Application of the renormalization group to deterministic global minimization of molecular conformation energy functions. Journal of Global Optimization 2:281–311CrossRefGoogle Scholar
  100. Shoemaker KR, Kim PS, Brems DN, Marqusee S, York EJ, Chaiken, IM, Stewart JM, Baldwin RL (1985): Nature of the charged group effect on the stability of the C-peptide helix. Proceedings of the National Academy of Sciences, USA 82:2349–2353CrossRefGoogle Scholar
  101. Shoemaker KR, Kim PS, York EJ, Stewart JM, Baldwin RL (1987): Tests of the helix dipole model for stabilization of α-helices. Nature 326:563–567PubMedCrossRefGoogle Scholar
  102. Shubert BO (1972a): A sequential method seeking the global maximum of a function. SIAM Journal on Numerical Analysis 9(3):379–388CrossRefGoogle Scholar
  103. Shubert BO (1972b): Sequential optimization of multimodal discrete function with bounded rate of change. Management Science 18(11):687–693CrossRefGoogle Scholar
  104. Sikorski A, Skolnick J (1990): Dynamic Monte Carlo simulations of globular protein folding/unfolding pathways. II. α-helical motifs. Journal of Molecular Biology 212:819–836PubMedCrossRefGoogle Scholar
  105. Simon I, Glasser L, Scheraga HA (1991): Calculation of protein conformation as an assembly of stable overlapping segments: Application to bovine pancreaticai trypsininhibitor. Proceedings of the National Academy of Sciences, USA 88:3661–3665CrossRefGoogle Scholar
  106. Summers NL, Karplus M (1990): Modeling of globular proteins: A distance-based data search procedure for the construction of insertion-deletion regions and pro reversible non-pro mutations. Journal of Molecular Biology 216(4):991–1016PubMedCrossRefGoogle Scholar
  107. Sun S (1993): Reduced representation model of protein structure prediction: Statistical potential and genetic algorithms. Protein Science 2(5):762–785PubMedCrossRefGoogle Scholar
  108. Tsong TY, Baldwin RL, McPhie P (1972): A sequential model of nucleationdependent protein folding: Kinetic studies of ribonuclease A. Journal of Molecular Biology 63(3):453–475PubMedCrossRefGoogle Scholar
  109. Udgaonkar JB, Baldwin RL (1988): NMR evidence for an early framework intermediate on the folding pathway of ribonuclease A. Nature 335:694–699PubMedCrossRefGoogle Scholar
  110. Unger R, Moult J (1993): Finding the lowest free energy conformation of a protein is a NP-hard problem: Proof and implications. Bulletin of Mathematical Biology 55(6):1183–1198PubMedGoogle Scholar
  111. van Gunsteren WF, Berendsen HJC, Colonna F, Perahia D, Hollenberg JP, Lellouch D (1984): On searching neighbors in computer simulations of macromolecular systems. Journal of Computational Chemistry 5(3):272–279CrossRefGoogle Scholar
  112. Vásquez M, Scheraga HA (1985): Use of buildup and energy-minimization procedures to compute low-energy structures of the backbone of enkephalin. Biopolymers 24:1437–1447PubMedCrossRefGoogle Scholar
  113. Wawak RJ, Wimmer MM, Scheraga HA (1992): An application of the diffusion equation method of global optimization to water clusters. Journal of Physical Chemistry 96:5138–5145CrossRefGoogle Scholar
  114. Webster (1991): Webster’s Ninth Collegiate Dictionary. Springfield, MA: Merriam-WebsterGoogle Scholar
  115. Weiner SJ, Kollman P, Nguyen D, Case DA (1986): An all-atom force field for simulations of proteins and nucleic acids. Journal of Computational Chemistry 7(2):230–252CrossRefGoogle Scholar
  116. Wetlaufer DB (1973): Nucleation, rapid folding, and globular intrachain regions in proteins. Proceedings of the National Academy of Sciences, USA 70:697–701CrossRefGoogle Scholar
  117. Zwanzig R, Szabo A, Bagchi B (1992): Levinthal’s paradox. Proceedings of the National Academy of Sciences, USA 89:20–22CrossRefGoogle Scholar

Copyright information

© Birkhäuser Boston 1994

Authors and Affiliations

  • J. Thomas Ngo
  • Joe Marks
  • Martin Karplus

There are no affiliations available

Personalised recommendations