A Historical Perspective of Template-Based Protein Structure Prediction

  • Jun-tao Guo
  • Kyle Ellrott
  • Ying Xu
Part of the Methods in Molecular Biology™ book series (MIMB, volume 413)


This chapter presents a broad and a historical overview of the problem of protein structure prediction. Different structure prediction methods, including homology modeling, fold recognition (FR)/protein threading, ab initio/de novo approaches, and hybrid techniques involving multiple types of approaches, are introduced in a historical context. The progress of the field as a whole, especially in the threading/FR area, as reflected by the CASP/CAFASP contests, is reviewed. At the end of the chapter, we discuss the challenging issues ahead in the field of protein structure prediction.


Structure prediction fold recognition protein threading CASP/CAFASP meta-server fragment assembly energy function comparative/homology modeling 



The work is, in part, supported by National Institutes of Health (R01 AG18927), National Science Foundation (DBI-0354771/ITR-IIS-0407204), and by the Georgia Cancer Coalition (a “Distinguished Cancer Scholar” grant).


  1. 1.
    Anfinsen, C. B. (1973) Principles that govern the folding of protein chains. Science 181, 223–30.PubMedCrossRefGoogle Scholar
  2. 2.
    Kolata, G. (1986) Trying to crack the second half of the genetic code. Science 233, 1037–9.PubMedCrossRefGoogle Scholar
  3. 3.
    Burley, S. K., Almo, S. C., Bonanno, J. B., Capel, M., Chance, M. R., Gaasterland, T., Lin, D. W., Sali, A., Studier, F. W., and Swaminathan, S. (1999) Structural genomics: beyond the Human Genome Project. Nat Genet 23, 151–7.PubMedCrossRefGoogle Scholar
  4. 4.
    Levitt, M., and Warshel, A. (1975) Computer-simulation of protein folding. Nature 253, 694–8.PubMedCrossRefGoogle Scholar
  5. 5.
    Levitt, M. (1976) Simplified representation of protein conformations for rapid simulation of protein folding. J Mol Biol 104, 59–107.PubMedCrossRefGoogle Scholar
  6. 6.
    Contreras-Moreira, B., Ezkurdia, I., Tress, M. L., and Valencia, A. (2005) Empirical limits for template-based protein structure prediction: the CASP5 example. FEBS Lett 579, 1203–7.PubMedCrossRefGoogle Scholar
  7. 7.
    Bajorath, J., Chalupny, N. J., Marken, J. S., Siadak, A. W., Skonier, J., Gordon, M., Hollenbaugh, D., Noelle, R. J., Ochs, H. D., and Aruffo, A. (1995) Identification of residues on CD40 and its ligand which are critical for the receptor-ligand interaction. Biochemistry 34, 1833–44.PubMedCrossRefGoogle Scholar
  8. 8.
    Bajorath, J., Marken, J. S., Chalupny, N. J., Spoon, T. L., Siadak, A. W., Gordon, M., Noelle, R. J., Hollenbaugh, D., and Aruffo, A. (1995) Analysis of gp39/CD40 interactions using molecular models and site-directed mutagenesis. Biochemistry 34, 9884–92.PubMedCrossRefGoogle Scholar
  9. 9.
    Bajorath, J., Seyama, K., Nonoyama, S., Ochs, H. D., and Aruffo, A. (1996) Classification of mutations in the human CD40 ligand, gp39, that are associated with X-linked hyper IgM syndrome. Protein Sci 5, 531–4.PubMedCrossRefGoogle Scholar
  10. 10.
    Karpusas, M., Hsu, Y. M., Wang, J. H., Thompson, J., Lederman, S., Chess, L., and Thomas, D. (1995) 2 A crystal structure of an extracellular fragment of human CD40 ligand. Structure 3, 1031–9.PubMedCrossRefGoogle Scholar
  11. 11.
    Bajorath, J. (1998) Detailed comparison of two molecular models of the human CD40 ligand with an x-ray structure and critical assessment of model-based mutagenesis and residue mapping studies. J Biol Chem 273, 24603–9.PubMedCrossRefGoogle Scholar
  12. 12.
    Schonbrun, J., Wedemeyer, W. J., and Baker, D. (2002) Protein structure prediction in 2002. Curr Opin Struct Biol 12, 348–54.PubMedCrossRefGoogle Scholar
  13. 13.
    Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., and Bourne, P. E. (2000) The Protein Data Bank. Nucleic Acids Res 28, 235–42.PubMedCrossRefGoogle Scholar
  14. 14.
    Jones, D. T. (1997) Progress in protein structure prediction. Curr Opin Struct Biol 7, 377–87.PubMedCrossRefGoogle Scholar
  15. 15.
    Moult, J., Pedersen, J. T., Judson, R., and Fidelis, K. (1995) A large-scale experiment to assess protein structure prediction methods. Proteins 23, ii–v.PubMedCrossRefGoogle Scholar
  16. 16.
    Moult, J. (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15, 285–9.PubMedCrossRefGoogle Scholar
  17. 17.
    Fischer, D., Barret, C., Bryson, K., Elofsson, A., Godzik, A., Jones, D., Karplus, K. J., Kelley, L. A., MacCallum, R. M., Pawowski, K., Rost, B., Rychlewski, L., and Sternberg, M. (1999) CAFASP-1: critical assessment of fully automated structure prediction methods. Proteins Suppl 3, 209–17.PubMedCrossRefGoogle Scholar
  18. 18.
    Chothia, C., and Lesk, A. M. (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5, 823–6.PubMedGoogle Scholar
  19. 19.
    Browne, W. J., North, A. C., Phillips, D. C., Brew, K., Vanaman, T. C., and Hill, R. L. (1969) A possible three-dimensional structure of bovine alpha-lactalbumin based on that of hen’s egg-white lysozyme. J Mol Biol 42, 65–86.PubMedCrossRefGoogle Scholar
  20. 20.
    Sippl, M. J., and Flockner, H. (1996) Threading thrills and threats. Structure 4, 15–9.PubMedCrossRefGoogle Scholar
  21. 21.
    Fischer, D., Rice, D., Bowie, J. U., and Eisenberg, D. (1996) Assigning amino acid sequences to 3-dimensional protein folds. FASEB J 10, 126–36.PubMedGoogle Scholar
  22. 22.
    Zhang, C., and DeLisi, C. (1998) Estimating the number of protein folds. J Mol Biol 284, 1301–5.PubMedCrossRefGoogle Scholar
  23. 23.
    Wang, Z. X. (1998) A re-estimation for the total numbers of protein folds and superfamilies. Protein Eng 11, 621–6.PubMedCrossRefGoogle Scholar
  24. 24.
    Wang, Z. X. (1996) How many fold types of protein are there in nature? Proteins 26, 186–91.PubMedCrossRefGoogle Scholar
  25. 25.
    Chothia, C. (1992) Proteins. One thousand families for the molecular biologist. Nature 357, 543–4.PubMedCrossRefGoogle Scholar
  26. 26.
    Govindarajan, S., Recabarren, R., and Goldstein, R. A. (1999) Estimating the total number of protein folds. Proteins 35, 408–14.PubMedCrossRefGoogle Scholar
  27. 27.
    Zhang, C. T. (1997) Relations of the numbers of protein sequences, families and folds. Protein Eng 10, 757–61.PubMedCrossRefGoogle Scholar
  28. 28.
    Bowie, J. U., Luthy, R., and Eisenberg, D. (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–70.PubMedCrossRefGoogle Scholar
  29. 29.
    Ripka, W. C. (1986) Computer-assisted model building. Nature 321, 93–4.CrossRefGoogle Scholar
  30. 30.
    Isaacs, N., James, R., Niall, H., Bryant-Greenwood, G., Dodson, G., Evans, A., and North, A. C. (1978) Relaxin and its structural relationship to insulin. Nature 271, 278–81.PubMedCrossRefGoogle Scholar
  31. 31.
    Blundell, T. L., Bedarkar, S., Rinderknecht, E., and Humbel, R. E. (1978) Insulin-like growth factor: a model for tertiary structure accounting for immunoreactivity and receptor binding. Proc Natl Acad Sci USA 75, 180–4.PubMedCrossRefGoogle Scholar
  32. 32.
    Greer, J. (1981) Comparative model-building of the mammalian serine proteases. J Mol Biol 153, 1027–42.PubMedCrossRefGoogle Scholar
  33. 33.
    Blundell, T., Sibanda, B. L., and Pearl, L. (1983) Three-dimensional structure, specificity and catalytic mechanism of renin. Nature 304, 273–5.PubMedCrossRefGoogle Scholar
  34. 34.
    Greer, J. (1985) Model structure for the inflammatory protein C5a. Science 228, 1055–60.PubMedCrossRefGoogle Scholar
  35. 35.
    Palmer, K. A., Scheraga, H. A., Riordan, J. F., and Vallee, B. L. (1986) A preliminary three-dimensional structure of angiogenin. Proc Natl Acad Sci USA 83, 1965–9.PubMedCrossRefGoogle Scholar
  36. 36.
    Chothia, C., Lesk, A. M., Levitt, M., Amit, A. G., Mariuzza, R. A., Phillips, S. E., and Poljak, R. J. (1986) The predicted structure of immunoglobulin D1.3 and its comparison with the crystal structure. Science 233, 755–8.PubMedCrossRefGoogle Scholar
  37. 37.
    Jones, D. T., Taylor, W. R., and Thornton, J. M. (1992) A new approach to protein fold recognition. Nature 358, 86–9.PubMedCrossRefGoogle Scholar
  38. 38.
    Chandonia, J. M., and Brenner, S. E. (2006) The impact of structural genomics: expectations and outcomes. Science 311, 347–51.PubMedCrossRefGoogle Scholar
  39. 39.
    Orengo, C. A., Jones, D. T., and Thornton, J. M. (1994) Protein superfamilies and domain superfolds. Nature 372, 631–4.PubMedCrossRefGoogle Scholar
  40. 40.
    Murzin, A. G., Brenner, S. E., Hubbard, T., and Chothia, C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247, 536–40.PubMedGoogle Scholar
  41. 41.
    Govindarajan, S., and Goldstein, R. A. (1996) Why are some proteins structures so common? Proc Natl Acad Sci USA 93, 3341–5.PubMedCrossRefGoogle Scholar
  42. 42.
    Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B., and Thornton, J. M. (1997) CATH–a hierarchic classification of protein domain structures. Structure 5, 1093–108.PubMedCrossRefGoogle Scholar
  43. 43.
    Marti-Renom, M. A., Stuart, A. C., Fiser, A., Sanchez, R., Melo, F., and Sali, A. (2000) Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 29, 291–325.PubMedCrossRefGoogle Scholar
  44. 44.
    Al-Lazikani, B., Jung, J., Xiang, Z., and Honig, B. (2001) Protein structure prediction. Curr Opin Chem Biol 5, 51–6.PubMedCrossRefGoogle Scholar
  45. 45.
    Xiang, Z. (2007) Homology-based modeling of protein structure. In Computational Methods for Protein Structure Prediction and Modeling (Xu, Y., Xu, D., and Liang, J., Eds.), Vol. 1:319–357, Springer.Google Scholar
  46. 46.
    Sippl, M. J. (1995) Knowledge-based potentials for proteins. Curr Opin Struct Biol 5, 229–35.PubMedCrossRefGoogle Scholar
  47. 47.
    Tanaka, S., and Scheraga, H. A. (1976) Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. Macromolecules 9, 945–50.PubMedCrossRefGoogle Scholar
  48. 48.
    Miyazawa, S., and Jernigan, R. L. (1985) Estimation of effective interresidue contact energies from protein crystal-structures - quasi-chemical approximation. Macromolecules 18, 534–52.CrossRefGoogle Scholar
  49. 49.
    Eisenberg, D., and McLachlan, A. D. (1986) Solvation energy in protein folding and binding. Nature 319, 199–203.PubMedCrossRefGoogle Scholar
  50. 50.
    Sippl, M. J. (1990) Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. J Mol Biol 213, 859–83.PubMedCrossRefGoogle Scholar
  51. 51.
    Miller, R. T., Jones, D. T., and Thornton, J. M. (1996) Protein fold recognition by sequence threading: tools and assessment techniques. FASEB J 10, 171–8.PubMedGoogle Scholar
  52. 52.
    Xu, Y., and Xu, D. (2000) Protein threading using PROSPECT: design and evaluation. Proteins 40, 343–54.PubMedCrossRefGoogle Scholar
  53. 53.
    Kim, D., Xu, D., Guo, J. T., Ellrott, K., and Xu, Y. (2003) PROSPECT II: protein structure prediction program for genome-scale applications. Protein Eng 16, 641–50.PubMedCrossRefGoogle Scholar
  54. 54.
    DeWitte, R. S., and Shakhnovich, E. I. (1996) SMoG: de novo design method based on simple, fast, and accurate free energy estimates. 1. Methodology and supporting evidence. J Am Chem Soc 118, 11733–44.CrossRefGoogle Scholar
  55. 55.
    Lu, H., and Skolnick, J. (2001) A distance-dependent atomic knowledge-based potential for improved protein structure selection. Proteins 44, 223–32.PubMedCrossRefGoogle Scholar
  56. 56.
    Samudrala, R., and Moult, J. (1998) An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol 275, 895–916.PubMedCrossRefGoogle Scholar
  57. 57.
    Zhou, H., and Zhou, Y. (2002) Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci 11, 2714–26.PubMedCrossRefGoogle Scholar
  58. 58.
    Zhang, C., Liu, S., Zhou, H., and Zhou, Y. (2004) The dependence of all-atom statistical potentials on structural training database. Biophys J 86, 3349–58.PubMedCrossRefGoogle Scholar
  59. 59.
    Zhou, H., and Zhou, Y. (2005) SPARKS 2 and SP3 servers in CASP6. Proteins 61 (Suppl 7), 152–6.PubMedCrossRefGoogle Scholar
  60. 60.
    Jones, D. T., and Thornton, J. M. (1996) Potential energy functions for threading. Curr Opin Struct Biol 6, 210–6.PubMedCrossRefGoogle Scholar
  61. 61.
    Kocher, J. P., Rooman, M. J., and Wodak, S. J. (1994) Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. J Mol Biol 235, 1598–613.PubMedCrossRefGoogle Scholar
  62. 62.
    Nishikawa, K., and Matsuo, Y. (1993) Development of pseudoenergy potentials for assessing protein 3-D-1-D compatibility and detecting weak homologies. Protein Eng 6, 811–20.PubMedCrossRefGoogle Scholar
  63. 63.
    Buchete, N. V., Straub, J. E., and Thirumalai, D. (2004) Orientational potentials extracted from protein structures improve native fold recognition. Protein Sci 13, 862–74.PubMedCrossRefGoogle Scholar
  64. 64.
    Miyazawa, S., and Jernigan, R. L. (2005) How effective for fold recognition is a potential of mean force that includes relative orientations between contacting residues in proteins? J Chem Phys 122, 024901.PubMedCrossRefGoogle Scholar
  65. 65.
    Kinch, L. N., Wrabl, J. O., Krishna, S. S., Majumdar, I., Sadreyev, R. I., Qi, Y., Pei, J., Cheng, H., and Grishin, N. V. (2003) CASP5 assessment of fold recognition target predictions. Proteins 53 (Suppl 6), 395–409.PubMedCrossRefGoogle Scholar
  66. 66.
    Wang, G., Jin, Y., and Dunbrack, R. L., Jr. (2005) Assessment of fold recognition predictions in CASP6. Proteins 61 (Suppl 7), 46–66.PubMedCrossRefGoogle Scholar
  67. 67.
    Rychlewski, L., Jaroszewski, L., Li, W. Z., and Godzik, A. (2000) Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci 9, 232–41.PubMedCrossRefGoogle Scholar
  68. 68.
    Ginalski, K., Grishin, N. V., Godzik, A., and Rychlewski, L. (2005) Practical lessons from protein structure prediction. Nucleic Acids Res 33, 1874–91.PubMedCrossRefGoogle Scholar
  69. 69.
    Jones, D., and Thornton, J. (1993) Protein fold recognition. J Comput Aided Mol Des 7, 439–56.PubMedCrossRefGoogle Scholar
  70. 70.
    Lathrop, R. H. (1994) The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Eng 7, 1059–68.PubMedCrossRefGoogle Scholar
  71. 71.
    Godzik, A., Kolinski, A., and Skolnick, J. (1992) Topology fingerprint approach to the inverse protein folding problem. J Mol Biol 227, 227–38.PubMedCrossRefGoogle Scholar
  72. 72.
    Godzik, A., and Skolnick, J. (1992) Sequence-structure matching in globular proteins: application to supersecondary and tertiary structure determination. Proc Natl Acad Sci USA 89, 12098–102.PubMedCrossRefGoogle Scholar
  73. 73.
    Westhead, D. R., Collura, V. P., Eldridge, M. D., Firth, M. A., Li, J., and Murray, C. W. (1995) Protein fold recognition by threading: comparison of algorithms and analysis of results. Protein Eng 8, 1197–204.PubMedCrossRefGoogle Scholar
  74. 74.
    Flockner, H., Braxenthaler, M., Lackner, P., Jaritz, M., Ortner, M., and Sippl, M. J. (1995) Progress in fold recognition. Proteins 23, 376–86.PubMedCrossRefGoogle Scholar
  75. 75.
    Jones, D. T. (1999) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 287, 797–815.PubMedCrossRefGoogle Scholar
  76. 76.
    Akmaev, V. R., Kelley, S. T., and Stormo, G. D. (2000) Phylogenetically enhanced statistical tools for RNA structure prediction. Bioinformatics 16, 501–12.PubMedCrossRefGoogle Scholar
  77. 77.
    Shi, J., Blundell, T. L., and Mizuguchi, K. (2001) FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 310, 243–57.PubMedCrossRefGoogle Scholar
  78. 78.
    Lathrop, R. H., and Smith, T. F. (1994) A branch and bound algorithm for optimal protein threading with pairwise (contact potential) interaction preferences. In Proceedings of the 27th Hawaii International Conference on System Sciences (Hunter, L. and Shriver, B., Eds), pp. 365–74. IEEE Computer Soc. Press, Los Alamitos, CA.Google Scholar
  79. 79.
    Guo, J.-T., Ellrott, K., Chung, W. J., Xu, D., Passovets, S., Xu, Y. (2004) PROSPECT-PSPP: an automatic computational pipeline for protein structure prediction. Nucleic Acids Res. 32, W522–5.PubMedCrossRefGoogle Scholar
  80. 80.
    Xu, J., Li, M., Kim, D., and Xu, Y. (2003) RAPTOR: optimal protein threading by linear programming. J Bioinform Comput Biol 1, 95–117.PubMedCrossRefGoogle Scholar
  81. 81.
    Xu, D., Crawford, O. H., LoCascio, P. F., and Xu, Y. (2001) Application of PROSPECT in CASP4: characterizing protein structures with new folds. Proteins Suppl 5, 140–8.PubMedCrossRefGoogle Scholar
  82. 82.
    Xu, J., and Li, M. (2003) Assessment of RAPTOR’s linear programming approach in CAFASP3. Proteins 53 (Suppl 6), 579–84.PubMedCrossRefGoogle Scholar
  83. 83.
    Xu, J., Jiao, F., and Berger, B. (2005) A tree-decomposition approach to protein structure prediction. In 2005 IEEE Computational Systems Bioinformatics Conference, pp. 247–56, Stanford, CA.Google Scholar
  84. 84.
    Song, Y., Guo, J.-T., Ellrott, K., Xu, Y., and Cai, L. (2007) Efficient algorithms for protein threading via tree decomposition (submitted).Google Scholar
  85. 85.
    Wang, G., and Dunbrack, R. L., Jr. (2003) PISCES: a protein sequence culling server. Bioinformatics 19, 1589–91.PubMedCrossRefGoogle Scholar
  86. 86.
    Bryant, S. H., and Lawrence, C. E. (1993) An empirical energy function for threading protein sequence through the folding motif. Proteins 16, 92–112.PubMedCrossRefGoogle Scholar
  87. 87.
    Karlin, S., and Altschul, S. F. (1990) Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc Natl Acad Sci USA 87, 2264–8.PubMedCrossRefGoogle Scholar
  88. 88.
    Altschul, S. F., Boguski, M. S., Gish, W., and Wootton, J. C. (1994) Issues in searching molecular sequence databases. Nat Genet 6, 119–29.PubMedCrossRefGoogle Scholar
  89. 89.
    Altschul, S. F., and Gish, W. (1996) Local alignment statistics. Methods Enzymol 266, 460–80.PubMedCrossRefGoogle Scholar
  90. 90.
    Pearson, W. R. (1998) Empirical statistical estimates for sequence similarity searches. J Mol Biol 276, 71–84.PubMedCrossRefGoogle Scholar
  91. 91.
    Bryant, S. H., and Altschul, S. F. (1995) Statistics of sequence-structure threading. Curr Opin Struct Biol 5, 236–44.PubMedCrossRefGoogle Scholar
  92. 92.
    Levitt, M., and Gerstein, M. (1998) A unified statistical framework for sequence comparison and structure comparison. Proc Natl Acad Sci USA 95, 5913–20.PubMedCrossRefGoogle Scholar
  93. 93.
    Sommer, I., Zien, A., von Ohsen, N., Zimmer, R., and Lengauer, T. (2002) Confidence measures for protein fold recognition. Bioinformatics 18, 802–12.PubMedCrossRefGoogle Scholar
  94. 94.
    McGuffin, L. J., and Jones, D. T. (2003) Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19, 874–81.PubMedCrossRefGoogle Scholar
  95. 95.
    Xu, J. (2005) Fold recognition by predicted alignment accuracy. IEEE/ACM Trans Comput Biol Bioinform 2, 157–65.PubMedCrossRefGoogle Scholar
  96. 96.
    Holm, L., and Sander, C. (1996) Mapping the protein universe. Science 273, 595–603.PubMedCrossRefGoogle Scholar
  97. 97.
    Smith, T. F., and Waterman, M. S. (1981) Identification of common molecular subsequences. J Mol Biol 147, 195–7.PubMedCrossRefGoogle Scholar
  98. 98.
    Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J Mol Biol 215, 403–10.PubMedGoogle Scholar
  99. 99.
    Pearson, W. R., and Lipman, D. J. (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85, 2444–8.PubMedCrossRefGoogle Scholar
  100. 100.
    Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–402.PubMedCrossRefGoogle Scholar
  101. 101.
    Eddy, S. R. (1998) Profile hidden Markov models. Bioinformatics 14, 755–63.PubMedCrossRefGoogle Scholar
  102. 102.
    Karplus, K., Barrett, C., and Hughey, R. (1998) Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–56.PubMedCrossRefGoogle Scholar
  103. 103.
    Yona, G., and Levitt, M. (2002) Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol 315, 1257–75.PubMedCrossRefGoogle Scholar
  104. 104.
    Wang, G., and Dunbrack, R. L., Jr. (2004) Scoring profile-to-profile sequence alignments. Protein Sci 13, 1612–26.PubMedCrossRefGoogle Scholar
  105. 105.
    Ohlson, T., Wallner, B., and Elofsson, A. (2004) Profile-profile methods provide improved fold-recognition: a study of different profile-profile alignment methods. Proteins 57, 188–97.PubMedCrossRefGoogle Scholar
  106. 106.
    Zhou, H., and Zhou, Y. (2005) Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins 58, 321–8.PubMedCrossRefGoogle Scholar
  107. 107.
    Zhou, H., and Zhou, Y. (2004) Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition. Proteins 55, 1005–13.PubMedCrossRefGoogle Scholar
  108. 108.
    Blundell, T. L., Sibanda, B. L., Sternberg, M. J., and Thornton, J. M. (1987) Knowledge-based prediction of protein structures and the design of novel molecules. Nature 326, 347–52.PubMedCrossRefGoogle Scholar
  109. 109.
    Sutcliffe, M. J., Haneef, I., Carney, D., and Blundell, T. L. (1987) Knowledge based modelling of homologous proteins. Part I: Three-dimensional frameworks derived from the simultaneous superposition of multiple structures. Protein Eng 1, 377–84.PubMedCrossRefGoogle Scholar
  110. 110.
    Yang, A. S., and Honig, B. (1999) Sequence to structure alignment in comparative modeling using PrISM. Proteins Suppl 3, 66–72.PubMedCrossRefGoogle Scholar
  111. 111.
    Bates, P. A., Kelley, L. A., MacCallum, R. M., and Sternberg, M. J. (2001) Enhancement of protein modeling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM. Proteins Suppl 5, 39–46.PubMedCrossRefGoogle Scholar
  112. 112.
    Peitsch, M. C., and Jongeneel, C. V. (1993) A 3-D model for the CD40 ligand predicts that it is a compact trimer similar to the tumor necrosis factors. Int Immunol 5, 233–8.PubMedCrossRefGoogle Scholar
  113. 113.
    Unger, R., Harel, D., Wherland, S., and Sussman, J. L. (1989) A 3D building blocks approach to analyzing and predicting structure of proteins. Proteins 5, 355–73.PubMedCrossRefGoogle Scholar
  114. 114.
    Claessens, M., Van Cutsem, E., Lasters, I., and Wodak, S. (1989) Modelling the polypeptide backbone with ‘spare parts’ from known protein structures. Protein Eng 2, 335–45.PubMedCrossRefGoogle Scholar
  115. 115.
    van Gelder, C. W., Leusen, F. J., Leunissen, J. A., and Noordik, J. H. (1994) A molecular dynamics approach for the generation of complete protein structures from limited coordinate data. Proteins 18, 174–85.PubMedCrossRefGoogle Scholar
  116. 116.
    Levitt, M. (1992) Accurate modeling of protein conformation by automatic segment matching. J Mol Biol 226, 507–33.PubMedCrossRefGoogle Scholar
  117. 117.
    Sali, A., and Blundell, T. L. (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234, 779–815.PubMedCrossRefGoogle Scholar
  118. 118.
    Petrey, D., Xiang, Z., Tang, C. L., Xie, L., Gimpelev, M., Mitros, T., Soto, C. S., Goldsmith-Fischman, S., Kernytsky, A., Schlessinger, A., Koh, I. Y., Alexov, E., and Honig, B. (2003) Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling. Proteins 53 (Suppl 6), 430–5.PubMedCrossRefGoogle Scholar
  119. 119.
    Fiser, A., Do, R. K., and Sali, A. (2000) Modeling of loops in protein structures. Protein Sci 9, 1753–73.PubMedCrossRefGoogle Scholar
  120. 120.
    Greer, J. (1980) Model for haptoglobin heavy chain based upon structural homology. Proc Natl Acad Sci USA 77, 3393–7.PubMedCrossRefGoogle Scholar
  121. 121.
    Jones, T. A., and Thirup, S. (1986) Using known substructures in protein model building and crystallography. EMBO J 5, 819–22.PubMedGoogle Scholar
  122. 122.
    Wojcik, J., Mornon, J. P., and Chomilier, J. (1999) New efficient statistical sequence-dependent structure prediction of short to medium-sized protein loops based on an exhaustive loop classification. J Mol Biol 289, 1469–90.PubMedCrossRefGoogle Scholar
  123. 123.
    Fine, R. M., Wang, H., Shenkin, P. S., Yarmush, D. L., and Levinthal, C. (1986) Predicting antibody hypervariable loop conformations. II: Minimization and molecular dynamics studies of MCPC603 from many randomly generated loop conformations. Proteins 1, 342–62.PubMedCrossRefGoogle Scholar
  124. 124.
    Moult, J., and James, M. N. (1986) An algorithm for determining the conformation of polypeptide segments in proteins by systematic search. Proteins 1, 146–63.PubMedCrossRefGoogle Scholar
  125. 125.
    Xiang, Z., Soto, C. S., and Honig, B. (2002) Evaluating conformational free energies: the colony energy and its application to the problem of loop prediction. Proc Natl Acad Sci USA 99, 7432–7.PubMedCrossRefGoogle Scholar
  126. 126.
    van Vlijmen, H. W., and Karplus, M. (1997) PDB-based protein loop prediction: parameters for selection and methods for optimization. J Mol Biol 267, 975–1001.PubMedCrossRefGoogle Scholar
  127. 127.
    Fidelis, K., Stern, P. S., Bacon, D., and Moult, J. (1994) Comparison of systematic search and database methods for constructing segments of protein structure. Protein Eng 7, 953–60.PubMedCrossRefGoogle Scholar
  128. 128.
    Jacobson, M. P., Pincus, D. L., Rapp, C. S., Day, T. J., Honig, B., Shaw, D. E., and Friesner, R. A. (2004) A hierarchical approach to all-atom protein loop prediction. Proteins 55, 351–67.PubMedCrossRefGoogle Scholar
  129. 129.
    Chandrasekaran, R., and Ramachandran, G. N. (1970) Studies on the conformation of amino acids. XI. Analysis of the observed side group conformation in proteins. Int J Protein Res 2, 223–33.PubMedCrossRefGoogle Scholar
  130. 130.
    Ponder, J. W., and Richards, F. M. (1987) Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 193, 775–91.PubMedCrossRefGoogle Scholar
  131. 131.
    Dunbrack, R. L., Jr., and Karplus, M. (1993) Backbone-dependent rotamer library for proteins. Application to side-chain prediction. J Mol Biol 230, 543–74.PubMedCrossRefGoogle Scholar
  132. 132.
    Lovell, S. C., Word, J. M., Richardson, J. S., and Richardson, D. C. (2000) The penultimate rotamer library. Proteins 40, 389–408.PubMedCrossRefGoogle Scholar
  133. 133.
    Lovell, S. C., Word, J. M., Richardson, J. S., and Richardson, D. C. (1999) Asparagine and glutamine rotamers: B-factor cutoff and correction of amide flips yield distinct clustering. Proc Natl Acad Sci USA 96, 400–5.PubMedCrossRefGoogle Scholar
  134. 134.
    Xiang, Z., and Honig, B. (2001) Extending the accuracy limits of prediction for side-chain conformations. J Mol Biol 311, 421–30.PubMedCrossRefGoogle Scholar
  135. 135.
    Holm, L., and Sander, C. (1991) Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of co-ordinate errors. J Mol Biol 218, 183–94.PubMedCrossRefGoogle Scholar
  136. 136.
    Desmet, J., Demaeyer, M., Hazes, B., and Lasters, I. (1992) The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356, 539–42.CrossRefGoogle Scholar
  137. 137.
    Desmet, J., De Maeyer, M., and Lasters, I. (1997) Theoretical and algorithmical optimization of the dead-end elimination theorem. Pac Symp Biocomput 122–33.Google Scholar
  138. 138.
    Goldstein, R. F. (1994) Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophys J 66, 1335–40.PubMedCrossRefGoogle Scholar
  139. 139.
    Lasters, I., De Maeyer, M., and Desmet, J. (1995) Enhanced dead-end elimination in the search for the global minimum energy conformation of a collection of protein side chains. Protein Eng 8, 815–22.PubMedCrossRefGoogle Scholar
  140. 140.
    Lasters, I., and Desmet, J. (1993) The fuzzy-end elimination theorem: correctly implementing the side chain placement algorithm based on the dead-end elimination theorem. Protein Eng 6, 717–22.PubMedCrossRefGoogle Scholar
  141. 141.
    Pierce, N. A., Spriet, J. A., Desmet, J., and Mayo, S. L. (2000) Conformational splitting: a more powerful criterion for dead-end elimination. J Comput Chem 21, 999–1009.CrossRefGoogle Scholar
  142. 142.
    Canutescu, A. A., Shelenkov, A. A., and Dunbrack, R. L., Jr. (2003) A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 12, 2001–14.PubMedCrossRefGoogle Scholar
  143. 143.
    Xu, J. (2005) Rapid protein side-chain packing via tree decomposition. RECOMB 423–39.Google Scholar
  144. 144.
    Desmet, J., Spriet, J., and Lasters, I. (2002) Fast and accurate side-chain topology and energy refinement (FASTER) as a new method for protein structure optimization. Proteins 48, 31–43.PubMedCrossRefGoogle Scholar
  145. 145.
    Wallner, B., and Elofsson, A. (2005) All are not equal: a benchmark of different homology modeling programs. Protein Sci 14, 1315–27.PubMedCrossRefGoogle Scholar
  146. 146.
    Sutcliffe, M. J., Hayes, F. R., and Blundell, T. L. (1987) Knowledge based modelling of homologous proteins. Part II: Rules for the conformations of substituted sidechains. Protein Eng 1, 385–92.PubMedCrossRefGoogle Scholar
  147. 147.
    Nayeem, A., Sitkoff, D., and Krystek, S., Jr. (2006) A comparative study of available software for high-accuracy homology modeling: from sequence alignments to structural models. Protein Sci 15, 808–24.PubMedCrossRefGoogle Scholar
  148. 148.
    Kolodny, R., Koehl, P., Guibas, L., and Levitt, M. (2002) Small libraries of protein fragments model native protein structures accurately. J Mol Biol 323, 297–307.PubMedCrossRefGoogle Scholar
  149. 149.
    Jones, D. T. (1997) Successful ab initio prediction of the tertiary structure of NK-lysin using multiple sequences and recognized supersecondary structural motifs. Proteins Suppl 1, 185–91.PubMedCrossRefGoogle Scholar
  150. 150.
    Jones, D. T., Bryson, K., Coleman, A., McGuffin, L. J., Sadowski, M. I., Sodhi, J. S., and Ward, J. J. (2005) Prediction of novel and analogous folds using fragment assembly and fold recognition. Proteins 61 (Suppl 7), 143–51.PubMedCrossRefGoogle Scholar
  151. 151.
    Simons, K. T., Kooperberg, C., Huang, E., and Baker, D. (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 268, 209–25.PubMedCrossRefGoogle Scholar
  152. 152.
    Rohl, C. A., Strauss, C. E., Misura, K. M., and Baker, D. (2004) Protein structure prediction using Rosetta. Methods Enzymol 383, 66–93.PubMedCrossRefGoogle Scholar
  153. 153.
    Karplus, K., Karchin, R., Draper, J., Casper, J., Mandel-Gutfreund, Y., Diekhans, M., and Hughey, R. (2003) Combining local-structure, fold-recognition, and new fold methods for protein structure prediction. Proteins 53 (Suppl 6), 491–6.PubMedCrossRefGoogle Scholar
  154. 154.
    Ishida, T., Nishimura, T., Nozaki, M., Inoue, T., Terada, T., Nakamura, S., and Shimizu, K. (2003) Development of an ab initio protein structure prediction system ABLE. Genome Inform 14, 228–37.PubMedGoogle Scholar
  155. 155.
    Chikenji, G., Fujitsuka, Y., and Takada, S. (2003) A reversible fragment assembly method for de novo protein structure prediction. J Chem Phys 119, 6895–903.CrossRefGoogle Scholar
  156. 156.
    Fujitsuka, Y., Chikenji, G., and Takada, S. (2006) SimFold energy function for de novo protein structure prediction: consensus with Rosetta. Proteins 62, 381–98.PubMedCrossRefGoogle Scholar
  157. 157.
    Lee, J., Kim, S. Y., Joo, K., Kim, I., and Lee, J. (2004) Prediction of protein tertiary structure using PROFESY, a novel method based on fragment assembly and conformational space annealing. Proteins 56, 704–14.PubMedCrossRefGoogle Scholar
  158. 158.
    Jones, D. T., and McGuffin, L. J. (2003) Assembling novel protein folds from super-secondary structural fragments. Proteins 53 (Suppl 6), 480–5.PubMedCrossRefGoogle Scholar
  159. 159.
    Bujnicki, J. M. (2006) Protein-structure prediction by recombination of fragments. Chembiochem 7, 19–27.PubMedCrossRefGoogle Scholar
  160. 160.
    Kryshtafovych, A., Venclovas, C., Fidelis, K., and Moult, J. (2005) Progress over the first decade of CASP experiments. Proteins 61 (Suppl 7), 225–36.PubMedCrossRefGoogle Scholar
  161. 161.
    Cozzetto, D., Di Matteo, A., and Tramontano, A. (2005) Ten years of predictions ... and counting. FEBS J 272, 881–2.PubMedCrossRefGoogle Scholar
  162. 162.
    Fischer, D., Rychlewski, L., Dunbrack, R. L., Jr., Ortiz, A. R., and Elofsson, A. (2003) CAFASP3: the third critical assessment of fully automated structure prediction methods. Proteins 53 (Suppl 6), 503–16.PubMedCrossRefGoogle Scholar
  163. 163.
    Sippl, M. J., Lackner, P., Domingues, F. S., Prlic, A., Malik, R., Andreeva, A., and Wiederstein, M. (2001) Assessment of the CASP4 fold recognition category. Proteins Suppl 5, 55–67.PubMedCrossRefGoogle Scholar
  164. 164.
    Fischer, D., Elofsson, A., Rychlewski, L., Pazos, F., Valencia, A., Rost, B., Ortiz, A. R., and Dunbrack, R. L., Jr. (2001) CAFASP2: the second critical assessment of fully automated structure prediction methods. Proteins Suppl 5, 171–83.PubMedCrossRefGoogle Scholar
  165. 165.
    Lundstrom, J., Rychlewski, L., Bujnicki, J., and Elofsson, A. (2001) Pcons: a neural-network-based consensus predictor that improves fold recognition. Protein Sci 10, 2354–62.PubMedCrossRefGoogle Scholar
  166. 166.
    Schulz, G. E., Barry, C. D., Friedman, J., Chou, P. Y., Fasman, G. D., Finkelstein, A. V., Lim, V. I., Pititsyn, O. B., Kabat, E. A., Wu, T. T., Levitt, M., Robson, B., and Nagano, K. (1974) Comparison of predicted and experimentally determined secondary structure of adenyl kinase. Nature 250, 140–2.PubMedCrossRefGoogle Scholar
  167. 167.
    Matthews, B. W. (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405, 442–51.PubMedGoogle Scholar
  168. 168.
    Argos, P., and Schwarz, J. (1976) An assessment of protein secondary structure prediction methods based on amino acid sequence. Biochim Biophys Acta 439, 261–73.PubMedGoogle Scholar
  169. 169.
    Bujnicki, J. M., and Fischer, D. (2004) “Meta” approaches to protein structure prediction. In Practical Bioinformatics (Bujnicki, J. M., Ed.), Vol. 15, pp. 23–34, Springer, Berlin.Google Scholar
  170. 170.
    Zhang, Y., Arakaki, A. K., and Skolnick, J. (2005) TASSER: an automated method for the prediction of protein tertiary structures in CASP6. Proteins 61 (Suppl 7), 91–8.PubMedCrossRefGoogle Scholar
  171. 171.
    Jones, D. T. (2001) Predicting novel protein folds by using FRAGFOLD. Proteins Suppl 5, 127–32.PubMedCrossRefGoogle Scholar
  172. 172.
    Skolnick, J., Kolinski, A., Kihara, D., Betancourt, M., Rotkiewicz, P., and Boniecki, M. (2001) Ab initio protein structure prediction via a combination of threading, lattice folding, clustering, and structure refinement. Proteins Suppl 5, 149–56.PubMedCrossRefGoogle Scholar
  173. 173.
    Zhang, Y., Kolinski, A., and Skolnick, J. (2003) TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys J 85, 1145–64.PubMedCrossRefGoogle Scholar
  174. 174.
    Tai, C. H., Lee, W. J., Vincent, J. J., and Lee, B. (2005) Evaluation of domain prediction in CASP6. Proteins 61 (Suppl 7), 183–92.PubMedCrossRefGoogle Scholar
  175. 175.
    Shortle, D. (1999) Structure prediction: the state of the art. Curr Biol 9, R205–9.PubMedCrossRefGoogle Scholar
  176. 176.
    Guo, J. T., Xu, D., Kim, D., and Xu, Y. (2003) Improving the performance of DomainParser for structural domain partition using neural network. Nucleic Acids Res 31, 944–52.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press Inc 2008

Authors and Affiliations

  • Jun-tao Guo
  • Kyle Ellrott
  • Ying Xu

There are no affiliations available

Personalised recommendations