Skip to main content
Log in

Systematic assessment of accuracy of comparative model of proteins belonging to different structural fold classes

  • Original Paper
  • Published:
Journal of Molecular Modeling Aims and scope Submit manuscript

Abstract

In the absence of experimental structures, comparative modeling continues to be the chosen method for retrieving structural information on target proteins. However, models lack the accuracy of experimental structures. Alignment error and structural divergence (between target and template) influence model accuracy the most. Here, we examine the potential additional impact of backbone geometry, as our previous studies have suggested that the structural class (all-α, αβ, all-β) of a protein may influence the accuracy of its model. In the twilight zone (sequence identity ≤ 30%) and at a similar level of target-template divergence, the accuracy of protein models does indeed follow the trend all-α > αβ > all-β. This is mainly because the alignment accuracy follows the same trend (all-α > αβ > all-β), with backbone geometry playing only a minor role. Differences in the diversity of sequences belonging to different structural classes leads to the observed accuracy differences, thus enabling the accuracy of alignments/models to be estimated a priori in a class-dependent manner. This study provides a systematic description of and quantifies the structural class-dependent effect in comparative modeling. The study also suggests that datasets for large-scale sequence/structure analyses should have equal representations of different structural classes to avoid class-dependent bias.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Taylor WR (2007) Evolutionary transitions in protein fold space. Curr Opin Struct Biol 17:354–361

    Article  CAS  Google Scholar 

  2. Sanchez R, Sali A (1998) Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc Natl Acad Sci USA 95:13597–13602

  3. Sanchez R, Pieper U, Melo F, Eswar N, Marti-Renom MA, Madhusudhan MS, Mirkovic N, Sali A (2000) Protein structure modeling for structural genomics. Nat Struct Biol 7(Suppl 1):986–990

    Google Scholar 

  4. Stevens RC, Yokoyama S, Wilson IA (2001) Global efforts in structural genomics. Science 294:89–92

    Article  CAS  Google Scholar 

  5. Tramontano A, Morea V (2003) Assessment of homology-based predictions in CASP5. Proteins 53(Suppl 6):352–368

    Article  CAS  Google Scholar 

  6. Lushington GH (2008) Comparative modeling of proteins. Meth Mol Biol Clifton NJ 443:199–212

    CAS  Google Scholar 

  7. Chakravarty S, Wang L, Sanchez R (2005) Accuracy of structure-derived properties in simple comparative models of protein structures. Nucleic Acids Res 33:244–259

    Article  CAS  Google Scholar 

  8. Chakravarty S, Sanchez R (2004) Systematic analysis of added-value in simple comparative models of protein structure. Struct Camb 12:1461–1470

    CAS  Google Scholar 

  9. Kiel C, Wohlgemuth S, Rousseau F, Schymkowitz J, Ferkinghoff-Borg J, Wittinghofer F, Serrano L (2005) Recognizing and defining true Ras binding domains II: in silico prediction based on homology modelling and energy calculations. J Mol Biol 348:759–775

    Article  CAS  Google Scholar 

  10. Liu T, Rojas A, Ye Y, Godzik A (2003) Homology modeling provides insights into the binding mode of the PAAD/DAPIN/pyrin domain, a fourth member of the CARD/DD/DED domain family. Protein Sci 12:1872–1881

    Article  CAS  Google Scholar 

  11. Murray PS, Li Z, Wang J, Tang CL, Honig B, Murray D (2005) Retroviral matrix domains share electrostatic homology: models for membrane binding function throughout the viral life cycle. Structure 13:1521–1531

    Article  CAS  Google Scholar 

  12. Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S, Fagan P, Marvin J, Padilla D, Ravichandran V, Schneider B, Thanki N, Weissig H, Westbrook JD, Zardecki C (2002) The Protein Data Bank. Acta Crystallogr D 58:899–907

  13. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242

  14. Hillisch A, Pineda LF, Hilgenfeld R (2004) Utility of homology models in the drug discovery process. Drug Discov Today 9:659–669

    Article  CAS  Google Scholar 

  15. Ring CS, Sun E, McKerrow JH, Lee GK, Rosenthal PJ, Kuntz ID, Cohen FE (1993) Structure-based inhibitor design by using protein models for the development of antiparasitic agents. Proc Natl Acad Sci USA 90:3583–3587

    Article  CAS  Google Scholar 

  16. Evers A, Klabunde T (2005) Structure-based drug discovery using GPCR homology modeling: successful virtual screening for antagonists of the alpha1A adrenergic receptor. J Med Chem 48:1088–1097

    Article  CAS  Google Scholar 

  17. Evers A, Klebe G (2004) Successful virtual screening for a submicromolar antagonist of the neurokinin-1 receptor based on a ligand-supported homology model. J Med Chem 47:5381–5392

    Article  CAS  Google Scholar 

  18. Vangrevelinghe E, Zimmermann K, Schoepfer J, Portmann R, Fabbro D, Furet P (2003) Discovery of a potent and selective protein kinase CK2 inhibitor by high-throughput docking. J Med Chem 46:2656–2662

    Article  CAS  Google Scholar 

  19. Lengauer T, Lemmen C, Rarey M, Zimmermann M (2004) Novel technologies for virtual screening. Drug Discov Today 9:27–34

    Article  CAS  Google Scholar 

  20. Read RJ (2001) Pushing the boundaries of molecular replacement with maximum likelihood. Acta Crystallogr D 57(Pt 10):1373–1382

    Google Scholar 

  21. Skolnick J, Fetrow JS, Kolinski A (2000) Structural genomics and its importance for gene function analysis. Nat Biotech 18:283–287

    Article  CAS  Google Scholar 

  22. Moult J (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15:285–289

    Article  CAS  Google Scholar 

  23. Chakravarty S, Godbole S, Zhang B, Berger S, Sanchez R (2008) Systematic analysis of the effect of multiple templates on the accuracy of comparative models of protein structure. BMC Struct Biol 8:31

    Article  Google Scholar 

  24. Sauder JM, Arthur JW, Dunbrack RL Jr (2000) Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins 40:6–22

    Article  CAS  Google Scholar 

  25. Dunbrack RL Jr (2006) Sequence comparison and protein structure prediction. Curr Opin Struct Biol 16:374–384

    Article  CAS  Google Scholar 

  26. Baker D, Sali A (2001) Protein structure prediction and structural genomics. Science 294:93–96

    Article  CAS  Google Scholar 

  27. Kopp J, Bordoli L, Battey JN, Kiefer F, Schwede T (2007) Assessment of CASP7 predictions for template-based modeling targets. Proteins 69(Suppl 8):38–56

    Article  CAS  Google Scholar 

  28. Nayeem A, Sitkoff D, Krystek S Jr (2006) A comparative study of available software for high-accuracy homology modeling: from sequence alignments to structural models. Protein Sci 15:808–824

    Article  CAS  Google Scholar 

  29. Rayan A (2009) New tips for structure prediction by comparative modeling. Bioinformation 3:263–267

    Google Scholar 

  30. Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234:779–815

    Article  CAS  Google Scholar 

  31. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  CAS  Google Scholar 

  32. Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci 9:232–241

    Article  CAS  Google Scholar 

  33. Sadreyev RI, Grishin NV (2004) Estimates of statistical significance for comparison of individual positions in multiple sequence alignments. BMC Bioinf 5:106

    Article  Google Scholar 

  34. Panchenko AR (2003) Finding weak similarities between proteins by sequence profile comparison. Nucleic Acids Res 31:683–689

    Article  CAS  Google Scholar 

  35. Casbon J, Saqi MA (2005) S4: structure-based sequence alignments of SCOP superfamilies. Nucleic Acids Res 33:D219–222

    Google Scholar 

  36. Tress ML, Jones D, Valencia A (2003) Predicting reliable regions in protein alignments from sequence profiles. J Mol Biol 330:705–718

    Article  CAS  Google Scholar 

  37. Sadreyev RI, Grishin NV (2004) Quality of alignment comparison by COMPASS improves with inclusion of diverse confident homologs. Bioinformatics 20:818–828

    Article  CAS  Google Scholar 

  38. Koehl P, Levitt M (2002) Protein topology and stability define the space of allowed sequences. Proc Natl Acad Sci USA 99:1280–1285

    Article  CAS  Google Scholar 

  39. England JL, Shakhnovich EI (2003) Structural determinant of protein designability. Phys Rev Lett 90:218101

    Article  Google Scholar 

  40. Minor DL Jr, Kim PS (1994) Context is a major determinant of beta-sheet propensity. Nature 371:264–267

    Article  CAS  Google Scholar 

  41. Han KF, Baker D (1995) Recurring local sequence motifs in proteins. J Mol Biol 251:176–187

    Article  CAS  Google Scholar 

  42. Han KF, Baker D (1996) Global properties of the mapping between local amino acid sequence and local structure in proteins. Proc Natl Acad Sci USA 93:5814–5818

    Article  CAS  Google Scholar 

  43. Bystroff C, Simons KT, Han KF, Baker D (1996) Local sequence–structure correlations in proteins. Curr Opin Biotechnol 7:417–421

    Google Scholar 

  44. West MW, Hecht MH (1995) Binary patterning of polar and nonpolar amino acids in the sequences and structures of native proteins. Protein Sci 4:2032–2039

    Article  CAS  Google Scholar 

  45. Shakhnovich BE, Deeds E, Delisi C, Shakhnovich E (2005) Protein structure and evolutionary history determine sequence space topology. Genom Res 15:385–392

    Article  CAS  Google Scholar 

  46. Edgar RC, Sjolander K (2004) A comparison of scoring functions for protein sequence profile alignment. Bioinformatics 20:1301–1308

    Article  CAS  Google Scholar 

  47. Anantharaman V, Aravind L, Koonin EV (2003) Emergence of diverse biochemical activities in evolutionarily conserved structural scaffolds of proteins. Curr Opin Chem Biol 7:12–20

    Article  CAS  Google Scholar 

  48. Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinf 5:113

    Article  Google Scholar 

Download references

Acknowledgments

We thank Sucheta Godbole for helping us with the profile–profile alignments. We thank Zhanwen Li of the Godzik Laboratory at the Burnham Institute for helping us with the Fold and Function Assignment (FFAS) server when investigating the test cases of profile–profile alignments. SC thanks Prof. Ming-Ming Zhou for encouragement. The study was supported by the National Institute of General Medicine at the National Institutes of Health [grant 1R01GM081713 (RS)], and South Dakota State University’s (SDSU) Agricultural Experiment Station and Center for Biological Control and Analysis by Applied Photonics (BCAAP) [grant 3SG163 (SC)].

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Suvobrata Chakravarty or Roberto Sanchez.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 2482 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chakravarty, S., Ghersi, D. & Sanchez, R. Systematic assessment of accuracy of comparative model of proteins belonging to different structural fold classes. J Mol Model 17, 2831–2837 (2011). https://doi.org/10.1007/s00894-011-0976-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00894-011-0976-9

Keywords

Navigation