Systematic assessment of accuracy of comparative model of proteins belonging to different structural fold classes

Chakravarty, Suvobrata; Ghersi, Dario; Sanchez, Roberto

doi:10.1007/s00894-011-0976-9

Systematic assessment of accuracy of comparative model of proteins belonging to different structural fold classes

Original Paper
Published: 08 February 2011

Volume 17, pages 2831–2837, (2011)
Cite this article

Journal of Molecular Modeling Aims and scope Submit manuscript

Suvobrata Chakravarty¹,
Dario Ghersi^2,3 &
Roberto Sanchez²

1169 Accesses
2 Citations
Explore all metrics

Abstract

In the absence of experimental structures, comparative modeling continues to be the chosen method for retrieving structural information on target proteins. However, models lack the accuracy of experimental structures. Alignment error and structural divergence (between target and template) influence model accuracy the most. Here, we examine the potential additional impact of backbone geometry, as our previous studies have suggested that the structural class (all-α, αβ, all-β) of a protein may influence the accuracy of its model. In the twilight zone (sequence identity ≤ 30%) and at a similar level of target-template divergence, the accuracy of protein models does indeed follow the trend all-α > αβ > all-β. This is mainly because the alignment accuracy follows the same trend (all-α > αβ > all-β), with backbone geometry playing only a minor role. Differences in the diversity of sequences belonging to different structural classes leads to the observed accuracy differences, thus enabling the accuracy of alignments/models to be estimated a priori in a class-dependent manner. This study provides a systematic description of and quantifies the structural class-dependent effect in comparative modeling. The study also suggests that datasets for large-scale sequence/structure analyses should have equal representations of different structural classes to avoid class-dependent bias.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Protein Structure Prediction: Are We There Yet?

Modeling of Protein Tertiary and Quaternary Structures Based on Evolutionary Information

Comparative Modeling of Proteins

References

Taylor WR (2007) Evolutionary transitions in protein fold space. Curr Opin Struct Biol 17:354–361
Article CAS Google Scholar
Sanchez R, Sali A (1998) Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc Natl Acad Sci USA 95:13597–13602
Sanchez R, Pieper U, Melo F, Eswar N, Marti-Renom MA, Madhusudhan MS, Mirkovic N, Sali A (2000) Protein structure modeling for structural genomics. Nat Struct Biol 7(Suppl 1):986–990
Google Scholar
Stevens RC, Yokoyama S, Wilson IA (2001) Global efforts in structural genomics. Science 294:89–92
Article CAS Google Scholar
Tramontano A, Morea V (2003) Assessment of homology-based predictions in CASP5. Proteins 53(Suppl 6):352–368
Article CAS Google Scholar
Lushington GH (2008) Comparative modeling of proteins. Meth Mol Biol Clifton NJ 443:199–212
CAS Google Scholar
Chakravarty S, Wang L, Sanchez R (2005) Accuracy of structure-derived properties in simple comparative models of protein structures. Nucleic Acids Res 33:244–259
Article CAS Google Scholar
Chakravarty S, Sanchez R (2004) Systematic analysis of added-value in simple comparative models of protein structure. Struct Camb 12:1461–1470
CAS Google Scholar
Kiel C, Wohlgemuth S, Rousseau F, Schymkowitz J, Ferkinghoff-Borg J, Wittinghofer F, Serrano L (2005) Recognizing and defining true Ras binding domains II: in silico prediction based on homology modelling and energy calculations. J Mol Biol 348:759–775
Article CAS Google Scholar
Liu T, Rojas A, Ye Y, Godzik A (2003) Homology modeling provides insights into the binding mode of the PAAD/DAPIN/pyrin domain, a fourth member of the CARD/DD/DED domain family. Protein Sci 12:1872–1881
Article CAS Google Scholar
Murray PS, Li Z, Wang J, Tang CL, Honig B, Murray D (2005) Retroviral matrix domains share electrostatic homology: models for membrane binding function throughout the viral life cycle. Structure 13:1521–1531
Article CAS Google Scholar
Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S, Fagan P, Marvin J, Padilla D, Ravichandran V, Schneider B, Thanki N, Weissig H, Westbrook JD, Zardecki C (2002) The Protein Data Bank. Acta Crystallogr D 58:899–907
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242
Hillisch A, Pineda LF, Hilgenfeld R (2004) Utility of homology models in the drug discovery process. Drug Discov Today 9:659–669
Article CAS Google Scholar
Ring CS, Sun E, McKerrow JH, Lee GK, Rosenthal PJ, Kuntz ID, Cohen FE (1993) Structure-based inhibitor design by using protein models for the development of antiparasitic agents. Proc Natl Acad Sci USA 90:3583–3587
Article CAS Google Scholar
Evers A, Klabunde T (2005) Structure-based drug discovery using GPCR homology modeling: successful virtual screening for antagonists of the alpha1A adrenergic receptor. J Med Chem 48:1088–1097
Article CAS Google Scholar
Evers A, Klebe G (2004) Successful virtual screening for a submicromolar antagonist of the neurokinin-1 receptor based on a ligand-supported homology model. J Med Chem 47:5381–5392
Article CAS Google Scholar
Vangrevelinghe E, Zimmermann K, Schoepfer J, Portmann R, Fabbro D, Furet P (2003) Discovery of a potent and selective protein kinase CK2 inhibitor by high-throughput docking. J Med Chem 46:2656–2662
Article CAS Google Scholar
Lengauer T, Lemmen C, Rarey M, Zimmermann M (2004) Novel technologies for virtual screening. Drug Discov Today 9:27–34
Article CAS Google Scholar
Read RJ (2001) Pushing the boundaries of molecular replacement with maximum likelihood. Acta Crystallogr D 57(Pt 10):1373–1382
Google Scholar
Skolnick J, Fetrow JS, Kolinski A (2000) Structural genomics and its importance for gene function analysis. Nat Biotech 18:283–287
Article CAS Google Scholar
Moult J (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15:285–289
Article CAS Google Scholar
Chakravarty S, Godbole S, Zhang B, Berger S, Sanchez R (2008) Systematic analysis of the effect of multiple templates on the accuracy of comparative models of protein structure. BMC Struct Biol 8:31
Article Google Scholar
Sauder JM, Arthur JW, Dunbrack RL Jr (2000) Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins 40:6–22
Article CAS Google Scholar
Dunbrack RL Jr (2006) Sequence comparison and protein structure prediction. Curr Opin Struct Biol 16:374–384
Article CAS Google Scholar
Baker D, Sali A (2001) Protein structure prediction and structural genomics. Science 294:93–96
Article CAS Google Scholar
Kopp J, Bordoli L, Battey JN, Kiefer F, Schwede T (2007) Assessment of CASP7 predictions for template-based modeling targets. Proteins 69(Suppl 8):38–56
Article CAS Google Scholar
Nayeem A, Sitkoff D, Krystek S Jr (2006) A comparative study of available software for high-accuracy homology modeling: from sequence alignments to structural models. Protein Sci 15:808–824
Article CAS Google Scholar
Rayan A (2009) New tips for structure prediction by comparative modeling. Bioinformation 3:263–267
Google Scholar
Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234:779–815
Article CAS Google Scholar
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Article CAS Google Scholar
Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci 9:232–241
Article CAS Google Scholar
Sadreyev RI, Grishin NV (2004) Estimates of statistical significance for comparison of individual positions in multiple sequence alignments. BMC Bioinf 5:106
Article Google Scholar
Panchenko AR (2003) Finding weak similarities between proteins by sequence profile comparison. Nucleic Acids Res 31:683–689
Article CAS Google Scholar
Casbon J, Saqi MA (2005) S4: structure-based sequence alignments of SCOP superfamilies. Nucleic Acids Res 33:D219–222
Google Scholar
Tress ML, Jones D, Valencia A (2003) Predicting reliable regions in protein alignments from sequence profiles. J Mol Biol 330:705–718
Article CAS Google Scholar
Sadreyev RI, Grishin NV (2004) Quality of alignment comparison by COMPASS improves with inclusion of diverse confident homologs. Bioinformatics 20:818–828
Article CAS Google Scholar
Koehl P, Levitt M (2002) Protein topology and stability define the space of allowed sequences. Proc Natl Acad Sci USA 99:1280–1285
Article CAS Google Scholar
England JL, Shakhnovich EI (2003) Structural determinant of protein designability. Phys Rev Lett 90:218101
Article Google Scholar
Minor DL Jr, Kim PS (1994) Context is a major determinant of beta-sheet propensity. Nature 371:264–267
Article CAS Google Scholar
Han KF, Baker D (1995) Recurring local sequence motifs in proteins. J Mol Biol 251:176–187
Article CAS Google Scholar
Han KF, Baker D (1996) Global properties of the mapping between local amino acid sequence and local structure in proteins. Proc Natl Acad Sci USA 93:5814–5818
Article CAS Google Scholar
Bystroff C, Simons KT, Han KF, Baker D (1996) Local sequence–structure correlations in proteins. Curr Opin Biotechnol 7:417–421
Google Scholar
West MW, Hecht MH (1995) Binary patterning of polar and nonpolar amino acids in the sequences and structures of native proteins. Protein Sci 4:2032–2039
Article CAS Google Scholar
Shakhnovich BE, Deeds E, Delisi C, Shakhnovich E (2005) Protein structure and evolutionary history determine sequence space topology. Genom Res 15:385–392
Article CAS Google Scholar
Edgar RC, Sjolander K (2004) A comparison of scoring functions for protein sequence profile alignment. Bioinformatics 20:1301–1308
Article CAS Google Scholar
Anantharaman V, Aravind L, Koonin EV (2003) Emergence of diverse biochemical activities in evolutionarily conserved structural scaffolds of proteins. Curr Opin Chem Biol 7:12–20
Article CAS Google Scholar
Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinf 5:113
Article Google Scholar

Download references

Acknowledgments

We thank Sucheta Godbole for helping us with the profile–profile alignments. We thank Zhanwen Li of the Godzik Laboratory at the Burnham Institute for helping us with the Fold and Function Assignment (FFAS) server when investigating the test cases of profile–profile alignments. SC thanks Prof. Ming-Ming Zhou for encouragement. The study was supported by the National Institute of General Medicine at the National Institutes of Health [grant 1R01GM081713 (RS)], and South Dakota State University’s (SDSU) Agricultural Experiment Station and Center for Biological Control and Analysis by Applied Photonics (BCAAP) [grant 3SG163 (SC)].

Author information

Authors and Affiliations

Department of Chemistry & Biochemistry, South Dakota State University, Box 2202, Brookings, SD, 57007, USA
Suvobrata Chakravarty
Department of Structural and Chemical Biology, Mount Sinai School of Medicine, Box 1677, 1425 Madison Avenue, New York, NY, 10029, USA
Dario Ghersi & Roberto Sanchez
Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, 08540, USA
Dario Ghersi

Authors

Suvobrata Chakravarty
View author publications
You can also search for this author in PubMed Google Scholar
Dario Ghersi
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Sanchez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Suvobrata Chakravarty or Roberto Sanchez.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 2482 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chakravarty, S., Ghersi, D. & Sanchez, R. Systematic assessment of accuracy of comparative model of proteins belonging to different structural fold classes. J Mol Model 17, 2831–2837 (2011). https://doi.org/10.1007/s00894-011-0976-9

Download citation

Received: 08 December 2010
Accepted: 17 January 2011
Published: 08 February 2011
Issue Date: November 2011
DOI: https://doi.org/10.1007/s00894-011-0976-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Systematic assessment of accuracy of comparative model of proteins belonging to different structural fold classes

Abstract

Access this article

Similar content being viewed by others

Protein Structure Prediction: Are We There Yet?

Modeling of Protein Tertiary and Quaternary Structures Based on Evolutionary Information

Comparative Modeling of Proteins

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Electronic Supplementary Material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Systematic assessment of accuracy of comparative model of proteins belonging to different structural fold classes

Abstract

Access this article

Similar content being viewed by others

Protein Structure Prediction: Are We There Yet?

Modeling of Protein Tertiary and Quaternary Structures Based on Evolutionary Information

Comparative Modeling of Proteins

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Electronic Supplementary Material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation