Skip to main content

Prediction of Protein Function from Theoretical Models

  • Chapter
  • First Online:
From Protein Structure to Function with Bioinformatics

Abstract

Explicit 3D models can be obtained by comparative protein modelling, a mature and predictable technique, fragment assembly ab initio methods for smaller novel or unrecognisable folds and contact-based methods for large protein families. Each modelling method has limitations in model accuracy, which vary further according to the characteristics of the target: as a result, the performance of structure-based function prediction algorithms applied to models is variable. Nevertheless, with care, a wide variety of structure-based methods can be productively applied to protein models, frequently facilitating the planning and interpretation of experimental results. This chapter will first survey the literature on applicability of structure-based methods specifically to models, before discussing a selection of examples in more detail.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
€34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 245.03
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 320.99
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
EUR 320.99
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Aloy P, Russell RB (2006) Structural systems biology: modelling protein interactions. Nat Rev Mol Cell Biol 7:188–197

    Article  CAS  PubMed  Google Scholar 

  • Anishchenko I, Kundrotas PJ, Tuzikov AV et al (2014) Protein models: the grand challenge of protein docking. Proteins 82(2):278–287

    Article  CAS  PubMed  Google Scholar 

  • Barth P, Schonbrun J, Baker D (2007) Toward high-resolution prediction and design of transmembrane helical protein structures. Proc Natl Acad Sci U S A 104(40):15682–15687

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Baxter SM, Rosenblum JS, Knutson S et al (2004) Synergistic computational and experimental proteomics approaches for more accurate detection of active serine hydrolases in yeast. Mol Cell Proteomics 3:209–225

    Article  CAS  PubMed  Google Scholar 

  • Bonneau R, Strauss CE, Rohl CA et al (2002) De novo prediction of three-dimensional structures for major protein families. J Mol Biol 322:65–78

    Article  CAS  PubMed  Google Scholar 

  • Bordogna A, Pandini A, Bonati L (2011) Predicting the accuracy of protein–ligand docking on homology models. J Comput Chem 32(1):81–98

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bradley P, Misura KM, Baker D (2005) Toward high-resolution de novo structure prediction for small proteins. Science 309:1868–1871

    Article  CAS  PubMed  Google Scholar 

  • Brylinski M, Skolnick J (2009) FINDSITE: a threading-based approach to ligand homology modeling. PLoS Comput Biol 5(6):e1000405

    Article  PubMed  PubMed Central  Google Scholar 

  • Brylinski M, Skolnick J (2010) Comprehensive structural and functional characterization of the human kinome by protein structure modeling and ligand virtual screening. J Chem Inf Model 50(10):1839–1854

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bui KH, von Appen A, DiGuilio AL et al (2013) Integrated structural analysis of the human nuclear pore complex scaffold. Cell 155(6):1233–1243

    Article  CAS  PubMed  Google Scholar 

  • Bujnicki JM (2003) Crystallographic and bioinformatic studies on restriction endonucleases: inference of evolutionary relationships in the “midnight zone” of homology. Curr Protein Pept Sci 4:327–337

    Article  CAS  PubMed  Google Scholar 

  • Bumbaca D, Littlejohn JE, Nayakanti H et al (2007) Genome-based identification and characterization of a putative mucin-binding protein from the surface of Streptococcus pneumoniae. Proteins 66:547–558

    Article  CAS  PubMed  Google Scholar 

  • Cammer SA, Hoffman BT, Speir JA et al (2003) Structure-based active site profiles for genome analysis and functional family subclassification. J Mol Biol 334:387–401

    Article  CAS  PubMed  Google Scholar 

  • Chakravarty S, Sanchez R (2004) Systematic analysis of added-value in simple comparative models of protein structure. Structure 12:1461–1470

    Article  CAS  PubMed  Google Scholar 

  • Chakravarty S, Wang L, Sanchez R (2005) Accuracy of structure-derived properties in simple comparative models of protein structures. Nucleic Acids Res 33:244–259

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chi A, Kemp RG (2000) The primordial high energy compound: ATP or inorganic pyrophosphate? J Biol Chem 275:35677–35679

    Article  CAS  PubMed  Google Scholar 

  • Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5(4):823–826

    CAS  PubMed  PubMed Central  Google Scholar 

  • Chung SY, Subbiah S (1996) A structural explanation for the twilight zone of protein sequence homology. Structure 4(10):1123–1127

    Article  CAS  PubMed  Google Scholar 

  • Cohen-Gonsaud M, Ducasse S, Hoh F et al (2002) Crystal structure of MabA from Mycobacterium tuberculosis, a reductase involved in long-chain fatty acid biosynthesis. J Mol Biol 320(2):249–261

    Article  CAS  PubMed  Google Scholar 

  • Cymerman IA, Meiss G, Bujnicki JM (2005) DNase II is a member of the phospholipase D superfamily. Bioinformatics 21:3959–3962

    Article  CAS  PubMed  Google Scholar 

  • Davis FP, Braberg H, Shen MY et al (2006) Protein complex compositions predicted by structural similarity. Nucleic Acids Res 34:2943–2952

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Davis FP, Barkan DT, Eswar N et al (2007) Host pathogen protein interactions predicted by comparative modeling. Protein Sci 16:2585–2596

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Dessailly BH, Nair R, Jaroszewski L et al (2009) PSI-2: structural genomics to cover protein domain family space. Structure 17(6):869–881

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Du Y, He YX, Zhang ZY et al (2011) Crystal structure of the mucin-binding domain of Spr1345 from Streptococcus pneumoniae. J Struct Biol 174(1):252–257

    Article  CAS  PubMed  Google Scholar 

  • Du H, Brender JR, Zhang J et al (2015) Protein structure prediction provides comparable performance to crystallographic structures in docking-based virtual screening. Methods 71:77–84

    Article  CAS  PubMed  Google Scholar 

  • Fan H, Irwin JJ, Webb BM et al (2009) Molecular docking screens using comparative models of proteins. J Chem Inf Model 49(11):2512–2527

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Feder M, Bujnicki JM (2005) Identification of a new family of putative PD-(D/E)XK nucleases with unusual phylogenomic distribution and a new type of the active site. BMC Genom 6:21

    Article  Google Scholar 

  • Fetrow JS, Skolnick J (1998) Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. J Mol Biol 281:949–968

    Article  CAS  PubMed  Google Scholar 

  • Fetrow JS, Godzik A, Skolnick J (1998) Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity. J Mol Biol 282:703–711

    Article  CAS  PubMed  Google Scholar 

  • Furnham N, Ruffle S, Southan C (2004) Splice variants: a homology modeling approach. Proteins 54:596–608

    Article  CAS  PubMed  Google Scholar 

  • Gao M, Skolnick J (2009) From nonspecific DNA–protein encounter complexes to the prediction of DNA–protein interactions. PLoS Comput Biol 5(3):e1000341

    Article  PubMed  PubMed Central  Google Scholar 

  • Ginalski K (2006) Comparative modeling for protein structure prediction. Curr Opin Struct Biol 16(2):172–177

    Article  CAS  PubMed  Google Scholar 

  • Greer J (1985) Model structure for the inflammatory protein C5a. Science 228(4703):1055–1060

    Article  CAS  PubMed  Google Scholar 

  • Haas J, Roth S, Arnold K, et al (2013) The protein model portal—a comprehensive resource for protein structure and model information. Database (Oxford) 2013:bat031

    Google Scholar 

  • Hasegawa K, Funatsu K (2012) A new method for mapping the molecular surface of a protein structure using a spherical self-organizing map. Mol Inf 31(2):161–166

    Article  CAS  Google Scholar 

  • Hattersley AT, Ashcroft FM (2005) Activating mutations in Kir6.2 and neonatal diabetes: new clinical syndromes, new scientific insights, and new therapy. Diabetes 54:2503–2513

    Article  CAS  PubMed  Google Scholar 

  • Hermann JC, Marti-Arbona R, Fedorov AA et al (2007) Structure-based activity prediction for an enzyme of unknown function. Nature 448:775–779

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hopf TA, Colwell LJ, Sheridan R et al (2012) Three-dimensional structures of membrane proteins from genomic sequencing. Cell 149(7):1607–1621

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jacobson M, Sali A (2004) Comparative protein structure modelling and its applications to drug discovery. Annu Rep Med Chem 39:259–274

    Article  CAS  Google Scholar 

  • Kamisetty H, Ovchinnikov S, Baker D (2013) Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci U S A 110(39):15674–15679

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kavanagh KL, Jornvall H, Persson B et al (2008) Medium- and short-chain dehydrogenase/reductase gene and protein families: the SDR superfamily: functional and structural diversity within a family of metabolic and regulatory enzymes. Cell Mol Life Sci 65(24):3895–3906

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Khafizov K, Madrid-Aliste C, Almo SC et al (2014) Trends in structural coverage of the protein universe and the impact of the protein structure initiative. Proc Natl Acad Sci U S A 111(10):3733–3738

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kiefer F, Arnold K, Kunzli M, et al (2009) The SWISS-MODEL repository and associated resources. Nucleic Acids Res 37(Database issue):D387–D392

    Google Scholar 

  • Kolinski A, Bujnicki JM (2005) Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models. Proteins 61(Suppl 7):84–90

    Article  CAS  PubMed  Google Scholar 

  • Kryshtafovych A, Fidelis K, Tramontano A (2011) Evaluation of model quality predictions in CASP9. Proteins 79(Suppl 10):91–106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kundrotas PJ, Vakser IA (2010) Accuracy of protein-protein binding sites in high-throughput template-based modeling. PLoS Comput Biol 6(4):e1000727

    Article  PubMed  PubMed Central  Google Scholar 

  • Kundrotas PJ, Zhu Z, Janin J et al (2012) Templates are available to model nearly all complexes of structurally characterized proteins. Proc Natl Acad Sci U S A 109(24):9438–9441

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lee HS, Zhang Y (2012) BSP-SLIM: A blind low-resolution ligand-protein docking approach using predicted protein structures. Proteins Struct Funct Bioinf 80(1):93–110

    Article  CAS  Google Scholar 

  • Lee TT, Agarwalla S, Stroud RM (2004) Crystal structure of RumA, an iron-sulfur cluster containing E. coli ribosomal RNA 5-methyluridine methyltransferase. Structure 12(3):397–407

    Article  CAS  PubMed  Google Scholar 

  • Li S, Yamashita K, Amada KM et al (2014) Quantifying sequence and structural features of protein-RNA interactions. Nucleic Acids Res 42(15):10086–10098

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Liu J, Fang C, Jiang Y et al (2009) Characterization of a hemolysin gene ytjA from Bacillus subtilis. Curr Microbiol 58(6):642–647

    Article  CAS  PubMed  Google Scholar 

  • Lopez C, Chevalier N, Hannaert V et al (2002) Leishmania donovani phosphofructokinase. Gene characterization, biochemical properties and structure-modeling studies. Eur J Biochem 269:3978–3989

    Article  CAS  PubMed  Google Scholar 

  • Lukk T, Sakai A, Kalyanaraman C et al (2012) Homology models guide discovery of diverse enzyme specificities among dipeptide epimerases in the enolase superfamily. Proc Natl Acad Sci U S A 109(11):4122–4127

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Malmstrom L, Riffle M, Strauss CE et al (2007) Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology. PLoS Biol 5:e76

    Article  PubMed  PubMed Central  Google Scholar 

  • Marks DS, Colwell LJ, Sheridan R et al (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6(12):e28766

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • McGovern SL, Shoichet BK (2003) Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. J Med Chem 46:2895–2907

    Article  CAS  PubMed  Google Scholar 

  • Michels PA, Chevalier N, Opperdoes FR et al (1997) The glycosomal ATP-dependent phosphofructokinase of Trypanosoma brucei must have evolved from an ancestral pyrophosphate-dependent enzyme. Eur J Biochem 250:698–704

    Article  CAS  PubMed  Google Scholar 

  • Negroni J, Mosca R, Aloy P (2014) Assessing the applicability of template-based protein docking in the twilight zone. Structure 22(9):1356–1362

    Article  CAS  PubMed  Google Scholar 

  • Oshiro C, Bradley EK, Eksterowicz J et al (2004) Performance of 3D-database molecular docking studies into homology models. J Med Chem 47:764–767

    Article  CAS  PubMed  Google Scholar 

  • Parkkinen T, Boer H, Janis J et al (2011) Crystal structure of uronate dehydrogenase from Agrobacterium tumefaciens. J Biol Chem 286(31):27294–27300

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pickl A, Schonheit P (2015) The oxidative pentose phosphate pathway in the haloarchaeon Haloferax volcanii involves a novel type of glucose-6-phosphate dehydrogenase—the archaeal Zwischenferment. FEBS Lett

    Google Scholar 

  • Piedra D, Lois S, de la Cruz X (2008) Preservation of protein clefts in comparative models. BMC Struct Biol 8:2-6807-8-2

    Google Scholar 

  • Pieper U, Schlessinger A, Kloppmann E et al (2013) Coordinating the impact of structural genomics on the human [alpha]-helical transmembrane proteome. Nat Struct Mol Biol 20(2):135–138

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pieper U, Webb BM, Dong GQ, et al (2014a) ModBase, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 42(Database issue):D336–D346

    Google Scholar 

  • Pieper U, Webb BM, Dong GQ, et al (2014b) ModBase, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 42(Database issue):D336–D346

    Google Scholar 

  • Rigden DJ, Galperin MY (2008) Sequence analysis of GerM and SpoVS, uncharacterised bacterial ‘sporulation’ proteins with widespread phylogenetic distribution. Bioinform. doi:10.1093/bioinformatics/btn314 (accepted)

    Google Scholar 

  • Rodrigues J, Melquiond A, Karaca E et al (2013) Defining the limits of homology modeling in information-driven protein docking. Proteins Struct Funct Bioinf 81(12):2119–2128

    Article  CAS  Google Scholar 

  • Rose PW, Prlic A, Bi C, et al (2015) The RCSB protein data bank: views of structural biology for basic and applied research and education. Nucleic Acids Res 43(Database issue):D345–D356

    Google Scholar 

  • Roy A, Yang J, Zhang Y (2012) COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res 40(Web Server issue):W471–W477

    Google Scholar 

  • Schafer P, Cymerman IA, Bujnicki JM et al (2007) Human lysosomal DNase IIalpha contains two requisite PLD-signature (HxK) motifs: evidence for a pseudodimeric structure of the active enzyme species. Protein Sci 16:82–91

    Article  PubMed  PubMed Central  Google Scholar 

  • Schwede T (2013) Protein modeling: what happened to the “protein structure gap”? Structure 21(9):1531–1540

    Article  CAS  PubMed  Google Scholar 

  • Schwede T, Sali A, Honig B et al (2009) Outcome of a workshop on applications of protein models in biomedical research. Structure 17(2):151–159

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Shortle D, Simons KT, Baker D (1998) Clustering of low-energy conformations near the native structures of small proteins. Proc Natl Acad Sci U S A 95(19):11158–11162

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Skolnick J, Zhou H, Gao M (2013) Are predicted protein structures of any value for binding site prediction and virtual ligand screening? Curr Opin Struct Biol 23(2):191–197

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sodhi JS, Bryson K, McGuffin LJ et al (2004) Predicting metal-binding site residues in low-resolution structural models. J Mol Biol 342:307–320

    Article  CAS  PubMed  Google Scholar 

  • Soding J, Biegert A, Lupas AN (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33(Web Server issue):W244–W248

    Google Scholar 

  • Song L, Kalyanaraman C, Fedorov AA et al (2007) Prediction and assignment of function for a divergent N-succinyl amino acid racemase. Nat Chem Biol 3:486–491

    Article  CAS  PubMed  Google Scholar 

  • Song Y, DiMaio F, Wang RY et al (2013) High-resolution comparative modeling with Rosetta CM. Structure 21(10):1735–1742

    Article  CAS  PubMed  Google Scholar 

  • Szilagyi A, Skolnick J (2006) Efficient prediction of nucleic acid binding function from low-resolution protein structures. J Mol Biol 358:922–933

    Article  CAS  PubMed  Google Scholar 

  • Szilagyi A, Zhang Y (2014) Template-based structure modeling of protein-protein interactions. Curr Opin Struct Biol 24:10–23

    Article  CAS  PubMed  Google Scholar 

  • Tammaro P, Flanagan SE, Zadek B et al (2008) A Kir6.2 mutation causing severe functional effects in vitro produces neonatal diabetes without the expected neurological complications. Diabetologia

    Google Scholar 

  • Tamulaitiene G, Jakubauskas A, Urbanke C et al (2006) The crystal structure of the rare-cutting restriction enzyme SdaI reveals unexpected domain architecture. Structure 14:1389–1400

    Article  CAS  PubMed  Google Scholar 

  • Tatusov RL, Fedorova ND, Jackson JD et al (2003) The COG database: an updated version includes eukaryotes. BMC Bioinform 4:41

    Article  Google Scholar 

  • Tovchigrechko A, Wells CA, Vakser IA (2002) Docking of protein models. Protein Sci 11(8):1888–1896

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tress ML, Martelli PL, Frankish A et al (2007) The implications of alternative splicing in the ENCODE protein complement. Proc Natl Acad Sci U S A 104:5495–5500

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tuncbag N, Keskin O, Nussinov R et al (2012) Fast and accurate modeling of protein-protein interactions by combining template-interface-based docking with flexible refinement. Proteins 80(4):1239–1249

    Article  CAS  PubMed  Google Scholar 

  • UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43(Database issue):D204–D212

    Google Scholar 

  • Vakser IA (2013) Low-resolution structural modeling of protein interactome. Curr Opin Struct Biol 23(2):198–205

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Vreven T, Hwang H, Pierce BG et al (2014) Evaluating template-based and template-free protein-protein complex structure prediction. Brief Bioinform 15(2):169–176

    Article  CAS  PubMed  Google Scholar 

  • Vroling B, Sanders M, Baakman C, et al (2011) GPCRDB: information system for G protein-coupled receptors. Nucleic Acids Res 39(Database issue):D309–D319

    Google Scholar 

  • Wallrapp FH, Pan JJ, Ramamoorthy G et al (2013) Prediction of function for the polyprenyl transferase subgroup in the isoprenoid synthase superfamily. Proc Natl Acad Sci U S A 110(13):E1196–E1202

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang P, Yan B, Guo JT et al (2005) Structural genomics analysis of alternative splicing and application to isoform structure modeling. Proc Natl Acad Sci U S A 102:18920–18925

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80(7):1715–1735

    CAS  PubMed  PubMed Central  Google Scholar 

  • Xu D, Zhang Y (2013) Ab initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment. Sci Rep 3:1895

    PubMed  PubMed Central  Google Scholar 

  • Xu LZ, Sanchez R, Sali A et al (1996) Ligand specificity of brain lipid-binding protein. J Biol Chem 271:24711–24719

    Article  CAS  PubMed  Google Scholar 

  • Yang J, Roy A, Zhang Y (2013) Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 29(20):2588–2595

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang Y, Skolnick J (2004) Scoring function for automated assessment of protein structure template quality. Proteins 57(4):702–710

    Article  CAS  PubMed  Google Scholar 

  • Zhang QC, Petrey D, Deng L et al (2012) Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature 490(7421):556–560

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhao J, Dundas J, Kachalo S et al (2011) Accuracy of functional surfaces on comparatively modeled protein structures. J Struct Funct Genomics 12(2):97–107

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel J. Rigden .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media B.V.

About this chapter

Cite this chapter

Rigden, D.J., Cymerman, I.A., Bujnicki, J.M. (2017). Prediction of Protein Function from Theoretical Models. In: J. Rigden, D. (eds) From Protein Structure to Function with Bioinformatics. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-1069-3_15

Download citation

Publish with us

Policies and ethics