Abstract
Explicit 3D models can be obtained by comparative protein modelling, a mature and predictable technique, fragment assembly ab initio methods for smaller novel or unrecognisable folds and contact-based methods for large protein families. Each modelling method has limitations in model accuracy, which vary further according to the characteristics of the target: as a result, the performance of structure-based function prediction algorithms applied to models is variable. Nevertheless, with care, a wide variety of structure-based methods can be productively applied to protein models, frequently facilitating the planning and interpretation of experimental results. This chapter will first survey the literature on applicability of structure-based methods specifically to models, before discussing a selection of examples in more detail.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aloy P, Russell RB (2006) Structural systems biology: modelling protein interactions. Nat Rev Mol Cell Biol 7:188â197
Anishchenko I, Kundrotas PJ, Tuzikov AV et al (2014) Protein models: the grand challenge of protein docking. Proteins 82(2):278â287
Barth P, Schonbrun J, Baker D (2007) Toward high-resolution prediction and design of transmembrane helical protein structures. Proc Natl Acad Sci U S A 104(40):15682â15687
Baxter SM, Rosenblum JS, Knutson S et al (2004) Synergistic computational and experimental proteomics approaches for more accurate detection of active serine hydrolases in yeast. Mol Cell Proteomics 3:209â225
Bonneau R, Strauss CE, Rohl CA et al (2002) De novo prediction of three-dimensional structures for major protein families. J Mol Biol 322:65â78
Bordogna A, Pandini A, Bonati L (2011) Predicting the accuracy of proteinâligand docking on homology models. J Comput Chem 32(1):81â98
Bradley P, Misura KM, Baker D (2005) Toward high-resolution de novo structure prediction for small proteins. Science 309:1868â1871
Brylinski M, Skolnick J (2009) FINDSITE: a threading-based approach to ligand homology modeling. PLoS Comput Biol 5(6):e1000405
Brylinski M, Skolnick J (2010) Comprehensive structural and functional characterization of the human kinome by protein structure modeling and ligand virtual screening. J Chem Inf Model 50(10):1839â1854
Bui KH, von Appen A, DiGuilio AL et al (2013) Integrated structural analysis of the human nuclear pore complex scaffold. Cell 155(6):1233â1243
Bujnicki JM (2003) Crystallographic and bioinformatic studies on restriction endonucleases: inference of evolutionary relationships in the âmidnight zoneâ of homology. Curr Protein Pept Sci 4:327â337
Bumbaca D, Littlejohn JE, Nayakanti H et al (2007) Genome-based identification and characterization of a putative mucin-binding protein from the surface of Streptococcus pneumoniae. Proteins 66:547â558
Cammer SA, Hoffman BT, Speir JA et al (2003) Structure-based active site profiles for genome analysis and functional family subclassification. J Mol Biol 334:387â401
Chakravarty S, Sanchez R (2004) Systematic analysis of added-value in simple comparative models of protein structure. Structure 12:1461â1470
Chakravarty S, Wang L, Sanchez R (2005) Accuracy of structure-derived properties in simple comparative models of protein structures. Nucleic Acids Res 33:244â259
Chi A, Kemp RG (2000) The primordial high energy compound: ATP or inorganic pyrophosphate? J Biol Chem 275:35677â35679
Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5(4):823â826
Chung SY, Subbiah S (1996) A structural explanation for the twilight zone of protein sequence homology. Structure 4(10):1123â1127
Cohen-Gonsaud M, Ducasse S, Hoh F et al (2002) Crystal structure of MabA from Mycobacterium tuberculosis, a reductase involved in long-chain fatty acid biosynthesis. J Mol Biol 320(2):249â261
Cymerman IA, Meiss G, Bujnicki JM (2005) DNase II is a member of the phospholipase D superfamily. Bioinformatics 21:3959â3962
Davis FP, Braberg H, Shen MY et al (2006) Protein complex compositions predicted by structural similarity. Nucleic Acids Res 34:2943â2952
Davis FP, Barkan DT, Eswar N et al (2007) Host pathogen protein interactions predicted by comparative modeling. Protein Sci 16:2585â2596
Dessailly BH, Nair R, Jaroszewski L et al (2009) PSI-2: structural genomics to cover protein domain family space. Structure 17(6):869â881
Du Y, He YX, Zhang ZY et al (2011) Crystal structure of the mucin-binding domain of Spr1345 from Streptococcus pneumoniae. J Struct Biol 174(1):252â257
Du H, Brender JR, Zhang J et al (2015) Protein structure prediction provides comparable performance to crystallographic structures in docking-based virtual screening. Methods 71:77â84
Fan H, Irwin JJ, Webb BM et al (2009) Molecular docking screens using comparative models of proteins. J Chem Inf Model 49(11):2512â2527
Feder M, Bujnicki JM (2005) Identification of a new family of putative PD-(D/E)XK nucleases with unusual phylogenomic distribution and a new type of the active site. BMC Genom 6:21
Fetrow JS, Skolnick J (1998) Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. J Mol Biol 281:949â968
Fetrow JS, Godzik A, Skolnick J (1998) Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity. J Mol Biol 282:703â711
Furnham N, Ruffle S, Southan C (2004) Splice variants: a homology modeling approach. Proteins 54:596â608
Gao M, Skolnick J (2009) From nonspecific DNAâprotein encounter complexes to the prediction of DNAâprotein interactions. PLoS Comput Biol 5(3):e1000341
Ginalski K (2006) Comparative modeling for protein structure prediction. Curr Opin Struct Biol 16(2):172â177
Greer J (1985) Model structure for the inflammatory protein C5a. Science 228(4703):1055â1060
Haas J, Roth S, Arnold K, et al (2013) The protein model portalâa comprehensive resource for protein structure and model information. Database (Oxford) 2013:bat031
Hasegawa K, Funatsu K (2012) A new method for mapping the molecular surface of a protein structure using a spherical self-organizing map. Mol Inf 31(2):161â166
Hattersley AT, Ashcroft FM (2005) Activating mutations in Kir6.2 and neonatal diabetes: new clinical syndromes, new scientific insights, and new therapy. Diabetes 54:2503â2513
Hermann JC, Marti-Arbona R, Fedorov AA et al (2007) Structure-based activity prediction for an enzyme of unknown function. Nature 448:775â779
Hopf TA, Colwell LJ, Sheridan R et al (2012) Three-dimensional structures of membrane proteins from genomic sequencing. Cell 149(7):1607â1621
Jacobson M, Sali A (2004) Comparative protein structure modelling and its applications to drug discovery. Annu Rep Med Chem 39:259â274
Kamisetty H, Ovchinnikov S, Baker D (2013) Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci U S A 110(39):15674â15679
Kavanagh KL, Jornvall H, Persson B et al (2008) Medium- and short-chain dehydrogenase/reductase gene and protein families: the SDR superfamily: functional and structural diversity within a family of metabolic and regulatory enzymes. Cell Mol Life Sci 65(24):3895â3906
Khafizov K, Madrid-Aliste C, Almo SC et al (2014) Trends in structural coverage of the protein universe and the impact of the protein structure initiative. Proc Natl Acad Sci U S A 111(10):3733â3738
Kiefer F, Arnold K, Kunzli M, et al (2009) The SWISS-MODEL repository and associated resources. Nucleic Acids Res 37(Database issue):D387âD392
Kolinski A, Bujnicki JM (2005) Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models. Proteins 61(Suppl 7):84â90
Kryshtafovych A, Fidelis K, Tramontano A (2011) Evaluation of model quality predictions in CASP9. Proteins 79(Suppl 10):91â106
Kundrotas PJ, Vakser IA (2010) Accuracy of protein-protein binding sites in high-throughput template-based modeling. PLoS Comput Biol 6(4):e1000727
Kundrotas PJ, Zhu Z, Janin J et al (2012) Templates are available to model nearly all complexes of structurally characterized proteins. Proc Natl Acad Sci U S A 109(24):9438â9441
Lee HS, Zhang Y (2012) BSP-SLIM: A blind low-resolution ligand-protein docking approach using predicted protein structures. Proteins Struct Funct Bioinf 80(1):93â110
Lee TT, Agarwalla S, Stroud RM (2004) Crystal structure of RumA, an iron-sulfur cluster containing E. coli ribosomal RNA 5-methyluridine methyltransferase. Structure 12(3):397â407
Li S, Yamashita K, Amada KM et al (2014) Quantifying sequence and structural features of protein-RNA interactions. Nucleic Acids Res 42(15):10086â10098
Liu J, Fang C, Jiang Y et al (2009) Characterization of a hemolysin gene ytjA from Bacillus subtilis. Curr Microbiol 58(6):642â647
Lopez C, Chevalier N, Hannaert V et al (2002) Leishmania donovani phosphofructokinase. Gene characterization, biochemical properties and structure-modeling studies. Eur J Biochem 269:3978â3989
Lukk T, Sakai A, Kalyanaraman C et al (2012) Homology models guide discovery of diverse enzyme specificities among dipeptide epimerases in the enolase superfamily. Proc Natl Acad Sci U S A 109(11):4122â4127
Malmstrom L, Riffle M, Strauss CE et al (2007) Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology. PLoS Biol 5:e76
Marks DS, Colwell LJ, Sheridan R et al (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6(12):e28766
McGovern SL, Shoichet BK (2003) Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. J Med Chem 46:2895â2907
Michels PA, Chevalier N, Opperdoes FR et al (1997) The glycosomal ATP-dependent phosphofructokinase of Trypanosoma brucei must have evolved from an ancestral pyrophosphate-dependent enzyme. Eur J Biochem 250:698â704
Negroni J, Mosca R, Aloy P (2014) Assessing the applicability of template-based protein docking in the twilight zone. Structure 22(9):1356â1362
Oshiro C, Bradley EK, Eksterowicz J et al (2004) Performance of 3D-database molecular docking studies into homology models. J Med Chem 47:764â767
Parkkinen T, Boer H, Janis J et al (2011) Crystal structure of uronate dehydrogenase from Agrobacterium tumefaciens. J Biol Chem 286(31):27294â27300
Pickl A, Schonheit P (2015) The oxidative pentose phosphate pathway in the haloarchaeon Haloferax volcanii involves a novel type of glucose-6-phosphate dehydrogenaseâthe archaeal Zwischenferment. FEBS Lett
Piedra D, Lois S, de la Cruz X (2008) Preservation of protein clefts in comparative models. BMC Struct Biol 8:2-6807-8-2
Pieper U, Schlessinger A, Kloppmann E et al (2013) Coordinating the impact of structural genomics on the human [alpha]-helical transmembrane proteome. Nat Struct Mol Biol 20(2):135â138
Pieper U, Webb BM, Dong GQ, et al (2014a) ModBase, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 42(Database issue):D336âD346
Pieper U, Webb BM, Dong GQ, et al (2014b) ModBase, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 42(Database issue):D336âD346
Rigden DJ, Galperin MY (2008) Sequence analysis of GerM and SpoVS, uncharacterised bacterial âsporulationâ proteins with widespread phylogenetic distribution. Bioinform. doi:10.1093/bioinformatics/btn314 (accepted)
Rodrigues J, Melquiond A, Karaca E et al (2013) Defining the limits of homology modeling in information-driven protein docking. Proteins Struct Funct Bioinf 81(12):2119â2128
Rose PW, Prlic A, Bi C, et al (2015) The RCSB protein data bank: views of structural biology for basic and applied research and education. Nucleic Acids Res 43(Database issue):D345âD356
Roy A, Yang J, Zhang Y (2012) COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res 40(Web Server issue):W471âW477
Schafer P, Cymerman IA, Bujnicki JM et al (2007) Human lysosomal DNase IIalpha contains two requisite PLD-signature (HxK) motifs: evidence for a pseudodimeric structure of the active enzyme species. Protein Sci 16:82â91
Schwede T (2013) Protein modeling: what happened to the âprotein structure gapâ? Structure 21(9):1531â1540
Schwede T, Sali A, Honig B et al (2009) Outcome of a workshop on applications of protein models in biomedical research. Structure 17(2):151â159
Shortle D, Simons KT, Baker D (1998) Clustering of low-energy conformations near the native structures of small proteins. Proc Natl Acad Sci U S A 95(19):11158â11162
Skolnick J, Zhou H, Gao M (2013) Are predicted protein structures of any value for binding site prediction and virtual ligand screening? Curr Opin Struct Biol 23(2):191â197
Sodhi JS, Bryson K, McGuffin LJ et al (2004) Predicting metal-binding site residues in low-resolution structural models. J Mol Biol 342:307â320
Soding J, Biegert A, Lupas AN (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33(Web Server issue):W244âW248
Song L, Kalyanaraman C, Fedorov AA et al (2007) Prediction and assignment of function for a divergent N-succinyl amino acid racemase. Nat Chem Biol 3:486â491
Song Y, DiMaio F, Wang RY et al (2013) High-resolution comparative modeling with Rosetta CM. Structure 21(10):1735â1742
Szilagyi A, Skolnick J (2006) Efficient prediction of nucleic acid binding function from low-resolution protein structures. J Mol Biol 358:922â933
Szilagyi A, Zhang Y (2014) Template-based structure modeling of protein-protein interactions. Curr Opin Struct Biol 24:10â23
Tammaro P, Flanagan SE, Zadek B et al (2008) A Kir6.2 mutation causing severe functional effects in vitro produces neonatal diabetes without the expected neurological complications. Diabetologia
Tamulaitiene G, Jakubauskas A, Urbanke C et al (2006) The crystal structure of the rare-cutting restriction enzyme SdaI reveals unexpected domain architecture. Structure 14:1389â1400
Tatusov RL, Fedorova ND, Jackson JD et al (2003) The COG database: an updated version includes eukaryotes. BMC Bioinform 4:41
Tovchigrechko A, Wells CA, Vakser IA (2002) Docking of protein models. Protein Sci 11(8):1888â1896
Tress ML, Martelli PL, Frankish A et al (2007) The implications of alternative splicing in the ENCODE protein complement. Proc Natl Acad Sci U S A 104:5495â5500
Tuncbag N, Keskin O, Nussinov R et al (2012) Fast and accurate modeling of protein-protein interactions by combining template-interface-based docking with flexible refinement. Proteins 80(4):1239â1249
UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43(Database issue):D204âD212
Vakser IA (2013) Low-resolution structural modeling of protein interactome. Curr Opin Struct Biol 23(2):198â205
Vreven T, Hwang H, Pierce BG et al (2014) Evaluating template-based and template-free protein-protein complex structure prediction. Brief Bioinform 15(2):169â176
Vroling B, Sanders M, Baakman C, et al (2011) GPCRDB: information system for G protein-coupled receptors. Nucleic Acids Res 39(Database issue):D309âD319
Wallrapp FH, Pan JJ, Ramamoorthy G et al (2013) Prediction of function for the polyprenyl transferase subgroup in the isoprenoid synthase superfamily. Proc Natl Acad Sci U S A 110(13):E1196âE1202
Wang P, Yan B, Guo JT et al (2005) Structural genomics analysis of alternative splicing and application to isoform structure modeling. Proc Natl Acad Sci U S A 102:18920â18925
Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80(7):1715â1735
Xu D, Zhang Y (2013) Ab initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment. Sci Rep 3:1895
Xu LZ, Sanchez R, Sali A et al (1996) Ligand specificity of brain lipid-binding protein. J Biol Chem 271:24711â24719
Yang J, Roy A, Zhang Y (2013) Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 29(20):2588â2595
Zhang Y, Skolnick J (2004) Scoring function for automated assessment of protein structure template quality. Proteins 57(4):702â710
Zhang QC, Petrey D, Deng L et al (2012) Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature 490(7421):556â560
Zhao J, Dundas J, Kachalo S et al (2011) Accuracy of functional surfaces on comparatively modeled protein structures. J Struct Funct Genomics 12(2):97â107
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Rigden, D.J., Cymerman, I.A., Bujnicki, J.M. (2017). Prediction of Protein Function from Theoretical Models. In: J. Rigden, D. (eds) From Protein Structure to Function with Bioinformatics. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-1069-3_15
Download citation
DOI: https://doi.org/10.1007/978-94-024-1069-3_15
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-024-1067-9
Online ISBN: 978-94-024-1069-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)