Function Diversity Within Folds and Superfamilies

  • Benoit H. Dessailly
  • Natalie L. Dawson
  • Sayoni Das
  • Christine A. OrengoEmail author


The structural genomics initiatives significantly increased the numbers of three-dimensional structures available for proteins of unknown function. However, the extent to which structural information helps understanding function is still a matter of debate. Here, the value of detecting structural relationships at different levels (typically, fold and superfamily ) for transferring functional annotations between proteins is reviewed. First, function diversity of proteins sharing the same fold is investigated, and it is shown that although the identification of a fold can in some cases provide clues on functional properties, the diversity of functions within a fold can be such that this information is very limited for some particularly diverse folds (e.g. super-folds). Next, since structural data can help detecting homology in the absence of sequence similarity, function diversity between proteins from the same superfamily (homologous proteins) is analysed. The evolutionary causes and the mechanisms that have generated the observed functional diversity between related proteins are discussed, and helpful tools for the correlated analysis of structure, function and evolution are reviewed.


Protein structure Protein function Function annotation Function prediction Protein function diversity Protein evolution Protein function evolution Functional sites Protein folds Superfolds Protein superfamilies Structural genomics initiatives Homology 


  1. Adams MA, Suits MDL, Zheng J, Jia Z (2007) Piecing together the structure-function puzzle: experiences in structure-based functional annotation of hypothetical proteins. Proteomics 7:2920–2932. doi: 10.1002/pmic.200700099 CrossRefPubMedGoogle Scholar
  2. Addou S, Rentzsch R, Lee D, Orengo CA (2009) Domain-based and family-specific sequence identity thresholds increase the levels of reliable protein function transfer. J Mol Biol 387:416–430. doi: 10.1016/j.jmb.2008.12.045 CrossRefPubMedGoogle Scholar
  3. Akiva E, Brown S, Almonacid DE et al (2014) The structure-function linkage database. Nucleic Acids Res 42:D521–D530. doi: 10.1093/nar/gkt1130 CrossRefPubMedGoogle Scholar
  4. Andreeva A, Murzin AG (2006) Evolution of protein fold in the presence of functional constraints. Curr Opin Struct Biol 16:399–408. doi: 10.1016/ CrossRefPubMedGoogle Scholar
  5. Andreeva A, Howorth D, Chandonia JM et al (2007) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36:D419–D425. doi: 10.1093/nar/gkm993 CrossRefPubMedPubMedCentralGoogle Scholar
  6. Andreeva A, Howorth D, Chothia C et al (2014) SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res 42:D310–D314. doi: 10.1093/nar/gkt1242 CrossRefPubMedGoogle Scholar
  7. Andreeva A, Howorth D, Chothia C et al (2015) Investigating protein structure and evolution with SCOP2. Curr Protoc Bioinform 49:1.26.1–1.26.21. doi: 10.1002/0471250953.bi0126s49
  8. Aravind L, Anantharaman V, Koonin EV (2002) Monophyly of class I aminoacyl tRNA synthetase, USPA, ETFP, photolyase, and PP-ATPase nucleotide-binding domains: implications for protein evolution in the RNA. Proteins 48:1–14. doi: 10.1002/prot.10064 CrossRefPubMedGoogle Scholar
  9. Ashburner M, Ball CAA, Blake JAA et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29. doi: 10.1038/75556 CrossRefPubMedPubMedCentralGoogle Scholar
  10. Baier F, Tokuriki N (2014) Connectivity between catalytic landscapes of the Metallo-β-Lactamase superfamily. J Mol Biol 426:2442–2456. doi: 10.1016/j.jmb.2014.04.013 CrossRefPubMedGoogle Scholar
  11. Baier F, Chen J, Solomonson M et al (2015) Distinct metal isoforms underlie promiscuous activity profiles of metalloenzymesGoogle Scholar
  12. Bashton M, Chothia C (2007) The generation of new protein functions by the combination of domains. Structure 15:85–99. doi: 10.1016/j.str.2006.11.009 CrossRefPubMedGoogle Scholar
  13. Bashton M, Nobeli I, Thornton JM (2006) Cognate ligand domain mapping for enzymes. J Mol Biol 364:836–852. doi: 10.1016/j.jmb.2006.09.041 CrossRefPubMedGoogle Scholar
  14. Bashton M, Nobeli I, Thornton JM (2008) PROCOGNATE: a cognate ligand domain mapping for enzymes. Nucleic Acids Res 36:D618–D622. doi: 10.1093/nar/gkm611 CrossRefPubMedGoogle Scholar
  15. Brudler R, Hitomi K, Daiyasu H et al (2003) Identification of a new cryptochrome class. Structure, function, and evolution. Mol Cell 11:59–67CrossRefPubMedGoogle Scholar
  16. Burroughs AM, Allen KN, Dunaway-Mariano D, Aravind L (2006) Evolutionary genomics of the HAD superfamily: understanding the structural adaptations and catalytic diversity in a superfamily of phosphoesterases and allied enzymes. J Mol Biol 361:1003–1034. doi: 10.1016/j.jmb.2006.06.049 CrossRefPubMedGoogle Scholar
  17. Caspi R, Altman T, Billington R et al (2014) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 42:D459–D471. doi: 10.1093/nar/gkt1103 CrossRefPubMedGoogle Scholar
  18. Cheng H, Schaeffer RD, Liao Y et al (2014) ECOD: an evolutionary classification of protein domains. PLoS Comput Biol 10:e1003926. doi: 10.1371/journal.pcbi.1003926 CrossRefPubMedPubMedCentralGoogle Scholar
  19. Chothia C, Gough J (2009) Genomic and structural aspects of protein evolution. Biochem J 419:15–28. doi: 10.1042/BJ20090122 CrossRefPubMedGoogle Scholar
  20. Colovos C, Cascio D, Yeates TO (1998) The 1.8 A crystal structure of the ycaC gene product from Escherichia coli reveals an octameric hydrolase of unknown specificity. Structure 6:1329–1337CrossRefPubMedGoogle Scholar
  21. Croft D, Mundo AFF, Haw R et al (2014) The Reactome pathway knowledgebase. Nucleic Acids Res 42:D472–D477. doi: 10.1093/nar/gkt1102 CrossRefPubMedGoogle Scholar
  22. Cuff A, Redfern OC, Greene L et al (2009) The CATH hierarchy revisited-structural divergence in domain superfamilies and the continuity of fold space. Structure 17:1051–1062. doi: 10.1016/j.str.2009.06.015 CrossRefPubMedPubMedCentralGoogle Scholar
  23. Das S, Lee D, Sillitoe I et al (2015) Functional classification of CATH superfamilies: a domain-based approach for protein function annotation. Bioinformatics btv398:1–8. doi: 10.1093/bioinformatics/btv398
  24. Dessailly BH, Lensink MF, Orengo CA, Wodak SJ (2008) LigASite—a database of biologically relevant binding sites in proteins with known apo-structures. Nucleic Acids Res. doi: 10.1093/nar/gkm839 PubMedGoogle Scholar
  25. Devos D, Valencia A (2000) Practical limits of function prediction. Proteins Struct Funct Genet 107:98–107CrossRefGoogle Scholar
  26. Devos D, Valencia A (2001) Intrinsic errors in genome annotation. Trends Genet 17:429–431CrossRefPubMedGoogle Scholar
  27. Dolinski K, Botstein D (2007) Orthology and functional conservation in eukaryotes. Annu Rev Genet 41:465–507. doi: 10.1146/annurev.genet.40.110405.090439 CrossRefPubMedGoogle Scholar
  28. Favia AD, Nobeli I, Glaser F, Thornton JM (2008) Molecular docking for substrate identification: the short-chain dehydrogenases/reductases. J Mol Biol 375:855–874. doi: 10.1016/j.jmb.2007.10.065 CrossRefPubMedGoogle Scholar
  29. Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230. doi: 10.1093/nar/gkt1223 CrossRefPubMedGoogle Scholar
  30. Fox NK, Brenner SE, Chandonia J-MM (2014) SCOPe: structural classification of proteins–extended, integrating SCOP and ASTRAL data and classification of new structures. Nucleic Acids Res 42:D304–D309. doi: 10.1093/nar/gkt1240 CrossRefPubMedGoogle Scholar
  31. Fu L, Niu B, Zhu Z et al (2012) CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28:3150–3152. doi: 10.1093/bioinformatics/bts565 CrossRefPubMedPubMedCentralGoogle Scholar
  32. Furnham N, Sillitoe I, Holliday GL et al (2012a) FunTree: a resource for exploring the functional evolution of structurally defined enzyme superfamilies. Nucleic Acids Res 40:D776–D782. doi: 10.1093/nar/gkr852 CrossRefPubMedGoogle Scholar
  33. Furnham N, Sillitoe I, Holliday GL et al (2012b) Exploring the evolution of novel enzyme functions within structurally defined protein superfamilies. PLoS Comput Biol 8:e1002403 +. doi: 10.1371/journal.pcbi.1002403
  34. Furnham N, Holliday GL, de Beer TAP et al (2014) The catalytic site atlas 2.0: cataloging catalytic sites and residues identified in enzymes. Nucleic Acids Res 42:D485–D489. doi: 10.1093/nar/gkt1243 CrossRefPubMedGoogle Scholar
  35. Furnham N, Dawson NL, Rahman SA et al (2015) Large-scale analysis exploring evolution of catalytic machineries and mechanisms in enzyme superfamilies. J Mol Biol. doi: 10.1016/j.jmb.2015.11.010 PubMedGoogle Scholar
  36. Furukawa H, Singh SK, Mancusso R, Gouaux E (2005) Subunit arrangement and function in NMDA receptors. Nature 438:185–192Google Scholar
  37. Gerlt JA, Babbitt PC (2001) Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies. Annu Rev Biochem 70:209–246. doi: 10.1146/annurev.biochem.70.1.209 CrossRefPubMedGoogle Scholar
  38. Glasner M, Gerlt J, Babbitt P (2006) Evolution of enzyme superfamilies. Curr Opin Chem Biol 10:492–497. doi: 10.1016/j.cbpa.2006.08.012 CrossRefPubMedGoogle Scholar
  39. Goldstein RA (2008) The structure of protein evolution and the evolution of protein structure. Curr Opin Struct Biol 18:170–177. doi: 10.1016/ CrossRefPubMedGoogle Scholar
  40. Greene LH, Lewis TE, Addou S et al (2007) The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution. Nucleic Acids Res 35:D291–D297. doi: 10.1093/nar/gkl959 CrossRefPubMedGoogle Scholar
  41. Grishin NV (2001) Fold change in evolution of protein structures. J Struct Biol 134:167–185CrossRefPubMedGoogle Scholar
  42. Harrison PM, Gerstein M (2002) Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol 318:1155–1174CrossRefPubMedGoogle Scholar
  43. Harrison A, Pearl F, Mott R et al (2002) Quantifying the similarities within fold space. J Mol Biol. doi: 10.1016/S0022-2836(02)00992-0 Google Scholar
  44. Hegyi H, Gerstein M (2001) Annotation transfer for genomics: measuring functional divergence in multi-domain proteins. Genome Res 11:1632–1640. doi: 10.1101/gr.183801 CrossRefPubMedPubMedCentralGoogle Scholar
  45. Hernández S, Ferragut G, Amela I et al (2014) MultitaskProtDB: a database of multitasking proteins. Nucleic Acids Res 42:D517–D520. doi: 10.1093/nar/gkt1153 CrossRefPubMedGoogle Scholar
  46. Holliday GL, Andreini C, Fischer JD et al (2011) MACiE: exploring the diversity of biochemical reactions. Nucleic Acids Res 40:gkr799–D789. doi: 10.1093/nar/gkr799
  47. Holm L, Sander C (1993) Protein structure comparison by alignment of distance matrices. J Mol Biol 233:123–138. doi: 10.1006/jmbi.1993.1489 CrossRefPubMedGoogle Scholar
  48. Holm L, Sander C (1996a) The FSSP database: fold classification based on structure-structure alignment of proteins. Nucleic Acids Res 24:206–209CrossRefPubMedPubMedCentralGoogle Scholar
  49. Holm L, Sander C (1996b) Mapping the protein universe. Science 273:595–603CrossRefPubMedGoogle Scholar
  50. Horowitz NH (1945) On the evolution of biochemical syntheses. Proc Natl Acad Sci USA 31:153–157CrossRefPubMedPubMedCentralGoogle Scholar
  51. Jeffery CJ (1999) Moonlighting proteins. Tr Bioch Sci 24:8–11CrossRefGoogle Scholar
  52. Jeffery CJ (2004) Moonlighting proteins: complications and implications for proteomics research. Drug Discov Today TARGETS 3:71–78. doi: 10.1016/S1741-8372(04)02405-3 CrossRefGoogle Scholar
  53. Jiang H, Blouin C (2007) Insertions and the emergence of novel protein structure: a structure-based phylogenetic study of insertions. BMC Bioinform 8:444. doi: 10.1186/1471-2105-8-444 CrossRefGoogle Scholar
  54. Kanehisa M, Goto S, Sato Y et al (2014) Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res 42:D199–D205. doi: 10.1093/nar/gkt1076 CrossRefPubMedGoogle Scholar
  55. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780. doi: 10.1093/molbev/mst010 CrossRefPubMedPubMedCentralGoogle Scholar
  56. Khersonsky O, Tawfik DS (2010) Enzyme promiscuity: a mechanistic and evolutionary perspective. Annu Rev Biochem 79:471–505CrossRefPubMedGoogle Scholar
  57. Khersonsky O, Roodveldt C, Tawfik D (2006) Enzyme promiscuity: evolutionary and mechanistic aspects. Curr Opin Chem Biol 10:498–508. doi: 10.1016/j.cbpa.2006.08.011 CrossRefPubMedGoogle Scholar
  58. Kolodny R, Koehl P, Levitt M (2005) Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J Mol Biol 346:1173–1188. doi: 10.1016/j.jmb.2004.12.032 CrossRefPubMedPubMedCentralGoogle Scholar
  59. Kolodny R, Petrey D, Honig B (2006) Protein structure comparison: implications for the nature of “fold space”, and structure and function prediction. Curr Opin Struct Biol 16:393–398. doi: 10.1016/ CrossRefPubMedGoogle Scholar
  60. Kraulis PJ (1991) MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J Appl Crystallogr 24:946–950Google Scholar
  61. Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr 60:2256–2268. doi: 10.1107/S0907444904026460 CrossRefPubMedGoogle Scholar
  62. Lee D, Grant A, Marsden RL, Orengo C (2005) Identification and distribution of protein families in 120 completed genomes using Gene3D. Proteins Struct Funct Bioinforma. doi: 10.1002/prot.20409 Google Scholar
  63. Lee D, Redfern O, Orengo C (2007) Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol 8:995–1005. doi: 10.1038/nrm2281 CrossRefPubMedGoogle Scholar
  64. Lee DA, Rentzsch R, Orengo C (2010) GeMMA: functional subfamily classification within superfamilies of predicted protein structural domains. Nucleic Acids Res 38:720–737. doi: 10.1093/nar/gkp1049 CrossRefPubMedGoogle Scholar
  65. Lees JG, Lee D, Studer RA et al (2014) Gene3D: multi-domain annotations for protein sequence and comparative genome analysis. Nucleic Acids Res 42:D240–D245. doi: 10.1093/nar/gkt1205 CrossRefPubMedGoogle Scholar
  66. Lopez G, Maietta P, Rodriguez JM et al (2011) Firestar–advances in the prediction of functionally important residues. Nucleic Acids Res 39:W235–W241. doi: 10.1093/nar/gkr437 CrossRefPubMedPubMedCentralGoogle Scholar
  67. Madera M (2008) Profile comparer: a program for scoring and aligning profile hidden Markov models. Bioinformatics 24:2630–2631Google Scholar
  68. Mani M, Chen C, Amblee V et al (2014) MoonProt: a database for proteins that are known to moonlight. Nucleic Acids Res gku954Google Scholar
  69. Marsden RL, Ranea JAG, Sillero A et al (2006) Exploiting protein structure data to explore the evolution of protein function and biological complexity. Philos Trans R Soc B Biol Sci. doi: 10.1098/rstb.2005.1801 Google Scholar
  70. Martin AC, Orengo CA, Hutchinson EG et al (1998) Protein folds and functions. Structure 6:875–884CrossRefPubMedGoogle Scholar
  71. Merritt EA, Bacon DJ (1997) [26] Raster3D: photorealistic molecular graphics. Methods Enzymol 277:505–524Google Scholar
  72. Moult J, Melamud E (2000) From fold to function. Curr Opin Struct Biol 10:384–389CrossRefPubMedGoogle Scholar
  73. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540. doi: 10.1016/S0022-2836(05)80134-2 PubMedGoogle Scholar
  74. Nagano N (2005) EzCatDB: the enzyme catalytic-mechanism database. Nucleic Acids Res 33:D407–D412. doi: 10.1093/nar/gki080 CrossRefPubMedGoogle Scholar
  75. Nagano N, Orengo CA, Thornton JM (2002) One fold with many functions: the evolutionary relationships between TIM barrel families based on their sequences, structures and functions. J Mol Biol 321:741–765CrossRefPubMedGoogle Scholar
  76. Nomenclature Committee of the IUBMB (1992) Enzyme nomenclature: recommendations of the nomenclature committee of the international union of biochemistry and molecular biology. Academic Press, San Diego, CaliforniaGoogle Scholar
  77. O’Boyle NM, Holliday GL, Almonacid DE, Mitchell JBO (2007) Using reaction mechanism to measure enzyme similarity. J Mol Biol 368:1484–1499. doi: 10.1016/j.jmb.2007.02.065 CrossRefPubMedPubMedCentralGoogle Scholar
  78. Oates ME, Stahlhacke J, Vavoulis DV et al (2015) The SUPERFAMILY 1.75 database in 2014: a doubling of data. Nucleic Acids Res 43:D227–D233. doi: 10.1093/nar/gku1041 CrossRefPubMedGoogle Scholar
  79. Ojha S, Meng EC, Babbitt PC (2007) Evolution of function in the “two dinucleotide binding domains” flavoproteins. PLoS Comput Biol 3:e121 +. doi: 10.1371/journal.pcbi.0030121
  80. Orengo CA, Taylor WR (1996) SSAP: sequential structure alignment program for protein structure comparison. In: Russell FD (ed) Methods in enzymology. Academic Press, CambridgeGoogle Scholar
  81. Orengo CA, Jones DT, Thornton JM (1994) Protein domain superfolds and superfamiliesGoogle Scholar
  82. Orengo CA (1999) CORA—topological fingerprints for protein structural families. Protein Sci 8:699–715Google Scholar
  83. Orengo CA, Michie AD, Jones S et al (1997) CATH—a hierarchic classification of protein domain structures. Structure 5:1093–1108CrossRefPubMedGoogle Scholar
  84. Pandya C, Farelli JD, Dunaway-Mariano D, Allen KN (2014) Enzyme promiscuity: engine of evolutionary innovation. J Biol Chem 289:30229–30236. doi: 10.1074/jbc.R114.572990 CrossRefPubMedPubMedCentralGoogle Scholar
  85. Pethica RB, Levitt M, Gough J (2012) Evolutionarily consistent families in SCOP: sequence, structure and function. BMC Struct Biol 12:27. doi: 10.1186/1472-6807-12-27 CrossRefPubMedPubMedCentralGoogle Scholar
  86. Piatigorsky J, Kantorow M, Gopal-Srivastava R, Tomarev SI (1994) Recruitment of enzymes and stress proteins as lens crystallins. EXS 71:241–250PubMedGoogle Scholar
  87. Porter CT, Bartlett GJ, Thornton JM (2004) The catalytic site atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res 32:D129–D133. doi: 10.1093/nar/gkh028 CrossRefPubMedPubMedCentralGoogle Scholar
  88. Radivojac P, Clark WT, Oron TR et al (2013) A large-scale evaluation of computational protein function prediction. Nat Methods 10:221–227. doi: 10.1038/nmeth.2340 CrossRefPubMedPubMedCentralGoogle Scholar
  89. Rahman SA, Cuesta SM, Furnham N et al (2014) EC-BLAST: a tool to automatically search and compare enzyme reactions. Nat Methods 11:171–174. doi: 10.1038/nmeth.2803 CrossRefPubMedPubMedCentralGoogle Scholar
  90. Rausell A, Juan D, Pazos F, Valencia A (2010) Protein interactions and ligand binding: from protein subfamilies to functional specificity. Proc Natl Acad Sci 107:1995–2000. doi: 10.1073/pnas.0908044107 CrossRefPubMedPubMedCentralGoogle Scholar
  91. Redfern OC, Harrison A, Dallman T et al (2007) CATHEDRAL: a fast and effective algorithm to predict folds and domain boundaries from multidomain protein structures. PLoS Comput Biol 3:e232 +. doi: 10.1371/journal.pcbi.0030232
  92. Reeves G, Dallman T, Redfern O et al (2006) Structural diversity of domain superfamilies in the CATH database. J Mol Biol 360:725–741. doi: 10.1016/j.jmb.2006.05.035 CrossRefPubMedGoogle Scholar
  93. Reid AJ, Yeats C, Orengo CA (2007) Methods of remote homology detection can be combined to increase coverage by 10% in the midnight zone. Bioinformatics 23:2353–2360. doi: 10.1093/bioinformatics/btm355 CrossRefPubMedGoogle Scholar
  94. Rison SCG, Thornton JM (2002) Pathway evolution, structurally speaking. Curr Opin Struct Biol 12:374–382. doi: 10.1016/s0959-440x(02)00331-7 CrossRefPubMedGoogle Scholar
  95. Rost B (2002) Enzyme function less conserved than anticipated. J Mol Biol 318:595–608CrossRefPubMedGoogle Scholar
  96. Ruepp A, Zollner A, Maier D et al (2004) The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res 32:5539–5545. doi: 10.1093/nar/gkh894 CrossRefPubMedPubMedCentralGoogle Scholar
  97. Russell RB, Saqi MA, Sayle RA et al (1997) Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mol Biol 269:423–439. doi: 10.1006/jmbi.1997.1019 CrossRefPubMedGoogle Scholar
  98. Russell RB, Sasieni PD, Sternberg MJ (1998) Supersites within superfolds. Binding site similarity in the absence of homology. J Mol Biol 282:903–918. doi: 10.1006/jmbi.1998.2043 CrossRefPubMedGoogle Scholar
  99. Sadreyev R, Grishin N (2003) COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J Mol Biol 326:317–336CrossRefPubMedGoogle Scholar
  100. Sangar V, Blankenberg DJ, Altman N, Lesk AM (2007) Quantitative sequence-function relationships in proteins based on gene ontology. BMC Bioinform 8:294. doi: 10.1186/1471-2105-8-294 CrossRefGoogle Scholar
  101. Shakhnovich BE, Koonin EV (2006) Origins and impact of constraints in evolution of gene families. Genome Res 16:1529–1536. doi: 10.1101/gr.5346206 CrossRefPubMedPubMedCentralGoogle Scholar
  102. Shindyalov IN, Bourne PE (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 11:739–747. doi: 10.1093/protein/11.9.739 CrossRefPubMedGoogle Scholar
  103. Sillitoe I, Lewis TE, Cuff A et al (2015) CATH: comprehensive structural and functional annotations for genome sequences. Nucleic Acids Res 43:D376–D381. doi: 10.1093/nar/gku947 CrossRefPubMedGoogle Scholar
  104. Söding J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951–960. doi: 10.1093/bioinformatics/bti125 CrossRefPubMedGoogle Scholar
  105. Takahashi H, Inagaki E, Kuroishi C, Tahirov TH (2004) Structure of the Thermus thermophilus putative periplasmic glutamate/glutamine-binding protein. Acta Crystallogr Sect D Biol Crystallogr 60:1846–1854Google Scholar
  106. Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637CrossRefPubMedGoogle Scholar
  107. The UniProt Consortium (2014) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212. doi: 10.1093/nar/gku989 CrossRefPubMedCentralGoogle Scholar
  108. Tian W, Skolnick J (2003) How well is enzyme function conserved as a function of pairwise sequence identity? J Mol Biol 333:863–882CrossRefPubMedGoogle Scholar
  109. Todd AE, Orengo CA, Thornton JM (2001) Evolution of function in protein superfamilies, from a structural perspective. J Mol Biol 307:1113–1143. doi: 10.1006/jmbi.2001.4513 CrossRefPubMedGoogle Scholar
  110. Todd AE, Orengo CA, Thornton JM (2002) Sequence and structural differences between enzyme and nonenzyme homologs. Structure 10:1435–1451CrossRefPubMedGoogle Scholar
  111. Whisstock JC, Lesk AM (2003) Prediction of protein function from protein sequence and structure. Q Rev Biophys 36:307–340CrossRefPubMedGoogle Scholar
  112. Wilson D, Madera M, Vogel C et al (2007) The SUPERFAMILY database in 2007: families and functions. Nucleic Acids Res 35:D308–D313. doi: 10.1093/nar/gkl910 CrossRefPubMedGoogle Scholar
  113. Ye Y, Godzik A (2003) Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19:ii246–ii255. doi: 10.1093/bioinformatics/btg1086
  114. Yeats C, Lees J, Reid A et al (2008) Gene3D: comprehensive structural and functional annotation of genomes. Nucleic Acids Res. doi: 10.1093/nar/gkm1019 PubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2017

Authors and Affiliations

  • Benoit H. Dessailly
    • 1
  • Natalie L. Dawson
    • 1
  • Sayoni Das
    • 1
  • Christine A. Orengo
    • 1
    Email author
  1. 1.Department of Structural and Molecular BiologyUniversity College LondonLondonUK

Personalised recommendations