Journal of Molecular Evolution

, Volume 60, Issue 4, pp 484–498 | Cite as

Universal Sharing Patterns in Proteomes and Evolution of Protein Fold Architecture and Life

  • Gustavo Caetano-Anollés
  • Derek Caetano-Anollés


Protein evolution is imprinted in both the sequence and the structure of evolutionary building blocks known as protein domains. These domains share a common ancestry and can be unified into a comparatively small set of folding architectures, the protein folds. We have traced the distribution of protein folds between and within proteomes belonging to Eukarya, Archaea, and Bacteria along the branches of a universal phylogeny of protein architecture. This tree was reconstructed from global fold-usage statistics derived from a structural census of proteomes. We found that folds shared by the three organismal domains were placed almost exclusively at the base of the rooted tree and that there were marked heterogeneities in fold distribution and clear evolutionary patterns related to protein architecture and organismal diversification. These include a relative timing for the emergence of prokaryotes, congruent episodes of architectural loss and diversification in Archaea and Bacteria, and a late and quite massive rise of architectural novelties in Eukarya perhaps linked to multicellularity.


Archaea Bacteria Eukarya Organismal diversification Origins of life Phylogenetic tracing Protein structure Proteome diversification 



We would like to thank Jay Mittenthal (University of Illinois) and Dietz Bauer (Ohio State University) for valuable comments and suggestions.


  1. Ancel, LW, Fontana, W 2000Plasticity, evolvability, and modularity in RNAJ Exp Zool (Mol Dev Evol)288242283CrossRefGoogle Scholar
  2. Aravind, L, Mazumder, R, Vasudevan, S, Koonin, EV 2002Trends in protein evolution inferred from sequence and structure analysisCurr Opin Struct Biol12392399CrossRefPubMedGoogle Scholar
  3. Bauer, WD, Mathesius, U 2004Plant responses to bacterial quorum sensing signalsCurr Opin Plant Biol7429433CrossRefPubMedGoogle Scholar
  4. Caetano-Anollés, G 2002Evolved RNA secondary structure and the rooting of the universal tree of lifeJ Mol Evol54333345PubMedGoogle Scholar
  5. Caetano-Anollés, G, Caetano-Anollés, D 2003An evolutionarily structured universe of protein architectureGenome Res1315631571CrossRefPubMedGoogle Scholar
  6. Carlile, M 1982Prokaryotes and eukaryotes: Strategies and successesTrends Biochem7128130CrossRefGoogle Scholar
  7. Chervitz, SA, Aravind, L, Sherlock, G, Ball, CA, Koonin, EV, Dwight, SS, Harris, MA, Dolinski, K, Mohr, S, Smith, T, Weng, S, Cherry, JM, Botstein, D 1998Comparison of the complete protein sets of worm and yeast: Orthology and divergenceScience8220222028CrossRefGoogle Scholar
  8. Chothia, C, Lesk, AM 1986The relation between the divergence of sequence and structure in proteinsEMBO J5823826PubMedGoogle Scholar
  9. Chothia, C, Gough, J, Vogel, C, Teichmann, SA 2003Evolution of the protein repertoireScience30017011703CrossRefPubMedGoogle Scholar
  10. Copley, RR, Bork, P 2000Homology among (βα)8 barrels: implications for the evolution of metabolic pathwaysJ Mol Biol303627640CrossRefPubMedGoogle Scholar
  11. Copley, RR, Schultz, J, Ponting, CP, Bork, P 1999Protein families in multicellular organismsCurr Opin Struct Biol9408415CrossRefPubMedGoogle Scholar
  12. Coulson, AFW, Moult, J 2002A unifold, mesofold, and superfold model of protein fold useProteins466171CrossRefPubMedGoogle Scholar
  13. Daubin, V, Moran, NA, Ochman, H 2003Phylogenetics and the cohesion of bacterial genomesScience301829832CrossRefPubMedGoogle Scholar
  14. Felsenstein, J 1985Confidence limits on phylogenies: An approach using the bootstrapEvolution39783791Google Scholar
  15. Felsenstein, J 2004Inferring phylogeniesSinauer AssociatesSunderland, MAGoogle Scholar
  16. Frishman, D, Mewes, H-W 1997Protein structural classes in five complete genomesNature Struct Biol4626628CrossRefPubMedGoogle Scholar
  17. Frishman, D, Albermann, K, Hani, J, Heumann, K, Metanomski, A, Zollner, A, Mewes, H-W 2001Functional and structural genomics using PEDANTBioinformatics174457CrossRefPubMedGoogle Scholar
  18. Gavrilets, S 1997Evolution and speciation on holey adaptive landscapesTrends Ecol Evol12307312CrossRefGoogle Scholar
  19. Gerstein, M 1997A structural census of genomes: Comparing bacterial, eukaryotic and archaeal genomes in terms of protein structureJ Mol Biol274562576CrossRefPubMedGoogle Scholar
  20. Gerstein, M 1998Patterns of protein-fold usage in eight microbial genomes: A comprehensive structural censusProteins33518534CrossRefPubMedGoogle Scholar
  21. Gerstein, M, Hegyi, H 1998Comparing genomes in terms of protein structure: Surveys of a finite parts listFEMS Microbiol Rev22277304CrossRefPubMedGoogle Scholar
  22. Gerstein, M, Levitt, M 1997A structural census of the current population of protein sequencesProc Natl Acad Sci USA941191111916CrossRefPubMedGoogle Scholar
  23. Glansdorff, N 2000About the last common ancestor, the universal life-tree and lateral gene transfer: A reappraisalMol Microbiol38177185CrossRefPubMedGoogle Scholar
  24. Gogarten, JP, Doolittle, WF, Lawrence, JG 2002Prokaryotic evolution in light of gene transferMol Biol Evol1922262238PubMedGoogle Scholar
  25. Grant, A, Lee, D, Orengo, C 2004Progress towards mapping the universe of protein foldsGenome Biol5107CrossRefPubMedGoogle Scholar
  26. Grime, JP 1977Evidence for the existence of three primary strategies in plants and its relevance to ecological and evolutionary theoryAm Nat11111691194CrossRefGoogle Scholar
  27. Hansen, TF 2003Is modularity necessary for evolvability? Remarks on the relationship between pleiotropy and evolvabilityBiosystems698394CrossRefPubMedGoogle Scholar
  28. Harris, JK, Kelley, ST, Spiegelman, GB, Pace, NR 2003The genetic core of the universal ancestorGenome Res13407412CrossRefPubMedGoogle Scholar
  29. Harrison, A, Pearl, F, Mott, R, Thornton, J, Orengo, C 2002Quantifying the similarities within fold spaceJ Mol Biol323909926CrossRefPubMedGoogle Scholar
  30. Hartwell, LH, Hopfield, JJ, Leibler, S, Murray, AW 1999From molecular to modular cell biologyNature402C47C52CrossRefPubMedGoogle Scholar
  31. Hegyi, H, Lin, J, Greenbaum, D, Gerstein, M 2002Structural genomics analysis: Characteristics of atypical, common, and horizontally transferred foldsProteins47126141CrossRefPubMedGoogle Scholar
  32. Huynen, MA, Nimwegen, E 1998The frequency distribution of gene family size in complete genomesMol Biol Evol15583589PubMedGoogle Scholar
  33. Karev, GP, Wolf, Y, Rzhetsky, AY, Berezovskaya, FS, Koonin, EV 2002Birth and death of protein domains: A simple model of evolution explains power law behaviorBMC Evol Biol218CrossRefPubMedGoogle Scholar
  34. Karev, GP, Wolf, Y, Koonin, EV 2003Simple stochastic birth and death models of genome evolution: Was there enough time for us to evolve?Bioinformatics1918891990CrossRefPubMedGoogle Scholar
  35. Kauffmann, SA 1993The origins of orderOxford University PressNew YorkGoogle Scholar
  36. Kunin, V, Ouzounis, CA 2003The balance of driving forces during genome evolution in prokaryotesGenome Res1315891594CrossRefPubMedGoogle Scholar
  37. Kunin, V, Cases, I, Enright, AJ, de Lorenzo, V, Ouzounis, CA 2003Myriads of protein families, and still countingGenome Biol4401CrossRefPubMedGoogle Scholar
  38. Lee, D, Grant, A, Buchan, D, Orengo, C 2003A structural perspective on genome evolutionCurr Opin Struct Biol13359369CrossRefPubMedGoogle Scholar
  39. Limpens, E, Franken, C, Smit, P, Willemse, J, Bisseling, T, Geurts, R 2003LysM domain receptor kinases regulating rhizobial Nod factor-induced infectionScience302630633CrossRefPubMedGoogle Scholar
  40. Lin, J, Gerstein, M 2000Whole-genome trees based on the occurrence of fold and orthologs: Implications for comparing genomes on different levelsGenome Res10808818CrossRefPubMedGoogle Scholar
  41. Lo Conte, L, Brenner, SE, Hubbard, TJP, Chothia, C, Murzin, A 2002SCOP database in 2002: Refinements accommodate structural genomicsNucleic Acids Res30264267CrossRefPubMedGoogle Scholar
  42. Lupas, AN, Ponting, CP, Russell, RB 2001On the evolution of protein folds: Are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world?J Struct Biol134191203CrossRefPubMedGoogle Scholar
  43. Lynch, M, Conery, JS 2000The evolutionary fate and consequences of duplicate genesScience1011511155CrossRefGoogle Scholar
  44. Lynch, M, Conery, JS 2003aThe evolutionary demography of duplicate genesJ Struct Funct Genomics33544CrossRefGoogle Scholar
  45. Lynch, M, Conery, JS 2003bThe origins of genome complexityScience30214011404CrossRefGoogle Scholar
  46. Maddison, WP 1991Squared-change parsimony reconstructions of ancestral states for continuous-valued characters on a phylogenetic treeSyst Zool40304314Google Scholar
  47. Maddison, WP, Maddison, DR 1999MacClade: Analysis of phylogeny and character evolution, version 3.08.Sinauer AssociatesSunderland, MAGoogle Scholar
  48. Mathesius, U, Mulders, S, Gao, M, Teplitski, M, Caetano-Anollés, G, Rolfe, BG, Bauer, WD 2003Extensive and specific responses of a eukaryote to bacterial quorum sensing signalsProc Natl Acad Sci USA10014441449CrossRefPubMedGoogle Scholar
  49. McFall-Ngai, MJ 2001Identifying ‘prime suspects’: Symbioses and the evolution of multicellularityComp Biochem Phys B Biochem Mol Biol129711723CrossRefGoogle Scholar
  50. Mossell, E 2003On the impossibility of reconstructing ancestral data and phylogeniesJ Comp Biol10669678CrossRefGoogle Scholar
  51. Murzin, A, Brenner, SE, Hubbard, T, Clothia, C 1995SCOP: A structural classification of proteins for the investigation of sequences and structuresJ Mol Biol247536540CrossRefPubMedGoogle Scholar
  52. Nagano, N, Orengo, CA, Thornton, JM 2002One fold with many functions: The evolutionary relationships between TIM barrel families based on their sequences, structures and functionsJ Mol Biol321741765CrossRefPubMedGoogle Scholar
  53. Nee, S, Holmes, EC, May, RM, Harvey, PH 1994Extinction rates can be estimated from molecular phylogeniesPhil Trans R Soc Lond B Biol Sci3447782Google Scholar
  54. Orengo, CA, Michie, AD, Jones, S, Jones, DJ, Swindells, MB, Thornton, JM 1997CATH: a hierarchic classification of protein domain structuresStructure510931108CrossRefPubMedGoogle Scholar
  55. Patthy, L 2003Modular assembly of genes and the evolution of new functionsGenetica118217231CrossRefPubMedGoogle Scholar
  56. Penny, D, Hendy, MD, Poole, AM 2003Testing fundamental evolutionary hypothesesJ Theor Biol223377385CrossRefPubMedGoogle Scholar
  57. Philippe, H, Laurent, J 1998How good are deep phylogenetic trees? Curr Opin Genet Dev86161623CrossRefGoogle Scholar
  58. Poole, A, Jeffares, DC, Penny, D 1998The path from the RNA worldJ Mol Evol46117PubMedGoogle Scholar
  59. Qian, J, Luscombe, NM, Gerstein, M 2001Protein family and fold occurrence in genomes: Power-law behavior and evolutionary modelJ Mol Biol313673681CrossRefPubMedGoogle Scholar
  60. Rivera, MC, Lake, JA 2004The ring of life provides evidence for a genome fusion origin of eukaryotesNature431152155CrossRefPubMedGoogle Scholar
  61. Rokas, A, Holland, PWK 2000Rare genomic changes as a tool for phylogeneticsTrends Ecol Evol15454459PubMedGoogle Scholar
  62. Rzhetsky, A, Gomez, SM 2001Birth of scale-free molecular networks and the number of distinct DNA and protein domains per genomeBioinformatics17988996CrossRefPubMedGoogle Scholar
  63. Semple, C, Steel, M 2002Tree reconstruction from multi-state charactersAdv Appl Math28169184CrossRefGoogle Scholar
  64. Snel, B, Bork, P, Huynen, MA 2002Genomes in flux: The evolution of Archaeal and Proteobacterial gene contentGenome Res121725CrossRefPubMedGoogle Scholar
  65. Sober, E, Steel, M 2002Testing the hypothesis of common ancestryJ Theor Biol218395408PubMedGoogle Scholar
  66. Steel, M, Penny, D 2000Parsimony, likelihood, and the role of models in molecular phylogeneticsMol Biol Evol17839850PubMedGoogle Scholar
  67. Swofford, DL 1999Phylogenetic analysis using parsimony and other programs (PAUP*), version 4.Sinauer AssociatesSunderland, MAGoogle Scholar
  68. Swofford, DL, Maddison, WP 1987Reconstructing ancestral character states under Wagner parsimonyMath Biosci87199229CrossRefGoogle Scholar
  69. Thiele, K 1993The holy grail of the perfect character: The cladistic treatment of morphometric dataCladistics9275304CrossRefGoogle Scholar
  70. Thorley, JL, Page, RDM 2000RadCon: phylogenetic tree comparison and consensusBioinformatics16486487CrossRefPubMedGoogle Scholar
  71. White, SH 1994Global statistics of protein sequences: implications for the origin, evolution, and prediction of structureAnnu Rev Biophys Biomol Struct23407439CrossRefPubMedGoogle Scholar
  72. Wilkinson, M, Thorley, JL, Upchurch, P 2000A chain is no longer than its weakest link: double decay analysis of phylogenetic hypothesesSyst Biol49754776CrossRefPubMedGoogle Scholar
  73. Woese, CR 2000The universal ancestorProc Natl Acad Sci USA9568546859CrossRefGoogle Scholar
  74. Woese, CR, Kandler, O, Wheelis, ML 1990Towards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and EucaryaProc Natl Acad Sci USA8745764579PubMedGoogle Scholar
  75. Wolf, YI, Brenner, SE, Bash, PA, Koonin, EV 1999Distribution of protein folds in the three superkingdoms of lifeGenome Res91726PubMedGoogle Scholar
  76. Wolf, YI, Rogozin, IB, Grishin, NV, Koonin, EV 2002Genome trees and the tree of lifeTrends Genet18472479CrossRefPubMedGoogle Scholar
  77. Wright, S 1932The roles of mutation, inbreeding, crossbreeding and selection in evolutionProc Sixth Int Congr Genet1356366Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  • Gustavo Caetano-Anollés
    • 1
  • Derek Caetano-Anollés
    • 2
  1. 1.Department of Crop SciencesUniversity of IllinoisUrbanaUSA
  2. 2.Vital NRGKnoxvilleUSA

Personalised recommendations