Journal of Molecular Evolution

, Volume 58, Issue 5, pp 527–539 | Cite as

The Consistent Phylogenetic Signal in Genome Trees Revealed by Reducing the Impact of Noise

  • Bas E. DutilhEmail author
  • Martijn A. Huynen
  • William J. Bruno
  • Berend Snel


Phylogenetic trees based on gene repertoires are remarkably similar to the current consensus of life history. Yet it has been argued that shared gene content is unreliable for phylogenetic reconstruction because of convergence in gene content due to horizontal gene transfer and parallel gene loss. Here we test this argument, by filtering out as noise those orthologous groups that have an inconsistent phylogenetic distribution, using two independent methods. The resulting phylogenies do indeed contain small but significant improvements. More importantly, we find that the majority of orthologous groups contain some phylogenetic signal and that the resulting phylogeny is the only detectable signal present in the gene distribution across genomes. Horizontal gene transfer or parallel gene loss does not cause systematic biases in the gene content tree.


Genome phylogeny Horizontal gene transfer Gene loss Genome evolution Character weighting Thermophilic Bacteria 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aravind, L, Tatusov, RL, Wolf, YI, Walker, DR, Koonin, EV 1998Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles.Trends Genet14442444CrossRefPubMedGoogle Scholar
  2. 2.
    Arvestad, L, Berglund, AC, Lagergren, J, Sennblad, B 2003Bayesian gene/species tree reconciliation and orthology analysis using MCMC.Bioinformatics19I7I15CrossRefPubMedGoogle Scholar
  3. 3.
    Bansal, AK, Meyer, TE 2002Evolutionary analysis by whole-genome comparisons.J Bacteriol18422602272CrossRefPubMedGoogle Scholar
  4. 4.
    Brochier, C, Philippe, H 2002Phylogeny: A non-hyperthermophilic ancestor for bacteria.Nature417244CrossRefPubMedGoogle Scholar
  5. 5.
    Brown, JR, Douady, CJ, Italia, MJ, Marshall, WE, Stanhope, MJ 2001Universal trees based on large combined protein sequence data sets.Nat Genet28281285CrossRefPubMedGoogle Scholar
  6. 6.
    Bruno, WJ 1996Modeling residue usage in aligned protein sequences via maximum likelihood.Mol Biol Evol1313681374PubMedGoogle Scholar
  7. 7.
    Bruno, WJ, Socci, ND, Halpern, AL 2000Weighted neighbor joining: A likelihood-based approach to distance-based phylogeny reconstruction.Mol Biol Evol17189197PubMedGoogle Scholar
  8. 8.
    Cambillau, C, Claverie, JM 2000Structural and genomic correlates of hyperthermostability.J Biol Chem2753238332386CrossRefPubMedGoogle Scholar
  9. 9.
    Cannone, JJ, Subramanian, S, Schnare, MN, Collett, JR, D’Souza, LM, Du, Y, Feng, B, Lin, N, Madabusi, LV, UIler, KM, Pande, N, Shang, Z, Yu, N, Gutell, RR 2002The Comparative RNA Web (CRW) Site: An online database of comparative sequence and structure information for ribosomal, intron, and other RNAs.BMC Bioinformatics32Google Scholar
  10. 10.
    Cavalier-Smith, T 1986The kingdoms of organisms.Nature324416417PubMedGoogle Scholar
  11. 11.
    Cavalier-Smith, T 2002The neomuran origin of archaebacteria, the negibacterial root of the universal tree and bacterial megaclassification.Int J Syst Evol Microbiol52776PubMedGoogle Scholar
  12. 12.
    Clarke, GD, Beiko, RG, Ragan, MA, Charlebois, RL 2002Inferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores.J Bacteriol18420722080CrossRefPubMedGoogle Scholar
  13. 13.
    Daubin, V, Gouy, M, Perriere, G 2001Bacterial molecular phylogeny using supertree approach.Genome Inform Ser Workshop Genome Inform12155164PubMedGoogle Scholar
  14. 14.
    Daubin, V, Moran, NA, Ochman, H 2003Phylogenetics and the cohesion of bacterial genomes.Science301829832CrossRefPubMedGoogle Scholar
  15. 15.
    Doolittle, WF 1999aLateral gene transfer, genome surveys, and the phylogeny of Prokaryotes.Science2861443aCrossRefGoogle Scholar
  16. 16.
    Doolittle, WF 1999bPhylogenetic classification and the universal tree.Science28421242129CrossRefGoogle Scholar
  17. 17.
    Farris, RJ 1977Phylogenetic analysis under Dollo’s law.Syst Zool267788Google Scholar
  18. 18.
    Felsenstein, J 1989PHYLIP—Phylogeny inference package (version 3. 2).Cladistics5164166Google Scholar
  19. 19.
    Fitch, WM 1970Distinguishing homologous from analogous proteins.Syst Zool1999113PubMedGoogle Scholar
  20. 20.
    Fitz-Gibbon, ST, House, CH 1999Whole genome-based phylogenetic analysis of free-living microorganisms.Nucleic Acids Res2742184222CrossRefPubMedGoogle Scholar
  21. 21.
    Forterre, P, Bouthier De La Tour, C, Philippe, H, Duguet, M 2000Reverse gyrase from hyperthermophiles: Probable transfer of a thermoadaptation trait from archaea to bacteria.Trends Genet16152154CrossRefPubMedGoogle Scholar
  22. 22.
    Galtier, N, Lobry, JR 1997Relationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes.J Mol Evol44632636PubMedGoogle Scholar
  23. 23.
    Gogarten, JP, Doolittle, WF, Lawrence, JG 2002Prokaryotic evolution in light of gene transfer.Mol Biol Evol1922262238PubMedGoogle Scholar
  24. 24.
    Goldstein, DB, Pollock, DD 1994Least squares estimation of molecular distance—Noise abatement in phylogenetic reconstruction.Theor Popul Biol45219226CrossRefPubMedGoogle Scholar
  25. 25.
    Gribaldo, S, Philippe, H 2002Ancient phylogenetic relationships.Theor Popul Biol61391408CrossRefPubMedGoogle Scholar
  26. 26.
    Gribaldo, S, Lumia, V, Creti, R, de Macario, EC, Sanangelantoni, A, Cammarano, P 1999Discontinuous occurrence of the hsp70 (dnaK) gene among Archaea and sequence features of HSP70 suggest a novel outlook on phylogenies inferred from this protein.J Bacteriol181434443Google Scholar
  27. 27.
    Gupta, RS, Griffiths, E 2002Critical issues in bacterial phylogeny.Theor Popul Biol61423434CrossRefPubMedGoogle Scholar
  28. 28.
    Huelsenbeck, JP, Ronquist, F 2001MRBAYES: Bayesian inference of phylogenetic trees.Bioinformatics17754755Google Scholar
  29. 29.
    Huynen, M, Dandekar, T, Bork, P 1998Differential genome analysis applied to the species-specific features of Helicobacter pylori.FEBS Lett42615CrossRefPubMedGoogle Scholar
  30. 30.
    Jain, R, Rivera, MC, Lake, JA 1999Horizontal gene transfer among genomes: The complexity hypothesis.Proc Natl Acad Sci USA9638013806CrossRefPubMedGoogle Scholar
  31. 31.
    Kirkpatrick, S, Gelatt, C, Vecchi, M 1983Optimization by simulated annealing.Science220671680MathSciNetGoogle Scholar
  32. 32.
    Klenk, HP, Meier, TD, Durovic, P, Schwass, V, Lottspeich, F, Dennis, PP, Zillig, W 1999RNA polymerase of Aquifex pyrophilus: Implications for the evolution of the bacterial rpoBC operon and extremely thermophilic bacteria.J Mol Evol48528541PubMedGoogle Scholar
  33. 33.
    Korbel, JO, Snel, B, Huynen, MA, Bork, P 2002SHOT: A web server for the construction of genome phylogenies.Trends Genet18158162CrossRefPubMedGoogle Scholar
  34. 34.
    Kreil, DP, Ouzounis, CA 2001Identification of thermophilic species by the amino acid compositions deduced from their genomes.Nucleic Acids Res2916081615CrossRefPubMedGoogle Scholar
  35. 35.
    Nelson, KE, Clayton, RA, Gill, SR,  et al. 1999Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima.Nature399323329PubMedGoogle Scholar
  36. 36.
    Pesole, G, Gissi, C, Lanave, C, Saccone, C 1995Glutamine synthetase gene evolution in bacteria.Mol Biol Evol12189197PubMedGoogle Scholar
  37. 37.
    Philippe, H, Forterre, P 1999The rooting of the universal tree of life is not reliable.J Mol Evol49509523PubMedGoogle Scholar
  38. 38.
    Plotz, BM, Lindner, B, Stetter, KO, Holst, O 2000Characterization of a novel lipid A containing D-galacturonic acid that replaces phosphate residues. The structure of the lipid a of the lipopolysaccharide from the hyperthermophilic bacterium Aquifex pyrophilus.J Biol Chem2751122211228Google Scholar
  39. 39.
    Saitou, N, Nei, M 1987The neighbor-joining method: A new method for reconstructing phylogenetic trees.Mol Biol Evol4406425PubMedGoogle Scholar
  40. 40.
    Slesarev, AI, Mezhevaya, KV, Makarova, KS, Polushin, NN, Shcherbinina, OV, Shakhova, VV, Belova, GI, Aravind, L, Natale, DA, Rogozin, IB, Tatusov, RL, Wolf, YI, Stetter, KO, Malykh, AG, Koonin, EV, Kozyavkin, SA 2002The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens.Proc Natl Acad Sci USA9946444649CrossRefPubMedGoogle Scholar
  41. 41.
    Snel, B, Bork, P, Huynen, MA 1999Genome phylogeny based on gene content.Nat Genet21108110CrossRefPubMedGoogle Scholar
  42. 42.
    Snel, B, Bork, P, Huynen, MA 2002Genomes in flux: The evolution of archaeal and proteobacterial gene content.Genome Res121725CrossRefPubMedGoogle Scholar
  43. 43.
    Suhre, K, Claverie, JM 2003Genomic correlates of hyperthermostability: An update.J Biol Chem2781719817202CrossRefPubMedGoogle Scholar
  44. 44.
    Tamames, J 2001Evolution of gene order conservation in prokaryotes.Genome Biol2RESEARCH0020CrossRefPubMedGoogle Scholar
  45. 45.
    Tatusov, RL, Koonin, EV, Lipman, DJ 1997A genomic perspective on protein families.Science278631637PubMedGoogle Scholar
  46. 46.
    Tatusov, RL, Natale, DA, Garkavtsev, IV, Tatusova, TA, Shankavaram, UT, Rao, BS, Kiryutin, B, Galperin, MY, Fedorova, ND, Koonin, EV 2001The COG database: New developments in phylogenetic classification of proteins from complete genomes.Nucleic Acids Res292228PubMedGoogle Scholar
  47. 47.
    Tekaia, F, Lazcano, A, Dujon, B 1999The genomic tree as revealed from whole proteome comparisons.Genome Res9550557PubMedGoogle Scholar
  48. 48.
    Thompson, JD, Higgins, DG, Gibson, TJ 1994CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.Nucleic Acids Res2246734680PubMedGoogle Scholar
  49. 49.
    Tiboni, O, Cammarano, P, Sanangelantoni, AM 1993Cloning and sequencing of the gene encoding glutamine synthetase I from the archaeum Pyrococcus woesei: Anomalous phylogenies inferred from analysis of archaeal and bacterial glutamine synthetase I sequences.J Bacteriol17529612969PubMedGoogle Scholar
  50. 50.
    von Mering, C, Huynen, M, Jaeggi, D, Schmidt, S, Bork, P, Snel, B 2003STRING: A database of predicted functional associations between proteins.Nucleic Acids Res31258261CrossRefPubMedGoogle Scholar
  51. 51.
    Wheeler, DL, Chappey, C, Lash, AE, Leipe, DD, Madden, TL, Schuler, GD, Tatusova, TA, Rapp, BA 2000Database resources of the National Center for Biotechnology Information.Nucleic Acids Res281014PubMedGoogle Scholar
  52. 52.
    Wolf, YI, Rogozin, IB, Grishin, NV, Tatusov, RL, Koonin, EV 2001Genome trees constructed using five different approaches suggest new major bacterial clades.BMC Evol Biol18CrossRefPubMedGoogle Scholar
  53. 53.
    Wolf, YI, Rogozin, IB, Grishin, NV, Koonin, EV 2002Genome trees and the tree of life.Trends Genet18472479CrossRefPubMedGoogle Scholar
  54. 54.
    Zomorodipour, A, Andersson, SG 1999Obligate intracellular parasites: Rickettsia prowazekii and Chlamydia trachomatis.FEBS Lett4521115CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag New York Inc. 2004

Authors and Affiliations

  • Bas E. Dutilh
    • 1
    Email author
  • Martijn A. Huynen
    • 1
  • William J. Bruno
    • 2
  • Berend Snel
    • 1
  1. 1.Center for Molecular and Biomolecular Informatics/Nijmegen Center for Molecular Life SciencesUniversity of Nijmegen, NijmegenThe Netherlands
  2. 2.Theoretical Biology and BiophysicsLos Alamos National Laboratory, Los Alamos, NM 87545USA

Personalised recommendations