Skip to main content
Log in

The Consistent Phylogenetic Signal in Genome Trees Revealed by Reducing the Impact of Noise

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Phylogenetic trees based on gene repertoires are remarkably similar to the current consensus of life history. Yet it has been argued that shared gene content is unreliable for phylogenetic reconstruction because of convergence in gene content due to horizontal gene transfer and parallel gene loss. Here we test this argument, by filtering out as noise those orthologous groups that have an inconsistent phylogenetic distribution, using two independent methods. The resulting phylogenies do indeed contain small but significant improvements. More importantly, we find that the majority of orthologous groups contain some phylogenetic signal and that the resulting phylogeny is the only detectable signal present in the gene distribution across genomes. Horizontal gene transfer or parallel gene loss does not cause systematic biases in the gene content tree.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. L Aravind RL Tatusov YI Wolf DR Walker EV Koonin (1998) ArticleTitleEvidence for massive gene exchange between archaeal and bacterial hyperthermophiles. Trends Genet 14 442–444 Occurrence Handle10.1016/S0168-9525(98)01553-4 Occurrence Handle1:CAS:528:DyaK1cXns1ynur8%3D Occurrence Handle9825671

    Article  CAS  PubMed  Google Scholar 

  2. L Arvestad AC Berglund J Lagergren B Sennblad (2003) ArticleTitleBayesian gene/species tree reconciliation and orthology analysis using MCMC. Bioinformatics 19 IssueIDSuppl 1 I7–I15 Occurrence Handle10.1093/bioinformatics/btg1000 Occurrence Handle12855432

    Article  PubMed  Google Scholar 

  3. AK Bansal TE Meyer (2002) ArticleTitleEvolutionary analysis by whole-genome comparisons. J Bacteriol 184 2260–2272 Occurrence Handle10.1128/JB.184.8.2260-2272.2002 Occurrence Handle1:CAS:528:DC%2BD38Xis1Kqt74%3D Occurrence Handle11914358

    Article  CAS  PubMed  Google Scholar 

  4. C Brochier H Philippe (2002) ArticleTitlePhylogeny: A non-hyperthermophilic ancestor for bacteria. Nature 417 244 Occurrence Handle10.1038/417244a Occurrence Handle1:CAS:528:DC%2BD38Xjs1agtrs%3D Occurrence Handle12015592

    Article  CAS  PubMed  Google Scholar 

  5. JR Brown CJ Douady MJ Italia WE Marshall MJ Stanhope (2001) ArticleTitleUniversal trees based on large combined protein sequence data sets. Nat Genet 28 281–285 Occurrence Handle10.1038/90129 Occurrence Handle1:CAS:528:DC%2BD3MXltFSmurw%3D Occurrence Handle11431701

    Article  CAS  PubMed  Google Scholar 

  6. WJ Bruno (1996) ArticleTitleModeling residue usage in aligned protein sequences via maximum likelihood. Mol Biol Evol 13 1368–1374 Occurrence Handle1:CAS:528:DyaK28XnsVWgsrY%3D Occurrence Handle8952081

    CAS  PubMed  Google Scholar 

  7. WJ Bruno ND Socci AL Halpern (2000) ArticleTitleWeighted neighbor joining: A likelihood-based approach to distance-based phylogeny reconstruction. Mol Biol Evol 17 189–197 Occurrence Handle1:CAS:528:DC%2BD3cXot1Sisg%3D%3D Occurrence Handle10666718

    CAS  PubMed  Google Scholar 

  8. C Cambillau JM Claverie (2000) ArticleTitleStructural and genomic correlates of hyperthermostability. J Biol Chem 275 32383–32386 Occurrence Handle10.1074/jbc.C000497200 Occurrence Handle10940293

    Article  PubMed  Google Scholar 

  9. JJ Cannone S Subramanian MN Schnare JR Collett LM D’Souza Y Du B Feng N Lin LV Madabusi KM UIler N Pande Z Shang N Yu RR Gutell (2002) ArticleTitleThe Comparative RNA Web (CRW) Site: An online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinformatics 3 2

    Google Scholar 

  10. T Cavalier-Smith (1986) ArticleTitleThe kingdoms of organisms. Nature 324 416–417 Occurrence Handle1:STN:280:BiiD28rhtFU%3D Occurrence Handle2431320

    CAS  PubMed  Google Scholar 

  11. T Cavalier-Smith (2002) ArticleTitleThe neomuran origin of archaebacteria, the negibacterial root of the universal tree and bacterial megaclassification. Int J Syst Evol Microbiol 52 7–76 Occurrence Handle1:CAS:528:DC%2BD38XhsVWmtL4%3D Occurrence Handle11837318

    CAS  PubMed  Google Scholar 

  12. GD Clarke RG Beiko MA Ragan RL Charlebois (2002) ArticleTitleInferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores. J Bacteriol 184 2072–2080 Occurrence Handle10.1128/JB.184.8.2072-2080.2002 Occurrence Handle1:CAS:528:DC%2BD38Xis1KqsLc%3D Occurrence Handle11914337

    Article  CAS  PubMed  Google Scholar 

  13. V Daubin M Gouy G Perriere (2001) ArticleTitleBacterial molecular phylogeny using supertree approach. Genome Inform Ser Workshop Genome Inform 12 155–164 Occurrence Handle1:CAS:528:DC%2BD38XkvV2rt7Y%3D Occurrence Handle11791234

    CAS  PubMed  Google Scholar 

  14. V Daubin NA Moran H Ochman (2003) ArticleTitlePhylogenetics and the cohesion of bacterial genomes. Science 301 829–832 Occurrence Handle10.1126/science.1086568 Occurrence Handle1:CAS:528:DC%2BD3sXmtVGqs7g%3D Occurrence Handle12907801

    Article  CAS  PubMed  Google Scholar 

  15. WF Doolittle (1999a) ArticleTitleLateral gene transfer, genome surveys, and the phylogeny of Prokaryotes. Science 286 1443a Occurrence Handle10.1126/science.286.5444.1443a

    Article  Google Scholar 

  16. WF Doolittle (1999b) ArticleTitlePhylogenetic classification and the universal tree. Science 284 2124–2129 Occurrence Handle10.1126/science.284.5423.2124 Occurrence Handle1:CAS:528:DyaK1MXkt1Kgsbs%3D

    Article  CAS  Google Scholar 

  17. RJ Farris (1977) ArticleTitlePhylogenetic analysis under Dollo’s law. Syst Zool 26 77–88

    Google Scholar 

  18. J Felsenstein (1989) ArticleTitlePHYLIP—Phylogeny inference package (version 3. 2). Cladistics 5 164–166

    Google Scholar 

  19. WM Fitch (1970) ArticleTitleDistinguishing homologous from analogous proteins. Syst Zool 19 99–113 Occurrence Handle1:CAS:528:DyaE3MXkvFyisw%3D%3D Occurrence Handle5449325

    CAS  PubMed  Google Scholar 

  20. ST Fitz-Gibbon CH House (1999) ArticleTitleWhole genome-based phylogenetic analysis of free-living microorganisms. Nucleic Acids Res 27 4218–4222 Occurrence Handle10.1093/nar/27.21.4218 Occurrence Handle1:CAS:528:DyaK1MXnt1Gkur8%3D Occurrence Handle10518613

    Article  CAS  PubMed  Google Scholar 

  21. P Forterre C Bouthier De La Tour H Philippe M Duguet (2000) ArticleTitleReverse gyrase from hyperthermophiles: Probable transfer of a thermoadaptation trait from archaea to bacteria. Trends Genet 16 152–154 Occurrence Handle10.1016/S0168-9525(00)01980-6 Occurrence Handle1:CAS:528:DC%2BD3cXit1Sqsb4%3D Occurrence Handle10729828

    Article  CAS  PubMed  Google Scholar 

  22. N Galtier JR Lobry (1997) ArticleTitleRelationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J Mol Evol 44 632–636 Occurrence Handle1:CAS:528:DyaK2sXjslGqsbc%3D Occurrence Handle9169555

    CAS  PubMed  Google Scholar 

  23. JP Gogarten WF Doolittle JG Lawrence (2002) ArticleTitleProkaryotic evolution in light of gene transfer. Mol Biol Evol 19 2226–2238 Occurrence Handle1:CAS:528:DC%2BD38Xps12hsL4%3D Occurrence Handle12446813

    CAS  PubMed  Google Scholar 

  24. DB Goldstein DD Pollock (1994) ArticleTitleLeast squares estimation of molecular distance—Noise abatement in phylogenetic reconstruction. Theor Popul Biol 45 219–226 Occurrence Handle10.1006/tpbi.1994.1012 Occurrence Handle1:STN:280:ByuA2crkvVM%3D Occurrence Handle8066551

    Article  CAS  PubMed  Google Scholar 

  25. S Gribaldo H Philippe (2002) ArticleTitleAncient phylogenetic relationships. Theor Popul Biol 61 391–408 Occurrence Handle10.1006/tpbi.2002.1593 Occurrence Handle12167360

    Article  PubMed  Google Scholar 

  26. S Gribaldo V Lumia R Creti EC de Macario A Sanangelantoni P Cammarano (1999) ArticleTitleDiscontinuous occurrence of the hsp70 (dnaK) gene among Archaea and sequence features of HSP70 suggest a novel outlook on phylogenies inferred from this protein. J Bacteriol 181 434–443 Occurrence Handle1:CAS:528:DyaK1MXmtlWjsw%3D%3D

    CAS  Google Scholar 

  27. RS Gupta E Griffiths (2002) ArticleTitleCritical issues in bacterial phylogeny. Theor Popul Biol 61 423–434 Occurrence Handle10.1006/tpbi.2002.1589 Occurrence Handle12167362

    Article  PubMed  Google Scholar 

  28. JP Huelsenbeck F Ronquist (2001) ArticleTitleMRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17 754–755

    Google Scholar 

  29. M Huynen T Dandekar P Bork (1998) ArticleTitleDifferential genome analysis applied to the species-specific features of Helicobacter pylori. FEBS Lett 426 1–5 Occurrence Handle10.1016/S0014-5793(98)00276-2 Occurrence Handle1:CAS:528:DyaK1cXitFKhs7g%3D Occurrence Handle9598967

    Article  CAS  PubMed  Google Scholar 

  30. R Jain MC Rivera JA Lake (1999) ArticleTitleHorizontal gene transfer among genomes: The complexity hypothesis. Proc Natl Acad Sci USA 96 3801–3806 Occurrence Handle10.1073/pnas.96.7.3801 Occurrence Handle1:CAS:528:DyaK1MXjslChsrk%3D Occurrence Handle10097118

    Article  CAS  PubMed  Google Scholar 

  31. S Kirkpatrick C Gelatt M Vecchi (1983) ArticleTitleOptimization by simulated annealing. Science 220 671–680 Occurrence Handle85f:90091

    MathSciNet  Google Scholar 

  32. HP Klenk TD Meier P Durovic V Schwass F Lottspeich PP Dennis W Zillig (1999) ArticleTitleRNA polymerase of Aquifex pyrophilus: Implications for the evolution of the bacterial rpoBC operon and extremely thermophilic bacteria. J Mol Evol 48 528–541 Occurrence Handle1:CAS:528:DyaK1MXisFGis78%3D Occurrence Handle10198119

    CAS  PubMed  Google Scholar 

  33. JO Korbel B Snel MA Huynen P Bork (2002) ArticleTitleSHOT: A web server for the construction of genome phylogenies. Trends Genet 18 158–162 Occurrence Handle10.1016/S0168-9525(01)02597-5 Occurrence Handle1:CAS:528:DC%2BD38XhtlGju7g%3D Occurrence Handle11858840

    Article  CAS  PubMed  Google Scholar 

  34. DP Kreil CA Ouzounis (2001) ArticleTitleIdentification of thermophilic species by the amino acid compositions deduced from their genomes. Nucleic Acids Res 29 1608–1615 Occurrence Handle10.1093/nar/29.7.1608 Occurrence Handle1:CAS:528:DC%2BD3MXivFGjt7s%3D Occurrence Handle11266564

    Article  CAS  PubMed  Google Scholar 

  35. KE Nelson RA Clayton SR Gill et al. (1999) ArticleTitleEvidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima. Nature 399 323–329 Occurrence Handle1:CAS:528:DyaK1MXjs1WnsLo%3D Occurrence Handle10360571

    CAS  PubMed  Google Scholar 

  36. G Pesole C Gissi C Lanave C Saccone (1995) ArticleTitleGlutamine synthetase gene evolution in bacteria. Mol Biol Evol 12 189–197 Occurrence Handle1:CAS:528:DyaK2MXjvVajt74%3D Occurrence Handle7700148

    CAS  PubMed  Google Scholar 

  37. H Philippe P Forterre (1999) ArticleTitleThe rooting of the universal tree of life is not reliable. J Mol Evol 49 509–523 Occurrence Handle1:CAS:528:DyaK1MXmslWhtrs%3D Occurrence Handle10486008

    CAS  PubMed  Google Scholar 

  38. BM Plotz B Lindner KO Stetter O Holst (2000) ArticleTitleCharacterization of a novel lipid A containing D-galacturonic acid that replaces phosphate residues. The structure of the lipid a of the lipopolysaccharide from the hyperthermophilic bacterium Aquifex pyrophilus. J Biol Chem 275 11222–11228 Occurrence Handle1:CAS:528:DC%2BD3cXislShtLY%3D

    CAS  Google Scholar 

  39. N Saitou M Nei (1987) ArticleTitleThe neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol Biol Evol 4 406–425 Occurrence Handle1:STN:280:BieC1cbgtVY%3D Occurrence Handle3447015

    CAS  PubMed  Google Scholar 

  40. AI Slesarev KV Mezhevaya KS Makarova NN Polushin OV Shcherbinina VV Shakhova GI Belova L Aravind DA Natale IB Rogozin RL Tatusov YI Wolf KO Stetter AG Malykh EV Koonin SA Kozyavkin (2002) ArticleTitleThe complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens. Proc Natl Acad Sci USA 99 4644–4649 Occurrence Handle10.1073/pnas.032671499 Occurrence Handle1:CAS:528:DC%2BD38XivFSgsrw%3D Occurrence Handle11930014

    Article  CAS  PubMed  Google Scholar 

  41. B Snel P Bork MA Huynen (1999) ArticleTitleGenome phylogeny based on gene content. Nat Genet 21 108–110 Occurrence Handle10.1038/5052 Occurrence Handle1:CAS:528:DyaK1MXltlWjtQ%3D%3D Occurrence Handle9916801

    Article  CAS  PubMed  Google Scholar 

  42. B Snel P Bork MA Huynen (2002) ArticleTitleGenomes in flux: The evolution of archaeal and proteobacterial gene content. Genome Res 12 17–25 Occurrence Handle10.1101/gr.176501 Occurrence Handle1:CAS:528:DC%2BD38XksV2rsQ%3D%3D Occurrence Handle11779827

    Article  CAS  PubMed  Google Scholar 

  43. K Suhre JM Claverie (2003) ArticleTitleGenomic correlates of hyperthermostability: An update. J Biol Chem 278 17198–17202 Occurrence Handle10.1074/jbc.M301327200 Occurrence Handle1:CAS:528:DC%2BD3sXjsVKmsrY%3D Occurrence Handle12600994

    Article  CAS  PubMed  Google Scholar 

  44. J Tamames (2001) ArticleTitleEvolution of gene order conservation in prokaryotes. Genome Biol 2 RESEARCH0020 Occurrence Handle10.1186/gb-2001-2-6-research0020 Occurrence Handle1:CAS:528:DC%2BD38Xit1Ohurk%3D Occurrence Handle11423009

    Article  CAS  PubMed  Google Scholar 

  45. RL Tatusov EV Koonin DJ Lipman (1997) ArticleTitleA genomic perspective on protein families. Science 278 631–637 Occurrence Handle1:CAS:528:DyaK2sXmvVOrsL0%3D Occurrence Handle9381173

    CAS  PubMed  Google Scholar 

  46. RL Tatusov DA Natale IV Garkavtsev TA Tatusova UT Shankavaram BS Rao B Kiryutin MY Galperin ND Fedorova EV Koonin (2001) ArticleTitleThe COG database: New developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res 29 22–28 Occurrence Handle1:CAS:528:DC%2BD3MXjtlWnsLo%3D Occurrence Handle11125040

    CAS  PubMed  Google Scholar 

  47. F Tekaia A Lazcano B Dujon (1999) ArticleTitleThe genomic tree as revealed from whole proteome comparisons. Genome Res 9 550–557 Occurrence Handle1:CAS:528:DyaK1MXksVems7s%3D Occurrence Handle10400922

    CAS  PubMed  Google Scholar 

  48. JD Thompson DG Higgins TJ Gibson (1994) ArticleTitleCLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22 4673–4680 Occurrence Handle7984417

    PubMed  Google Scholar 

  49. O Tiboni P Cammarano AM Sanangelantoni (1993) ArticleTitleCloning and sequencing of the gene encoding glutamine synthetase I from the archaeum Pyrococcus woesei: Anomalous phylogenies inferred from analysis of archaeal and bacterial glutamine synthetase I sequences. J Bacteriol 175 2961–2969 Occurrence Handle1:CAS:528:DyaK3sXltlalt74%3D Occurrence Handle8098326

    CAS  PubMed  Google Scholar 

  50. C von Mering M Huynen D Jaeggi S Schmidt P Bork B Snel (2003) ArticleTitleSTRING: A database of predicted functional associations between proteins. Nucleic Acids Res 31 258–261 Occurrence Handle10.1093/nar/gkg034 Occurrence Handle1:CAS:528:DC%2BD3sXhvFSmsrg%3D Occurrence Handle12519996

    Article  CAS  PubMed  Google Scholar 

  51. DL Wheeler C Chappey AE Lash DD Leipe TL Madden GD Schuler TA Tatusova BA Rapp (2000) ArticleTitleDatabase resources of the National Center for Biotechnology Information. Nucleic Acids Res 28 10–14 Occurrence Handle1:CAS:528:DC%2BD3cXhvVGqurg%3D Occurrence Handle10592169

    CAS  PubMed  Google Scholar 

  52. YI Wolf IB Rogozin NV Grishin RL Tatusov EV Koonin (2001) ArticleTitleGenome trees constructed using five different approaches suggest new major bacterial clades. BMC Evol Biol 1 8 Occurrence Handle10.1186/1471-2148-1-8 Occurrence Handle1:STN:280:DC%2BD3srgs1Wruw%3D%3D Occurrence Handle11734060

    Article  CAS  PubMed  Google Scholar 

  53. YI Wolf IB Rogozin NV Grishin EV Koonin (2002) ArticleTitleGenome trees and the tree of life. Trends Genet 18 472–479 Occurrence Handle10.1016/S0168-9525(02)02744-0 Occurrence Handle1:CAS:528:DC%2BD38Xmtleltbo%3D Occurrence Handle12175808

    Article  CAS  PubMed  Google Scholar 

  54. A Zomorodipour SG Andersson (1999) ArticleTitleObligate intracellular parasites: Rickettsia prowazekii and Chlamydia trachomatis. FEBS Lett 452 11–15 Occurrence Handle10.1016/S0014-5793(99)00563-3 Occurrence Handle1:CAS:528:DyaK1MXkt1ektrg%3D Occurrence Handle10376669

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bas E. Dutilh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dutilh, B.E., Huynen, M.A., Bruno, W.J. et al. The Consistent Phylogenetic Signal in Genome Trees Revealed by Reducing the Impact of Noise . J Mol Evol 58, 527–539 (2004). https://doi.org/10.1007/s00239-003-2575-6

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-003-2575-6

Keywords

Navigation