Skip to main content
Log in

Phylogeny of Prokaryotes and Chloroplasts Revealed by a Simple Composition Approach on All Protein Sequences from Complete Genomes Without Sequence Alignment

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

The complete genomes of living organisms have provided much information on their phylogenetic relationships. Similarly, the complete genomes of chloroplasts have helped to resolve the evolution of this organelle in photosynthetic eukaryotes. In this paper we propose an alternative method of phylogenetic analysis using compositional statistics for all protein sequences from complete genomes. This new method is conceptually simpler than and computationally as fast as the one proposed by Qi et al. (2004b) and Chu et al. (2004). The same data sets used in Qi et al. (2004b) and Chu et al. (2004) are analyzed using the new method. Our distance-based phylogenic tree of the 109 prokaryotes and eukaryotes agrees with the biologists “tree of life” based on 16S rRNA comparison in a predominant majority of basic branching and most lower taxa. Our phylogenetic analysis also shows that the chloroplast genomes are separated to two major clades corresponding to chlorophytes s.l. and rhodophytes s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2

Similar content being viewed by others

References

  • J Adachi PJ Waddell W Martin M Hasegawa (2000) ArticleTitlePlastid genome phylogeny and a model of amino acid substitution for proteins encoded by chloroplast DNA J Mol Evol 50 348–358 Occurrence Handle1:CAS:528:DC%2BD3cXivFyjur8%3D Occurrence Handle10795826

    CAS  PubMed  Google Scholar 

  • JR Brown WF Doolittle (1997) ArticleTitleArchaea and the prokaryote-to-eukaryote transition Microbiol Mol Biol Rev 61 456–502 Occurrence Handle1:CAS:528:DyaK2sXotVymsbk%3D Occurrence Handle9409149

    CAS  PubMed  Google Scholar 

  • RL Charlebois RG Beiko MA Ragan (2003) ArticleTitleBranching out Nature 421 217–217 Occurrence Handle10.1038/421217a Occurrence Handle1:CAS:528:DC%2BD3sXjsF2jtA%3D%3D Occurrence Handle12529621

    Article  CAS  PubMed  Google Scholar 

  • E Chatton (1937) Titres et travaux scientiflques Sette, Sottano Italy

    Google Scholar 

  • KH Chu J Qi ZG Yu VV Anh (2004) ArticleTitleOrigin and Phylogeny of Chloroplasts revealed by a simple correlation analysis of complete genome Mol Biol Evol 21 200–206 Occurrence Handle10.1093/molbev/msh002 Occurrence Handle1:CAS:528:DC%2BD2cXhvVKqsL0%3D Occurrence Handle14595102

    Article  CAS  PubMed  Google Scholar 

  • J Las Rivas ParticleDe JJ Lozano AR Ortiz (2002) ArticleTitleComparative analysis of chloroplast genomes: Functional annotation, genome-based phylogeny, and deduced evolutionary patterns Genome Res 12 567–583 Occurrence Handle10.1101/gr.209402 Occurrence Handle11932241

    Article  PubMed  Google Scholar 

  • RF Doolittle (1998) ArticleTitleMicrobial genomes opened up Nature 392 339–342 Occurrence Handle10.1038/32789 Occurrence Handle1:CAS:528:DyaK1cXit12ru7w%3D Occurrence Handle9537318

    Article  CAS  PubMed  Google Scholar 

  • RF Doolittle (1999) ArticleTitlePhylogenetic classification and the universal tree Science 284 2124–2128 Occurrence Handle10.1126/science.284.5423.2124 Occurrence Handle1:CAS:528:DyaK1MXkt1Kgsbs%3D Occurrence Handle10381871

    Article  CAS  PubMed  Google Scholar 

  • SV Edwards B Fertil A Giron P Deschavanne (2002) ArticleTitleA genomic schism in birds revealed by phylogenetic analysis of DNA strings Syst Biol 51 599–613 Occurrence Handle10.1080/10635150290102285 Occurrence Handle12228002

    Article  PubMed  Google Scholar 

  • JA Eisen CM Fraser (2003) ArticleTitlePhylogenomics: intersection of evolution and genomics Science 300 1706–1707 Occurrence Handle10.1126/science.1086292 Occurrence Handle1:CAS:528:DC%2BD3sXksVKisb8%3D Occurrence Handle12805538

    Article  CAS  PubMed  Google Scholar 

  • Felsenstein J (1993) PHYLIP (phylogeny inference package) version 3.5c. Distributed by the author at http://evolution.genetics.washington.edu/phylip.html

  • FitchWM E Margoliash (1967) ArticleTitleConstruction of phylogenetic trees Science 155 279–284 Occurrence Handle1:CAS:528:DyaF2sXnt1Gnsw%3D%3D Occurrence Handle5334057

    CAS  PubMed  Google Scholar 

  • ST Fitz-Gibbon CH House (1999) ArticleTitleWhole genome-based phylogenetic analysis of free-living microorganisms Nucleic Acids Res 27 4218–4222 Occurrence Handle10.1093/nar/27.21.4218 Occurrence Handle1:CAS:528:DyaK1MXnt1Gkur8%3D Occurrence Handle10518613

    Article  CAS  PubMed  Google Scholar 

  • MW Gray (1992) ArticleTitleThe endosymbiont hypothesis revisited Int. Rev Cytol 141 233–357 Occurrence Handle1:STN:280:ByyD1cvkslI%3D Occurrence Handle1452433

    CAS  PubMed  Google Scholar 

  • MW Gray (1999) ArticleTitleEvolution of organellar genomes Curr Opin Genet Dev 9 678–687 Occurrence Handle1:CAS:528:DC%2BD3cXhslOitw%3D%3D Occurrence Handle10607615

    CAS  PubMed  Google Scholar 

  • RS Gupta (1998) ArticleTitleProtein phylogenies and signature sequences: A reappraisal of evolutionary relationships among Archaebacteria, Eubacteria, and Eukaryotes Microbiol Mol Biol Rev 62 1435–1491 Occurrence Handle1:CAS:528:DyaK1MXhs1OntQ%3D%3D Occurrence Handle9841678

    CAS  PubMed  Google Scholar 

  • C Lemieux C Otis M Turmel (2000) ArticleTitleAncestral chloroplast genome in Mesostigma viride reveals an early branch of green plant evolution Nature 403 649–652 Occurrence Handle10.1038/35001059 Occurrence Handle1:CAS:528:DC%2BD3cXht1Oqt7g%3D Occurrence Handle10688199

    Article  CAS  PubMed  Google Scholar 

  • M Li JH Badger X Chen S Kwong P Kearney H Zhang (2001) ArticleTitleAn information-based sequence distance and its application to whole mitochondrial genome phylogeny Bioinformatics 17 149–154 Occurrence Handle10.1093/bioinformatics/17.2.149 Occurrence Handle1:CAS:528:DC%2BD3MXisFymsbY%3D Occurrence Handle11238070

    Article  CAS  PubMed  Google Scholar 

  • J Lin M Gerstein (2000) ArticleTitleWhole-genome trees based on the occurrence of folds and orthologs, implications for comparing genomes at different levels Genome Res 10 808–818 Occurrence Handle10.1101/gr.10.6.808 Occurrence Handle1:CAS:528:DC%2BD3cXkt1eisLk%3D Occurrence Handle10854412

    Article  CAS  PubMed  Google Scholar 

  • W Martin RG Herrmann (1998) ArticleTitleGene transfer from organelles to the nucleus: How much, what happens, and why? Plant Physiol 118 9–17 Occurrence Handle10.1104/pp.118.1.9 Occurrence Handle1:CAS:528:DyaK1cXmtV2msLs%3D Occurrence Handle9733521

    Article  CAS  PubMed  Google Scholar 

  • W Martin B Stoebe V Goremykin S Hansmann M Hasegawa KV Kowallik (1998) ArticleTitleGene transfer to the nucleus and the evolution of chloroplasts Nature 393 162–165 Occurrence Handle1:CAS:528:DyaK1cXjt1ahsL0%3D Occurrence Handle11560168

    CAS  PubMed  Google Scholar 

  • W Martin T Rujan E Richly A Hansen S Cornelsen T Lins D Leister B Stoebe M Hasegawa D Penny (2002) ArticleTitleEvolutionary analysis of Arabidopsis, cyanobacterial, and chloroplast genomes reveals plastid phylogeny and thousands of cyanobacterial genes in the nucleus Proc Natl Acad Sci USA 99 12246–12251 Occurrence Handle1:CAS:528:DC%2BD38XntlCks70%3D Occurrence Handle12218172

    CAS  PubMed  Google Scholar 

  • E Mayr (1998) ArticleTitleTwo empires or the three Proc Natl Acad Sci USA 95 9720–9723 Occurrence Handle10.1073/pnas.95.17.9720 Occurrence Handle1:CAS:528:DyaK1cXlsFSkur0%3D Occurrence Handle9707542

    Article  CAS  PubMed  Google Scholar 

  • GI McFadden (2001a) ArticleTitlePrimary and secondary endosymbiosis and the origin of plastids J Phycol 37 951–959 Occurrence Handle10.1046/j.1529-8817.2001.01126.x

    Article  Google Scholar 

  • GI McFadden (2001b) ArticleTitleChloroplast origin and integration Plant Physiol 125 50–53 Occurrence Handle10.1104/pp.125.1.50 Occurrence Handle1:CAS:528:DC%2BD3MXjslymt7Y%3D

    Article  CAS  Google Scholar 

  • DH Moreira H Le Guyader H Philippe (2000) ArticleTitleThe origin of red algae and the evolution of chloroplasts Nature 405 69–72 Occurrence Handle1:STN:280:DC%2BD3c3mvFemtA%3D%3D Occurrence Handle10811219

    CAS  PubMed  Google Scholar 

  • JD Palmer CF Delwiche (1998) The origin and evolution of plastids and their genomes DE Soltis PS Soltis JJ Doyle (Eds) Molecular systematics of plants II DNA sequencing Kluwer London 345–409

    Google Scholar 

  • E Pennisi (1999) ArticleTitleIs it the time to uproot the tree of life? Science 284 1305–1308 Occurrence Handle10.1126/science.284.5418.1305 Occurrence Handle1:CAS:528:DyaK1MXjs1SqtLg%3D Occurrence Handle10383313

    Article  CAS  PubMed  Google Scholar 

  • J Qi H Luo B Hao (2004a) ArticleTitleCVTree: a phylogenetic tree reconstruction tool based on whole genomes Nucleic Acids Res 32 W45–W47 Occurrence Handle10.1093/nar/gnh180 Occurrence Handle1:CAS:528:DC%2BD2cXlvFKns70%3D

    Article  CAS  Google Scholar 

  • J Qi B Wang B Hao (2004b) ArticleTitleWhole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach J Mol Evol 58 1–11 Occurrence Handle10.1007/s00239-003-2493-7 Occurrence Handle1:CAS:528:DC%2BD2cXmsVSntQ%3D%3D

    Article  CAS  Google Scholar 

  • MA Ragan (2001) ArticleTitleDetection of lateral gene transfer among microbial genomes Curr Opin, Gen Dev 11 620–626

    Google Scholar 

  • N Saitou M Nei (1987) ArticleTitleThe neighbor-joining method: a new method for reconstructing phylogenetic trees Mol Biol Evol 4 406–425 Occurrence Handle1:STN:280:BieC1cbgtVY%3D Occurrence Handle3447015

    CAS  PubMed  Google Scholar 

  • D Sankoff G Leaduc N Antoine B Paquin BF Lang R Cedergren (1992) ArticleTitleGene order comparisons for phylogenetic inference: Evolution of the mitochondrial genome Proc Natl Acad Sci USA 89 6575–6579 Occurrence Handle1:CAS:528:DyaK38XltVKku7o%3D Occurrence Handle1631158

    CAS  PubMed  Google Scholar 

  • VL Stirewalt CB Michalowski W Loffelhardt HJ Bohnert DA Bryant (1995) ArticleTitleNucleotide sequence of the cyanelle genome from Cycmophora paradoxa Plant Mol Biol Rep 13 327–332 Occurrence Handle1:CAS:528:DyaK28XhtFWms7Y%3D

    CAS  Google Scholar 

  • GW Stuart K Moffet S Baker (2002a) ArticleTitleIntegrated gene species phylogenies from unaligned whole genome protein sequences Bioinformatics 18 100–108 Occurrence Handle10.1093/bioinformatics/18.1.100 Occurrence Handle1:CAS:528:DC%2BD38Xhs1elurg%3D

    Article  CAS  Google Scholar 

  • GW Stuart K Moffet JJ Leader (2002b) ArticleTitleA comprehensive vertebrate phylogeny using vector representations of protein sequences from whole genomes Mol Biol Evol 19 554–562 Occurrence Handle1:CAS:528:DC%2BD38XivV2gtbc%3D

    CAS  Google Scholar 

  • F Tekaia A Lazcano B Dujon (1999) ArticleTitleThe genomic tree as revealed from whole proteome comparisons Genome Res 9 550–557 Occurrence Handle1:CAS:528:DyaK1MXksVems7s%3D Occurrence Handle10400922

    CAS  PubMed  Google Scholar 

  • M Turmel C Otis C Lemieux (1999) ArticleTitleThe complete chloroplast DNA sequence of the green alga Nephroselmis olivacea: Insights into the architecture of ancestral chloroplast genomes Proc Natl Acad Sci USA 96 10248–10253 Occurrence Handle10.1073/pnas.96.18.10248 Occurrence Handle1:CAS:528:DyaK1MXlvFensr4%3D Occurrence Handle10468594

    Article  CAS  PubMed  Google Scholar 

  • M Turmel C Otis C Lemieux (2002) ArticleTitleThe chloroplast and mitochondrial genome sequences of the charophyte Chaetosphaeridium globosum: Insights into the timing of the events that restructured organelle DNAs within the green algal lineage that led to land plants Proc Natl Acad Sci USA 99 11275–11280 Occurrence Handle10.1073/pnas.162203299 Occurrence Handle1:CAS:528:DC%2BD38XmslSmtbs%3D Occurrence Handle12161560

    Article  CAS  PubMed  Google Scholar 

  • O Weiss MA Jimenez H Herzel (2000) ArticleTitleInformation content of protein sequences J Theor Biol 206 379–386 Occurrence Handle10.1006/jtbi.2000.2138 Occurrence Handle1:CAS:528:DC%2BD3cXmsVGisbo%3D Occurrence Handle10988023

    Article  CAS  PubMed  Google Scholar 

  • CR Woese (1987) ArticleTitleBacterial evolution Microbiol Rev 51 221–271 Occurrence Handle1:CAS:528:DyaL2sXkslertLc%3D Occurrence Handle2439888

    CAS  PubMed  Google Scholar 

  • CR Woese O Kandler ML Wheelis (1990) ArticleTitleTowards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and Eucarya Proc Natl Acad Sci USA 87 4576–4579 Occurrence Handle1:STN:280:By%2BB1c%2FktFw%3D Occurrence Handle2112744

    CAS  PubMed  Google Scholar 

  • ZG Yu P Jiang (2001) ArticleTitleDistance, correlation and mutual information among portraits of organisms based on complete genomes Phys Lett A 286 34–46 Occurrence Handle10.1016/S0375-9601(01)00336-X Occurrence Handle1:CAS:528:DC%2BD3MXkvF2gsrk%3D

    Article  CAS  Google Scholar 

  • ZG Yu V Anh KS Lau (2003a) ArticleTitleMultifractal and correlation analysis of protein sequences from complete genome Phys Rev E 68 021913 Occurrence Handle10.1103/PhysRevE.68.021913

    Article  Google Scholar 

  • ZG Yu V Anh KS Lau KH Chu (2003b) ArticleTitleThe genomic tree of living organisms based on a fractal model Phys Lett A 317 293–302 Occurrence Handle10.1016/j.physleta.2003.08.040 Occurrence Handle1:CAS:528:DC%2BD3sXnvVelt7g%3D Occurrence HandleMR2018655

    Article  CAS  MathSciNet  Google Scholar 

  • ZG Yu V Anh KS Lau (2004) ArticleTitleChaos game representation, and multifractal and correlation analysis of protein sequences from complete genome based on detailed HP model J Theor Biol 226 341–348 Occurrence Handle10.1016/j.jtbi.2003.09.009 Occurrence Handle1:CAS:528:DC%2BD3sXpt1SjtLw%3D Occurrence Handle14643648 Occurrence HandleMR2068825

    Article  CAS  PubMed  MathSciNet  Google Scholar 

Download references

Acknowledgments

One of the authors, Zu-Guo Yu, would like to express his thanks to Dr. Ji Qi, ITP, Chinese Academy of Science, for useful discussion and sharing of his data and source code. Financial support was provided by the Youth Foundation of the Chinese National Natural Science Foundation (Grant 10101022) and Postdoctoral Research Support Grant 9900658 from Queensland University of Technology (Z.-G. Yu), and by the AoE Fund of The Chinese University of Hong Kong (K.H. Chu).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Z.G. Yu.

Additional information

Reviewing Editor: Dr. John Oakeshott

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, Z., Zhou, L., Anh, V. et al. Phylogeny of Prokaryotes and Chloroplasts Revealed by a Simple Composition Approach on All Protein Sequences from Complete Genomes Without Sequence Alignment. J Mol Evol 60, 538–545 (2005). https://doi.org/10.1007/s00239-004-0255-9

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-004-0255-9

Keywords

Navigation