Journal of Biological Physics

, Volume 28, Issue 3, pp 439–447

Phylogeny Based on Whole Genome as inferred from Complete Information Set Analysis

  • W. Li
  • W. Fang
  • L. Ling
  • J. Wang
  • Z. Xuan
  • R. Chen
Article

DOI: 10.1023/A:1020316706928

Cite this article as:
Li, W., Fang, W., Ling, L. et al. Journal of Biological Physics (2002) 28: 439. doi:10.1023/A:1020316706928

Abstract

Previous molecular phylogeny algorithms mainly rely onmulti-sequence alignments of cautiously selected characteristic sequences,thus not directly appropriate for whole genome phylogeny where eventssuch as rearrangements make full-length alignments impossible. Weintroduce here the concept of Complete Information Set (CIS) and itsmeasurement implementation as evolution distance without reference tosizes. As method proof-test, the 16s rRNA sequences of 22 completelysequenced Bacteria and Archaea species are used to reconstruct aphylogenetic tree, which is generally consistent with the commonlyaccepted one. Based on whole genome, our further efforts yield a highlyrobust whole genome phylogenetic tree, supporting separate monophyleticcluster of species with similar phenotype as well as the early evolution ofthermophilic Bacteria and late diverging of Eukarya. The purpose of thiswork is not to contradict or confirm previous phylogeny standards butrather to bring a brand-new algorithm and tool to the phylogeny researchcommunity. The software to estimate the sequence distance and materialsused in this study are available upon request to corresponding author.

comparative genomicsinformation discrepancymolecular evolutionsequence analysis

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • W. Li
    • 1
  • W. Fang
    • 2
  • L. Ling
    • 1
  • J. Wang
    • 1
  • Z. Xuan
    • 1
  • R. Chen
    • 1
  1. 1.Laboratory of Bioinformatics, Institute of BiophysicsChinese Academy of SciencesBeijingChina
  2. 2.Academy of Mathematical and Systemic SciencesChinese Academy of SciencesBeijingChina