A Novel Approach for Compressing Phylogenetic Trees

  • Suzanne J. Matthews
  • Seung-Jin Sul
  • Tiffani L. Williams
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6053)

Abstract

Phylogenetic trees are tree structures that depict relationships between organisms. Popular analysis techniques often produce large collections of candidate trees, which are expensive to store. We introduce TreeZip, a novel algorithm to compress phylogenetic trees based on their shared evolutionary relationships. We evaluate TreeZip’s performance on fourteen tree collections ranging from 2,505 trees on 328 taxa to 150,000 trees on 525 taxa corresponding to 0.6 MB to 434 MB in storage. Our results show that TreeZip is very effective, typically compressing a tree file to less than 2% of its original size. When coupled with standard compression methods such as 7zip, TreeZip can compress a file to less than 1% of its original size. Our results strongly suggest that TreeZip is very effective at compressing phylogenetic trees, which allows for easier exchange of data with colleagues around the world.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Suzanne J. Matthews
    • 1
  • Seung-Jin Sul
    • 1
  • Tiffani L. Williams
    • 1
  1. 1.Texas A&M UniversityCollege StationUSA

Personalised recommendations