Skip to main content

Extracting Conflict-Free Information from Multi-labeled Trees

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNBI,volume 7534)

Abstract

A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious. We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T , and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation among MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved.

Keywords

  • Information Content
  • Internal Edge
  • Contractible Edge
  • Supertree Method
  • Quartet Topology

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This work was supported in part by the National Science Foundation under grant DEB-0829674.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bansal, M., Burleigh, J.G., Eulenstein, O., Fernández-Baca, D.: Robinson-Foulds supertrees. Algorithms for Molecular Biology 5(1), 18 (2010)

    CrossRef  Google Scholar 

  2. Baum, B.R.: Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon 41(1), 3–10 (1992)

    CrossRef  MathSciNet  Google Scholar 

  3. de Queiroz, A., Gatesy, J.: The supermatrix approach to systematics. Trends in Ecology & Evolution 22(1), 34–41 (2007)

    CrossRef  Google Scholar 

  4. Deepak, A., Fernández-Baca, D., McMahon, M.: Extracting conflict-free information from multi-labeled trees (2012), http://arxiv.org/abs/1205.6359

  5. Fellows, M., Hallett, M., Stege, U.: Analogs & duals of the mast problem for sequences & trees. Journal of Algorithms 49(1), 192–216 (2003); 1998 European Symposium on Algorithms

    CrossRef  MathSciNet  MATH  Google Scholar 

  6. Ganapathy, G., Goodson, B., Jansen, R., Le, H., Ramachandran, V., Warnow, T.: Pattern identification in biogeography. IEEE/ACM Trans. Comput. Biol. Bioinformatics 3, 334–346 (2006)

    CrossRef  Google Scholar 

  7. Grundt, H., Popp, M., Brochmann, C., Oxelman, B.: Polyploid origins in a circumpolar complex in draba (brassicaceae) inferred from cloned nuclear dna sequences and fingerprints. Molecular Phylogenetics and Evolution 32(3), 695–710 (2004)

    CrossRef  Google Scholar 

  8. Huber, K., Lott, M., Moulton, V., Spillner, A.: The complexity of deriving multi-labeled trees from bipartitions. Journal of Computational Biology 15(6), 639–651 (2008)

    CrossRef  MathSciNet  Google Scholar 

  9. Huber, K., Moulton, V.: Phylogenetic networks from multi-labelled trees. Journal of Mathematical Biology 52, 613–632 (2006)

    CrossRef  MathSciNet  MATH  Google Scholar 

  10. Huber, K., Spillner, A., Suchecki, R., Moulton, V.: Metrics on multilabeled trees: Interrelationships and diameter bounds. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(4), 1029–1040 (2011)

    CrossRef  Google Scholar 

  11. Johnson, K.P., Adams, R.J., Page, R.D.M., Clayton, D.H.: When do parasites fail to speciate in response to host speciation? Syst. Biol. 52, 37–47 (2003)

    CrossRef  Google Scholar 

  12. Lott, M., Spillner, A., Huber, K., Petri, A., Oxelman, B., Moulton, V.: Inferring polyploid phylogenies from multiply-labeled gene trees. BMC Evolutionary Biology 9(1), 216 (2009)

    CrossRef  Google Scholar 

  13. Marcet-Houben, M., Gabaldón, T.: Treeko: a duplication-aware algorithm for the comparison of phylogenetic trees. Nucleic Acids Research 39, e66 (2011)

    CrossRef  Google Scholar 

  14. Popp, M., Oxelman, B.: Inferring the history of the polyploid silene aegaea (caryophyllaceae) using plastid and homoeologous nuclear dna sequences. Molecular Phylogenetics and Evolution 20(3), 474–481 (2001)

    CrossRef  Google Scholar 

  15. Puigbò, P., Garcia-Vallvé, S., McInerney, J.: Topd/fmts: a new software to compare phylogenetic trees. Bioinformatics 23(12), 1556 (2007)

    CrossRef  Google Scholar 

  16. Ragan, M.: Phylogenetic inference based on matrix representation of trees. Molecular Phylogenetics and Evolution 1(1), 53–58 (1992)

    CrossRef  Google Scholar 

  17. Rasmussen, M.D., Kellis, M.: Unified modeling of gene duplication, loss, and coalescence using a locus tree. Genome Research 22, 755–765 (2012)

    CrossRef  Google Scholar 

  18. Sanderson, M., Boss, D., Chen, D., Cranston, K., Wehe, A.: The PhyLoTA browser: processing GenBank for molecular phylogenetics research. Systematic Biology 57(3), 335 (2008)

    CrossRef  Google Scholar 

  19. Scornavacca, C., Berry, V., Ranwez, V.: Building species trees from larger parts of phylogenomic databases. Information and Computation 209(3), 590–605 (2011); Special Issue: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds.): LATA 2009. LNCS, vol. 5457. Springer, Heidelberg (2009)

    Google Scholar 

  20. Semple, C., Steel, M.: Phylogenetics. Oxford University Press, Oxford (2003)

    MATH  Google Scholar 

  21. Steel, M.: The complexity of reconstructing trees from qualitative characters and subtrees. Journal of Classification 9(1), 91–116 (1992)

    CrossRef  MathSciNet  MATH  Google Scholar 

  22. Swenson, M., Suri, R., Linder, C., Warnow, T.: Superfine: fast and accurate supertree estimation. Systematic Biology 61(2), 214–227 (2012)

    CrossRef  Google Scholar 

  23. Wiens, J.J., Reeder, T.W.: Combining data sets with different numbers of taxa for phylogenetic analysis. Systematic Biology 44(4), 548–558 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Deepak, A., Fernández-Baca, D., McMahon, M.M. (2012). Extracting Conflict-Free Information from Multi-labeled Trees. In: Raphael, B., Tang, J. (eds) Algorithms in Bioinformatics. WABI 2012. Lecture Notes in Computer Science(), vol 7534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33122-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33122-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33121-3

  • Online ISBN: 978-3-642-33122-0

  • eBook Packages: Computer ScienceComputer Science (R0)