Skip to main content
Log in

Groves of Phylogenetic Trees

  • Published:
Annals of Combinatorics Aims and scope Submit manuscript

Abstract

A major challenge in biological sciences is the reconstruction of the Tree of Life. To this effect, large genomic databases like GenBank and SwissProt are being mined for clusters from which phylogenies can be inferred. Systematists and comparative biologists commonly combine such phylogenies into informative supertrees that reveal information which was not explicitly displayed in any of the original phylogenies. However, whether a supertree is informative depends on particular overlap properties among the clusters from which it originates. In this work we formally introduce the concept of groves — sets of clusters with the potential to construct informative supertrees. Thus maximal potential candidate clusters for informative supertree construction can be identified in large databases through groves, prior to inferring trees for each cluster. Groves also have the potential to lead to informative supermatrix construction. We developed methods that (i) efficiently identify particular types of groves and (ii) find lower and upper bounds on the minimal number of groves needed to cover all the trees or data sets in a database. Finally, we apply our methods to the green plant sequences from GenBank.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aho A.V., Sagiv Y., Szymanski T.G., Ullman J.D.: Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J. Comput. 10(3), 405–421 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  2. Bininda-Emonds O.R.P.: Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, Computational Biology, Vol. 4. Kluwer Academic Publishers, Dordrecht (2004)

    Google Scholar 

  3. Bininda-Emonds O.R.P., Gittleman J.L., Steel M.: The (super)tree of life: procedures, problems, and prospects. Annu. Rev. Ecol. Syst. 33, 265–289 (2002)

    Article  Google Scholar 

  4. Bryant D.J., Steel M.: Extension operations on sets of leaf-labelled trees. Adv. Appl. Math. 16(4), 425–453 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  5. Cotton J.A., Slater C.S.C., Wilkinson M.: Discriminating supported and unsupported relationships in supertrees using triplets. Syst. Biol. 55(2), 345–350 (2006)

    Article  Google Scholar 

  6. Davies T.J., Barraclough T.G., Chase M.W., Soltis P.S., Soltis D.E., Savolainen V.: Darwin’s abominable mystery: insights from a supertree of the angiosperms. Proc. Natl. Acad. Sci. USA 101, 1904–1909 (2004)

    Article  Google Scholar 

  7. Driskell A.C., Burleigh J.G., Burleigh J.G., McMahon M.M., O’Meara B.C., Sanderson M.J.: Prospects for building the tree of life from large sequence databases. Science 306, 1172–1174 (2004)

    Article  Google Scholar 

  8. Foulds L.R., Graham R.L.: The Steiner problem in Phylogeny is NP-complete. Adv. Appl. Math. 3, 43–49 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  9. Hall M. Jr.: Combinatorial Theory. John Wiley & Sons, New York (1986)

    MATH  Google Scholar 

  10. Kennedy M., Page R.D.M.: Seabird supertrees: combining partial estimates of procellariiform phylogeny. Auk 119(1), 88–108 (2002)

    Article  Google Scholar 

  11. Liu F.-G., Miyamoto M.M., Freire N.P., Ong P.Q., Tennant M.R., Young T.S., Gugel K.F.: Molecular and morphological supertrees for eutherian (plancental) mammals. Science 291, 1786–1789 (2001)

    Article  Google Scholar 

  12. R.D.M. Page, Phyloinformatics: towards a phylogenetic database, In: Data Mining in Bioinformatics, J.T.L.Wang, M.J. Zaki, H.T.T. Toivonen, and D.E. Shasha, Eds., Springer, Heidelberg, (2005) pp. 219–241.

  13. R.D.M. Page, Taxonomy, supertrees, and the Tree of Life, In: Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, O.R.P. Bininda-Emonds Ed., Kluwer Academic Publishers, Dordrecht, (2004) pp. 247–265.

  14. Papadimitriou C.H.: Computational Complexity. Addison-Wesley, Reading, Massachusetts (1994)

    MATH  Google Scholar 

  15. Pisani D., Yates A.M., Langer M.C., Benton M.J.: A genus-level supertree of the Dinosauria. Proc. R. Soc. Lond. B 269, 915–921 (2002)

    Article  Google Scholar 

  16. M.J. Sanderson, C. An´e, O. Eulenstein, D. Fern´andez-Baca, J. Kim, M.M. McMahon, and R. Piaggio-Talice, Fragmentation of large data sets in phylogenetic analyses, In: Reconstructing Evolution: New Mathematical and Computational Advances, Mike Steel and Olivier Gascuel, Eds., Oxford University Press, Oxford, (2007) pp. 199–216.

  17. Sanderson M.J., Driskell A.C.: The challenge of constructing large phylogenetic trees. Trends Plant Sci. 8, 374–379 (2003)

    Article  Google Scholar 

  18. Sanderson M.J., Purvis A., Henze C.: Phylogenetic supertrees: assembling the trees of life. Trends Ecol. Evol. 13(3), 105–109 (1998)

    Article  Google Scholar 

  19. Semple C., Steel M.: Phylogenetics. Oxford University Press, New York (2003)

    MATH  Google Scholar 

  20. Steel M.: The complexity of reconstructing trees from qualitative characters and subtrees. J. Classification 9, 91–116 (1992)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cécile Ané.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ané, C., Eulenstein, O., Piaggio-Talice, R. et al. Groves of Phylogenetic Trees. Ann. Comb. 13, 139–167 (2009). https://doi.org/10.1007/s00026-009-0017-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00026-009-0017-x

AMS Subject Classification

Keywords

Navigation