Algorithmica

, Volume 67, Issue 2, pp 142–160

FlipCut Supertrees: Towards Matrix Representation Accuracy in Polynomial Time

  • Malte Brinkmeyer
  • Thasso Griebel
  • Sebastian Böcker
Article

Abstract

In computational phylogenetics, supertree methods provide a way to reconstruct larger clades of the Tree of Life. The supertree problem can be formalized in different ways, to cope with contradictory information in the input. In particular, there exist methods based on encoding the input trees in a matrix, and methods based on finding minimum cuts in some graph. Matrix representation methods compute supertrees of superior quality, but the underlying optimization problems are computationally hard. In contrast, graph-based methods have polynomial running time, but supertrees are inferior in quality.

In this paper, we present a novel approach for the computation of supertrees called FlipCut supertree. Our method combines the computation of minimum cuts from graph-based methods with a matrix representation method, namely Minimum Flip Supertrees. Here, the input trees are encoded in a 0/1/?-matrix. We present a heuristic to search for a minimum set of 0/1-flips such that the resulting matrix admits a directed perfect phylogeny. We then extend our approach by using edge weights to weight the columns of the 0/1/?-matrix.

In our evaluation, we show that our method is extremely swift in practice, and orders of magnitude faster than the runner up. Concerning supertree quality, our method is sometimes on par with the “gold standard” Matrix Representation with Parsimony.

Keywords

Phylogenetics Supertrees Algorithms Minimum cut Perfect phylogeny Minimum flip supertree problem 

References

  1. 1.
    Aho, A.V., Sagiv, Y., Szymanski, T.G., Ullman, J.D.: Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J. Comput. 10(3), 405–421 (1981) MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Baum, B.R.: Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon 41(1), 3–10 (1992) MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bininda-Emonds, O.R.P. (ed.): Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life. Computational Biology Series, vol. 4. Kluwer Academic, Dordrecht (2004) Google Scholar
  4. 4.
    Bininda-Emonds, O.R.P.: Supertree construction in the genomic age. Methods Enzymol. 395, 745–757 (2005) CrossRefGoogle Scholar
  5. 5.
    Böcker, S., Bui, B., Nicolas, F., Truss, A.: Intractability of the minimum flip supertree problem and its variants. Technical report, Cornell University Library, arXiv:1112.4536v1 (2011)
  6. 6.
    Brinkmeier, M.: A simple and fast min-cut algorithm. Theory Comput. Syst. 41(2), 369–380 (2007) MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Brinkmeyer, M., Griebel, T., Böcker, S.: Polynomial supertree methods revisited. Adv. Bioinform. 2011, 524182 (2011) Google Scholar
  8. 8.
    Bryant, D., Steel, M.A.: Extension operations on sets of leaf-labelled trees. Adv. Appl. Math. 16(4), 425–453 (1995) MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Chen, D., Eulenstein, O., Fernández-Baca, D., Burleigh, J.G.: Improved heuristics for minimum-flip supertree construction. Evol. Bioinform. 2, 347–356 (2006) Google Scholar
  10. 10.
    Chen, D., Eulenstein, O., Fernández-Baca, D., Sanderson, M.: Minimum-flip supertrees: complexity and algorithms. IEEE/ACM Trans. Comput. Biol. Bioinform. 3(2), 165–173 (2006) CrossRefGoogle Scholar
  11. 11.
    Chimani, M., Rahmann, S., Böcker, S.: Exact ILP solutions for phylogenetic minimum flip problems. In: Proc. of ACM Conf. on Bioinformatics and Computational Biology (ACM-BCB 2010), pp. 147–153. ACM, New York (2010) CrossRefGoogle Scholar
  12. 12.
    Day, W., Johnson, D., Sankoff, D.: The computational complexity of inferring rooted phylogenies by parsimony. Math. Biosci. 81(1), 33–42 (1986) MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Ford, L.R., Fulkerson, D.R.: Flows in Networks. Princeton University Press, Princeton (1962) MATHGoogle Scholar
  14. 14.
    Foulds, L., Graham, R.L.: The Steiner problem in phylogeny is NP-complete. Adv. Appl. Math. 3(1), 43–49 (1982) MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Gasieniec, L., Jansson, J., Lingas, A., Östlin, A.: On the complexity of computing evolutionary trees. In: Proc. of Conference Computing and Combinatorics (COCOON 1997). Lecture Notes in Computer Science, vol. 1276, pp. 134–145. Springer, Berlin (1997) CrossRefGoogle Scholar
  16. 16.
    Gasieniec, L., Jansson, J., Lingas, A., Östlin, A.: On the complexity of constructing evolutionary trees. J. Comb. Optim. 3, 183–197 (1999) MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Griebel, T., Brinkmeyer, M., Böcker, S.: EPoS: a modular software framework for phylogenetic analysis. Bioinformatics 24(20), 2399–2400 (2008) CrossRefGoogle Scholar
  18. 18.
    Gusfield, D.: Efficient algorithms for inferring evolutionary trees. Networks 21(1), 19–28 (1991) MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997) CrossRefMATHGoogle Scholar
  20. 20.
    Hao, J.X., Orlin, J.B.: A faster algorithm for finding the minimum cut in a directed graph. J. Algorithms 17(3), 424–446 (1994) MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Henzinger, M.R., King, V., Warnow, T.: Constructing a tree from homeomorphic subtrees with applications to computational evolutionary biology. Algorithmica 24(1), 13 (1999) MathSciNetCrossRefGoogle Scholar
  22. 22.
    Huson, D.H., Nettles, S.M., Warnow, T.J.: Disk-covering, a fast-converging method for phylogenetic tree reconstruction. J. Comput. Biol. 6(3–4), 369–386 (1999) CrossRefGoogle Scholar
  23. 23.
    Huson, D.H., Vawter, L., Warnow, T.J.: Solving large scale phylogenetic problems using DCM2. In: Proc. of Intelligent Systems for Molecular Biology (ISMB 1999), pp. 118–129 (1999) Google Scholar
  24. 24.
    Karger, D.R.: Minimum cuts in near-linear time. J. ACM 47(1), 46–76 (2000) MathSciNetMATHGoogle Scholar
  25. 25.
    Page, R.D.M.: Modified mincut supertrees. In: Proc. of Workshop on Algorithms in Bioinformatics (WABI 2002). Lecture Notes in Computer Science, vol. 2452, pp. 537–552. Springer, Berlin (2002) CrossRefGoogle Scholar
  26. 26.
    Pe’er, I., Pupko, T., Shamir, R., Sharan, R.: Incomplete directed perfect phylogeny. SIAM J. Comput. 33(3), 590–607 (2004) MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    Picard, J.-C., Queyranne, M.: On the structure of all minimum cuts in a network and applications. Math. Program. Stud. 13, 8–16 (1980) MathSciNetCrossRefMATHGoogle Scholar
  28. 28.
    Ragan, M.A.: Phylogenetic inference based on matrix representation of trees. Mol. Phylogenet. Evol. 1(1), 53–58 (1992) CrossRefGoogle Scholar
  29. 29.
    Ranwez, V., Berry, V., Criscuolo, A., Fabre, P.-H., Guillemot, S., Scornavacca, C., Douzery, E.J.P.: PhySIC: a veto supertree method with desirable properties. Syst. Biol. 56(5), 798–817 (2007) CrossRefGoogle Scholar
  30. 30.
    Ranwez, V., Criscuolo, A., Douzery, E.J.P.: SuperTriplets: a triplet-based supertree approach to phylogenomics. Bioinformatics 26(12), i115–i123 (2010) CrossRefGoogle Scholar
  31. 31.
    Ronquist, F.: Matrix representation of trees, redundancy, and weighting. Syst. Biol. 45(2), 247–253 (1996) CrossRefGoogle Scholar
  32. 32.
    Roshan, U., Moret, B., Warnow, T., Williams, T.: Rec-I-DCM3: a fast algorithmic technique for reconstructing large phylogenetic trees. In: Proc. of IEEE Computational Systems Bioinformatics Conference (CSB 2004), pp. 98–109 (2004) Google Scholar
  33. 33.
    Ross, H., Rodrigo, A.: An assessment of matrix representation with compatibility in supertree construction. In: Bininda-Emonds, O.R. (ed.) Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life. Computational Biology Book Series, vol. 4, pp. 35–63. Kluwer Academic, Dordrecht (2004) CrossRefGoogle Scholar
  34. 34.
    Scornavacca, C., Berry, V., Lefort, V., Douzery, E.J.P., Ranwez, V.: PhySIC_IST: cleaning source trees to infer more informative supertrees. BMC Bioinform. 9, 413 (2008) CrossRefGoogle Scholar
  35. 35.
    Semple, C., Steel, M.: A supertree method for rooted trees. Discrete Appl. Math. 105(1–3), 147–158 (2000) MathSciNetCrossRefMATHGoogle Scholar
  36. 36.
    Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006) CrossRefGoogle Scholar
  37. 37.
    Steel, M.A., Dress, A.W., Böcker, S.: Simple but fundamental limitations on supertree and consensus tree methods. Syst. Biol. 49(2), 363–368 (2000) CrossRefGoogle Scholar
  38. 38.
    Swenson, M.S., Barbancon, F., Warnow, T., Linder, C.R.: A simulation study comparing supertree and combined analysis methods using SMIDGen. Algorithms Mol. Biol. 5(1), 8 (2010) CrossRefGoogle Scholar
  39. 39.
    Swofford, D.L.: PAUP* Phylogenetic Analysis Using Parsimony (and Other Methods) 4.0 Beta. Sinauer Associates (2002) Google Scholar
  40. 40.
    Willson, S.J.: Constructing rooted supertrees using distances. Bull. Math. Biol. 66(6), 1755–1783 (2004) MathSciNetCrossRefGoogle Scholar
  41. 41.
    Wilson, E.O.: A consistency test for phylogenies based on contemporaneous species. Syst. Zool. 14(3), 214–220 (1965) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  • Malte Brinkmeyer
    • 1
  • Thasso Griebel
    • 1
  • Sebastian Böcker
    • 1
  1. 1.Lehrstuhl für BioinformatikFriedrich-Schiller-Universität JenaJenaGermany

Personalised recommendations