# FlipCut Supertrees: Towards Matrix Representation Accuracy in Polynomial Time

- 252 Downloads
- 4 Citations

## Abstract

In computational phylogenetics, supertree methods provide a way to reconstruct larger clades of the *Tree of Life*. The supertree problem can be formalized in different ways, to cope with contradictory information in the input. In particular, there exist methods based on encoding the input trees in a matrix, and methods based on finding minimum cuts in some graph. Matrix representation methods compute supertrees of superior quality, but the underlying optimization problems are computationally hard. In contrast, graph-based methods have polynomial running time, but supertrees are inferior in quality.

In this paper, we present a novel approach for the computation of supertrees called FlipCut supertree. Our method combines the computation of minimum cuts from graph-based methods with a matrix representation method, namely Minimum Flip Supertrees. Here, the input trees are encoded in a 0/1/?-matrix. We present a heuristic to search for a minimum set of 0/1-flips such that the resulting matrix admits a directed perfect phylogeny. We then extend our approach by using edge weights to weight the columns of the 0/1/?-matrix.

In our evaluation, we show that our method is extremely swift in practice, and orders of magnitude faster than the runner up. Concerning supertree quality, our method is sometimes on par with the “gold standard” Matrix Representation with Parsimony.

## Keywords

Phylogenetics Supertrees Algorithms Minimum cut Perfect phylogeny Minimum flip supertree problem## Notes

### Acknowledgement

Additional implementation by Markus Fleischhauer.

## References

- 1.Aho, A.V., Sagiv, Y., Szymanski, T.G., Ullman, J.D.: Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J. Comput.
**10**(3), 405–421 (1981) MathSciNetCrossRefzbMATHGoogle Scholar - 2.Baum, B.R.: Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon
**41**(1), 3–10 (1992) MathSciNetCrossRefGoogle Scholar - 3.Bininda-Emonds, O.R.P. (ed.): Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life. Computational Biology Series, vol. 4. Kluwer Academic, Dordrecht (2004) Google Scholar
- 4.Bininda-Emonds, O.R.P.: Supertree construction in the genomic age. Methods Enzymol.
**395**, 745–757 (2005) CrossRefGoogle Scholar - 5.Böcker, S., Bui, B., Nicolas, F., Truss, A.: Intractability of the minimum flip supertree problem and its variants. Technical report, Cornell University Library, arXiv:1112.4536v1 (2011)
- 6.Brinkmeier, M.: A simple and fast min-cut algorithm. Theory Comput. Syst.
**41**(2), 369–380 (2007) MathSciNetCrossRefzbMATHGoogle Scholar - 7.Brinkmeyer, M., Griebel, T., Böcker, S.: Polynomial supertree methods revisited. Adv. Bioinform.
**2011**, 524182 (2011) Google Scholar - 8.Bryant, D., Steel, M.A.: Extension operations on sets of leaf-labelled trees. Adv. Appl. Math.
**16**(4), 425–453 (1995) MathSciNetCrossRefzbMATHGoogle Scholar - 9.Chen, D., Eulenstein, O., Fernández-Baca, D., Burleigh, J.G.: Improved heuristics for minimum-flip supertree construction. Evol. Bioinform.
**2**, 347–356 (2006) Google Scholar - 10.Chen, D., Eulenstein, O., Fernández-Baca, D., Sanderson, M.: Minimum-flip supertrees: complexity and algorithms. IEEE/ACM Trans. Comput. Biol. Bioinform.
**3**(2), 165–173 (2006) CrossRefGoogle Scholar - 11.Chimani, M., Rahmann, S., Böcker, S.: Exact ILP solutions for phylogenetic minimum flip problems. In: Proc. of ACM Conf. on Bioinformatics and Computational Biology (ACM-BCB 2010), pp. 147–153. ACM, New York (2010) CrossRefGoogle Scholar
- 12.Day, W., Johnson, D., Sankoff, D.: The computational complexity of inferring rooted phylogenies by parsimony. Math. Biosci.
**81**(1), 33–42 (1986) MathSciNetCrossRefzbMATHGoogle Scholar - 13.Ford, L.R., Fulkerson, D.R.: Flows in Networks. Princeton University Press, Princeton (1962) zbMATHGoogle Scholar
- 14.Foulds, L., Graham, R.L.: The Steiner problem in phylogeny is NP-complete. Adv. Appl. Math.
**3**(1), 43–49 (1982) MathSciNetCrossRefzbMATHGoogle Scholar - 15.Gasieniec, L., Jansson, J., Lingas, A., Östlin, A.: On the complexity of computing evolutionary trees. In: Proc. of Conference Computing and Combinatorics (COCOON 1997). Lecture Notes in Computer Science, vol. 1276, pp. 134–145. Springer, Berlin (1997) CrossRefGoogle Scholar
- 16.Gasieniec, L., Jansson, J., Lingas, A., Östlin, A.: On the complexity of constructing evolutionary trees. J. Comb. Optim.
**3**, 183–197 (1999) MathSciNetCrossRefzbMATHGoogle Scholar - 17.Griebel, T., Brinkmeyer, M., Böcker, S.: EPoS: a modular software framework for phylogenetic analysis. Bioinformatics
**24**(20), 2399–2400 (2008) CrossRefGoogle Scholar - 18.Gusfield, D.: Efficient algorithms for inferring evolutionary trees. Networks
**21**(1), 19–28 (1991) MathSciNetCrossRefzbMATHGoogle Scholar - 19.Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997) CrossRefzbMATHGoogle Scholar
- 20.Hao, J.X., Orlin, J.B.: A faster algorithm for finding the minimum cut in a directed graph. J. Algorithms
**17**(3), 424–446 (1994) MathSciNetCrossRefzbMATHGoogle Scholar - 21.Henzinger, M.R., King, V., Warnow, T.: Constructing a tree from homeomorphic subtrees with applications to computational evolutionary biology. Algorithmica
**24**(1), 13 (1999) MathSciNetCrossRefGoogle Scholar - 22.Huson, D.H., Nettles, S.M., Warnow, T.J.: Disk-covering, a fast-converging method for phylogenetic tree reconstruction. J. Comput. Biol.
**6**(3–4), 369–386 (1999) CrossRefGoogle Scholar - 23.Huson, D.H., Vawter, L., Warnow, T.J.: Solving large scale phylogenetic problems using DCM2. In: Proc. of Intelligent Systems for Molecular Biology (ISMB 1999), pp. 118–129 (1999) Google Scholar
- 24.Karger, D.R.: Minimum cuts in near-linear time. J. ACM
**47**(1), 46–76 (2000) MathSciNetzbMATHGoogle Scholar - 25.Page, R.D.M.: Modified mincut supertrees. In: Proc. of Workshop on Algorithms in Bioinformatics (WABI 2002). Lecture Notes in Computer Science, vol. 2452, pp. 537–552. Springer, Berlin (2002) CrossRefGoogle Scholar
- 26.Pe’er, I., Pupko, T., Shamir, R., Sharan, R.: Incomplete directed perfect phylogeny. SIAM J. Comput.
**33**(3), 590–607 (2004) MathSciNetCrossRefzbMATHGoogle Scholar - 27.Picard, J.-C., Queyranne, M.: On the structure of all minimum cuts in a network and applications. Math. Program. Stud.
**13**, 8–16 (1980) MathSciNetCrossRefzbMATHGoogle Scholar - 28.Ragan, M.A.: Phylogenetic inference based on matrix representation of trees. Mol. Phylogenet. Evol.
**1**(1), 53–58 (1992) CrossRefGoogle Scholar - 29.Ranwez, V., Berry, V., Criscuolo, A., Fabre, P.-H., Guillemot, S., Scornavacca, C., Douzery, E.J.P.: PhySIC: a veto supertree method with desirable properties. Syst. Biol.
**56**(5), 798–817 (2007) CrossRefGoogle Scholar - 30.Ranwez, V., Criscuolo, A., Douzery, E.J.P.: SuperTriplets: a triplet-based supertree approach to phylogenomics. Bioinformatics
**26**(12), i115–i123 (2010) CrossRefGoogle Scholar - 31.Ronquist, F.: Matrix representation of trees, redundancy, and weighting. Syst. Biol.
**45**(2), 247–253 (1996) CrossRefGoogle Scholar - 32.Roshan, U., Moret, B., Warnow, T., Williams, T.: Rec-I-DCM3: a fast algorithmic technique for reconstructing large phylogenetic trees. In: Proc. of IEEE Computational Systems Bioinformatics Conference (CSB 2004), pp. 98–109 (2004) Google Scholar
- 33.Ross, H., Rodrigo, A.: An assessment of matrix representation with compatibility in supertree construction. In: Bininda-Emonds, O.R. (ed.) Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life. Computational Biology Book Series, vol. 4, pp. 35–63. Kluwer Academic, Dordrecht (2004) CrossRefGoogle Scholar
- 34.Scornavacca, C., Berry, V., Lefort, V., Douzery, E.J.P., Ranwez, V.: PhySIC_IST: cleaning source trees to infer more informative supertrees. BMC Bioinform.
**9**, 413 (2008) CrossRefGoogle Scholar - 35.Semple, C., Steel, M.: A supertree method for rooted trees. Discrete Appl. Math.
**105**(1–3), 147–158 (2000) MathSciNetCrossRefzbMATHGoogle Scholar - 36.Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics
**22**(21), 2688–2690 (2006) CrossRefGoogle Scholar - 37.Steel, M.A., Dress, A.W., Böcker, S.: Simple but fundamental limitations on supertree and consensus tree methods. Syst. Biol.
**49**(2), 363–368 (2000) CrossRefGoogle Scholar - 38.Swenson, M.S., Barbancon, F., Warnow, T., Linder, C.R.: A simulation study comparing supertree and combined analysis methods using SMIDGen. Algorithms Mol. Biol.
**5**(1), 8 (2010) CrossRefGoogle Scholar - 39.Swofford, D.L.: PAUP* Phylogenetic Analysis Using Parsimony (and Other Methods) 4.0 Beta. Sinauer Associates (2002) Google Scholar
- 40.Willson, S.J.: Constructing rooted supertrees using distances. Bull. Math. Biol.
**66**(6), 1755–1783 (2004) MathSciNetCrossRefGoogle Scholar - 41.Wilson, E.O.: A consistency test for phylogenies based on contemporaneous species. Syst. Zool.
**14**(3), 214–220 (1965) CrossRefGoogle Scholar