Abstract
Understanding the evolution of a set of genes or species is a fundamental problem in evolutionary biology. The problem we study here takes as input a set of trees describing possibly discordant evolutionary scenarios for a given set of genes or species, and aims at finding a single tree that minimizes the leaf-removal distance to the input trees. This problem is a specific instance of the general consensus/supertree problem, widely used to combine or summarize discordant evolutionary trees. The problem we introduce is specifically tailored to address the case of discrepancies between the input trees due to the misplacement of individual taxa. Most supertree or consensus tree problems are computationally intractable, and we show that the problem we introduce is also NP-hard. We provide tractability results in form of a 2-approximation algorithm and a parameterized algorithm with respect to the number of removed leaves. We also introduce a variant that minimizes the maximum number d of leaves that are removed from any input tree, and provide a parameterized algorithm for this problem with parameter d.
All missing proofs are provided in [6].
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
All trees we consider here are uniquely leaf-labeled, rooted (i.e. are out-trees) and binary; see next section for formal definitions.
References
Aberer, A.J., Krompass, D., Stamatakis, A.: Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Syst. Biol. 62(1), 162–166 (2013). http://dx.doi.org/10.1093/sysbio/sys078
Amir, A., Keselman, D.: Maximum agreement subtree in a set of evolutionary trees: metrics and efficient algorithms. SIAM J. Comput. 26, 1656–1669 (1997). http://dx.doi.org/10.1137/S0097539794269461
Bryant, D.: Building trees, hunting for trees, and comparing trees. Ph.D. thesis, Bryant University (1997)
Bryant, D., McKenzie, A., Steel, M.: The size of a maximum agreement subtree for random binary trees. Dimacs Ser. Discrete Math. Theor. Comput. Sci. 61, 55–66 (2003)
Byrka, J., Guillemot, S., Jansson, J.: New results on optimizing rooted triplets consistency. Discrete Appl. Math. 158, 1136–1147 (2010). http://dx.doi.org/10.1016/j.dam.2010.03.004
Chauve, C., Jones, M., Lafond, M., Scornavacca, C., Weller, M.: Constructing a consensus phylogeny from a leaf-removal distance. http://arxiv.org/abs/1705.05295
Chester, A., Dondi, R., Wirth, A.: Resolving rooted triplet inconsistency by dissolving multigraphs. In: Hubert Chan, T.-H., Lau, L.C., Trevisan, L. (eds.) TAMC 2013. LNCS, vol. 7876, pp. 260–271. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38236-9_24
Cole, R., Farach-Colton, M., Hariharan, R., Przytycka, T.M., Thorup, M.: An O(nlog n) algorithm for the maximum agreement subtree problem for binary trees. SIAM J. Comput. 30, 1385–1404 (2000). http://dx.doi.org/10.1137/S0097539796313477
Deng, Y., Fernández-Baca, D.: Fast compatibility testing for rooted phylogenetic trees. In: Leibniz International Proceedings of Information, Combinatorial Pattern Matching, LIPIcs, vol. 54, pp. 12:1–12:12 (2016). http://drops.dagstuhl.de/opus/volltexte/2016/6088
Fernández-Baca, D., Guillemot, S., Shutters, B., Vakati, S.: Fixed-parameter algorithms for finding agreement supertrees. SIAM J. Comput. 44, 384–410 (2015). http://dx.doi.org/10.1137/120897559
Guillemot, S., Mnich, M.: Kernel and fast algorithm for dense triplet inconsistency. Theoret. Comput. Sci. 494, 134–143 (2013). http://dx.doi.org/10.1016/j.tcs.2012.12.032
Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Hellmuth, M., Wieseke, N., Lechner, M., Lenhof, H.P., Middendorf, M., Stadler, P.F.: Phylogenomics with paralogs. Proc. Natl. Acad. Sci. USA 112, 2058–2063 (2015). http://dx.doi.org/10.1073/pnas.1412770112
Jarvis, E.D., et al.: Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346, 1320–1331 (2014). http://dx.doi.org/10.1126/science.1253451
Scornavacca, C., Galtier, N.: Incomplete lineage sorting in Mammalian phylogenomics. Syst. Biol. 66, 112–120 (2017). http://dx.doi.org/10.1093/sysbio/syw082
Scornavacca, C., Jacox, E., Szollösi, G.J.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31, 841–848 (2015). http://dx.doi.org/10.1093/bioinformatics/btu728
Szollösi, G.J., Boussau, B., Abby, S.S., Tannier, E., Daubin, V.: Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc. Natl. Acad. Sci. USA 109, 17513–17518 (2012). http://dx.doi.org/10.1073/pnas.1202997109
Vachaspati, P., Warnow, T.: FastRFS: fast and accurate Robinson-Foulds supertrees using constrained exact optimization. Bioinformatics 33, 631–639 (2017). http://dx.doi.org/10.1093/bioinformatics/btw600
Whidden, C., Zeh, N., Beiko, R.G.: Supertrees based on the subtree prune-and-regraft distance. Syst. Biol. 63, 566–581 (2014). http://dx.doi.org/10.1093/sysbio/syu023
Acknowledgements
MJ was partially supported by Labex NUMEV (ANR-10-LABX-20) and Vidi grant 639.072.602 from The Netherlands Organization for Scientific Research (NWO). CC was supported by NSERC Discovery Grant 249834. CS was partially supported by the French Agence Nationale de la Recherche Investissements d’Avenir/Bioinformatique (ANR-10-BINF-01-01, ANR-10-BINF-01-02, Ancestrome). ML was supported by NSERC PDF Grant. MW was partially supported by the Institut de Biologie Computationnelle (IBC).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Chauve, C., Jones, M., Lafond, M., Scornavacca, C., Weller, M. (2017). Constructing a Consensus Phylogeny from a Leaf-Removal Distance (Extended Abstract). In: Fici, G., Sciortino, M., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2017. Lecture Notes in Computer Science(), vol 10508. Springer, Cham. https://doi.org/10.1007/978-3-319-67428-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-67428-5_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67427-8
Online ISBN: 978-3-319-67428-5
eBook Packages: Computer ScienceComputer Science (R0)