Advertisement

Constructing a Consensus Phylogeny from a Leaf-Removal Distance (Extended Abstract)

  • Cedric Chauve
  • Mark Jones
  • Manuel Lafond
  • Céline Scornavacca
  • Mathias Weller
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10508)

Abstract

Understanding the evolution of a set of genes or species is a fundamental problem in evolutionary biology. The problem we study here takes as input a set of trees describing possibly discordant evolutionary scenarios for a given set of genes or species, and aims at finding a single tree that minimizes the leaf-removal distance to the input trees. This problem is a specific instance of the general consensus/supertree problem, widely used to combine or summarize discordant evolutionary trees. The problem we introduce is specifically tailored to address the case of discrepancies between the input trees due to the misplacement of individual taxa. Most supertree or consensus tree problems are computationally intractable, and we show that the problem we introduce is also NP-hard. We provide tractability results in form of a 2-approximation algorithm and a parameterized algorithm with respect to the number of removed leaves. We also introduce a variant that minimizes the maximum number d of leaves that are removed from any input tree, and provide a parameterized algorithm for this problem with parameter d.

Keywords

Computational biology Phylogenetics Parameterized algorithms Approximation Consensus trees Leaf deletion 

Notes

Acknowledgements

MJ was partially supported by Labex NUMEV (ANR-10-LABX-20) and Vidi grant 639.072.602 from The Netherlands Organization for Scientific Research (NWO). CC was supported by NSERC Discovery Grant 249834. CS was partially supported by the French Agence Nationale de la Recherche Investissements d’Avenir/Bioinformatique (ANR-10-BINF-01-01, ANR-10-BINF-01-02, Ancestrome). ML was supported by NSERC PDF Grant. MW was partially supported by the Institut de Biologie Computationnelle (IBC).

References

  1. 1.
    Aberer, A.J., Krompass, D., Stamatakis, A.: Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Syst. Biol. 62(1), 162–166 (2013). http://dx.doi.org/10.1093/sysbio/sys078 CrossRefGoogle Scholar
  2. 2.
    Amir, A., Keselman, D.: Maximum agreement subtree in a set of evolutionary trees: metrics and efficient algorithms. SIAM J. Comput. 26, 1656–1669 (1997). http://dx.doi.org/10.1137/S0097539794269461 MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Bryant, D.: Building trees, hunting for trees, and comparing trees. Ph.D. thesis, Bryant University (1997)Google Scholar
  4. 4.
    Bryant, D., McKenzie, A., Steel, M.: The size of a maximum agreement subtree for random binary trees. Dimacs Ser. Discrete Math. Theor. Comput. Sci. 61, 55–66 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Byrka, J., Guillemot, S., Jansson, J.: New results on optimizing rooted triplets consistency. Discrete Appl. Math. 158, 1136–1147 (2010). http://dx.doi.org/10.1016/j.dam.2010.03.004 MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Chauve, C., Jones, M., Lafond, M., Scornavacca, C., Weller, M.: Constructing a consensus phylogeny from a leaf-removal distance. http://arxiv.org/abs/1705.05295
  7. 7.
    Chester, A., Dondi, R., Wirth, A.: Resolving rooted triplet inconsistency by dissolving multigraphs. In: Hubert Chan, T.-H., Lau, L.C., Trevisan, L. (eds.) TAMC 2013. LNCS, vol. 7876, pp. 260–271. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38236-9_24 CrossRefGoogle Scholar
  8. 8.
    Cole, R., Farach-Colton, M., Hariharan, R., Przytycka, T.M., Thorup, M.: An O(nlog n) algorithm for the maximum agreement subtree problem for binary trees. SIAM J. Comput. 30, 1385–1404 (2000). http://dx.doi.org/10.1137/S0097539796313477 MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Deng, Y., Fernández-Baca, D.: Fast compatibility testing for rooted phylogenetic trees. In: Leibniz International Proceedings of Information, Combinatorial Pattern Matching, LIPIcs, vol. 54, pp. 12:1–12:12 (2016). http://drops.dagstuhl.de/opus/volltexte/2016/6088
  10. 10.
    Fernández-Baca, D., Guillemot, S., Shutters, B., Vakati, S.: Fixed-parameter algorithms for finding agreement supertrees. SIAM J. Comput. 44, 384–410 (2015). http://dx.doi.org/10.1137/120897559 MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Guillemot, S., Mnich, M.: Kernel and fast algorithm for dense triplet inconsistency. Theoret. Comput. Sci. 494, 134–143 (2013). http://dx.doi.org/10.1016/j.tcs.2012.12.032 MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)CrossRefzbMATHGoogle Scholar
  13. 13.
    Hellmuth, M., Wieseke, N., Lechner, M., Lenhof, H.P., Middendorf, M., Stadler, P.F.: Phylogenomics with paralogs. Proc. Natl. Acad. Sci. USA 112, 2058–2063 (2015). http://dx.doi.org/10.1073/pnas.1412770112 CrossRefGoogle Scholar
  14. 14.
    Jarvis, E.D., et al.: Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346, 1320–1331 (2014). http://dx.doi.org/10.1126/science.1253451 CrossRefGoogle Scholar
  15. 15.
    Scornavacca, C., Galtier, N.: Incomplete lineage sorting in Mammalian phylogenomics. Syst. Biol. 66, 112–120 (2017). http://dx.doi.org/10.1093/sysbio/syw082
  16. 16.
    Scornavacca, C., Jacox, E., Szollösi, G.J.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31, 841–848 (2015). http://dx.doi.org/10.1093/bioinformatics/btu728 CrossRefGoogle Scholar
  17. 17.
    Szollösi, G.J., Boussau, B., Abby, S.S., Tannier, E., Daubin, V.: Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc. Natl. Acad. Sci. USA 109, 17513–17518 (2012). http://dx.doi.org/10.1073/pnas.1202997109 CrossRefGoogle Scholar
  18. 18.
    Vachaspati, P., Warnow, T.: FastRFS: fast and accurate Robinson-Foulds supertrees using constrained exact optimization. Bioinformatics 33, 631–639 (2017). http://dx.doi.org/10.1093/bioinformatics/btw600 Google Scholar
  19. 19.
    Whidden, C., Zeh, N., Beiko, R.G.: Supertrees based on the subtree prune-and-regraft distance. Syst. Biol. 63, 566–581 (2014). http://dx.doi.org/10.1093/sysbio/syu023

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Cedric Chauve
    • 1
  • Mark Jones
    • 2
  • Manuel Lafond
    • 3
  • Céline Scornavacca
    • 4
  • Mathias Weller
    • 5
  1. 1.Department of MathematicsSimon Fraser UniversityBurnabyCanada
  2. 2.Delft Institute of Applied MathematicsDelft University of TechnologyDelftThe Netherlands
  3. 3.Department of Mathematics and StatisticsUniversity of OttawaOttawaCanada
  4. 4.Institut des Sciences de l’EvolutionUniversité de Montpellier, CNRS, IRD, EPHEMontpellierFrance
  5. 5.Laboratoire d’Informatique, de Robotique et de Microélectronique de MontpellierUniversité de Montpellier, IBCMontpellierFrance

Personalised recommendations