Constructing a Consensus Phylogeny from a Leaf-Removal Distance (Extended Abstract)

  • Cedric Chauve
  • Mark JonesEmail author
  • Manuel Lafond
  • Céline Scornavacca
  • Mathias Weller
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10508)


Understanding the evolution of a set of genes or species is a fundamental problem in evolutionary biology. The problem we study here takes as input a set of trees describing possibly discordant evolutionary scenarios for a given set of genes or species, and aims at finding a single tree that minimizes the leaf-removal distance to the input trees. This problem is a specific instance of the general consensus/supertree problem, widely used to combine or summarize discordant evolutionary trees. The problem we introduce is specifically tailored to address the case of discrepancies between the input trees due to the misplacement of individual taxa. Most supertree or consensus tree problems are computationally intractable, and we show that the problem we introduce is also NP-hard. We provide tractability results in form of a 2-approximation algorithm and a parameterized algorithm with respect to the number of removed leaves. We also introduce a variant that minimizes the maximum number d of leaves that are removed from any input tree, and provide a parameterized algorithm for this problem with parameter d.


Computational biology Phylogenetics Parameterized algorithms Approximation Consensus trees Leaf deletion 



MJ was partially supported by Labex NUMEV (ANR-10-LABX-20) and Vidi grant 639.072.602 from The Netherlands Organization for Scientific Research (NWO). CC was supported by NSERC Discovery Grant 249834. CS was partially supported by the French Agence Nationale de la Recherche Investissements d’Avenir/Bioinformatique (ANR-10-BINF-01-01, ANR-10-BINF-01-02, Ancestrome). ML was supported by NSERC PDF Grant. MW was partially supported by the Institut de Biologie Computationnelle (IBC).


  1. 1.
    Aberer, A.J., Krompass, D., Stamatakis, A.: Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Syst. Biol. 62(1), 162–166 (2013). CrossRefGoogle Scholar
  2. 2.
    Amir, A., Keselman, D.: Maximum agreement subtree in a set of evolutionary trees: metrics and efficient algorithms. SIAM J. Comput. 26, 1656–1669 (1997). MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Bryant, D.: Building trees, hunting for trees, and comparing trees. Ph.D. thesis, Bryant University (1997)Google Scholar
  4. 4.
    Bryant, D., McKenzie, A., Steel, M.: The size of a maximum agreement subtree for random binary trees. Dimacs Ser. Discrete Math. Theor. Comput. Sci. 61, 55–66 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Byrka, J., Guillemot, S., Jansson, J.: New results on optimizing rooted triplets consistency. Discrete Appl. Math. 158, 1136–1147 (2010). MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Chauve, C., Jones, M., Lafond, M., Scornavacca, C., Weller, M.: Constructing a consensus phylogeny from a leaf-removal distance.
  7. 7.
    Chester, A., Dondi, R., Wirth, A.: Resolving rooted triplet inconsistency by dissolving multigraphs. In: Hubert Chan, T.-H., Lau, L.C., Trevisan, L. (eds.) TAMC 2013. LNCS, vol. 7876, pp. 260–271. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38236-9_24 CrossRefGoogle Scholar
  8. 8.
    Cole, R., Farach-Colton, M., Hariharan, R., Przytycka, T.M., Thorup, M.: An O(nlog n) algorithm for the maximum agreement subtree problem for binary trees. SIAM J. Comput. 30, 1385–1404 (2000). MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Deng, Y., Fernández-Baca, D.: Fast compatibility testing for rooted phylogenetic trees. In: Leibniz International Proceedings of Information, Combinatorial Pattern Matching, LIPIcs, vol. 54, pp. 12:1–12:12 (2016).
  10. 10.
    Fernández-Baca, D., Guillemot, S., Shutters, B., Vakati, S.: Fixed-parameter algorithms for finding agreement supertrees. SIAM J. Comput. 44, 384–410 (2015). MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Guillemot, S., Mnich, M.: Kernel and fast algorithm for dense triplet inconsistency. Theoret. Comput. Sci. 494, 134–143 (2013). MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)CrossRefzbMATHGoogle Scholar
  13. 13.
    Hellmuth, M., Wieseke, N., Lechner, M., Lenhof, H.P., Middendorf, M., Stadler, P.F.: Phylogenomics with paralogs. Proc. Natl. Acad. Sci. USA 112, 2058–2063 (2015). CrossRefGoogle Scholar
  14. 14.
    Jarvis, E.D., et al.: Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346, 1320–1331 (2014). CrossRefGoogle Scholar
  15. 15.
    Scornavacca, C., Galtier, N.: Incomplete lineage sorting in Mammalian phylogenomics. Syst. Biol. 66, 112–120 (2017).
  16. 16.
    Scornavacca, C., Jacox, E., Szollösi, G.J.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31, 841–848 (2015). CrossRefGoogle Scholar
  17. 17.
    Szollösi, G.J., Boussau, B., Abby, S.S., Tannier, E., Daubin, V.: Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc. Natl. Acad. Sci. USA 109, 17513–17518 (2012). CrossRefGoogle Scholar
  18. 18.
    Vachaspati, P., Warnow, T.: FastRFS: fast and accurate Robinson-Foulds supertrees using constrained exact optimization. Bioinformatics 33, 631–639 (2017). Google Scholar
  19. 19.
    Whidden, C., Zeh, N., Beiko, R.G.: Supertrees based on the subtree prune-and-regraft distance. Syst. Biol. 63, 566–581 (2014).

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Cedric Chauve
    • 1
  • Mark Jones
    • 2
    Email author
  • Manuel Lafond
    • 3
  • Céline Scornavacca
    • 4
  • Mathias Weller
    • 5
  1. 1.Department of MathematicsSimon Fraser UniversityBurnabyCanada
  2. 2.Delft Institute of Applied MathematicsDelft University of TechnologyDelftThe Netherlands
  3. 3.Department of Mathematics and StatisticsUniversity of OttawaOttawaCanada
  4. 4.Institut des Sciences de l’EvolutionUniversité de Montpellier, CNRS, IRD, EPHEMontpellierFrance
  5. 5.Laboratoire d’Informatique, de Robotique et de Microélectronique de MontpellierUniversité de Montpellier, IBCMontpellierFrance

Personalised recommendations