Annals of Combinatorics

, Volume 21, Issue 4, pp 573–604 | Cite as

On the Complexity of Computing MP Distance Between Binary Phylogenetic Trees

Open Access
Article
  • 64 Downloads

Abstract

Within the field of phylogenetics there is great interest in distance measures to quantify the dissimilarity of two trees. Recently, a new distance measure has been proposed: the Maximum Parsimony (MP) distance. This is based on the difference of the parsimony scores of a single character on both trees under consideration, and the goal is to find the character which maximizes this difference. Here we show that computation of MP distance on two binary phylogenetic trees is NP-hard. This is a highly nontrivial extension of an earlier NP-hardness proof for two multifurcating phylogenetic trees, and it is particularly relevant given the prominence of binary trees in the phylogenetics literature. As a corollary to the main hardness result we show that computation of MP distance is also hard on binary trees if the number of states available is bounded. In fact, via a different reduction we show that it is hard even if only two states are available. Finally, as a first response to this hardness we give a simple Integer Linear Program (ILP) formulation which is capable of computing the MP distance exactly for small trees (and for larger trees when only a small number of character states are available) and which is used to computationally verify several auxiliary results required by the hardness proofs.

Mathematics Subject Classification

05C15 05C35 90C35 92D15 

Keywords

Maximum Parsimony phylogenetics tree metrics NP-hard binary trees 

References

  1. 1.
    Alimonti P., Kann V.: Some APX-completeness results for cubic graphs. Theoret. Comput. Sci. 237(1-2), 123–134 (2000)CrossRefMATHMathSciNetGoogle Scholar
  2. 2.
    Bordewich M., Semple C.: On the computational complexity of the rooted subtree prune and regraft distance. Ann. Combin. 8(4), 409–423 (2005)CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Diestel R.: Graph Theory. Springer-Verlag, Berlin (2005)MATHGoogle Scholar
  4. 4.
    Archie, J., Day, W., Felsenstein, J., Maddison, W., Meacham, C., Rohlf, F., Swofford, D.: The newick tree format. (2000) http://evolution.genetics.washington.edu/phylip/newicktree.html
  5. 5.
    Fischer M., Kelk S.: On the maximum parsimony distance between phylogenetic trees. Ann. Combin. 20(1), 87–113 (2016)CrossRefMATHMathSciNetGoogle Scholar
  6. 6.
    Fischer M., Thatte B.: Revisiting an equivalence between maximum parsimony and maximum likelihood methods in phylogenetics. Bull. Math. Biol. 72(1), 208–220 (2010)CrossRefMATHMathSciNetGoogle Scholar
  7. 7.
    Fitch W.: Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20(4), 406–416 (1971)CrossRefGoogle Scholar
  8. 8.
    Hartigan J.A.: Minimum mutation fits to a given tree. Biometrics 29(1), 53–65 (1973)CrossRefGoogle Scholar
  9. 9.
    Haws, D., Hodge, T., Yoshida, R.: Phylogenetic tree reconstruction: geometric approaches. In: Robeva, R., Hodge, T. (eds.)Mathematical Concepts andMethods inModern Biology: Using Modern Discrete Models, pp. 307–342. Elsevier, Dublin (2013)Google Scholar
  10. 10.
    Holyer I.: The NP-completeness of edge-coloring. SIAM J. Comput. 10(4), 718–720 (1981)CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    Huson D., Steel M.: Distances that perfectly mislead. Syst. Biol. 53(2), 327–332 (2004)CrossRefGoogle Scholar
  12. 12.
    van Iersel L., Kelk S., Lekić N., Scornavacca C.: A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees. BMC Bioinformatics 15, 127–138 (2013)CrossRefGoogle Scholar
  13. 13.
    Kelk, S., Fischer, M.: Maximum parsimony distance integer linear program (MPDIST). ”http://skelk.sdf-eu.org/mpdistbinary/” (2014)
  14. 14.
    Maddison W.: Reconstructing character evolution on polytomous cladograms. Cladistics 5(4), 365–377 (1989)CrossRefGoogle Scholar
  15. 15.
    Papadimitriou C.H., Yannakakis M.: Optimization, approximation, and complexity classes. J. Comput. System Sci. 43(3), 425–440 (1991)CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© The Author(s) 2017

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Department of Knowledge EngineeringMaastricht UniversityMaastrichtThe Netherlands
  2. 2.Department for Mathematics and Computer ScienceErnst-Moritz-Arndt University of GreifswaldGreifswaldGermany

Personalised recommendations