On the Maximum Parsimony Distance Between Phylogenetic Trees
- 83 Downloads
- 7 Citations
Abstract
Within the field of phylogenetics there is great interest in distance measures to quantify the dissimilarity of two trees. Here, based on an idea of Bruen and Bryant, we propose and analyze a new distance measure: theMaximum Parsimony (MP) distance. This is based on the difference of the parsimony scores of a single character on both trees under consideration, and the goal is to find the character which maximizes this difference. In this article we show that this new distance is a metric and provides a lower bound to the well-known Subtree Prune and Regraft (SPR) distance. We also show that to compute the MP distance it is sufficient to consider only characters that are convex on one of the trees, and prove several additional structural properties of the distance. On the complexity side, we prove that calculating the MP distance is in general NP-hard, and identify an interesting island of tractability in which the distance can be calculated in polynomial time.
Keywords
maximum parsimony tree metric subtree prune and regraft (SPR)Mathematics Subject Classification
05C15 05C35 90C35 92D15Preview
Unable to display preview. Download preview PDF.
References
- 1.Alimonti P., Kann V.: Some APX-completeness results for cubic graphs. Theoret. Comput. Sci. 237(1-2), 123–134 (2000)CrossRefMathSciNetMATHGoogle Scholar
- 2.Allen B.L., Steel M.: Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Combin. 5(1), 1–15 (2001)CrossRefMathSciNetGoogle Scholar
- 3.Archie, J., Day, W., Felsenstein, J., Maddison, W., Meacham, C., Rohlf, F., Swofford, D.: The newick tree format. Avaiable online at: http://evolution.genetics.washington.edu/phylip/newicktree.html (2000)
- 4.Beiko, R.G., Hamilton, N.: Phylogenetic identification of lateral genetic transfer events. BMC Evol. Biol. 6, #P15 (2006)Google Scholar
- 5.Bonet M.L., St. John K.: On the complexity of uSPR distance. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(3), 572–576 (2010)CrossRefGoogle Scholar
- 6.Bordewich M., Semple C.: On the computational complexity of the rooted subtree prune and regraft distance. Ann. Combin. 8(4), 409–423 (2004)CrossRefMathSciNetMATHGoogle Scholar
- 7.Brooks R.L.: On colouring the nodes of a network. Proc. Cambridge Philos. Soc. 37(2), 194–197 (1941)CrossRefMathSciNetGoogle Scholar
- 8.Bruen T.C., Bryant D.: Parsimony via consensus. Syst. Biol. 57(2), 251–256 (2008)CrossRefGoogle Scholar
- 9.Bryant D.: The splits in the neighborhood of a tree. Ann. Combin. 8(1), 1–11 (2004)CrossRefMATHGoogle Scholar
- 10.Buneman P.: The recovery of trees from measures of dissimilarity. In: Hodson F.R., Kendall D.G., Tautu P.T. (eds.) Mathematics in the Archaeological and Historical Sciences, pp. 387-395. Edinburgh University Press, Edinburgh (1971)Google Scholar
- 11.Catlin P.A.: Brooks’ graph-coloring theorem and the independence number. J. Combin. Theory Ser. B 27(1), 42–48 (1979)CrossRefMathSciNetMATHGoogle Scholar
- 12.Chor B., Tuller T.: Finding a maximum likelihood tree is hard. J. ACM 53(5), 722–744 (2006)CrossRefMathSciNetGoogle Scholar
- 13.Collins, J.S.: Rekernelisation algorithms in hybrid phylogenies. Master Thesis. University of Canterbury, New Zealand (2009)Google Scholar
- 14.Dailey D.P.: Uniqueness of colorability and colorability of planar 4-regular graphs are NP-complete. Discrete Math. 30(3), 289–293 (1980)CrossRefMathSciNetMATHGoogle Scholar
- 15.Ding Y., Grünewald S., Humphries P.J.: On agreement forests. J. Combin. Theory Ser. A 118(7), 2059–2065 (2011)CrossRefMathSciNetMATHGoogle Scholar
- 16.Fitch W.: Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20(4), 406–416 (1971)CrossRefGoogle Scholar
- 17.Foulds L.R., Graham R.L.: The Steiner problem in phylogeny is NP-complete. Adv. Appl. Math. 3(1), 43–49 (1982)CrossRefMathSciNetMATHGoogle Scholar
- 18.Hartigan J.A.: Minimum mutation fits to a given tree. Biometrics 29(1), 53–65 (1973)CrossRefGoogle Scholar
- 19.Kundu S., Misra J.: A linear tree partitioning algorithm. SIAM J. Comput. 6(1), 151–154 (1977)CrossRefMathSciNetMATHGoogle Scholar
- 20.Lin Y., Rajan V., Moret B.M.: A metric for phylogenetic trees based on matching. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(4), 1014–1022 (2012)CrossRefGoogle Scholar
- 21.Linz S., Semple C.: A cluster reduction for computing the subtree distance between phylogenies. Ann. Combin. 15(3), 465–484 (2011)CrossRefMathSciNetMATHGoogle Scholar
- 22.Robinson D.F., Foulds L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1-2), 131–147 (1981)CrossRefMathSciNetMATHGoogle Scholar
- 23.Roch S.: A short proof that phylogenetic tree reconstruction by maximum likelihood is hard. IEEE/ACM Trans. Comput. Biol. Bioinform. 3(1), 92–94 (2006)CrossRefMathSciNetGoogle Scholar
- 24.Rodrigues E.M., Sagot M.-F., Wakabayashi Y.: The maximum agreement forest problem: approximation algorithms and computational experiments. Theoret. Comput. Sci. 374(1-3), 91–110 (2007)CrossRefMathSciNetMATHGoogle Scholar
- 25.Semple C., Steel M.: Phylogenetics. Oxford University Press, Oxford (2003)MATHGoogle Scholar
- 26.van Iersel L., Kelk S., Lekić N., Scornavacca C.: A practical approximation algorithm for solving massive instances of hybridization number. In: Raphael B., Tang J. (eds.) Algorithms in Bioinformatics, pp. 430-440. Springer-Verlag, Berlin (2012)CrossRefGoogle Scholar
- 27.van Iersel L., Kelk S., Lekić N., Stougie L.: Approximation algorithms for nonbinary agreement forests. SIAM J. Discrete Math. 28(1), 49–66 (2014)CrossRefMathSciNetMATHGoogle Scholar
- 28.Whidden C., Beiko R.G., Zeh N.: Fixed-parameter algorithms for maximum agreement forests. SIAM J. Comput. 42(4), 1431–1466 (2013)CrossRefMathSciNetMATHGoogle Scholar