Identifying Rogue Taxa through Reduced Consensus: NP-Hardness and Exact Algorithms
A rogue taxon in a collection of phylogenetic trees is one whose position varies drastically from tree to tree. The presence of such taxa can greatly reduce the resolution of the consensus tree (e.g., the majority-rule or strict consensus) for a collection. The reduced consensus approach aims to identify and eliminate rogue taxa to produce more informative consensus trees. Given a collection of phylogenetic trees over the same leaf set, the goal is to find a set of taxa whose removal maximizes the number of internal edges in the consensus tree of the collection. We show that this problem is NP-hard for strict and majority-rule consensus. We give a polynomial-time algorithm for reduced strict consensus when the maximum degree of the strict consensus of the original trees is bounded. We describe exact integer linear programming formulations for computing reduced strict, majority and loose consensus trees. In experimental tests, our exact solutions improved over heuristic methods on several problem instances.
KeywordsInteger Linear Programming Consensus Tree Internal Edge Input Tree Consensus Method
Unable to display preview. Download preview PDF.
- 3.Bryant, D.: A classification of consensus methods for phylogenetics. In: Janowitz, M., Lapointe, F.-J., McMorris, F., Mirkin, B.B., Roberts, F. (eds.) Bioconsensus. Discrete Mathematics and Theoretical Computer Science, vol. 61, pp. 163–185. American Mathematical Society, Providence (2003)Google Scholar
- 7.Dong, J., Fernández-Baca, D., McMorris, F.R.: Constructing majority-rule supertrees. Algorithms in Molecular Biology 5(2) (2010)Google Scholar
- 11.Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E., Thatcher, J.W. (eds.) Complexity of Computer Computations. Plenum, New York (1972)Google Scholar
- 15.Pattengale, N., Aberer, A., Swenson, K., Stamatakis, A., Moret, B.: Uncovering hidden phylogenetic consensus in large datasets. IEEE/ACM Trans. Comput. Biol. Bioinformatics 8-4(99), 1 (2011)Google Scholar
- 16.Redelings, B.: Bayesian phylogenies unplugged: Majority consensus trees with wandering taxa (2009)Google Scholar
- 22.Wilkinson, M.: Common cladistic information and its consensus representation: reduced Adams and reduced cladistic consensus trees and profiles. Systematic Biology 43(3), 343 (1994)Google Scholar
- 23.Wilkinson, M.: More on reduced consensus methods. Systematic Biology 44(3), 435 (1995)Google Scholar
- 25.Xiao, Y., Yao, J.F.: Efficient data mining for maximal frequent subtrees. In: Proc. IEEE International Conference on Data Mining, pp. 379–386. IEEE (2003)Google Scholar