Lower Bounds for Maximum Parsimony with Gene Order Data

  • Abraham Bachrach
  • Kevin Chen
  • Chris Harrelson
  • Radu Mihaescu
  • Satish Rao
  • Apurva Shah
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3678)

Abstract

In this paper, we study lower bound techniques for branch-and-bound algorithms for maximum parsimony, with a focus on gene order data. We give a simple O(n3) time dynamic programming algorithm for computing the maximum circular ordering lower bound, where n is the number of leaves. The well-known gene order phylogeny program, GRAPPA, currently implements two heuristic approximations to this lower bounds. Our experiments show a significant improvement over both these methods in practice. Next, we show that the linear programming-based lower bound of Tang and Moret (Tang and Moret, 2005) can be greatly simplified, allowing us to solve the LP in O*n3) time in the worst case, and in O*(n2.5) time amortized over all binary trees. Finally, we formalize the problem of computing the circular ordering lower bound, when the tree topologies are generated bottom-up, as a Path-Constrained Traveling Salesman Problem, and give a polynomial-time 3-approximation algorithm for it. This is a special case of the more general Precedence-Constrained Travelling Salesman Problem and has not previously been studied, to the best of our knowledge.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Moret, B., Wyman, S., Bader, D., Warnow, T., Yan, M.: A new implementation and detailed study of breakpoint analysis. In: PSMB (2001)Google Scholar
  2. 2.
    Sankoff, D., Blanchette, M.: Multiple genome rearrangement and breakpoint phylogeny. J. Comput. Biol. 5(3), 555–570 (1998)CrossRefGoogle Scholar
  3. 3.
    Bourque, G., Pevzner, P.: Genome-scale evolution: Reconstructing gene orders in the ancestral species. Genome Res. 12(1), 26–36 (2002)Google Scholar
  4. 4.
    Bryant, D.: A lower bound for the breakpoint phylogeny problem. Journal of Discrete Algorithms 2, 229–255 (2004)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Purdom, P., Bradford, P., Tamura, K., Kumar, S.: Single column discrepency and dynamic max-mini optimizations for quickly finding the most parsimonious evolutionary trees. Bioinformatics 16(2), 140–151 (2000)CrossRefGoogle Scholar
  6. 6.
    Holland, B., Huber, K., Penny, D., Moulton, V.: The minmax squeeze: Guarenteeing a minimal tree for population data. Mol. Biol. and Evol. 22(2), 235–242 (2005)CrossRefGoogle Scholar
  7. 7.
    Tang, J.: A study of bounding methods for reconstructing phylogenies from gene-order data. PhD Thesis (2003)Google Scholar
  8. 8.
    Tang, J., Moret, B., Cui, L., de Pamphilis, C.: Phylogenetic reconstruction from arbitrary gene-order data. In: BIBE (2004)Google Scholar
  9. 9.
    Tang, J., Moret, B.: Linear programming for phylogenetic reconstruction based on gene rearrangements. In: CPM (2005)Google Scholar
  10. 10.
    Moret, B., Tang, J., Warnow, T.: Reconstructing phylogenies from gene-content and gene-order data. In: Gascuel, O. (ed.) Mathematics of Evolution and Phylogeny. Oxford Univ. Press, Oxford (2004)Google Scholar
  11. 11.
    Chaudhuri, K., Chen, K., Mihaescu, R., Rao, S.: On the tandem duplication-random loss model of genome rearrangement (in review)Google Scholar
  12. 12.
    Charikar, M., Motwani, R., Raghavan, P., Silverstein, C.: Constrained TSP and low power computing. In: WADS (1997)Google Scholar
  13. 13.
    Moret, B.: Personal communication (2005)Google Scholar
  14. 14.
    Lancia, G., Ravi, R.: GESTALT: Genomic steiner alignments. In: CPM (1999)Google Scholar
  15. 15.
    Young, N.: Sequential and parallel algorithms for mixed packing and covering. In: FOCS (2001)Google Scholar
  16. 16.
    Flajolet, P., Odlyzko, A.M.: The average height of binary trees and other simple trees. J. Computer System Sci. 25, 171–213 (1982)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Bar-Joseph, Z., Demaine, E.D., Gifford, D.K., Hamel, A.M., Jaakkola, T.S., et al.: K-ary clustering with optimal leaf ordering for gene expression data. Bioinformatics 19(9), 1070–1078 (2003)CrossRefGoogle Scholar
  18. 18.
    Burkard, R.E., Deineko, V.G., Woeginger, G.J.: The travelling salesman and the pq-tree. Mathematics of Operations Research 24, 262–272 (1999)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Farach, M., Kannan, S., Warnow, T.: A robust model for finding optimal evolutionary trees. In: STOC (1993)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Abraham Bachrach
    • 1
  • Kevin Chen
    • 1
  • Chris Harrelson
    • 1
  • Radu Mihaescu
    • 1
  • Satish Rao
    • 1
  • Apurva Shah
    • 1
  1. 1.Department of Computer ScienceUC Berkeley 

Personalised recommendations