Fast Computation of the Exact Hybridization Number of Two Phylogenetic Trees

  • Yufeng Wu
  • Jiayin Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6053)


Hybridization is a reticulate evolutionary process. An established problem on hybridization is computing the minimum number of hybridization events, called the hybridization number, needed in the evolutionary history of two phylogenetic trees. This problem is known to be NP-hard. In this paper, we present a new practical method to compute the exact hybridization number. Our approach is based on an integer linear programming formulation. Simulation results on biological and simulated datasets show that our method (as implemented in program SPRDist) is more efficient and robust than an existing method.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baroni, M., Grunewald, S., Moulton, V., Semple, C.: Bounding the number of hybridization events for a consistent evolutionary history. J. Math. Biol. 51, 171–182 (2005)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Baroni, M., Semple, C., Steel, M.: A framework for representing reticulate evolution. Annals of Combinatorics 8, 391–408 (2004)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Beiko, R.G., Hamilton, N.: Phylogenetic identification of lateral genetic transfer events. BMC Evolutionary Biology 6, 15 (2006)CrossRefGoogle Scholar
  4. 4.
    Bonet, M., John, K.S., Mahindru, R., Amenta, N.: Approximating subtree distances between phylogenies. J. of Comp. Biology 13, 1419–1434 (2006)CrossRefGoogle Scholar
  5. 5.
    Bordewich, M., Linz, S., John, K.S., Semple, C.: A reduction algorithm for computing the hybridization number of two trees. Evolutionary Bioinformatics 3, 86–98 (2007)Google Scholar
  6. 6.
    Bordewich, M., McCartin, C., Semple, C.: A 3-approximation algorithm for the subtree distance between phylogenies. J. Discrete Algorithms 6, 458–471 (2008)MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics 8, 409–423 (2004)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Bordewich, M., Semple, C.: Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Applied Mathematics 155, 914–928 (2007)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Grass Phylogeny Working Group. Phylogeny and subfamilial classification of the grasses (poaceae). Ann. Mo. Bot. Gard. 88, 373–457 (2001)Google Scholar
  10. 10.
    Hein, J., Jiang, T., Wang, L., Zhang, K.: On the complexity of comparing evolutionary trees. Discrete Appl. Math 71, 153–169 (1996)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Huson, D., Bryant, D.: Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23, 254–267 (2006)CrossRefGoogle Scholar
  12. 12.
    Linder, C.R., Moret, B.M.E., Nakhleh, L., Warnow, T.: Network (reticulate) evolution: biology, models, and algorithms (2004)Google Scholar
  13. 13.
    Linz, S.: Personal communicationsGoogle Scholar
  14. 14.
    Linz, S., Semple, C.: Hybridization in nonbinary trees. IEEE/ACM Transactions on Computational Biology and Bioinformatics 6, 30–45 (2009)CrossRefGoogle Scholar
  15. 15.
    Nakhleh, L.: Evolutionary phylogenetic networks: models and issues. In: Heath, L., Ramakrishnan, N. (eds.) The Problem Solving Handbook for Computational Biology and Bioinformatics. Springer, Heidelberg (In press 2010)Google Scholar
  16. 16.
    Rodrigues, E.M., Sagot, M.F., Wakabayashi, Y.: Some approximation results for the maximum agreement forest problem. In: Proceedings of RANDOM-APPROX 2001, pp. 159–169 (2001)Google Scholar
  17. 17.
    Schmidt, H.: Phylogenetic trees from large datasets. PhD thesis, Heinrich-Heine-Universität, Düsseldorf (2003)Google Scholar
  18. 18.
    Semple, C.: Hybridization networks. In: Gascuel, O., Steel, M. (eds.) Reconstructing Evolution: New Mathematical and Computational Advances, Oxford, pp. 277–309 (2007)Google Scholar
  19. 19.
    Tarjan, R.: Enumeration of the elementary circuits of a directed graph. SIAM J. on Computing 2, 211–216 (1973)MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Wu, Y.: A practical method for exact computation of subtree prune and regraft distance. Bioinformatics 25, 190–196 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yufeng Wu
    • 1
  • Jiayin Wang
    • 1
  1. 1.Department of Computer Science and EngineeringUniversity of ConnecticutStorrsU.S.A.

Personalised recommendations