Advertisement

A Practical Approximation Algorithm for Solving Massive Instances of Hybridization Number

  • Leo van Iersel
  • Steven Kelk
  • Nela Lekić
  • Celine Scornavacca
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7534)

Abstract

Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. In practice, exact solvers struggle to solve instances with reticulation number larger than 40. For such instances, one has to resort to heuristics and approximation algorithms. Here we present the algorithm CycleKiller which is the first approximation algorithm that can produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations. Theoretically, the algorithm is an exponential-time 2-approximation (or 4-approximation in its fastest mode). However, using simulations we demonstrate that in practice the algorithm runs quickly for large and difficult instances, producing solutions within one percent of optimality. An implementation of this algorithm, which extends the theoretical work of [14], has been made publicly available.

Keywords

Phylogenetic Network Massive Instance Agreement Forest Hybridization Number Binary Phylogenetic Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Albrecht, B., Scornavacca, C., Cenci, A., Huson, D.H.: Fast computation of minimum hybridization networks. Bioinformatics 28(2), 191–197 (2012)CrossRefGoogle Scholar
  2. 2.
    Baroni, M., Grünewald, S., Moulton, V., Semple, C.: Bounding the number of hybridisation events for a consistent evolutionary history. Mathematical Biology 51, 171–182 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Bordewich, M., Linz, S., St. John, K., Semple, C.: A reduction algorithm for computing the hybridization number of two trees. Evolutionary Bioinformatics 3, 86–98 (2007)Google Scholar
  4. 4.
    Bordewich, M., Semple, C.: Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Applied Mathematics 155(8), 914–928 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Chen, Z.-Z., Wang, L.: Hybridnet: a tool for constructing hybridization networks. Bioinformatics 26(22), 2912–2913 (2010)CrossRefGoogle Scholar
  6. 6.
    Chen, Z.-Z., Wang, L.: Algorithms for reticulate networks of multiple phylogenetic trees. IEEE/ACM Transactions on Computational Biology and Bioinformatics 9(2), 372–384 (2012)CrossRefGoogle Scholar
  7. 7.
    Collins, J., Linz, S., Semple, C.: Quantifying hybridization in realistic time. Journal of Computational Biology 18, 1305–1318 (2011)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer (2006)Google Scholar
  9. 9.
    Gascuel, O. (ed.): Mathematics of Evolution and Phylogeny. Oxford University Press, Inc. (2005)Google Scholar
  10. 10.
    Gascuel, O., Steel, M. (eds.): Reconstructing Evolution: New Mathematical and Computational Advances. Oxford University Press, USA (2007)zbMATHGoogle Scholar
  11. 11.
    Huson, D.H., Rupp, R., Scornavacca, C.: Phylogenetic Networks: Concepts, Algorithms and Applications. Cambridge University Press (2011)Google Scholar
  12. 12.
    Huson, D.H., Scornavacca, C.: Dendroscope 3 - a program for computing and drawing rooted phylogenetic trees and networks (2011) (in preparation), Software, http://www.dendroscope.org
  13. 13.
    Huson, D.H., Scornavacca, C.: A survey of combinatorial methods for phylogenetic networks. Genome Biology and Evolution 3, 23–35 (2011)CrossRefGoogle Scholar
  14. 14.
    Kelk, S.M., van Iersel, L.J.J., Lekić, N., Linz, S., Scornavacca, C., Stougie, L.: Cycle killer.. qu’est ce que c’est? on the comparative approximability of hybridization number and directed feedback vertex set. Submitted, preliminary version arXiv:1112.5359v1 (math.CO)Google Scholar
  15. 15.
    Nakhleh, L.: Evolutionary phylogenetic networks: models and issues. In: The Problem Solving Handbook for Computational Biology and Bioinformatics. Springer (2009)Google Scholar
  16. 16.
    Rodrigues, E.M., Sagot, M.F., Wakabayashi, Y.: The maximum agreement forest problem: Approximation algorithms and computational experiments. Theoretical Computer Science 374(1-3), 91–110 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
  18. 18.
    Whidden, C., Beiko, R.G., Zeh, N.: Fixed-parameter and approximation algorithms for maximum agreement forests. Submitted, preliminary version arXiv:1108.2664v1 (q-bio.PE)Google Scholar
  19. 19.
    Whidden, C., Beiko, R.G., Zeh, N.: Fast FPT Algorithms for Computing Rooted Agreement Forests: Theory and Experiments. In: Festa, P. (ed.) SEA 2010. LNCS, vol. 6049, pp. 141–153. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Leo van Iersel
    • 1
  • Steven Kelk
    • 2
  • Nela Lekić
    • 2
  • Celine Scornavacca
    • 3
  1. 1.Centrum Wiskunde & Informatica (CWI)AmsterdamThe Netherlands
  2. 2.Department of Knowledge Engineering (DKE)Maastricht UniversityMaastrichtThe Netherlands
  3. 3.Institut des Sciences de l’Evolution (ISEM, UMR 5554 CNRS)Université Montpellier IIMontpellier Cedex 5France

Personalised recommendations