Constructing Minimal Phylogenetic Networks from Softwired Clusters is Fixed Parameter Tractable

Abstract

Here we show that, given a set of clusters \({\mathcal{C}}\) on a set of taxa \({\mathcal{X}}\), where \(|{\mathcal{X}}|=n\), it is possible to determine in time f(k)⋅poly(n) whether there exists a level-≤k network (i.e. a network where each biconnected component has reticulation number at most k) that represents all the clusters in \({\mathcal{C}}\) in the softwired sense, and if so to construct such a network. This extends a result from Kelk et al. (in IEEE/ACM Trans. Comput. Biol. Bioinform. 9:517–534, 2012) which showed that the problem is polynomial-time solvable for fixed k. By defining “k-reticulation generators” analogous to “level-k generators”, we then extend this fixed parameter tractability result to the problem where k refers not to the level but to the reticulation number of the whole network.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 6

Notes

  1. 1.

    This is the definition when all reticulation vertices have indegree-2, for more general networks reticulation number is defined slightly differently. See the Preliminaries for more information.

  2. 2.

    Alternatively, we say that a network N represents a cluster \(C \subset \mathcal{X}\) “in the hardwired sense” if there exists a tree edge (u,v) of N such that C is the set of leaf descendants of v.

  3. 3.

    Otherwise \({\mathcal{C}}\) can be trivially represented by the star tree on \({\mathcal{X}}\).

  4. 4.

    Note that to determine the reticulation number of a biconnected component, the indegree of each node is computed using only edges belonging to this biconnected component.

  5. 5.

    Recall that, by Lemma 1 of [20], the existence of a level-k network representing a separating set of clusters \({\mathcal{C}}\) on \({\mathcal{X}}\) implies that a simple level-k network representing \({\mathcal{C}}\) has to exist.

  6. 6.

    Note that the number of level-k generators grows rapidly in k, lying between 2k−1 and k!250k [8].

  7. 7.

    Indeed, short sides can only be allocated taxa in two ways. Firstly, indirectly via Algorithm 1. Secondly, when there are no longer any unfinished long sides, at the very end of the entire procedure, in Algorithm 3.

References

  1. 1.

    Bordewich, M., Semple, C.: Computing the hybridization number of two phylogenetic trees is fixed-parameter tractable. IEEE/ACM Trans. Comput. Biol. Bioinf. 4(3), 458–466 (2007)

    Article  MathSciNet  Google Scholar 

  2. 2.

    Bordewich, M., Semple, C.: Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Appl. Math. 155(8), 914–928 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  3. 3.

    Bordewich, M., Linz, S., John, K.St., Semple, C.: A reduction algorithm for computing the hybridization number of two trees. Evol. Bioinform. 3, 86–98 (2007)

    Google Scholar 

  4. 4.

    Chen, Z.-Z., Wang, L.: Algorithms for reticulate networks of multiple phylogenetic trees. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 372–384 (2012)

    Article  Google Scholar 

  5. 5.

    Collins, J., Linz, S., Semple, C.: Quantifying hybridization in realistic time. J. Comput. Biol. 18(10), 1305–1318 (2011)

    Article  MathSciNet  Google Scholar 

  6. 6.

    Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Berlin (1999)

    Google Scholar 

  7. 7.

    Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Berlin (2006)

    Google Scholar 

  8. 8.

    Gambette, P., Berry, V., Paul, C.: The structure of level-k phylogenetic networks. In: Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching, CPM ’09, pp. 289–300. Springer, Berlin (2009)

    Google Scholar 

  9. 9.

    Gascuel, O. (ed.): Mathematics of Evolution and Phylogeny. Oxford University Press, Oxford (2005)

    Google Scholar 

  10. 10.

    Gascuel, O., Steel, M. (eds.): Reconstructing Evolution: New Mathematical and Computational Advances. Oxford University Press, Oxford (2007)

    Google Scholar 

  11. 11.

    Gramm, J., Nickelsen, A., Tantau, T.: Fixed-parameter algorithms in phylogenetics. Comput. J. 51(1), 79–101 (2008)

    Article  Google Scholar 

  12. 12.

    Gusfield, D., Bansal, V., Bafna, V., Song, Y.: A decomposition theory for phylogenetic networks and incompatible characters. J. Comput. Biol. 14(10), 1247–1272 (2007)

    Article  MathSciNet  Google Scholar 

  13. 13.

    Gusfield, D., Hickerson, D., Eddhu, S.: An efficiently computed lower bound on the number of recombinations in phylognetic networks: theory and empirical study. Discrete Appl. Math. 155(6–7), 806–830 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  14. 14.

    Huson, D.H., Scornavacca, C.: A survey of combinatorial methods for phylogenetic networks. Genome Biol. Evol. 3, 23–35 (2011)

    Article  Google Scholar 

  15. 15.

    Huson, D.H., Rupp, R., Berry, V., Gambette, P., Paul, C.: Computing galled networks from real data. Bioinformatics 25(12), i85–i93 (2009)

    Article  Google Scholar 

  16. 16.

    Huson, D.H., Rupp, R., Scornavacca, C.: Phylogenetic Networks: Concepts, Algorithms and Applications. Cambridge University Press, Cambridge (2011)

    Google Scholar 

  17. 17.

    Huynh, T.N.D., Jansson, J., Nguyen, N.B., Sung, W.-K.: Constructing a smallest refining galled phylogenetic network. In: Research in Computational Molecular Biology (RECOMB). Lecture Notes in Bioinformatics, vol. 3500, pp. 265–280 (2005)

    Google Scholar 

  18. 18.

    Jansson, J., Sung, W.-K.: Inferring a level-1 phylogenetic network from a dense set of rooted triplets. Theor. Comput. Sci. 363(1), 60–68 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  19. 19.

    Jansson, J., Nguyen, N.B., Sung, W.-K.: Algorithms for combining rooted triplets into a galled phylogenetic network. SIAM J. Comput. 35(5), 1098–1121 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  20. 20.

    Kelk, S., Scornavacca, C., van Iersel, L.: On the elusiveness of clusters. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 517–534 (2012)

    Article  Google Scholar 

  21. 21.

    Myers, S.R., Griffiths, R.C.: Bounds on the minimum number of recombination events in a sample history. Genetics 163, 375–394 (2003)

    Google Scholar 

  22. 22.

    Nakhleh, L.: Evolutionary phylogenetic networks: models and issues. In: The Problem Solving Handbook for Computational Biology and Bioinformatics. Springer, Berlin (2009)

    Google Scholar 

  23. 23.

    Niedermeier, R.: Invitation to Fixed Parameter Algorithms. Oxford Lecture Series in Mathematics and Its Applications. Oxford University Press, Oxford (2006)

    Google Scholar 

  24. 24.

    Semple, C.: Hybridization networks. In: Reconstructing Evolution—New Mathematical and Computational Advances. Oxford University Press, Oxford (2007)

    Google Scholar 

  25. 25.

    Semple, C., Steel, M.: Phylogenetics. Oxford University Press, Oxford (2003)

    Google Scholar 

  26. 26.

    To, T.-H., Habib, M.: Level-k phylogenetic networks are constructable from a dense triplet set in polynomial time. In: CPM09. LNCS, vol. 5577, pp. 275–288 (2009)

    Google Scholar 

  27. 27.

    van Iersel, L., Kelk, S.: Constructing the simplest possible phylogenetic network from triplets. Algorithmica 60, 207–235 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  28. 28.

    van Iersel, L.J.J., Kelk, S.M.: When two trees go to war. J. Theor. Biol. 269(1), 245–255 (2011)

    Article  Google Scholar 

  29. 29.

    van Iersel, L.J.J., Keijsper, J.C.M., Kelk, S.M., Stougie, L., Hagen, F., Boekhout, T.: Constructing level-2 phylogenetic networks from triplets. IEEE/ACM Trans. Comput. Biol. Bioinform. 6(4), 667–681 (2009)

    Article  Google Scholar 

  30. 30.

    van Iersel, L.J.J., Kelk, S.M., Mnich, M.: Uniqueness, intractability and exact algorithms: reflections on level-k phylogenetic networks. J. Bioinform. Comput. Biol. 7(2), 597–623 (2009)

    Article  Google Scholar 

  31. 31.

    van Iersel, L.J.J., Kelk, S.M., Rupp, R., Huson, D.H.: Phylogenetic networks do not need to be complex: Using fewer reticulations to represent conflicting clusters. Bioinformatics 26, i124–i131 (2010). Special issue: Proceedings of Intelligent Systems for Molecular Biology 2010 (ISMB2010), 10th–13th September (2010)

    Article  Google Scholar 

  32. 32.

    Whidden, C., Zeh, N.: A unifying view on approximation and fpt of agreement forests. In: Salzberg, S., Warnow, T. (eds.) Algorithms in Bioinformatics. Lecture Notes in Computer Science, vol. 5724, pp. 390–402. Springer, Berlin (2009)

    Google Scholar 

  33. 33.

    Whidden, C., Beiko, R.G., Zeh, N.: Fixed-parameter and approximation algorithms for maximum agreement forests. arXiv:1108.2664v1 [q-bio.PE]

  34. 34.

    Wu, Y.: Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees. Bioinformatics 26, i140–i148 (2010). Special issue: Proceedings of Intelligent Systems for Molecular Biology 2010 (ISMB2010), 10th–13th September (2010)

    Article  Google Scholar 

  35. 35.

    Wu, Y., Gusfield, D.: A new recombination lower bound and the minimum perfect phylogenetic forest problem. J. Comb. Optim. 16(3), 229–247 (2008)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Celine Scornavacca.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Kelk, S., Scornavacca, C. Constructing Minimal Phylogenetic Networks from Softwired Clusters is Fixed Parameter Tractable. Algorithmica 68, 886–915 (2014). https://doi.org/10.1007/s00453-012-9708-5

Download citation

Keywords

  • Phylogenetics
  • Fixed parameter tractability
  • Directed acyclic graphs