Polynomial-Time Algorithms for Phylogenetic Inference Problems

  • Leo van IerselEmail author
  • Remie Janssen
  • Mark Jones
  • Yukihiro Murakami
  • Norbert Zeh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10849)


A common problem in phylogenetics is to try to infer a species phylogeny from gene trees. We consider different variants of this problem. The first variant, called Unrestricted Minimal Episodes Inference, aims at inferring a species tree based on a model of speciation and duplication where duplications are clustered in duplication episodes. The goal is to minimize the number of such episodes. The second variant, Parental Hybridization, aims at inferring a species network based on a model of speciation and reticulation. The goal is to minimize the number of reticulation events. It is a variant of the well-studied Hybridization Number problem with a more generous view on which gene trees are consistent with a given species network. We show that these seemingly different problems are in fact closely related and can, surprisingly, both be solved in polynomial time, using a structure we call “beaded trees”. However, we also show that methods based on these problems have to be used with care because the optimal species phylogenies always have some restricted form. We discuss several possibilities to overcome this problem.


Phylogenetic inference problems Polynomial-time algorithms 


  1. 1.
    Aho, A.V., Sagiv, Y., Szymanski, T.G., Ullman, J.D.: Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J. Comput. 10, 405–421 (1981)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Albertin, W., Marullo, P.: Polyploidy in fungi: evolution after whole-genome duplication. Proc. Roy. Soci. Lond. B Biol. Sci. 279(1738), 2497–2509 (2012)CrossRefGoogle Scholar
  3. 3.
    Bansal, M.S., Eulenstein, O.: The multiple gene duplication problem revisited. Bioinformatics 24(13), i132–i138 (2008)CrossRefGoogle Scholar
  4. 4.
    Bordewich, M., Semple, C.: Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Appl. Math. 155(8), 914–928 (2007)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Burleigh, J.G., Bansal, M.S., Eulenstein, O., Vision, T.J.: Inferring species trees from gene duplication episodes. In: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, pp. 198–203. ACM (2010)Google Scholar
  6. 6.
    Burleigh, J.G., Bansal, M.S., Wehe, A., Eulenstein, O.: Locating multiple gene duplications through reconciled trees. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS, vol. 4955, pp. 273–284. Springer, Heidelberg (2008). Scholar
  7. 7.
    Chan, Y.B., Ranwez, V., Scornavacca, C.: Reconciliation-based detection of co-evolving gene families. BMC Bioinform. 14(1), 332 (2013)CrossRefGoogle Scholar
  8. 8.
    Dehal, P., Boore, J.L.: Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 3(10), e314 (2005)CrossRefGoogle Scholar
  9. 9.
    Fellows, M., Hallett, M., Stege, U.: On the multiple gene duplication problem. In: Chwa, K.-Y., Ibarra, O.H. (eds.) ISAAC 1998. LNCS, vol. 1533, pp. 348–357. Springer, Heidelberg (1998). Scholar
  10. 10.
    Glasauer, S.M., Neuhauss, S.C.: Whole-genome duplication in teleost fishes and its evolutionary consequences. Mol. Genet. Genomics 289(6), 1045–1060 (2014)CrossRefGoogle Scholar
  11. 11.
    Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Biol. 28(2), 132–163 (1979)CrossRefGoogle Scholar
  12. 12.
    Guigo, R., Muchnik, I., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Mol. Phylogenet. Evol. 6(2), 189–213 (1996)CrossRefGoogle Scholar
  13. 13.
    Huber, K.T., Moulton, V., Steel, M., Wu, T.: Folding and unfolding phylogenetic trees and networks. J. Math. Biol. 73(6–7), 1761–1780 (2016)MathSciNetCrossRefGoogle Scholar
  14. 14.
    van Iersel, L., Janssen, R., Jones, M., Murakami, Y., Zeh, N.: Polynomial-time algorithms for phylogenetic inference problems (2018). arXiv:1802.00317 [q-bio.PE]
  15. 15.
    van Iersel, L., Kelk, S., Scornavacca, C.: Kernelizations for the hybridization number problem on multiple nonbinary trees. J. Comput. Syst. Sci. 82(6), 1075–1089 (2016)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Luo, C.W., Chen, M.C., Chen, Y.C., Yang, R.W., Liu, H.F., Chao, K.M.: Linear-time algorithms for the multiple gene duplication problems. IEEE/ACM Trans. Comput. Biol. Bioinf. 8(1), 260–265 (2011)CrossRefGoogle Scholar
  17. 17.
    Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Mettanant, V., Fakcharoenphol, J.: A linear-time algorithm for the multiple gene duplication problem. In: National Computer Science and Engineering Conference (Thailand) (2008)Google Scholar
  19. 19.
    Page, R.D.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)Google Scholar
  20. 20.
    Panchy, N., Lehti-Shiu, M., Shiu, S.H.: Evolution of gene duplication in plants. Plant Physiol. 171(4), 2294–2316 (2016)Google Scholar
  21. 21.
    Paszek, J., Gorecki, P.: Efficient algorithms for genomic duplication models. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017)Google Scholar
  22. 22.
    Zhang, J.: Evolution by gene duplication: an update. Trends Ecol. Evol. 18(6), 292–298 (2003)CrossRefGoogle Scholar
  23. 23.
    Zhu, J., Yu, Y., Nakhleh, L.: In the light of deep coalescence: revisiting trees within networks. BMC Bioinform. 17(14), 415 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Leo van Iersel
    • 1
    Email author
  • Remie Janssen
    • 1
  • Mark Jones
    • 1
  • Yukihiro Murakami
    • 1
  • Norbert Zeh
    • 2
  1. 1.Delft Institute of Applied MathematicsDelft University of TechnologyDelftThe Netherlands
  2. 2.Faculty of Computer ScienceDalhousie UniversityHalifaxCanada

Personalised recommendations