Abstract
A common problem in phylogenetics is to try to infer a species phylogeny from gene trees. We consider different variants of this problem. The first variant, called Unrestricted Minimal Episodes Inference, aims at inferring a species tree based on a model of speciation and duplication where duplications are clustered in duplication episodes. The goal is to minimize the number of such episodes. The second variant, Parental Hybridization, aims at inferring a species network based on a model of speciation and reticulation. The goal is to minimize the number of reticulation events. It is a variant of the well-studied Hybridization Number problem with a more generous view on which gene trees are consistent with a given species network. We show that these seemingly different problems are in fact closely related and can, surprisingly, both be solved in polynomial time, using a structure we call “beaded trees”. However, we also show that methods based on these problems have to be used with care because the optimal species phylogenies always have some restricted form. We discuss several possibilities to overcome this problem.
Research funded in part by the Netherlands Organization for Scientific Research (NWO), including Vidi grant 639.072.602, the 4TU Applied Mathematics Institute, the Natural Sciences and Engineering Research Council of Canada and the Canada Research Chairs program.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aho, A.V., Sagiv, Y., Szymanski, T.G., Ullman, J.D.: Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J. Comput. 10, 405–421 (1981)
Albertin, W., Marullo, P.: Polyploidy in fungi: evolution after whole-genome duplication. Proc. Roy. Soci. Lond. B Biol. Sci. 279(1738), 2497–2509 (2012)
Bansal, M.S., Eulenstein, O.: The multiple gene duplication problem revisited. Bioinformatics 24(13), i132–i138 (2008)
Bordewich, M., Semple, C.: Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Appl. Math. 155(8), 914–928 (2007)
Burleigh, J.G., Bansal, M.S., Eulenstein, O., Vision, T.J.: Inferring species trees from gene duplication episodes. In: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, pp. 198–203. ACM (2010)
Burleigh, J.G., Bansal, M.S., Wehe, A., Eulenstein, O.: Locating multiple gene duplications through reconciled trees. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS, vol. 4955, pp. 273–284. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78839-3_24
Chan, Y.B., Ranwez, V., Scornavacca, C.: Reconciliation-based detection of co-evolving gene families. BMC Bioinform. 14(1), 332 (2013)
Dehal, P., Boore, J.L.: Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 3(10), e314 (2005)
Fellows, M., Hallett, M., Stege, U.: On the multiple gene duplication problem. In: Chwa, K.-Y., Ibarra, O.H. (eds.) ISAAC 1998. LNCS, vol. 1533, pp. 348–357. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49381-6_37
Glasauer, S.M., Neuhauss, S.C.: Whole-genome duplication in teleost fishes and its evolutionary consequences. Mol. Genet. Genomics 289(6), 1045–1060 (2014)
Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Biol. 28(2), 132–163 (1979)
Guigo, R., Muchnik, I., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Mol. Phylogenet. Evol. 6(2), 189–213 (1996)
Huber, K.T., Moulton, V., Steel, M., Wu, T.: Folding and unfolding phylogenetic trees and networks. J. Math. Biol. 73(6–7), 1761–1780 (2016)
van Iersel, L., Janssen, R., Jones, M., Murakami, Y., Zeh, N.: Polynomial-time algorithms for phylogenetic inference problems (2018). arXiv:1802.00317 [q-bio.PE]
van Iersel, L., Kelk, S., Scornavacca, C.: Kernelizations for the hybridization number problem on multiple nonbinary trees. J. Comput. Syst. Sci. 82(6), 1075–1089 (2016)
Luo, C.W., Chen, M.C., Chen, Y.C., Yang, R.W., Liu, H.F., Chao, K.M.: Linear-time algorithms for the multiple gene duplication problems. IEEE/ACM Trans. Comput. Biol. Bioinf. 8(1), 260–265 (2011)
Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)
Mettanant, V., Fakcharoenphol, J.: A linear-time algorithm for the multiple gene duplication problem. In: National Computer Science and Engineering Conference (Thailand) (2008)
Page, R.D.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)
Panchy, N., Lehti-Shiu, M., Shiu, S.H.: Evolution of gene duplication in plants. Plant Physiol. 171(4), 2294–2316 (2016)
Paszek, J., Gorecki, P.: Efficient algorithms for genomic duplication models. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017)
Zhang, J.: Evolution by gene duplication: an update. Trends Ecol. Evol. 18(6), 292–298 (2003)
Zhu, J., Yu, Y., Nakhleh, L.: In the light of deep coalescence: revisiting trees within networks. BMC Bioinform. 17(14), 415 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
van Iersel, L., Janssen, R., Jones, M., Murakami, Y., Zeh, N. (2018). Polynomial-Time Algorithms for Phylogenetic Inference Problems. In: Jansson, J., Martín-Vide, C., Vega-Rodríguez, M. (eds) Algorithms for Computational Biology. AlCoB 2018. Lecture Notes in Computer Science(), vol 10849. Springer, Cham. https://doi.org/10.1007/978-3-319-91938-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-91938-6_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91937-9
Online ISBN: 978-3-319-91938-6
eBook Packages: Computer ScienceComputer Science (R0)