Abstract
Gene tree reconciliation problems invoke the minimum number of evolutionary events that reconcile gene evolution within the context of a species tree. Here we focus on the deep coalescence (DC) problem, that is, given an unrooted gene tree and a rooted species tree, find a rooting of the gene tree that minimizes the number of DC events, or DC cost, when reconciling the gene tree with the species tree. We describe an O(n) time and space algorithm for the DC problem, where n is the size of the input trees, which improves on the time complexity of the best-known solution by a factor of n. Moreover, we provide an O(n) time and space algorithm that computes the DC scores for each rooting of the given gene tree. We also describe intriguing properties of the DC cost, which can be used to identify credible rootings in gene trees. Finally, we demonstrate the performance of our new algorithms in an empirical study using data from public databases.
This work was supported by MNiSW (#N N301 065236) to PG, by the National Science Foundation (#0830012 and #106029) to OE, and by NCN (#2011/01/B/ST6/02777) and the NIMBioS Working Group “Gene Tree Reconciliation” to PG and OE.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bansal, M.S., Burleigh, J.G., Eulenstein, O.: Efficient genome-scale phylogenetic analysis under the duplication-loss and deep coalescence cost models. BMC Bioinformatics 11(suppl. 1), S42 (2010)
Bender, M.A., Farach-Colton, M.: The LCA Problem Revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)
Bininda-Emonds, O.R.P., Gittleman, J.L., Steel, M.A.: The (super) tree of life: procedures, problems, and prospects. Annual Review of Ecology and Systematics 33, 265–289 (2002)
Burleigh, J.G., Bansal, M.S., Eulenstein, O., Hartmann, S., Wehe, A., Vision, T.J.: Genome-scale phylogenetics: inferring the plant tree of life from 18,896 discordant gene trees. Systematic Biology 60, 117–125
Chaudhary, R., Bansal, M., Wehe, A., Fernández-Baca, D., Eulenstein, O.: iGTP: A software package for large-scale gene tree parsimony analysis. BMC Bioinformatics 11(1), 574 (2010)
Chen, F., Mackey, A.J., Stoeckert, C.J., Roos, D.S.: Orthomcl-db: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Research 34(suppl. 1), D363–D368
Davies, J.T., Fritz, S.A., Grenyer, R., Orme, C.D.L., Bielby, J., Bininda-Emonds, O.R.P., Cardillo, M., Jones, K.E., Gittleman, J.L., Mace, G.M., Purvis, A.: Phylogenetic trees and the future of mammalian biodiversity. PNAS 105, 11556–11563 (2008)
Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32, 1792–1797 (2004)
Edwards, E.J., Still, C.J., Donoghue, M.J.: The relevance of phylogeny to studies of global change. Trends In Ecology & Evolution 22(5), 243–249 (2007)
Forest, F., et al.: Preserving the evolutionary potential of floras in biodiversity hotspots. Nature 445(7129), 757–760 (2007)
Górecki, P., Tiuryn, J.: DLS-trees: A model of evolutionary scenarios. Theor. Comput. Sci. 359(1-3), 378–399 (2006)
Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e122 (2007)
Guindon, S., Delsuc, F., Dufayard, J., Gascuel, O.: Estimating maximum likelihood phylogenies with PhyML. Methods Mol. Biol. 537, 113–137 (2009)
Koonin, E.V., Galperin, M.Y.: Sequence - evolution - function: computational approaches in comparative genomics. Kluwer Academic (2003)
Maddison, W.P.: Gene trees in species trees. Systematic Biology 46, 523–536 (1997)
Page, R.D.M., Holmes, E.C.: Molecular evolution: a phylogenetic approach. Blackwell Science (1998)
Sayers, E.W., et al. Database resources of the national center for biotechnology information. Nucleic Acids Research 37(suppl. 1), D5–D15 (2009)
Smith, A.: Rooting molecular trees: problems and strategies. Biol. J. Linn. Soc. 51, 279–292
Thuiller, W., Lavergne, S., Roquet, C., Boulangeat, I., Lafourcade, B., Araujo, M.: Consequences of climate change on the tree of life in Europe. Nature 470(7335), 531–534 (2011)
Wheeler, W.: Nucleic acid sequence phylogeny and random outgroups. Cladistics – The International Journal of the Willi Hennig Society 51, 363–368 (1990)
Willis, C.G., Ruhfel, B., Primack, R.B., Miller-Rushing, A.J., Davis, C.C.: Phylogenetic patterns of species loss in thoreau’s woods are driven by climate change. PNAS 105, 17029–17033 (2008)
Yu, Y., Warnow, T., Nakhleh, L.: Algorithms for MDC-Based Multi-locus Phylogeny Inference. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 531–545. Springer, Heidelberg (2011)
Zhang, L.: From Gene Trees to Species Trees II: Species Tree Inference by Minimizing Deep Coalescence Events. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8, 1685–1691 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Górecki, P., Eulenstein, O. (2012). Deep Coalescence Reconciliation with Unrooted Gene Trees: Linear Time Algorithms. In: Gudmundsson, J., Mestre, J., Viglas, T. (eds) Computing and Combinatorics. COCOON 2012. Lecture Notes in Computer Science, vol 7434. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32241-9_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-32241-9_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32240-2
Online ISBN: 978-3-642-32241-9
eBook Packages: Computer ScienceComputer Science (R0)