New Algorithms for the Genomic Duplication Problem

  • Jarosław PaszekEmail author
  • Paweł Górecki
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10562)


One of evolutionary molecular biology fundamental issues is to discover genomic duplication events and their correspondence to the species tree. Such events can be reconstructed by clustering single gene duplications that are inferred by reconciling a set of gene trees with a species tree. Here we propose the first solution to the genomic duplication problem in which every reconciliation with the minimal number of single gene duplications is allowed and the method of clustering called minimum episodes under the assumption that input gene trees are unrooted. We also present an evaluation study of proposed algorithms on empirical datasets.


Genomic duplication Duplication episode Minimum episodes problem Reconciliation Unrooted gene tree Species tree 



We would like to thank the reviewers for their detailed comments that allowed us to improve our paper. The support was provided by NCN grants #2015/19/N/ST6/01193 and #2015/19/B/ST6/00726.


  1. 1.
    Kellis, M., Birren, B.W., Lander, E.S.: Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428, 617–624 (2004)CrossRefGoogle Scholar
  2. 2.
    Guyot, R., Keller, B.: Ancestral genome duplication in rice. Genome 47(3), 610–614 (2004)CrossRefGoogle Scholar
  3. 3.
    Vision, T.J., Brown, D.G., Tanksley, S.D.: The origins of genomic duplications in Arabidopsis. Science 290(5499), 2114–2117 (2000)CrossRefGoogle Scholar
  4. 4.
    Costantino, L., Sotiriou, S.K., Rantala, J.K., Magin, S., et al.: Break-induced replication repair of damaged forks induces genomic duplications in human cells. Science 343(6166), 88–91 (2014)CrossRefGoogle Scholar
  5. 5.
    Cui, L., Wall, P.K., Leebens-Mack, J.H., Lindsay, B.G., et al.: Widespread genome duplications throughout the history of flowering plants. Genome Res. 16(6), 738–749 (2006)CrossRefGoogle Scholar
  6. 6.
    Aury, J.M., Jaillon, O., Duret, L., Noel, B., et al.: Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444(7116), 171–178 (2006)CrossRefGoogle Scholar
  7. 7.
    Van de Peer, Y., Maere, S., Meyer, A.: The evolutionary significance of ancient genome duplications. Nat. Rev. Genet. 10(10), 725–732 (2009)CrossRefGoogle Scholar
  8. 8.
    Vandepoele, K., Simillion, C., Van de Peer, Y.: Evidence that rice and other cereals are ancient aneuploids. Plant Cell. 15(9), 2192–2202 (2003)CrossRefGoogle Scholar
  9. 9.
    Sato, S., Tabata, S., Hirakawa, H., Asamizu, E., et al.: The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485(7400), 635–641 (2012)CrossRefGoogle Scholar
  10. 10.
    Scossa, F., Brotman, Y., de Abreu e Lima, F., et al.: Genomics-based strategies for the use of natural variation in the improvement of crop metabolism. Plant Sci. 242, 47–64 (2016)CrossRefGoogle Scholar
  11. 11.
    Vanneste, K., Maere, S., Van de Peer, Y.: Tangled up in two: a burst of genome duplications at the end of the Cretaceous and the consequences for plant evolution. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369(1648), 20130353 (2014)CrossRefGoogle Scholar
  12. 12.
    Tang, H., Bowers, J.E., Wang, X., Ming, R., et al.: Synteny and collinearity in plant genomes. Science 320(5875), 486–488 (2008)CrossRefGoogle Scholar
  13. 13.
    Holloway, P., Swenson, K., Ardell, D., El-Mabrouk, N.: Ancestral genome organization: an alignment approach. J. Comput. Biol. 20(4), 280–295 (2013)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Blanc, G., Wolfe, K.H.: Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16(7), 1667–78 (2004)CrossRefGoogle Scholar
  15. 15.
    Bowers, J.E., Chapman, B.A., Rong, J., Paterson, A.H.: Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422(6930), 433–8 (2003)CrossRefGoogle Scholar
  16. 16.
    Jiao, Y., Wickett, N.J., Ayyampalayam, S., Chanderbali, A.S., et al.: Ancestral polyploidy in seed plants and angiosperms. Nature 473(7345), 97–100 (2011)CrossRefGoogle Scholar
  17. 17.
    Rabier, C.E., Ta, T., Ané, C.: Detecting and locating whole genome duplications on a phylogeny: a probabilistic approach. Mol. Biol. Evol. 31(3), 750–62 (2014)CrossRefGoogle Scholar
  18. 18.
    Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)Google Scholar
  19. 19.
    Mirkin, B., Muchnik, I., Smith, T.F.: A biologically consistent model for comparing molecular phylogenies. J. Comput. Biol. 2(4), 493–507 (1995)CrossRefGoogle Scholar
  20. 20.
    Guigó, R., Muchnik, I.B., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Mol. Phylogenet. Evol. 6(2), 189–213 (1996)CrossRefGoogle Scholar
  21. 21.
    Arvestad, L., Berglund, A.C., Lagergren, J., Sennblad, B.: Bayesian gene/species tree reconciliation and orthology analysis using MCMC. Bioinformatics 19(Suppl 1), i7–15 (2003)CrossRefGoogle Scholar
  22. 22.
    Bonizzoni, P., Della Vedova, G., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theor. Comput. Sci. 347(1–2), 36–53 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Noutahi, E., Semeria, M., Lafond, M., Seguin, J., et al.: Efficient gene tree correction guided by genome evolution. PLoS ONE 11(8), 1–22 (2016)CrossRefGoogle Scholar
  24. 24.
    Schmidt-Böcking, H., Reich, K., Templeton, A., Trageser, W., Vill, V.: Reconstructing a supergenetree minimizing. BMC Bioinf. 16(14), S4 (2015)Google Scholar
  25. 25.
    Dondi, R., Mauri, G., Zoppis, I.: Orthology correction for gene tree reconstruction: theoretical and experimental results. Proc. Comput. Sci. 108, 1115–1124 (2017)CrossRefGoogle Scholar
  26. 26.
    Scornavacca, C., Jacox, E., Szöllősi, G.J.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31(6), 841–848 (2014)CrossRefGoogle Scholar
  27. 27.
    Nakhleh, L.: Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol. Evol. 28(12), 719–728 (2013)CrossRefGoogle Scholar
  28. 28.
    Zhu, Y., Lin, Z., Nakhleh, L.: Evolution after whole-genome duplication: a network perspective. G3: Genes, Genomes. Genetics 3(11), 2049–2057 (2013)Google Scholar
  29. 29.
    Zheng, Y., Zhang, L.: Effect of incomplete lineage sorting on tree-reconciliation-based inference of gene duplication. IEEE/ACM Trans. Comput. Biol. Bioinf. 11(3), 477–485 (2014)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Duchemin, W., Anselmetti, Y., Patterson, M., Ponty, Y., et al.: DeCoSTAR: Reconstructing the ancestral organization of genes or genomes using reconciled phylogenies. Genome Biol. Evol. 9(5), 1312–1319 (2017)CrossRefGoogle Scholar
  31. 31.
    Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., et al.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28(2), 132–163 (1979)CrossRefGoogle Scholar
  32. 32.
    Doyon, J.P., Chauve, C., Hamel, S.: Space of gene/species tree reconciliations and parsimonious models. J. Comput. Biol. 16(10), 1399–1418 (2009)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Stolzer, M., Lai, H., Xu, M., et al.: Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics 28(18), i409–i415 (2012)CrossRefGoogle Scholar
  35. 35.
    Górecki, P., Tiuryn, J.: DLS-trees: a model of evolutionary scenarios. Theor. Comput. Sci. 359(1–3), 378–399 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Paszek, J., Górecki, P.: Genomic duplication problems for unrooted gene trees. BMC Genom. 17(1), 165–175 (2016)CrossRefGoogle Scholar
  37. 37.
    Page, R.D.M., Cotton, J.A.: Vertebrate phylogenomics: reconciled trees and gene duplications. In: Pacific Symposium on Biocomputing, pp. 536–547 (2002)Google Scholar
  38. 38.
    Bansal, M.S., Eulenstein, O.: The multiple gene duplication problem revisited. Bioinformatics 24(13), i132–8 (2008)CrossRefGoogle Scholar
  39. 39.
    Burleigh, J.G., Bansal, M.S., Wehe, A., Eulenstein, O.: Locating multiple gene duplications through reconciled trees. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS, vol. 4955, pp. 273–284. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-78839-3_24 CrossRefGoogle Scholar
  40. 40.
    Mettanant, V., Fakcharoenphol, J.: A linear-time algorithm for the multiple gene duplication problem. NCSEC, pp. 198–203 (2008)Google Scholar
  41. 41.
    Luo, C.W., Chen, M.C., Chen, Y.C., Yang, R.W.L., et al.: Linear-time algorithms for the multiple gene duplication problems. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(1), 260–265 (2011)CrossRefGoogle Scholar
  42. 42.
    Burleigh, J.G., Bansal, M.S., Eulenstein, O., Vision, T.J.: Inferring species trees from gene duplication episodes. ACM BCB, pp. 198–203 (2010)Google Scholar
  43. 43.
    Paszek, J., Górecki, P.: Efficient algorithms for genomic duplicationmodels; APBC 2017. IEEE/ACM Trans. Comput. Biol. Bioinform. doi: 10.1109/TCBB.2017.2706679
  44. 44.
    Fellows, M., Hallett, M., Stege, U.: On the multiple gene duplication problem. In: Chwa, K.-Y., Ibarra, O.H. (eds.) ISAAC 1998. LNCS, vol. 1533, pp. 348–357. Springer, Heidelberg (1998). doi: 10.1007/3-540-49381-6_37 CrossRefGoogle Scholar
  45. 45.
    Czabarka, E., Szkely, L., Vision, T.: Minimizing the number of episodes and Gallai’s theorem on intervals. arXiv:12095699;2012
  46. 46.
    Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e122 (2007)CrossRefGoogle Scholar
  47. 47.
    Górecki, P., Eulenstein, O., Tiuryn, J.: Unrooted tree reconciliation: a unified approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 522–536 (2013)CrossRefGoogle Scholar
  48. 48.
    Page, R.D.M., Charleston, M.A.: Reconciled trees and incongruent gene and species trees. Math. Hierarchies Biol. DIMACS 96 37, 57–70 (1997)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Faculty of Mathematics, Informatics and MechanicsUniversity of WarsawWarsawPoland

Personalised recommendations