Mean Values of Gene Duplication and Loss Cost Functions

  • Paweł GóreckiEmail author
  • Jarosław Paszek
  • Agnieszka Mykowiecka
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9683)


Reconciliation based cost functions play crucial role in comparing gene family trees with their species tree. To provide a better understanding of tree reconciliation we derive mean formulas for gene duplication, gene loss and gene duplication-loss cost functions, for a fixed species tree under the uniform model of gene trees. We then analyse the time complexity and study mathematical properties of these formulas. Finally, we provide several computational experiments on empirical datasets for the duplication, duplication-loss and deep coalescence means under the uniform model.


Tree reconciliation Duplication-loss model Deep coalescence Speciation Gene duplication Gene loss Bijectively labelled tree Uniform model of trees Mean value 



We would like to thank the four reviewers for their detailed comments that allowed us to improve our paper. JP was supported by the DSM funding for young researchers of the Faculty of Mathematics, Informatics and Mechanics of the University of Warsaw.


  1. 1.
    Aldous, D.J.: Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today. Stat. Sci. 16, 23–34 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Bansal, M.S., Burleigh, J.G., Eulenstein, O.: Efficient genome-scale phylogenetic analysis under the duplication-loss and deep coalescence cost models. BMC Bioinf. 11(Suppl 1), S42 (2010)CrossRefGoogle Scholar
  3. 3.
    Blum, M.G., François, O.: On statistical tests of phylogenetic tree imbalance: the sackin and other indices revisited. Math. Biosci. 195(2), 141–153 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Bonizzoni, P., Della Vedova, G., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theoret. Comput. Sci. 347(1–2), 36–53 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Furnas, G.W.: The generation of random, binary unordered trees. J. Classif. 1(1), 187–233 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28(2), 132–163 (1979)CrossRefGoogle Scholar
  7. 7.
    Górecki, P., Eulenstein, O.: Deep coalescence reconciliation with unrooted gene trees: linear time algorithms. In: Gudmundsson, J., Mestre, J., Viglas, T. (eds.) COCOON 2012. LNCS, vol. 7434, pp. 531–542. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  8. 8.
    Górecki, P., Eulenstein, O.: Gene tree diameter for deep coalescence. IEEE-ACM Trans. Comput. Biol. Bioinf. 12(1), 155–165 (2015)CrossRefGoogle Scholar
  9. 9.
    Górecki, P., Paszek, J., Eulenstein, O.: Unconstrained gene tree diameters for deep coalescence. In: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. BCB 2014, NY, USA, pp. 114–121. ACM, New York (2014)Google Scholar
  10. 10.
    Górecki, P., Tiuryn, J.: DLS-trees: a model of evolutionary scenarios. Theoret. Comput. Sci. 359(1–3), 378–399 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Górecki, P., Eulenstein, O., Tiuryn, J.: Unrooted tree reconciliation: a unified approach. IEEE-ACM Trans. Comput. Biol. Bioinf. 10(2), 522–536 (2013)CrossRefGoogle Scholar
  12. 12.
    Górecki, P., Paszek, J., Eulenstein, O.: Duplication cost diameters. In: Basu, M., Pan, Y., Wang, J. (eds.) ISBRA 2014. LNCS, vol. 8492, pp. 212–223. Springer, Heidelberg (2014)Google Scholar
  13. 13.
    Górecki, P., Tiuryn, J.: URec: a system for unrooted reconciliation. Bioinformatics 23(4), 511–512 (2007)CrossRefGoogle Scholar
  14. 14.
    Hallett, M.T., Lagergren, J.: Efficient algorithms for lateral gene transfer problems. In: Proceedings of the Fifth Annual International Conference on Computational Biology. RECOMB 2001, NY, USA, pp. 149–156. ACM, New York (2001)Google Scholar
  15. 15.
    Harding, E.F.: The probabilities of rooted tree-shapes generated by random bifurcation. Adv. Appl. Probab. 3(1), 44–77 (1971)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Maddison, W.P.: Gene trees in species trees. Syst. Biol. 46, 523–536 (1997)CrossRefGoogle Scholar
  18. 18.
    Maddison, W.P., Knowles, L.L.: Inferring phylogeny despite incomplete lineage sorting. Syst. Biol. 55(1), 21–30 (2006)CrossRefGoogle Scholar
  19. 19.
    McKenzie, A., Steel, M.: Distributions of cherries for two models of trees. Math. Biosci. 164(1), 81–92 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Page, R.: From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Mol. Phylogenet. Evol. 7(2), 231–240 (1997)CrossRefGoogle Scholar
  21. 21.
    Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)Google Scholar
  22. 22.
    Pamilo, P., Nei, M.: Relationships between gene trees and species trees. Mol. Biol. Evol. 5(5), 568–583 (1988)Google Scholar
  23. 23.
    Rosenberg, N.A.: The probability of topological concordance of gene trees and species trees. Theoret. Popul. Biol. 61(2), 225–247 (2002)CrossRefzbMATHGoogle Scholar
  24. 24.
    Ruan, J., Li, H., Chen, Z., Coghlan, A., Coin, L.J., Guo, Y., Hériché, J.K., Hu, Y., Kristiansen, K., Li, R., Liu, T., Moses, A., Qin, J., Vang, S., Vilella, A.J., Ureta-Vidal, A., Bolund, L., Wang, J., Durbin, R.: TreeFam: 2008 update. Nucleic Acids Res. 36, D735–D740 (2008)CrossRefGoogle Scholar
  25. 25.
    Steel, M.A., Penny, D.: Distributions of tree comparison metrics – some new results. Syst. Biol. 42(2), 126–141 (1993)MathSciNetGoogle Scholar
  26. 26.
    Than, C., Nakhleh, L.: Species tree inference by minimizing deep coalescences. PLoS Comput. Biol. 5(9), e1000501 (2009)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Than, C.V., Rosenberg, N.A.: Consistency properties of species tree inference by minimizing deep coalescences. J. Comput. Biol. 18(1), 1–15 (2011)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Than, C.V., Rosenberg, N.A.: Mathematical properties of the deep coalescence cost. IEEE-ACM Trans. Comput. Biol. Bioinf. 10(1), 61–72 (2013)CrossRefGoogle Scholar
  29. 29.
    Than, C.V., Rosenberg, N.A.: Mean deep coalescence cost under exchangeable probability distributions. Discrete Appl. Math. 174, 11–26 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Sherman, D.J., Martin, T., Nikolski, M., Cayla, C., Souciet, J.L., Durrens, P.: Génolevures: protein families and synteny among complete hemiascomycetous yeast proteomes and genomes. Nucleic Acids Res. 37(suppl 1), D550–D554 (2009)CrossRefGoogle Scholar
  31. 31.
    The Génolevures Consortium, et al.: Comparative genomics of protoploid saccharomycetaceae, Genome Res. 19(10), 1696–1709 (2009)Google Scholar
  32. 32.
    Zhang, L.: From gene trees to species trees II: Species tree inference by minimizing deep coalescence events. IEEE-ACM Trans. Comput. Biol. Bioinf. 8, 1685–1691 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Paweł Górecki
    • 1
    Email author
  • Jarosław Paszek
    • 1
  • Agnieszka Mykowiecka
    • 1
  1. 1.Faculty of Mathematics, Informatics and MechanicsUniversity of WarsawWarsawPoland

Personalised recommendations