Removing Noise from Gene Trees

  • Andrea Doroftei
  • Nadia El-Mabrouk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6833)


Reconciliation is the commonly used method for inferring the evolutionary scenario for a gene family. It consists in “embedding” an inferred gene tree into a known species tree, revealing the evolution of the gene family by duplications and losses. The main complaint about reconciliation is that the inferred evolutionary scenario is strongly dependant on the considered gene tree, as few misplaced leaves may lead to a completely different history, with significantly more duplications and losses. As using different phylogenetic methods with different parameters may lead to different gene trees, it is essential to have criteria to choose, among those, the appropriate one for reconciliation. In this paper, following the conclusion of a previous paper, we flag certain duplication vertices of a gene tree, the “non-apparent duplication” (NAD) vertices, as resulting from the misplacement of leaves, and consider the optimization problem of removing the minimum number of leaves leading to a tree without any NAD vertex. We develop a polynomial-time algorithm that is exact for two special classes of gene trees, and show a good performance on simulated data sets in the general case.


Species Tree Gene Tree Mast Problem Internal Vertex Weighted Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amir, A., Keselman, D.: Maximum agreement subtree in a set of evolutionary trees: matrics and efficient algorithms. SIAM J. Computing 26, 1656–1669 (1997)CrossRefzbMATHGoogle Scholar
  2. 2.
    Blomme, T., Vandepoele, K., De Bodt, S., Silmillion, C., Maere, S., van de Peer, Y.: The gain and loss of genes during 600 millions years of vertebrate evolution. Genome Biology 7, R43 (2006)CrossRefGoogle Scholar
  3. 3.
    Bonizzoni, P., Della Vedova, G., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theoretical Computer Science 347, 36–53 (2005)CrossRefzbMATHGoogle Scholar
  4. 4.
    Chauve, C., Doyon, J.-P., El-Mabrouk, N.: Gene family evolution by duplication, speciation and loss. J. Comput. Biol. 15, 1043–1062 (2008)CrossRefGoogle Scholar
  5. 5.
    Chauve, C., El-Mabrouk, N.: New perspectives on gene family evolution: Losses in reconciliation and a link with supertrees. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 46–58. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Chen, K., Durand, D., Farach-Colton, M.: Notung: Dating gene duplications using gene family trees. Journal of Computational Biology 7, 429–447 (2000)CrossRefGoogle Scholar
  7. 7.
    Cole, R., Farach-Colton, M., Hariharan, R., Przytycka, T., Thorup, M.: An o(nlogn) algorithm for the maximum agreement subtree problem for binary trees. SIAM J. Computing 30(5), 1385–1404 (2000)CrossRefzbMATHGoogle Scholar
  8. 8.
    Cotton, J.A., Page, R.D.M.: Rates and patterns of gene duplication and loss in the human genome. Proceedings of the Royal Society of London Series B 272, 277–283 (2005)CrossRefGoogle Scholar
  9. 9.
    Doyon, J.-P., Scornavacca, C., Gorbunov, K., Szolloso, G., Ranwez, V., Berry, V.: An effi. algo. for gene/species trees parsim. reconc. with losses, dup. and transf. J. Comp. Biol. 6398, 93–108 (2010)Google Scholar
  10. 10.
    Durand, D., Haldórsson, B.V., Vernot, B.: A hybrid micro-macroevolutionary approach to gene tree reconstruction. Journal of Computational Biology 13, 320–335 (2006)CrossRefGoogle Scholar
  11. 11.
    Eichler, E.E., Sankoff, D.: Structural dynamics of eukaryotic chromosome evolution. Science 301, 793–797 (2003)CrossRefGoogle Scholar
  12. 12.
    Finden, C.R., Gordon, A.D.: Obtaining common pruned trees. J. Classification 2, 255–276 (1985)CrossRefGoogle Scholar
  13. 13.
    Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28, 132–163 (1979)CrossRefGoogle Scholar
  14. 14.
    Gorecki, P., Tiuryn, J.: DLS-trees: a model of evolutionary scenarios. Theoretical Computer Science 359, 378–399 (2006)CrossRefzbMATHGoogle Scholar
  15. 15.
    Guigó, R., Muchnik, I., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Molecular Phylogenetics and Evolution 6, 189–213 (1996)CrossRefGoogle Scholar
  16. 16.
    Hahn, M.W.: Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution. Genome Biology 8(R141) (2007)Google Scholar
  17. 17.
    Hahn, M.W., Han, M.V., Han, S.-G.: Gene family evolution across 12 drosophilia genomes. PLoS Genetics 3, e197 (2007)CrossRefGoogle Scholar
  18. 18.
    Hallett, M., Lagergren, J., Tofigh, A.: Simultaneous identification of duplications and lateral transfers. In: RECOMB. ACM, New York (2004)Google Scholar
  19. 19.
    Li, W.H., Gu, Z., Wang, H., Nekrutenko, A.: Evolutionary analysis of the human genome. Nature 409, 847–849 (2001)CrossRefGoogle Scholar
  20. 20.
    Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. on Comput. 30, 729–752 (2000)CrossRefzbMATHGoogle Scholar
  21. 21.
    Ohno, S.: Evolution by gene duplication. Springer, Berlin (1970)CrossRefGoogle Scholar
  22. 22.
    Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 43, 58–77 (1994)Google Scholar
  23. 23.
    Page, R.D.M.: Genetree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 14, 819–820 (1998)CrossRefGoogle Scholar
  24. 24.
    Page, R.D.M., Charleston, M.A.: Reconciled trees and incongruent gene and species trees. DIMACS Series in Discrete Mathematics and Theoretical Computer Science 37, 57–70 (1997)zbMATHGoogle Scholar
  25. 25.
    Sanderson, M.J., McMahon, M.M.: Inferring angiosperm phylogeny from EST data with widespread gene duplication. BMC Evolutionary Biology 7, S3 (2007)CrossRefGoogle Scholar
  26. 26.
    Steel, M., Warnow, T.: Kaikoura tree theorems:computing the maximum agreement subtree. Inform. Process. Lett. 48, 77–82 (1993)CrossRefzbMATHGoogle Scholar
  27. 27.
    Tofigh, A., Hallett, M., Lagergren, J.: Simultaneous identification of duplications and lateral gene transfers. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 517–535 (2011)CrossRefGoogle Scholar
  28. 28.
    Wapinski, I., Pfeffer, A., Friedman, N., Regev, A.: Natural history and evolutionary principles of gene duplication in fungi. Nature 449, 54–61 (2007)CrossRefGoogle Scholar
  29. 29.
    Zhang, L.X.: On Mirkin-Muchnik-Smith conjecture for comparing molecular phylogenies. Journal of Computational Biology 4, 177–188 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Andrea Doroftei
    • 1
  • Nadia El-Mabrouk
    • 2
  1. 1.DIROUniversité de MontréalCanada
  2. 2.DIROCanada

Personalised recommendations