Inferring Evolutionary Scenarios in the Duplication, Loss and Horizontal Gene Transfer Model

Part of the Lecture Notes in Computer Science book series (LNCS, volume 7230)


An H-tree is a formal model of evolutionary scenario. It can be used to represent any processes with gene duplication and loss, horizontal gene transfer (HGT) and speciation events. The model of H-trees, introduced in [26], is an extension of the duplication-loss model (DL-model). Similarly to its ancestor, it has a number of interesting mathematical and biological properties. It is, however, more computationally complex than the DL-model. In this paper, we primarily address the problem of inferring H-trees that are compatible with a given gene tree and a given phylogeny of species with HGTs. These results create a mathematical and computational foundation for a more general and practical problem of inferring HGTs from given gene and species trees with HGTs. We also demonstrate how our model can be used to support HGT hypotheses based on empirical data sets.


Phylogenetic tree Duplication-loss model Rewrite system Horizontal gene transfer 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Akerborg, O., Sennblad, B., Arvestad, L., Lagergren, J.: Simultaneous bayesian gene tree reconstruction and reconciliation analysis. PNAS 106(14), 5714–5719 (2009)CrossRefGoogle Scholar
  2. 2.
    Arca, B., Lombardo, F., Valenzuela, J.G., Francischetti, I.M.B., Marinotti, O., Coluzzi, M., Ribeiro, J.A.C.: An updated catalogue of salivary gland transcripts in the adult female mosquito, anopheles gambiae. J. Exp. Biol. 208, 3971–3986 (2005)CrossRefGoogle Scholar
  3. 3.
    Arvestad, L., Berglund, A.-C., Lagergren, J., Sennblad, B.: Bayesian gene/species tree reconciliation and orthology analysis using mcmc. Bioinformatics 19(Suppl. 1), i7–i15 (2003)CrossRefGoogle Scholar
  4. 4.
    Arvestad, L., Berglund, A.-C., Lagergren, J., Sennblad, B.: Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution. In: RECOMB, pp. 326–335 (2004)Google Scholar
  5. 5.
    Arvestad, L., Lagergren, J., Sennblad, B.: The gene evolution model and computing its associated probabilities. J. ACM 56(2) (2009)Google Scholar
  6. 6.
    Bansal, M.S., Eulenstein, O.: The multiple gene duplication problem revisited. Bioinformatics 24(13), i132–i138 (2008)CrossRefGoogle Scholar
  7. 7.
    Bansal, M.S., Eulenstein, O., Wehe, A.: The gene-duplication problem: near-linear time algorithms for nni-based local searches. IEEE/ACM Trans. Comput. Biol. Bioinform. 6(2), 221–231 (2009)CrossRefGoogle Scholar
  8. 8.
    Bansal, M.S., Gogarten, J.P., Shamir, R.: Detecting Highways of Horizontal Gene Transfer. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 109–120. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Bansal, M.S., Burleigh, J.G., Eulenstein, O., Wehe, A.: Heuristics for the Gene-Duplication Problem: A Θ(n) Speed-Up for the Local Search. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 238–252. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Bansal, M.S., Eulenstein, O.: An Ω(n 2/ logn) Speed-Up of TBR Heuristics for the Gene-Duplication Problem. In: Giancarlo, R., Hannenhalli, S. (eds.) WABI 2007. LNCS (LNBI), vol. 4645, pp. 124–135. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Bansal, M.S., Eulenstein, O.: The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI Based Local Searches. In: Măndoiu, I., Wang, S.-L., Zelikovsky, A. (eds.) ISBRA 2008. LNCS (LNBI), vol. 4983, pp. 14–25. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Berglund-Sonnhammer, A.-C., Steffansson, P., Betts, M.J., Liberles, D.A.: Optimal gene trees from sequences and species trees using a soft interpretation of parsimony. J. Mol. Evol. 63(2), 240–250 (2006)CrossRefGoogle Scholar
  13. 13.
    Boc, A., Makarenkov, V.: New Efficient Algorithm for Detection of Horizontal Gene Transfer Events. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 190–201. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  14. 14.
    Bonizzoni, P., Della Vedova, G., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theor. Comput. Sci. 347(1-2), 36–53 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Ann. Comb. 8, 409–423 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Burleigh, J.G., Bansal, M.S., Wehe, A., Eulenstein, O.: Locating large-scale gene duplication events through reconciled trees: implications for identifying ancient polyploidy events in plants. J. Comput. Biol. 16(8), 1071–1083 (2009)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Charleston, M.: Jungles: a new solution to the host/parasite phylogeny reconciliation problem. Math. Biosci. 149(2), 191–223 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    Chauve, C., El-Mabrouk, N.: New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 46–58. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  19. 19.
    Doyon, J.-P., Scornavacca, C., Gorbunov, K.Y., Szöllősi, G.J., Ranwez, V., Berry, V.: An Efficient Algorithm for Gene/Species Trees Parsimonious Reconciliation with Losses, Duplications and Transfers. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 93–108. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  20. 20.
    Eulenstein, O., Mirkin, B., Vingron, M.: Duplication-based measures of difference between gene and species trees. J. Comput. Biol. 5(1), 135–148 (1998)CrossRefGoogle Scholar
  21. 21.
    Fellows, M.R., Hallett, M.T., Stege, U.: On the Multiple Gene Duplication Problem. In: Chwa, K.-Y., Ibarra, O.H. (eds.) ISAAC 1998. LNCS, vol. 1533, pp. 347–356. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  22. 22.
    Foley, D.H., Bryan, J.H., Yeates, D., Saul, A.: Evolution and systematics of anopheles: Insights from a molecular phylogeny of australasian mosquitoes. Mol. Phylogenet. Evol. 9(2), 262–275 (1998)CrossRefGoogle Scholar
  23. 23.
    Gogarten, M.B., Gogarten, J.P., Olendzenski, L. (eds.): Horizontal Gene Transfer - Genomes in Flux. Methods in Molecular Biology, vol. 532. Springer, Heidelberg (2009)Google Scholar
  24. 24.
    Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28(2), 132–163 (1979)CrossRefGoogle Scholar
  25. 25.
    Górecki, P.: Reconciliation problems for duplication, loss and horizontal gene transfer. In: RECOMB, pp. 316–325 (2004)Google Scholar
  26. 26.
    Górecki, P.: H-trees: a model of evolutionary scenarios with horizontal gene transfer. Fund. Inform. 103, 105–128 (2010)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Górecki, P., Burleigh, J.G., Eulenstein, O.: Maximum likelihood models and algorithms for gene tree evolution with duplications and losses. BMC Bioinformatics 12(suppl. 1), S15 (2011)CrossRefGoogle Scholar
  28. 28.
    Górecki, P., Eulenstein, O.: A Linear Time Algorithm for Error-Corrected Reconciliation of Unrooted Gene Trees. In: Chen, J., Wang, J., Zelikovsky, A. (eds.) ISBRA 2011. LNCS, vol. 6674, pp. 148–159. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  29. 29.
    Górecki, P., Tiuryn, J.: Dls-trees: A model of evolutionary scenarios. Theor. Comput. Sci. 359(1-3), 378–399 (2006)zbMATHCrossRefGoogle Scholar
  30. 30.
    Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e222 (2007)CrossRefGoogle Scholar
  31. 31.
    Górecki, P., Tiuryn, J.: Urec: a system for unrooted reconciliation. Bioinformatics 23(4), 511–512 (2007)CrossRefGoogle Scholar
  32. 32.
    Guigó, R., Muchnik, I.B., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Mol. Phylogenet. Evol. 6(2), 189–213 (1996)CrossRefGoogle Scholar
  33. 33.
    Hallett, M.T., Lagergren, J.: New algorithms for the duplication-loss model. In: RECOMB, pp. 138–146 (2000)Google Scholar
  34. 34.
    Hallett, M.T., Lagergren, J.: Efficient algorithms for lateral gene transfer problems. In: RECOMB, pp. 149–156 (2001)Google Scholar
  35. 35.
    Hallett, M.T., Lagergren, J., Tofigh, A.: Simultaneous identification of duplications and lateral transfers. In: RECOMB, pp. 347–356 (2004)Google Scholar
  36. 36.
    Hill, T., Nordstrom, K., Thollesson, M., Safstrom, T., Vernersson, A., Fredriksson, R., Schioth, H.: SPRIT: Identifying horizontal gene transfer in rooted phylogenetic trees. BMC Evol. Biol. 10(1) (February 2010)Google Scholar
  37. 37.
    Korochkina, S., Barreau, C., Pradel, G., Jeffery, E., Li, J., Natarajan, R., Shabanowitz, J., Hunt, D., Frevert, U., Vernick, K.D.: A mosquito-specific protein family includes candidate receptors for malaria sporozoite invasion of salivary glands. Cell Microbiol. 8, 163–175 (2006)CrossRefGoogle Scholar
  38. 38.
    Libeskind-Hadas, R., Charleston, M.A.: On the computational complexity of the reticulate cophylogeny reconstruction problem. J. Comput. Biol. 16(1), 105–117 (2009)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)MathSciNetzbMATHCrossRefGoogle Scholar
  40. 40.
    Mirkin, B., Muchnik, I.B., Smith, T.F.: A biologically consistent model for comparing molecular phylogenies. J. Comput. Biol. 2(4), 493–507 (1995)CrossRefGoogle Scholar
  41. 41.
    Nakhleh, L., Linder, C.R., Warnow, T., John, K.S.: Reconstructing reticulate evolution in species - theory and practice. In: Proc. of 8th Annual International Conference on Computational Molecular Biology, pp. 337–346 (2004)Google Scholar
  42. 42.
    Ochiai, K., Yamanaka, T., Kimura, K., Sawada, O.: Inheritance of drug resistance (and its transfer) between Shigella strains and between Shigella and E. coli strains. Hihon Iji Shimpor, 1861 (1959) (in Japanese)Google Scholar
  43. 43.
    Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43(1), 58–77 (1994)Google Scholar
  44. 44.
    Page, R.D.M.: GeneTree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 14(9), 819–820 (1998)CrossRefGoogle Scholar
  45. 45.
    Than, C., Ruths, D., Innan, H., Nakhleh, L.: Confounding factors in hgt detection: Statistical error, coalescent effects, and multiple solutions. J. Comput. Biol. 14, 517–535 (2007)MathSciNetCrossRefGoogle Scholar
  46. 46.
    Tofigh, A., Hallett, M.T., Lagergren, J.: Simultaneous identification of duplications and lateral gene transfers. IEEE/ACM Trans. Comput. Biology Bioinform. 8(2), 517–535 (2011)CrossRefGoogle Scholar
  47. 47.
    Chang, W.-C., Eulenstein, O.: Reconciling Gene Trees with Apparent Polytomies. In: Chen, D.Z., Lee, D.T. (eds.) COCOON 2006. LNCS, vol. 4112, pp. 235–244. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  48. 48.
    Woolfit, M., Iturbe-Ormaetxe, I., McGraw, E.A., O’Neill, S.L.: An Ancient Horizontal Gene Transfer between Mosquito and the Endosymbiotic Bacterium Wolbachia pipientis. Mol. Biol. Evol. 26(2), 367–374 (2009)CrossRefGoogle Scholar
  49. 49.
    Zhang, L., Ng, Y.K., Wu, T., Zheng, Y.: Network model and efficient method for detecting relative duplications or horizontal gene transfers. In: Mandoiu, I.I., Miyano, S., Przytycka, T.M., Rajasekaran, S. (eds.) ICCABS, pp. 214–219. IEEE Computer Society (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Faculty of Mathematics, Informatics and MechanicsUniversity of WarsawWarszawaPoland

Personalised recommendations