Horizontal Gene Transfer pp 241-256

Part of the Methods in Molecular Biology book series (MIMB, volume 532)

| Cite as

Untangling Hybrid Phylogenetic Signals: Horizontal Gene Transfer and Artifacts of Phylogenetic Reconstruction

  • Robert G. Beiko
  • Mark A. Ragan


Phylogenomic methods can be used to investigate the tangled evolutionary relationships among genomes. Building ‘all the trees of all the genes’ can potentially identify common pathways of horizontal gene transfer (HGT) among taxa at varying levels of phylogenetic depth. Phylogenetic affinities can be aggregated and merged with the information about genetic linkage and biochemical function to examine hypotheses of adaptive evolution via HGT. Additionally, the use of many genetic data sets increases the power of statistical tests for phylogenetic artifacts. However, large-scale phylogenetic analyses pose several challenges, including the necessary abandonment of manual validation techniques, the need to translate inferred phylogenetic discordance into inferred HGT events, and the challenges involved in aggregating results from search-based inference methods. In this chapter we describe a tree search procedure to recover the most parsimonious pathways of HGT, and examine some of the assumptions that are made by this method.


Phylogenetics phylogenomics horizontal gene transfer subtree prune-and-regraft bipartitions model violation 


  1. 1.
    Lawrence, J. G., Ochman, H. (1997) Amelioration of bacterial genomes: rates of change and exchange. J Mol Evol 44, 383–97.CrossRefPubMedGoogle Scholar
  2. 2.
    Ragan, M. A. (2001) On surrogate methods for detecting lateral gene transfer. FEMS Microbiol Lett 201, 187–91.CrossRefPubMedGoogle Scholar
  3. 3.
    Ragan, M. A., Harlow, T. J., Beiko, R. G. (2006) Do different surrogate methods detect lateral genetic transfer events of different relative ages? Trends Microbiol 14, 4–8.CrossRefPubMedGoogle Scholar
  4. 4.
    Ragan, M. A., Charlebois, R. L. (2002) Distributional profiles of homologous open reading frames among bacterial phyla: implications for vertical and lateral transmission. Int J Syst Evol Microbiol 52, 777–87.CrossRefPubMedGoogle Scholar
  5. 5.
    Dagan, T., Martin, W. (2007) Ancestral genome sizes specify the minimum rate of lateral gene transfer during prokaryote evolution. Proc Natl Acad Sci USA 104, 870–5.CrossRefPubMedGoogle Scholar
  6. 6.
    Clarke, G. D. P., Beiko, R. G., Ragan, M. A., Charlebois, R. L. (2002) Inferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores. J Bacteriol 184, 2072–80.CrossRefPubMedGoogle Scholar
  7. 7.
    Lerat, E., Daubin, V., Moran, N. A. (2003) From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-Proteobacteria. PLoS Biol 1, E19.CrossRefPubMedGoogle Scholar
  8. 8.
    Beiko, R. G., Harlow, T. J., Ragan, M. A. (2005) Highways of gene sharing in prokaryotes. Proc Natl Acad Sci USA 102, 14332–7.CrossRefPubMedGoogle Scholar
  9. 9.
    Ge, F., Wang, L. S., Kim, J. (2005) The cobweb of life revealed by genome-scale estimates of horizontal gene transfer. PLoS Biol 3, E16.CrossRefGoogle Scholar
  10. 10.
    Li, W., Jaroszewski, L., Godzik, A. (2001) Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics 17, 282–3.CrossRefPubMedGoogle Scholar
  11. 11.
    Harlow, T. J., Gogarten, J. P., Ragan, M. A. (2004) A hybrid clustering approach to recognition of protein families in 114 microbial genomes. BMC Bioinformatics 5, 45.CrossRefPubMedGoogle Scholar
  12. 12.
    Poptsova, M. S., Gogarten, J. P. (2007) BranchClust: a phylogenetic algorithm for selecting gene families. BMC Bioinformatics 8, 120.CrossRefPubMedGoogle Scholar
  13. 13.
    Beiko, R. G., Chan, C.-X., Ragan, M. A. (2005) A word-oriented objective function for alignment validation. Bioinformatics 21, 2230–9.CrossRefPubMedGoogle Scholar
  14. 14.
    Allen, B. L., Steel, M. (2001) Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Combinatorics 5, 1–15.CrossRefGoogle Scholar
  15. 15.
    Hickey, G., Dehne, F., Rau-Chaplin, A., Blouin, C. (2008) SPR distance computation for unrooted trees, Evolutionary Bioinformatics 4, 17–27.Google Scholar
  16. 16.
    Hallett, M., Lagergren, J. (2001) Efficient algorithms for lateral gene transfer problems. RECOMB 2001, 149–56.Google Scholar
  17. 17.
    MacLeod, D., Charlebois, R. L., Doolittle, W. F., Bapteste, E. (2005) Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement. BMC Evol Biol 5, 27.CrossRefPubMedGoogle Scholar
  18. 18.
    Beiko, R. G., Hamilton, N. (2006) Phylogenetic identification of lateral genetic transfer events. BMC Evol Biol 6, 15.CrossRefPubMedGoogle Scholar
  19. 19.
    Than, C., Ruths, D., Innan, H., Nakhleh, L. (2007) Confounding factors in HGT detection: statistical error, coalescent effects, multiple solutions. J Comp Biol 14, 517–35.CrossRefGoogle Scholar
  20. 20.
    Beiko, R. G., Ragan, M. A. (2008) Detecting lateral genetic transfer: a phylogenetic approach, in Bioinformatics (Keith, J. M., ed.), Humana, Totowa, NJ, 457–69.CrossRefGoogle Scholar
  21. 21.
    Creevey, C. J., Fitzpatrick, D. A., Philip, G. K., Kinsella, R. J., O’Connell, M. J., Pentony, M. M., Travers, S. A., Wilkinson, M., McInerney, J. O. (2004) Does a tree-like phylogeny only exist at the tips in the prokaryotes? Proc Biol Sci 271, 2551–8.CrossRefPubMedGoogle Scholar
  22. 22.
    Zhaxybayeva, O., Gogarten, J. P., Charlebois, R. L., Doolittle, W. F., Papke, R. T. (2006) Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. Genome Res 9, 1099–108.CrossRefGoogle Scholar
  23. 23.
    Bryant, D., Moulton, V. (2004) Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol 21, 255–65.CrossRefPubMedGoogle Scholar
  24. 24.
    Huson, D. H., Dezulian, T., Klopper, T., Steel, M. (2004) Phylogenetic super-networks from partial trees. IEEE Trans Comput Biol Bioinform 1, 151–8.CrossRefGoogle Scholar
  25. 25.
    Ragan, M. A. (1992) Phylogenetic inference based on matrix representation of trees. Mol Phylogenet Evol 1, 53–8.CrossRefPubMedGoogle Scholar
  26. 26.
    Nakhleh, L., Warnow, T., Linder, C. R., St. John, K. (2005) Reconstructing reticulate evolution in species – theory and practice, J Comput Biol 12, 796–811.CrossRefPubMedGoogle Scholar
  27. 27.
    Bordewich, M., Semple, C. (2007) Computing the minimum number of hybridization events for a consistent evolutionary history. Discrete Appl Math 155, 914–28.CrossRefGoogle Scholar
  28. 28.
    Kurland, C. G., Canback, B., Berg, O. G. (2003) Horizontal gene transfer: a critical view. Proc Natl Acad Sci USA 100, 9658–62.CrossRefPubMedGoogle Scholar
  29. 29.
    Maddison, W. P. (1997) Gene trees in species trees. Syst Biol 46, 523–46.CrossRefGoogle Scholar
  30. 30.
    Inagaki, Y., Susko, E., Roger, A. J. (2006) Recombination between elongation factor \(1\alpha\) genes from distantly related archaeal lineages. Proc Natl Acad Sci USA 103, 4528–33.CrossRefPubMedGoogle Scholar
  31. 31.
    Chan, C.-X., Beiko, R. G., Ragan, M. A. (2007) A two-phase strategy for detecting recombination in nucleotide sequences. South Africa Comp J 38, 20–7.Google Scholar
  32. 32.
    Hein, J. (1993) A heuristic method to reconstruct the history of sequences subject to recombination. J Mol Evol 36, 396–405.CrossRefGoogle Scholar
  33. 33.
    Husmeier, D., McGuire, G. (2002) Detecting recombination with MCMC. Bioinformatics 18 Suppl 1, S345–53.PubMedGoogle Scholar
  34. 34.
    Swofford, D. L., Waddell, P. J., Huelsenbeck, J. P., Foster, P. G., Lewis, P. O., Rogers, J. S. (2001) Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Syst Biol 50, 525–39.CrossRefPubMedGoogle Scholar
  35. 35.
    Philippe, H., Delsuc, F., Brinkmann, H., Lartillot, N. (2005) Phylogenomics. Annu Rev Ecol Evol Syst 36, 541–62.CrossRefGoogle Scholar
  36. 36.
    Leigh, J. W., Susko, E., Baumgartner, M., Roger, A. J. (2008) Testing congruence in phylogenomic analysis. Syst Biol 57, 104–15.CrossRefPubMedGoogle Scholar
  37. 37.
    Singer, G. A. C., Hickey, D. A. (2000) Nucleotide bias causes a genomewide bias in the amino acid composition of proteins. Mol Biol Evol 17, 1581–8.PubMedGoogle Scholar
  38. 38.
    Fukuchi, S., Yoshimune, K., Wakayama, M., Moriguchi, M., Nishikawa, K. (2003) Unique amino acid composition of proteins in halophilic bacteria. J Mol Biol 327, 347–57.CrossRefPubMedGoogle Scholar
  39. 39.
    Ho, S. Y., Jermiin, L. S. (2004) Tracing the decay of the historical signal in biological sequence data. Syst Biol 53, 623–37.CrossRefPubMedGoogle Scholar
  40. 40.
    Jermiin, L. S., Ho, S. Y. W., Ababneh, F., Robinson, J., Larkum, A. W. D. (2004) The biasing effect of compositional heterogeneity on phylogenetic estimates may be underestimated. Syst Biol 53, 638–43.CrossRefPubMedGoogle Scholar
  41. 41.
    Lockhart, P. J., Steel, M. A., Hendy, M. D., Penny, D. (1994) Recovering evolutionary trees under a more realistic model of sequence evolution. Mol Biol Evol 11, 605–12.PubMedGoogle Scholar
  42. 42.
    Delsuc, F., Phillips, M. J., Penny, D. (2003) Comment on “Hexapod origins: monophyletic or paraphyletic?” Science 301, 1482.CrossRefPubMedGoogle Scholar
  43. 43.
    Sullivan, J., Swofford, D. L. (1997) Are guinea pigs rodents? The importance of adequate models in molecular phylogenetics. J Mamm Evol 4, 77–86.CrossRefGoogle Scholar
  44. 44.
    Wu, C. I., Li, W. H. (1985) Evidence for higher rates of nucleotide substitution in rodents than in man. Proc Natl Acad Sci USA 82, 1741–5.CrossRefPubMedGoogle Scholar
  45. 45.
    Woese, C. R., Achenbach, L., Rouviere, P., Mandelco, L. (1991) Archaeal phylogeny: re-examination of the phylogenetic position of Archaeoglobus fulgidus in light of certain composition-induced artefacts. Syst Appl Microbiol 14, 364–71.PubMedGoogle Scholar
  46. 46.
    Boucher, Y., Douady, C. J., Papke, R. T., Walsh, D. A., Boudreau, M. E., Nesbø, C. L., Case, R. J., Doolittle, W. F. (2003) Lateral gene transfer and the origins of prokaryotic groups. Annu Rev Genet 37, 283–328.CrossRefPubMedGoogle Scholar
  47. 47.
    Raymond, J., Zhaxybayeva, O., Gogarten, J. P., Gerdes, S. Y., Blankenship, R. E. (2002) Whole-genome analysis of photosynthetic prokaryotes. Science 298, 1616–20.CrossRefPubMedGoogle Scholar
  48. 48.
    Jain, R., Rivera, M. C., Lake, J. A. (1999) Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci USA 96, 3801–6.CrossRefPubMedGoogle Scholar
  49. 49.
    Galtier, N. (2007) A model of horizontal gene transfer and the bacterial phylogeny problem. Syst Biol 56, 633–42.CrossRefPubMedGoogle Scholar
  50. 50.
    Beiko, R. G., Charlebois, R. L. (2007) A simulation test bed for hypotheses of genome evolution. Bioinformatics 23, 825–31.CrossRefPubMedGoogle Scholar
  51. 51.
    Sorek, R., Zhu, Y., Creevey, C. J., Francino, M. P., Bork, P., Rubin, E. M. (2007) Genome-wide experimental determination of barriers to horizontal gene transfer. Science 318, 1449–52.CrossRefPubMedGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Robert G. Beiko
    • 1
  • Mark A. Ragan
    • 2
  1. 1.Department of Computer ScienceDalhousie UniversityHalifaxCanada
  2. 2.Institute for Molecular Bioscience and Australian Research Council Centre of Excellence in BioinformaticsSt. LuciaAustralia

Personalised recommendations