Detecting Laterally Transferred Genes

  • Rajeev K. Azad
  • Jeffrey G. LawrenceEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 855)


Methods for identifying alien genes in genomes fall into two general classes. Phylogenetic methods examine the distribution of a gene’s homologues among genomes to find those with relationships not consistent with vertical inheritance. These approaches include identifying orphan genes which lack homologues in closely related genomes and genes with unduly high levels of similarity to genes in otherwise unrelated genomes. Rigorous statistical tests are available to place confidence intervals for predicted alien genes. Parametric methods examine the compositional properties of genes within a genome to find those with atypical properties, likely indicating the directional mutational pressures of a donor genome. These methods may compare the properties of genes to genomic averages, properties of genes to each other, or properties of large, multigene regions of the chromosome. Here, we discuss the strengths and weaknesses of each approach.

Key words

Phylogeny Codon usage bias Dinucleotide frequencies HMM Jensen–Shannon entropic divergence 



This work was supported by NIH grant GM078092.


  1. 1.
    Mayr, E (1942) Systematics and the Origin of Species, Columbia University Press, New York.Google Scholar
  2. 2.
    Lederberg, J, and Tatum, EL. (1946) Gene recombination in Escherichia coli. Nature 158, 558.Google Scholar
  3. 3.
    Ochia, K, Yamanaka, K, Kimura, K, et al. (1959) Inheritance of drug resistance (and its transfer) between Shigella strains and between Shigella and E. coli strains. Nihon Iji Shimpo 1861, 34.Google Scholar
  4. 4.
    Zinder, ND, and Lederberg, J. (1952) Genetic exchange in Salmonella. J. Bacteriol. 64, 679–697.PubMedGoogle Scholar
  5. 5.
    Avery, OT, MacLeod, CM, and McCarty, M. (1944) Studies on the chemical nature of the substance inducing transformation of Pneumococcal types. J. Exper. Med. 79, 137–158.CrossRefGoogle Scholar
  6. 6.
    Doolittle, WF. (1999) Phylogenetic classification and the universal tree. Science 284, 2124–2129.PubMedCrossRefGoogle Scholar
  7. 7.
    Ochman, H, Lawrence, JG, and Groisman, E. (2000) Lateral gene transfer and the nature of bacterial innovation. Nature 405, 299–304.PubMedCrossRefGoogle Scholar
  8. 8.
    Welch, RA, Burland, V, Plunkett, G, 3rd, et al. (2002) Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc. Natl. Acad. Sci., USA 99, 17020–17024.CrossRefGoogle Scholar
  9. 9.
    Hao, W, and Golding, GB. (2004) Patterns of bacterial gene movement. Mol Biol Evol 21, 1294–1307.PubMedCrossRefGoogle Scholar
  10. 10.
    Hao, W, and Golding, GB. (2006) The fate of laterally transferred genes: life in the fast lane to adaptation or death. Genome Res 16, 636–643.PubMedCrossRefGoogle Scholar
  11. 11.
    Dobrindt, U, Hochhut, B, Hentschel, U, et al. (2004) Genomic islands in pathogenic and environmental microorganisms. Nat. Rev. Microbiol. 2, 414–424.PubMedCrossRefGoogle Scholar
  12. 12.
    Hacker, J, Blum-Oehler, G, Muhldorfer, I, et al. (1997) Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution. Mol. Microbiol. 23, 1089–1097.PubMedCrossRefGoogle Scholar
  13. 13.
    Hacker, J, and Kaper, JB. (2000) Pathogenicity islands and the evolution of microbes. Annu. Rev. Microbiol. 54, 641–679.PubMedCrossRefGoogle Scholar
  14. 14.
    Zuckerkandl, E, and Pauling, L. (1965) Molecules as documents of evolutionary history. J. Theoret. Biol. 8, 357–366.CrossRefGoogle Scholar
  15. 15.
    Bapteste, E, Boucher, Y, Leigh, J, et al. (2004) Phylogenetic reconstruction and lateral gene transfer. Trends Microbiol 12, 406–411.PubMedCrossRefGoogle Scholar
  16. 16.
    Bapteste, E, O’Malley, MA, Beiko, RG, et al. (2009) Prokaryotic evolution and the tree of life are two different things. Biol Direct 4, 34.PubMedCrossRefGoogle Scholar
  17. 17.
    Gogarten, JP, Doolittle, WF, and Lawrence, JG. (2002) Prokaryotic evolution in light of gene transfer. Mol. Biol. Evol. 19, 2226–2238.PubMedCrossRefGoogle Scholar
  18. 18.
    Koonin, EV, and Wolf, YI. (2009) The fundamental units, processes and patterns of evolution, and the tree of life conundrum. Biol Direct 4, 33.PubMedCrossRefGoogle Scholar
  19. 19.
    Beiko, RG, Harlow, TJ, and Ragan, MA. (2005) Highways of gene sharing in prokaryotes. Proc. Natl. Acad. Sci., USA 102, 14332–14337.CrossRefGoogle Scholar
  20. 20.
    Poptsova, MS, and Gogarten, JP. (2007) BranchClust: a phylogenetic algorithm for selecting gene families. BMC Bioinformatics 8, 120.PubMedCrossRefGoogle Scholar
  21. 21.
    Altenhoff, AM, Schneider, A, Gonnet, GH, et al. (2011) OMA 2011: orthology inference among 1000 complete genomes. Nucleic Acids Res 39, D289–294.PubMedCrossRefGoogle Scholar
  22. 22.
    Koonin, EV. (2005) Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet 39, 309–338.PubMedCrossRefGoogle Scholar
  23. 23.
    Clarke, GD, Beiko, RG, Ragan, MA, et al. (2002) Inferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores. J. Bacteriol. 184, 2072–2080.PubMedCrossRefGoogle Scholar
  24. 24.
    MacLeod, D, Charlebois, RL, Doolittle, F, et al. (2005) Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement. BMC Evol. Biol. 5, 27.PubMedCrossRefGoogle Scholar
  25. 25.
    Zhaxybayeva, O, Gogarten, JP, Charlebois, RL, et al. (2006) Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. Genome Res 16, 1099–1108.PubMedCrossRefGoogle Scholar
  26. 26.
    Garcia-Vallve, S, Romeu, A, and Palau, J. (2000) Horizontal gene transfer of glycosyl hydrolases of the rumen fungi. Mol. Biol. Evol. 17, 352–361.PubMedCrossRefGoogle Scholar
  27. 27.
    International Human Genome Sequencing Consortium. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.CrossRefGoogle Scholar
  28. 28.
    Salzberg, SL, White, O, Peterson, J, et al. (2001) Microbial genes in the human genome: lateral transfer or gene loss? Science 292, 1903–1906.PubMedCrossRefGoogle Scholar
  29. 29.
    Lawrence, JG, and Hartl, DL. (1992) Inference of horizontal genetic transfer: An approach using the bootstrap. Genetics 131, 753–760.PubMedGoogle Scholar
  30. 30.
    Dessimoz, C, Margadant, D, and Gonnet, GH. (2008) DLIGHT – Lateral gene transfer detection using pairwise evolutionary distances in a statistical framework. RECOMB 2008, 315–330.Google Scholar
  31. 31.
    Raymond, J, Zhaxybayeva, O, Gogarten, JP, et al. (2002) Whole-genome analysis of photosynthetic prokaryotes. Science 298, 1616–1620.PubMedCrossRefGoogle Scholar
  32. 32.
    Retchless, AC, and Lawrence, JG. (2010) Phylogenetic incongruence arising from fragmented speciation in enteric bacteria. Proc. Natl. Acad. Sci., USA 107, 11453–11458.CrossRefGoogle Scholar
  33. 33.
    Chan, CX, Darling, AE, Beiko, RG, et al. (2009) Are protein domains modules of lateral genetic transfer? PLoS One 4, e4524.PubMedCrossRefGoogle Scholar
  34. 34.
    Chan, CX, Beiko, RG, Darling, AE, et al. (2009) Lateral transfer of genes and gene fragments in prokaryotes. Genome Biol Evol 1, 429–438.PubMedCrossRefGoogle Scholar
  35. 35.
    Inagaki, Y, Susko, E, and Roger, AJ. (2006) Recombination between elongation factor 1alpha genes from distantly related archaeal lineages. Proc Natl Acad Sci U S A 103, 4528–4533.PubMedCrossRefGoogle Scholar
  36. 36.
    Omelchenko, MV, Makarova, KS, Wolf, YI, et al. (2003) Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ. Genome Biol 4, R55.PubMedCrossRefGoogle Scholar
  37. 37.
    Makarenkov, V, Boc, A, Xie, J, et al. (2010) Weighted bootstrapping: a correction method for assessing the robustness of phylogenetic trees. BMC Evol Biol 10, 250.PubMedCrossRefGoogle Scholar
  38. 38.
    Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791.CrossRefGoogle Scholar
  39. 39.
    Shimodaira, H, and Hasegawa, M. (1999) Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16, 1114–1116.CrossRefGoogle Scholar
  40. 40.
    Shimodaira, H. (2002) An approximately unbiased test of phylogenetic tree selection. Syst Biol 51, 492–508.PubMedCrossRefGoogle Scholar
  41. 41.
    Poptsova, MS, and Gogarten, JP. (2007) The power of phylogenetic approaches to detect horizontally transferred genes. BMC Evol Biol 7, 45.PubMedCrossRefGoogle Scholar
  42. 42.
    Boc, A, Philippe, H, and Makarenkov, V. (2010) Inferring and validating horizontal gene transfer events using bipartition dissimilarity. Syst Biol 59, 195–211.PubMedCrossRefGoogle Scholar
  43. 43.
    Lento, GM, Hickson, RE, Chambers, GK, et al. (1995) Use of spectral analysis to test hypotheses on the origin of pinnipeds. Mol Biol Evol 12, 28–52.PubMedCrossRefGoogle Scholar
  44. 44.
    Zhaxybayeva, O, Lapierre, P, and Gogarten, JP. (2004) Genome mosaicism and organismal lineages. Trends Genet 20, 254–260.PubMedCrossRefGoogle Scholar
  45. 45.
    Beiko, RG, and Ragan, MA. (2008) Detecting lateral genetic transfer: a phylogenetic approach. Methods Mol. Biol. 452, 457–469.PubMedCrossRefGoogle Scholar
  46. 46.
    Beiko, RG, and Hamilton, N. (2006) Phylogenetic identification of lateral genetic transfer events. BMC Evol. Biol. 6, 15.PubMedCrossRefGoogle Scholar
  47. 47.
    Beiko, RG, and Ragan, MA. (2009) Untangling hybrid phylogenetic signals: horizontal gene transfer and artifacts of phylogenetic reconstruction. Methods Mol Biol 532, 241–256.PubMedCrossRefGoogle Scholar
  48. 48.
    Winfield, MD, and Groisman, EA. (2004) Phenotypic differences between Salmonella and Escherichia coli resulting from the disparate regulation of homologous genes. Proc. Natl. Acad. Sci., USA 101, 17162–17167.CrossRefGoogle Scholar
  49. 49.
    Daubin, V, and Ochman, H. (2004) Quartet mapping and the extent of lateral transfer in bacterial genomes. Mol. Biol. Evol. 21, 86–89.PubMedCrossRefGoogle Scholar
  50. 50.
    Daubin, V, Moran, NA, and Ochman, H. (2003) Phylogenetics and the cohesion of bacterial genomes. Science 301, 829–832.PubMedCrossRefGoogle Scholar
  51. 51.
    Bapteste, E, Susko, E, Leigh, J, et al. (2005) Do orthologous gene phylogenies really support tree-thinking? BMC Evol. Biol. 5, 33.Google Scholar
  52. 52.
    Ochman, H, and Lawrence, JG. (1996) Phylogenetics and the amelioration of bacterial genomes, in Escherichia coli and Salmonella typhimurium: Cellular and molecular biology, 2nd edition (Neidhardt, FC, Curtiss III, R, Ingraham, JL, et al., Eds.), pp 2627–2637, American Society for Microbiology, Washington, D.C.Google Scholar
  53. 53.
    Lawrence, JG, and Ochman, H. (1997) Amelioration of bacterial genomes: rates of change and exchange. J. Mol. Evol. 44, 383–397.PubMedCrossRefGoogle Scholar
  54. 54.
    Lawrence, JG, and Ochman, H. (1998) Molecular archaeology of the Escherichia coli genome. Proc. Natl. Acad. Sci., USA 95, 9413–9417.CrossRefGoogle Scholar
  55. 55.
    Karlin, S, and Burge, C. (1995) Dinucleotide relative abundance extremes: a genomic signature. Trends Genet. 11, 283–290.PubMedCrossRefGoogle Scholar
  56. 56.
    Karlin, S. (1998) Global dinucleotide signatures and analysis of genomic heterogeneity. Curr. Opin. Microbiol. 1, 598–610.PubMedCrossRefGoogle Scholar
  57. 57.
    Hooper, SD, and Berg, OG. (2002) Detection of genes with atypical nucleotide sequence in microbial genomes. J. Mol. Evol. 54, 365–375.PubMedGoogle Scholar
  58. 58.
    Karlin, S, Mrazek, J, and Campbell, AM. (1998) Codon usages in different gene classes of the Escherichia coli genome. Mol. Microbiol. 29, 1341–1355.PubMedCrossRefGoogle Scholar
  59. 59.
    Campbell, A, Mrazek, J, and Karlin, S. (1999) Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. Proc. Natl. Acad. Sci., USA 96, 9184–9189.CrossRefGoogle Scholar
  60. 60.
    Mrazek, J, and Karlin, S. (1999) Detecting alien genes in bacterial genomes. Ann. N.Y. Acad. Sci. 870, 314–329.CrossRefGoogle Scholar
  61. 61.
    Garcia-Vallve, S, Guzman, E, Montero, MA, et al. (2003) HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes. Nucleic Acids Res. 31, 187–189.PubMedCrossRefGoogle Scholar
  62. 62.
    Dufraigne, C, Fertil, B, Lespinats, S, et al. (2005) Detection and characterization of horizontal transfers in prokaryotes using genomic signature. Nucleic Acids Res. 33, e6.PubMedCrossRefGoogle Scholar
  63. 63.
    Chatterjee, R, Chaudhuri, K, and Chaudhuri, P. (2008) On detection and assessment of statistical significance of Genomic Islands. BMC Genomics 9, 150.PubMedCrossRefGoogle Scholar
  64. 64.
    Tsirigos, A, and Rigoutsos, I. (2005) A sensitive, support-vector-machine method for the detection of horizontal gene transfers in viral, archaeal and bacterial genomes. Nucleic Acids Res. 33, 3699–3707.PubMedCrossRefGoogle Scholar
  65. 65.
    Nakamura, Y, Itoh, T, Matsuda, H, et al. (2004) Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat. Genet. 36, 760–766.PubMedCrossRefGoogle Scholar
  66. 66.
    Merkl, R. (2004) SIGI: score-based identification of genomic islands. BMC Bioinformatics 5, 22.PubMedCrossRefGoogle Scholar
  67. 67.
    Vernikos, GS, and Parkhill, J. (2006) Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. Bioinformatics 22, 2196–2203.PubMedCrossRefGoogle Scholar
  68. 68.
    Vernikos, GS, and Parkhill, J. (2008) Resolving the structural features of genomic islands: a machine learning approach. Genome Res. 18, 331–342.PubMedCrossRefGoogle Scholar
  69. 69.
    Zhang, R, and Zhang, CT. (2005) Genomic Islands in the Corynebacterium efficiens genome. Appl. Environ. Microbiol. 71, 3126–3130.PubMedCrossRefGoogle Scholar
  70. 70.
    Sandberg, R, Winberg, G, Branden, CI, et al. (2001) Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res. 11, 1404–1409.PubMedCrossRefGoogle Scholar
  71. 71.
    Arvey, AJ, Azad, RK, Raval, A, et al. (2009) Detection of genomic islands via segmental genome heterogeneity. Nucleic Acids Res. 37, 5255–5266.PubMedCrossRefGoogle Scholar
  72. 72.
    Médigue, C, Rouxel, T, Vigier, P, et al. (1991) Evidence of horizontal gene transfer in Escherichia coli speciation. J. Mol. Biol. 222, 851–856.PubMedCrossRefGoogle Scholar
  73. 73.
    Wang, HC, Badger, J, Kearney, P, et al. (2001) Analysis of codon usage patterns of bacterial genomes using the self-organizing map. Mol. Biol. Evol. 18, 792–800.PubMedCrossRefGoogle Scholar
  74. 74.
    Kohonen, T. (1982) Self-organized formation of topologically correct feature map. Biol. Cybern. 43, 59–69.CrossRefGoogle Scholar
  75. 75.
    Hayes, WS, and Borodovsky, M. (1998) How to interpret an anonymous bacterial genome: machine learning approach to gene identification. Genome Res. 8, 1154–1171.PubMedGoogle Scholar
  76. 76.
    Rabiner, L. (1989) A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc. IEEE 77, 257–286.CrossRefGoogle Scholar
  77. 77.
    Azad, RK, and Lawrence, JG. (2007) Detecting laterally transferred genes: use of entropic clustering methods and genome position. Nucleic Acids Res. 35, 4629–4639.PubMedCrossRefGoogle Scholar
  78. 78.
    Lin, J. (1991) Divergence measures based on the Shannon entropy. IEEE Trans. Inform. Theory 37, 145–151.CrossRefGoogle Scholar
  79. 79.
    Sandberg, R, Winberg, G, Branden, CI, et al. (2001) Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res 11, 1404–1409.PubMedCrossRefGoogle Scholar
  80. 80.
    McHardy, AC, and Rigoutsos, I. (2007) What’s in the mix: phylogenetic classification of metagenome sequence samples. Curr Opin Microbiol 10, 499–503.PubMedCrossRefGoogle Scholar
  81. 81.
    McHardy, AC, Martin, HG, Tsirigos, A, et al. (2007) Accurate phylogenetic classification of variable-length DNA fragments. Nat Methods 4, 63–72.PubMedCrossRefGoogle Scholar
  82. 82.
    Akaike, H. (1974) A new look at the statistical model identification. IEEE Trans. Automat. Contrl. AC-19, 716–723.Google Scholar
  83. 83.
    Azad, RK, and Lawrence, JG. (2005) Use of artificial genomes in assessing methods for atypical gene detection. PLoS Comp. Biol. 1, e56.CrossRefGoogle Scholar
  84. 84.
    Braun, JV, and Müller, H-G. (1998) Statistical methods for DNA sequence segmentation. Statistical Science 13, 142–162.CrossRefGoogle Scholar
  85. 85.
    Azad, RK, Lawrence, JG, Thakur, V, et al. (2007) Segmentation of genomic DNA sequences, in Advanced Computational Methods for Biocomputing and Bioimaging (Pham, TD, Yan, H, and Crane, DI, Eds.), Nova Science Publishers, New York.Google Scholar
  86. 86.
    Durbin, R, Eddy, S, Krogh, A, et al. (1998) Biological Sequence Analysis: Probabilistic models of proteins and nucleic acids, Cambridge University Press, Cambridge.CrossRefGoogle Scholar
  87. 87.
    Bernaola-Galvan, P, Roman-Roldan, R, and Oliver, JL. (1996) Compositional segmentation and long-range fractal correlations in DNA sequences. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics 53, 5181–5189.PubMedCrossRefGoogle Scholar
  88. 88.
    Thakur, V, Azad, RK, and Ramaswamy, R. (2007) Markov models of genome segmentation. Phys Rev E Stat Nonlin Soft Matter Phys 75, 011915.PubMedCrossRefGoogle Scholar
  89. 89.
    Sueoka, N. (1988) Directional mutation pressure and neutral molecular evolution. Proc. Natl. Acad. Sci., USA 85, 2653–2657.CrossRefGoogle Scholar
  90. 90.
    Ragan, MA. (2001) On surrogate methods for detecting lateral gene transfer. FEMS Microbiol. Lett. 201, 187–191.PubMedCrossRefGoogle Scholar
  91. 91.
    Becq, J, Churlaud, C, and Deschavanne, P. (2010) A benchmark of parametric methods for horizontal transfers detection. PLoS One 5, e9989.PubMedCrossRefGoogle Scholar
  92. 92.
    Zaneveld, JR, Nemergut, DR, and Knight, R. (2008) Are all horizontal gene transfers created equal? Prospects for mechanism-based studies of HGT patterns. Microbiology 154, 1–15.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Department of Biological SciencesUniversity of PittsburghPittsburghUSA
  2. 2.Departments of Biological Sciences and MathematicsUniversity of North TexasDentonUSA

Personalised recommendations