Testing Phylogenetic Methods to Identify Horizontal Gene Transfer

  • Maria Poptsova
Part of the Methods in Molecular Biology book series (MIMB, volume 532)


The subject of this chapter is to describe the methodology for assessing the power of phylogenetic HGT detection methods. Detection power is defined in the framework of hypothesis testing. Rates of false positives and false negatives can be estimated by testing HGT detection methods on HGT-free orthologous sets, and on the same sets with in silico simulated HGT events. The whole process can be divided into three steps: obtaining HGT-free orthologous sets, in silico simulation of HGT events in the same set, and submitting both sets for evaluation by any of the tested methods.

Phylogenetic methods of HGT detection can be roughly divided into three types: likelihood-based tests of topologies (Kishino-Hasegawa (KH), Shimodaira-Hasegawa (SH), and Approximately Unbiased (AU) tests), tree distance methods (symmetrical difference of Robinson and Foulds (RF), and Subtree Pruning and Regrafting (SPR) distances), and genome spectral approaches (bipartition and quartet decomposition analysis). Restrictions that are inherent to phylogenetic methods of HGT detection in general and the power and precision of each method are discussed and comparative analyses of different approaches are provided, as well as some examples of assessing the power of phylogenetic HGT detection methods from a case study of orthologous sets from gamma-proteobacteria (Poptsova and Gogarten, BMC Evol Biol 7, 45, 2007) and cyanobacteria (Zhaxybayeva et al., Genome Res 16, 1099–108, 2006).


Phylogenetic methods of HGT detection power of HGT detection methods likelihood-based tests of topologies tree distance methods genome spectral methods 


  1. 1.
    Robinson, D. R., Foulds, L. R. (1981) Comparison of phylogenetic trees. Math Biosci 53, 131–47.CrossRefGoogle Scholar
  2. 2.
    Swofford, D. L., Olsen, G. J. (1990) Phylogeny Reconstruction in Molecular Systematics, Sinauer Associates, Sunderland, Massachusetts.Google Scholar
  3. 3.
    Waterman, M. S., Smith, T. F. (1978) On the similarity of dendrograms. J Theor Biol 73, 789–800.CrossRefPubMedGoogle Scholar
  4. 4.
    Allen, B. L., Steel, M. (2001) Subtree transfer operations and their induced metrics on evolutionary trees. Ann Combinatorics 5, 1–15.CrossRefGoogle Scholar
  5. 5.
    Felsenstein, J. (2004) Inferring Phylogenies, Sinauer Associates, Sunderland, Massachusetts.Google Scholar
  6. 6.
    Beiko, R. G., Harlow, T. J., Ragan, M. A. (2005) Highways of gene sharing in prokaryotes. Proc Natl Acad Sci U S A 102, 14332–7.CrossRefPubMedGoogle Scholar
  7. 7.
    Lento, G. M., Hickson, R. E., Chambers, G. K., Penny, D. (1995) Use of spectral analysis to test hypotheses on the origin of pinnipeds. Mol Biol Evol 12, 28–52.PubMedGoogle Scholar
  8. 8.
    Zhaxybayeva, O., Lapierre, P., Gogarten, J. P. (2004) Genome mosaicism and organismal lineages. Trends Genet 20, 254–60.CrossRefPubMedGoogle Scholar
  9. 9.
    Zhaxybayeva, O., Gogarten, J. P., Charlebois, R. L., Doolittle, W. F., Papke, R. T. (2006) Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. Genome Res 16, 1099–108.CrossRefPubMedGoogle Scholar
  10. 10.
    Strimmer, K., von Haeseler, A. (1996) Quartet puzzling: a quartet maximum-likelihood method for reconstructing tree topologies. Mol. Biol. Evol. 13, 964–69.Google Scholar
  11. 11.
    Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–91.CrossRefGoogle Scholar
  12. 12.
    Beiko, R. G., Hamilton, N. (2006) Phylogenetic identification of lateral genetic transfer events. BMC Evol Biol 6, 15.CrossRefPubMedGoogle Scholar
  13. 13.
    Goldman, N., Anderson, J. P., Rodrigo, A. G. (2000) Likelihood-based tests of topologies in phylogenetics. Syst Biol 49, 652–70.CrossRefPubMedGoogle Scholar
  14. 14.
    Kishino, H., Hasegawa, M. (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol 29, 170–9.CrossRefPubMedGoogle Scholar
  15. 15.
    Shimodaira, H., Hasegawa, M. (1999) Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol 16, 1114–16.Google Scholar
  16. 16.
    Shimodaira, H. (2002) An approximately unbiased test of phylogenetic tree selection. Syst Biol 51, 492–508.CrossRefPubMedGoogle Scholar
  17. 17.
    Woese, C. R., Fox, G. E. (1977) Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci U S A 74, 5088–90.CrossRefPubMedGoogle Scholar
  18. 18.
    Hilario, E., Gogarten, J. P. (1993) Horizontal transfer of ATPase genes – the tree of life becomes a net of life. Biosystems 31, 111–9.CrossRefPubMedGoogle Scholar
  19. 19.
    Gogarten, J. P. (1995) The early evolution of cellular life. Trends Ecol Evol 10, 147–51.CrossRefGoogle Scholar
  20. 20.
    Brown, J. R., Masuchi, Y., Robb, F. T., Doolittle, W. F. (1994) Evolutionary relationships of bacterial and archaeal glutamine synthetase genes. J Mol Evol 38, 566–76.CrossRefPubMedGoogle Scholar
  21. 21.
    Jain, R., Rivera, M. C., Lake, J. A. (1999) Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci U S A 96, 3801–6.CrossRefPubMedGoogle Scholar
  22. 22.
    Nesbo, C. L., Boucher, Y., Doolittle, W. F. (2001) Defining the core of nontransferable prokaryotic genes: the euryarchaeal core. J Mol Evol 53, 340–50.CrossRefPubMedGoogle Scholar
  23. 23.
    Wolf, Y. I., Rogozin, I. B., Grishin, N. V., Koonin, E. V. (2002) Genome trees and the tree of life. Trends Genet 18, 472–9.CrossRefPubMedGoogle Scholar
  24. 24.
    Margush, T., McMorris, F. R. (1981) Consensus n-trees. Bull Math Biol 43, 239–44.Google Scholar
  25. 25.
    Adams, E. (1972) Consensus techniques and the comparison of taxonomic trees. Syst Zool 21, 390–97.CrossRefGoogle Scholar
  26. 26.
    Bininda-Emonds, O. R., Sanderson, M. J. (2001) Assessment of the accuracy of matrix representation with parsimony analysis supertree construction. Syst Biol 50, 565–79.CrossRefPubMedGoogle Scholar
  27. 27.
    Burleigh, J., Eulenstein, O., Fernandez-Baca, D., Sanderson, M. (2004) MRF supertrees. In Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life. (Bininda-Emonds, O.R.P., ed.), Kluwer, Dordrecht, pp. 65–85.Google Scholar
  28. 28.
    Semple, C., Steel, M. (2000) A supertree method for rooted trees. Discrete Appl Math 105, 147–58.CrossRefGoogle Scholar
  29. 29.
    Dagan, T., Martin, W. (2006) The tree of one percent. Genome Biol 7, 118.CrossRefPubMedGoogle Scholar
  30. 30.
    Ciccarelli, F. D., Doerks, T., von Mering, C., Creevey, C. J., Snel, B., Bork, P. (2006) Toward automatic reconstruction of a highly resolved tree of life. Science 311, 1283–7.CrossRefPubMedGoogle Scholar
  31. 31.
    Zhaxybayeva, O., Gogarten, J. P. (2007) Horizontal gene transfer, gene histories and the root of the tree of life. In Astrobiology and the Origins of Life (Pudritz, R. E., Higgs P. G., Stone J., eds.), Cambridge University Press, Cambridge.Google Scholar
  32. 32.
    Cortez, D. Q., Lazcano, A., Becerra, A. (2005) Comparative analysis of methodologies for the detection of horizontally transferred genes: a reassessment of first-order Markov models. In Silico Biol 5, 581–92.PubMedGoogle Scholar
  33. 33.
    Poptsova, M. S., Gogarten, J. P. (2007) The power of phylogenetic approaches to detect horizontally transferred genes. BMC Evol Biol 7, 45.CrossRefPubMedGoogle Scholar
  34. 34.
    Lerat, E., Daubin, V., Moran, N. A. (2003) From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-proteobacteria. PLoS Biol 1, E19.CrossRefPubMedGoogle Scholar
  35. 35.
    Bapteste, E., Boucher, Y., Leigh, J., Doolittle, W. F. (2004) Phylogenetic reconstruction and lateral gene transfer. Trends Microbiol 12, 406–11.CrossRefPubMedGoogle Scholar
  36. 36.
    Beiko, R. G., Charlebois, R. L. (2007) A simulation test bed for hypotheses of genome evolution. Bioinformatics 23, 825–31.CrossRefPubMedGoogle Scholar
  37. 37.
    Hillis, D. M., Bull, J. J. (1993) An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol 42, 182–92.Google Scholar
  38. 38.
    Zhaxybayeva, O., Gogarten, J. P. (2002) Bootstrap, Bayesian probability and maximum likelihood mapping: exploring new tools for comparative genome analyses. BMC Genomics 3, 4.CrossRefPubMedGoogle Scholar
  39. 39.
    Altschul, S. F., Gish, W., Miller, W., Myers, E. W., Lipman, D. J. (1990) Basic local alignment search tool. J Mol Biol 215, 403–10.PubMedGoogle Scholar
  40. 40.
    Montague, M. G., Hutchison, C. A., III (2000) Gene content phylogeny of herpesviruses. Proc Natl Acad Sci U S A 97, 5334–9.CrossRefPubMedGoogle Scholar
  41. 41.
    van Dongen, S. (2000) Graph Clustering by Flow Simulation. University of Utrecht, Utrecht.Google Scholar
  42. 42.
    Poptsova, M. S., Gogarten, J. P. (2007) BranchClust: a phylogenetic algorithm for selecting gene families. BMC Bioinformatics 8, 120.CrossRefPubMedGoogle Scholar
  43. 43.
    Schmidt, H. A., Strimmer, K., Vingron, M., von Haeseler, A. (2002) TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18, 502–4.CrossRefPubMedGoogle Scholar
  44. 44.
    Shimodaira, H., Hasegawa, M. (2001) CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17, 1246–7.CrossRefPubMedGoogle Scholar
  45. 45.
    Felsenstein, J. (1993) PHYLIP (Phylogeny Inference Package) version 3.6 Distributed by the author. Department of Genetics, University of Washington, Seattle.Google Scholar
  46. 46.
    Boc, A., Makarenkov, V. (2003) New efficient algorithms for detection of horizontal gene transfer events. In Algorithms in Bioinformatics. (Benson, G., Page, R., Eds.), pp. 190–201, 3rd Workshop on Algorithms in Bioinformatics, Springer-Verlag, New York.CrossRefGoogle Scholar
  47. 47.
    Makarenkov, V., Boc, A., Boubacar Diallo, A., Baniré Diallo, A. (2008) Algorithms for detecting complete and partial horizontal gene transfers: theory and practice. In CRM Proceedings and AMS Lecture Notes (Pardalos, P. M., Hansen P., eds.), Vol. 45, pp. 159–79.Google Scholar
  48. 48.
    Makarenkov, V. (2001) T-REX: reconstructing and visualizing phylogenetic trees and reticulation networks. Bioinformatics 17, 664–8.CrossRefPubMedGoogle Scholar
  49. 49.
    Nahar, N., Poptsova, M. S., Hamel, L., Gogarten, J. P. (2007) GPX: a tool for the exploration and visualization of genome evolution. Proceedings of the IEEE 7th International Symposium on Bioinformatics & Bioengineering (BIBE07) Boston, 1338–42.Google Scholar
  50. 50.
    Kohonen, T. (2001) Self-Organizing Maps. Springer, New York.Google Scholar
  51. 51.
    Hamel, L., Nahar, N., Poptsova, M. S., Zhaxybayeva, O., Gogarten, J. P. (2008) Unsupervised learning in detection of gene transfer. J Biomed Biotechnol doi: 10.1155/2008/472719.Google Scholar
  52. 52.
    Huelsenbeck, J., Rannala, B. (2004) Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. Syst Biol 53, 904–13.CrossRefPubMedGoogle Scholar
  53. 53.
    Hein, J., Jiang, T., Wang, L., Zhang, K. (1996) On the complexity of comparing evolutionary trees. Discrete Appl. Math. 71, 153–69.CrossRefGoogle Scholar
  54. 54.
    Strimmer, K., Rambaut, A. (2002) Inferring confidence sets of possibly misspecified gene trees. Proc Biol Sci 269, 137–42.CrossRefPubMedGoogle Scholar
  55. 55.
    Bordewich, M., Semple, C. (2007) Computing the hybridization number of two phylogenetic trees is fixed-parameter tractable. IEEE/ACM Trans Comput Biol Bioinform 4, 458–66.CrossRefPubMedGoogle Scholar
  56. 56.
    Baroni, M., Grunewald, S., Moulton, V., Semple, C. (2005) Bounding the number of hybridisation events for a consistent evolutionary history. J Math Biol 51, 171–82.CrossRefPubMedGoogle Scholar
  57. 57.
    Baroni, M., Semple, C., Steel, M. (2006) Hybrids in real time. Syst Biol 55, 46–56.CrossRefPubMedGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Maria Poptsova
    • 1
  1. 1.Department of Molecular and Cell BiologyUniversity of ConnecticutStorrsUSA

Personalised recommendations