Identifiability Issues in Phylogeny-Based Detection of Horizontal Gene Transfer
Prokaryotic organisms share genetic material across species boundaries by means of a process known as horizontal gene transfer (HGT). Detecting this process bears great significance on understanding prokaryotic genome diversification and unraveling their complexities. Phylogeny-based detection of HGT is one of the most commonly used approaches for this task, and is based on the fundamental fact that HGT may cause gene trees to disagree with one another, as well as with the species phylogeny. Hence, methods that adopt this approach compare gene and species trees, and infer a set of HGT events to reconcile the differences among these trees.
In this paper, we address some of the identifiability issues that face phylogeny-based detection of HGT. In particular, we show the effect of inaccuracies in the reconstructed (species and gene) trees on inferring the correct number of HGT events. Further, we show that a large number of maximally parsimonious HGT scenarios may exist. These results indicate that accurate detection of HGT requires accurate reconstruction of individual trees, and necessitates the search for more than a single scenario to explain gene tree disagreements. Finally, we show that disagreements among trees may be a result of not only HGT, but also lineage sorting, and make initial progress on incorporating HGT into the coalescent model, so as to stochastically distinguish between the two and make an accurate reconciliation. This contribution is very significant, particularly when analyzing closely related organisms.
Unable to display preview. Download preview PDF.
- 1.Addario-Berry, L., Hallett, M.T., Lagergren, J.: Towards identifying lateral gene transfer events. In: Proc. 8th Pacific Symp. on Biocomputing (PSB 2003), pp. 279–290 (2003)Google Scholar
- 2.Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics, 1–15 (in press, 2005)Google Scholar
- 7.Hallett, M.T., Lagergren, J.: Efficient algorithms for lateral gene transfer problems. In: Proc. 5th Ann. Int’l Conf. Comput. Mol. Biol. (RECOMB 2001), pp. 149–156. ACM Press, New York (2001)Google Scholar
- 10.Kimura, M.: The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61, 893–903 (1969)Google Scholar
- 18.Nakhleh, L., Warnow, T., Linder, C.R.: Reconstructing reticulate evolution in species–theory and practice. In: Proc. 8th Ann. Int’l Conf. Comput. Mol. Biol. (RECOMB 2004), pp. 337–346 (2004)Google Scholar
- 20.Rambaut, A., Grassly, N.C.: Seq-gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comp. Appl. Biosci. 13, 235–238 (1997)Google Scholar
- 22.Rosenberg, N.A.: Gene genealogies. In: Fox, C.W., Wolf, J.B. (eds.) Evolutionary Genetics: Concepts and Case Studies, ch. 15. Oxford University Press, Oxford (2005)Google Scholar
- 23.Ruths, D., Nakhleh, L.: Techniques for assessing phylogenetic branch support: A performance study. In: Proceedings of the Fourth Asia-Pacific Bioinformatics Conference (APBC 2006), pp. 187–196 (2006)Google Scholar
- 24.Saitou, N., Nei, M.: The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)Google Scholar
- 25.Sanderson, M.: r8s software package, Available from: http://loco.ucdavis.edu/r8s/r8s.html
- 26.Swofford, D.L.: PAUP*: Phylogenetic analysis using parsimony (and other methods). Sinauer Associates, Underland (1996); Version 4.0Google Scholar
- 27.Tajima, F.: Evolutionary relationship of DNA sequences in finite populations. Genetics 105, 437–460 (1983)Google Scholar
- 28.Takahata, N.: Gene genealogy in three related populations: Consistency probability between gene and population trees. Genetics 122, 957–966 (1989)Google Scholar
- 29.Welch, R.A., Burland, V., Plunkett, G., Redford, P., Roesch, P., Rasko, D., Buckles, E.L., Liou, S.R., Boutin, A., Hackett, J., et al.: Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc. Natl. Acad. Sci. U.S.A. 99, 17020–17024 (2002)CrossRefGoogle Scholar