A Distance-Based Method for Inferring Phylogenetic Networks in the Presence of Incomplete Lineage Sorting

  • Yun Yu
  • Luay Nakhleh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9096)


Hybridization and incomplete lineage sorting (ILS) are two evolutionary processes that result in incongruence among gene trees and complicate the identification of the species evolutionary history. Although a wide array of methods have been developed for inference of species phylogeny in the presence of each of these two processes individually, methods that can account for both of them simultaneously have been introduced recently. However, these new methods are based on the optimization of certain criteria, such as parsimony and likelihood, and are thus computationally intensive. In this paper, we present a novel distance-based method for inferring phylogenetic networks in the presence of ILS that makes use of pairwise distances computed from multiple sampled loci across the genome. We show in simulation studies that the method infers accurate networks when the estimated pairwise distances have good accuracy. Furthermore, we devised a heuristic for post-processing the inferred network to remove potential false positive reticulation events. The method is computationally very efficient and is applicable to very large data sets.


Gene Tree Pairwise Distance Phylogenetic Network Incomplete Lineage Sorting Optimal Pair 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arnold, M.L.: Natural Hybridization and Evolution. Oxford University Press, Oxford (1997)Google Scholar
  2. 2.
    Barton, N.H.: The role of hybridization in evolution. Molecular Ecology 10(3), 551–568 (2001)CrossRefMathSciNetGoogle Scholar
  3. 3.
    The Heliconius Genome Consortium: Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature 487(7405), 94–98 (2012)Google Scholar
  4. 4.
    Cranston, K.A., Hurwitz, B., Ware, D., Stein, L., Wing, R.A.: Species trees from highly incongruent gene trees in rice. Syst. Biol. 58, 489–500 (2009)CrossRefGoogle Scholar
  5. 5.
    Degnan, J.H., Rosenberg, N.A.: Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol. Evol. 24(6), 332–340 (2009)CrossRefGoogle Scholar
  6. 6.
    Eriksson, A., Manica, A.: Effect of ancient population structure on the degree of polymorphism shared between modern human populations and ancient hominins. Proceedings of the National Academy of Sciences 109(35), 13956–13960 (2012)CrossRefGoogle Scholar
  7. 7.
    Green, R.E., Krause, J., Briggs, A.W., Maricic, T., Stenzel, U., Kircher, M., Patterson, N., Li, H., Zhai, W., Fritz, M.H.-Y., Hansen, N.F., Durand, E.Y., Malaspinas, A.-S., Jensen, J.D., Marques-Bonet, T., Alkan, C., Prfer, K., Meyer, M., Burbano, H.A., Good, J.M., Schultz, R., Aximu-Petri, A., Butthof, A., Hber, B., Hffner, B., Siegemund, M., Weihmann, A., Nusbaum, C., Lander, E.S., Russ, C., Novod, N., Affourtit, J., Egholm, M., Verna, C., Rudan, P., Brajkovic, D., Kucan, E., Guic, I., Doronichev, V.B., Golovanova, L.V., Lalueza-Fox, C., de la Rasilla, M., Fortea, J., Rosas, A., Schmitz, R.W., Johnson, P.L.F., Eichler, E.E., Falush, D., Birney, E., Mullikin, J.C., Slatkin, M., Nielsen, R., Kelso, J., Lachmann, M., Reich, D., Pbo, S.: A draft sequence of the Neandertal genome. Science 328(5979), 710–722 (2010)CrossRefGoogle Scholar
  8. 8.
    Hobolth, A., Dutheil, J., Hawks, J., Schierup, M., Mailund, T.: Incomplete lineage sorting patterns among human, chimpanzee, and orangutan suggest recent orangutan speciation and widespread selection. Genome Research 21, 349–356 (2011)CrossRefGoogle Scholar
  9. 9.
    Holland, B.R., Benthin, S., Lockhart, P.J., Moulton, V., Huber, K.T.: Using supernetworks to distinguish hybridization from lineage-sorting. BMC Evol. Biol. 8, 202 (2008)CrossRefGoogle Scholar
  10. 10.
    Hudson, R.R.: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002)CrossRefGoogle Scholar
  11. 11.
    Huson, D.H., Rupp, R., Scornavacca, C.: Phylogenetic Networks: Concepts, Algorithms and Applications. Cambridge University Press, New York (2010)CrossRefGoogle Scholar
  12. 12.
    Joly, S., McLenachan, P.A., Lockhart, P.J.: A statistical approach for distinguishing hybridization and incomplete lineage sorting. Am. Nat. 174(2), E54–E70 (2009)Google Scholar
  13. 13.
    Kubatko, L.S.: Identifying hybridization events in the presence of coalescence via model selection. Syst. Biol. 58(5), 478–488 (2009)CrossRefGoogle Scholar
  14. 14.
    Kuo, C.-H., Wares, J.P., Kissinger, J.C.: The Apicomplexan whole-genome phylogeny: An analysis of incongruence among gene trees. Mol. Biol. Evol. 25(12), 2689–2698 (2008)CrossRefGoogle Scholar
  15. 15.
    Liu, L., Yu, L.L., Kubatko, L., Pearl, D.K., Edwards, S.V.: Coalescent methods for estimating phylogenetic trees. Mol. Phylogenet. Evol. 53, 320–328 (2009)CrossRefGoogle Scholar
  16. 16.
    Maddison, W.P.: Gene trees in species trees. Syst. Biol. 46(3), 523–536 (1997)CrossRefGoogle Scholar
  17. 17.
    Mallet, J.: Hybridization as an invasion of the genome. Trends Ecol. Evol. 20(5), 229–237 (2005)CrossRefGoogle Scholar
  18. 18.
    Mallet, J.: Hybrid speciation. Nature 446, 279–283 (2007)CrossRefGoogle Scholar
  19. 19.
    Meng, C., Kubatko, L.S.: Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: A model. Theor. Popul. Biol. 75(1), 35–45 (2009)CrossRefzbMATHGoogle Scholar
  20. 20.
    Moody, M.L., Rieseberg, L.H.: Sorting through the chaff, nDNA gene trees for phylogenetic inference and hybrid identification of annual sunflowers (Helianthus sect Helianthus). Molecular Phylogenetics And Evolution 64, 145–155 (2012)CrossRefGoogle Scholar
  21. 21.
    Mossel, E., Roch, S.: Incomplete lineage sorting: consistent phylogeny estimation from multiple loci. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 7(1), 166–171 (2010)CrossRefGoogle Scholar
  22. 22.
    Nakhleh, L.: Evolutionary phylogenetic networks: models and issues. In: Heath, L., Ramakrishnan, N. (eds.) The Problem Solving Handbook for Computational Biology and Bioinformatics, pp. 125–158. Springer, New York (2010)CrossRefGoogle Scholar
  23. 23.
    Nakhleh, L.: Computational approaches to species phylogeny inference and gene tree reconciliation. Trends in Ecology & Evolution 28(12), 719–728 (2013)CrossRefGoogle Scholar
  24. 24.
    Pollard, D.A., Iyer, V.N., Moses, A.M., Eisen, M.B.: Widespread discordance of gene trees with species tree in Drosophila: evidence for incomplete lineage sorting. PLoS Genet. 2(10), e173 (2006)Google Scholar
  25. 25.
    Rambaut, A.: Phylogen v1.1 (2012),
  26. 26.
    Rannala, B., Yang, Z.: Phylogenetic inference using whole genomes. Annu. Rev. Genomics Hum. Genet. 9, 217–231 (2008)CrossRefGoogle Scholar
  27. 27.
    Rieseberg, L.H.: Hybrid origins of plant species. Annu. Rev. Ecol. Syst. 28, 359–389 (1997)CrossRefGoogle Scholar
  28. 28.
    Staubach, F., Lorenc, A., Messer, P.W., Tang, K., Petrov, D.A., Tautz, D.: Genome patterns of selection and introgression of haplotypes in natural populations of the house mouse (mus musculus). PLoS Genet. 8(8), e1002891 (2012)Google Scholar
  29. 29.
    Syring, J., Willyard, A., Cronn, R., Liston, A.: Evolutionary relationships among Pinus (Pinaceae) subsections inferred from multiple low-copy nuclear loci. Am. J. Bot. 92, 2086–2100 (2005)CrossRefGoogle Scholar
  30. 30.
    Takuno, S., Kado, T., Sugino, R.P., Nakhleh, L., Innan, H.: Population genomics in bacteria: A case study of staphylococcus aureus. Molecular Biology and Evolution 29(2), 797–809 (2012)CrossRefGoogle Scholar
  31. 31.
    Than, C., Ruths, D., Innan, H., Nakhleh, L.: Confounding factors in HGT detection: statistical error, coalescent effects, and multiple solutions. J. Comput. Biol. 14, 517–535 (2007)CrossRefMathSciNetGoogle Scholar
  32. 32.
    Than, C., Sugino, R., Innan, H., Nakhleh, L.: Efficient inference of bacterial strain trees from genome-scale multi-locus data. Bioinformatics 24, i123–i131 (2008)Google Scholar
  33. 33.
    White, M.A., Ane, C., Dewey, C.N., Larget, B.R., Payseur, B.A.: Fine-scale phylogenetic discordance across the house mouse genome. PLoS Genetics 5, e1000729 (2009)Google Scholar
  34. 34.
    Yu, Y., Barnett, R.M., Nakhleh, L.: Parsimonious inference of hybridization in the presence of incomplete lineage sorting. Systematic Biology 62, 738–751 (2013)CrossRefGoogle Scholar
  35. 35.
    Yu, Y., Degnan, J.H., Nakhleh, L.: The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. PLoS Genetics 8, e1002660 (2012)Google Scholar
  36. 36.
    Yu, Y., Dong, J., Liu, K., Nakhleh, L.: Maximum likelihood inference of reticulate evolutionary histories. Proceedings of the National Academy of Sciences 111, 16448–16453 (2014)CrossRefGoogle Scholar
  37. 37.
    Yu, Y., Ristic, N., Nakhleh, L.: Fast algorithms and heuristics for phylogenomics under ils and hybridization. BMC Bioinformatics 14, S6 (2013)Google Scholar
  38. 38.
    Yu, Y., Than, C., Degnan, J.H., Nakhleh, L.: Coalescent histories on phylogenetic networks and detection of hybridization despite incomplete lineage sorting. Systematic Biology 60, 138–149 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceRice UniversityHoustonUSA

Personalised recommendations