Likelihood-Based Inference of Phylogenetic Networks from Sequence Data by PhyloDAG

  • Quan Nguyen
  • Teemu RoosEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9199)


Processes such as hybridization, horizontal gene transfer, and recombination result in reticulation which can be modeled by phylogenetic networks. Earlier likelihood-based methods for inferring phylogenetic networks from sequence data have been encumbered by the computational challenges related to likelihood evaluations. Consequently, they have required that the possible network hypotheses be given explicitly or implicitly in terms of a backbone tree to which reticulation edges are added. To achieve speed required for unrestricted network search instead of only adding reticulation edges to an initial tree structure, we employ several fast approximate inference techniques. Preliminary numerical and real data experiments demonstrate that the proposed method, PhyloDAG, is able to learn accurate phylogenetic networks based on limited amounts of data using moderate amounts of computational resources.


Phylogenetic networks Likelihood-based inference Phylogenetics Probabilistic graphical models 



This work was supported in part by the Academy of Finland (Center-of-Excellence COIN). We are grateful to Vincent Moulton for insightful comments. The anonymous reviewers suggested a comparison to the PhyloNet method and made several other suggestions that significantly improved the paper.


  1. 1.
    Aguilar, J.F., Rosselló, J., Feliner, G.N.: Nuclear ribosomal DNA (nrDNA) concerted evolution in natural and artificial hybrids of Armeria (Plumbaginaceae). Mol. Ecol. 8(8), 1341–1346 (1999)CrossRefGoogle Scholar
  2. 2.
    Celeux, G., Diebolt, J.: The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Comput. Stat. Q. 2(1), 73–82 (1985)Google Scholar
  3. 3.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 39(1), 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge (1998)Google Scholar
  5. 5.
    Felsenstein, J.: Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)CrossRefGoogle Scholar
  6. 6.
    Friedman, N., Ninio, M., Pe’er, I., Pupko, T.: A structural EM algorithm for phylogenetic inference. J. Comput. Biol. 9(2), 331–353 (2002)CrossRefGoogle Scholar
  7. 7.
    Glover, F., Laguna, M.: Tabu Search. Kluwer Academic Publishers, Norwell, MA (1997)CrossRefzbMATHGoogle Scholar
  8. 8.
    Haeseler, A., Churchill, G.A.: Network models for sequence evolution. J. Mol. Evol. 37(1), 77–85 (1993)CrossRefGoogle Scholar
  9. 9.
    Husmeier, D., Wright, F.: Detection of recombination in DNA multiple alignments with hidden Markov models. J. Comput. Biol. 8(4), 401–427 (2001)CrossRefGoogle Scholar
  10. 10.
    Jin, G., Nakhleh, G., Snir, S., Tuller, T.: Maximum likelihood of phylogenetic networks. Bioinformatics 22, 2604–2611 (2006)CrossRefGoogle Scholar
  11. 11.
    Jukes, T.H., Cantor, C.R.: Evolution of protein molecules. Mamm. Protein Metab. 3, 21–132 (1969)CrossRefGoogle Scholar
  12. 12.
    Meng, C., Kubatko, L.S.: Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: a model. Theor. Popul. Biol. 75(1), 35–45 (2009)CrossRefzbMATHGoogle Scholar
  13. 13.
    Morrison, D.: Introduction to Phylogenetic Networks. RJR Productions, Uppsala (2011)Google Scholar
  14. 14.
    Nielsen, S.F.: The stochastic EM algorithm: estimation and asymptotic results. Bernoulli 6, 457–489 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Park, H.J., Nakhleh, L.: Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria. BMC Bioinf. 13(Suppl 19), S12 (2012)Google Scholar
  16. 16.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1988)zbMATHGoogle Scholar
  17. 17.
    Ronquist, F., Huelsenbeck, J.P.: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19(12), 1572–1574 (2003)CrossRefGoogle Scholar
  18. 18.
    Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)Google Scholar
  19. 19.
    Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Sneath, P.H.A.: Cladistic representation of reticulate evolution. Syst. Zool. 24, 360–368 (1975)CrossRefGoogle Scholar
  21. 21.
    Strimmer, K., Moulton, V.: Likelihood analysis of phylogenetic networks using directed graphical models. Mol. Biol. Evol. 17(6), 875–881 (2000)CrossRefGoogle Scholar
  22. 22.
    Strimmer, K., Wiuf, C., Moulton, V.: Recombination analysis using directed graphical models. Mol. Biol. Evol. 18(1), 97–99 (2001)CrossRefGoogle Scholar
  23. 23.
    Tehrani, J., Nguyen, Q., Roos, T.: Oral fairy tale or literary fake? Investigating the origins of Little Red Riding Hood using phylogenetic network analysis. Digital Scholarship in the Humanities (2015, to appear)Google Scholar
  24. 24.
    Ueda, N., Nakano, R.: Deterministic annealing EM algorithm. Neural Netw. 11(2), 271–282 (1998)CrossRefGoogle Scholar
  25. 25.
    Webb, A., Hancock, J.M., Holmes, C.C.: Phylogenetic inference under recombination using Bayesian stochastic topology selection. Bioinformatics 25(2), 197–203 (2009)CrossRefGoogle Scholar
  26. 26.
    Whelan, S., Lio, P., Goldman, N.: Molecular phylogenetics: state-of-the-art methods for looking into the past. Trends Genet. 17(5), 262–272 (2001)CrossRefGoogle Scholar
  27. 27.
    Yu, Y., Dong, J., Liu, K.J., Nakhleh, L.: Maximum likelihood inference of reticulate evolutionary histories. Proc. Nat. Acad. Sci. 111(46), 16448–16453 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer Science, Helsinki Institute for Information Technology HIITUniversity of HelsinkiHelsinkiFinland

Personalised recommendations