Advertisement

Numerical Optimization Techniques in Maximum Likelihood Tree Inference

  • Stéphane Guindon
  • Olivier GascuelEmail author
Chapter
Part of the Computational Biology book series (COBO, volume 29)

Abstract

In this chapter, we present recent computational and algorithmic advances for improving the inference of phylogenetic trees from the analysis of homologous genetic sequences under the maximum likelihood criterion. In particular, we detail how the use of matrix algebra at the core of Felsenstein’s pruning algorithm, combined with the architecture of modern day computer processors, leads to efficient techniques for optimizing edge lengths. We also discuss some properties of the likelihood function when considering the optimization of the parameters of mixture models that are used to describe the variation of rates-across sites .

Keywords

Maximum likelihood Markov processes Optimization Mixture models 

Notes

Acknowledgements

We would like to thank Alexandros Stamatakis for helpful suggestions on how to improve this chapter and Tandy Warnow for inviting us to celebrate Bernard Moret’s contributions to the field of computational evolution.

References

  1. 1.
    Adachi, J., Hasegawa, M.: MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood. Institute of Statistical Mathematics Tokyo (1996)Google Scholar
  2. 2.
    Ayres, D.L., Darling, A., Zwickl, D.J., Beerli, P., Holder, M.T., Lewis, P.O., Huelsenbeck, J.P., Ronquist, F., Swofford, D.L., Cummings, M.P., Rambaut, A., Suchard, M.A.: BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. Syst. Biol. 61(1), 170–173 (2011)CrossRefGoogle Scholar
  3. 3.
    Brent, R.P.: An algorithm with guaranteed convergence for finding a zero of a function. Comput. J. 14(4), 422–425 (1971)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Dayhoff, M., Schwartz, R., Orcutt, B.: A model of evolutionary change in proteins. In: Dayhoff, M. (ed.) Atlas of Protein Sequence and Structure, vol. 5, pp. 345–352. National Biomedical Research Foundation, Washington, D.C. (1978)Google Scholar
  5. 5.
    Felsenstein, J.: Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)CrossRefGoogle Scholar
  6. 6.
    Felsenstein, J.: Inferring Phylogenies. Sinauer Associates, Sunderland, MA (2004)Google Scholar
  7. 7.
    Gascuel, O., Guindon, S.: Modelling the variability of evolutionary processes. In: Gascuel, O., Steel, M. (eds.) Reconstructing Evolution: New Mathematical and Computational Advances, pp. 65–99. Oxford University Press (2007)Google Scholar
  8. 8.
    Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O.: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59(3), 307–321 (2010)CrossRefGoogle Scholar
  9. 9.
    Guindon, S., Gascuel, O.: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52(5), 696–704 (2003)CrossRefGoogle Scholar
  10. 10.
    Hasegawa, M., Kishino, H., Yano, T.: Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22(2), 160–174 (1985)CrossRefGoogle Scholar
  11. 11.
    Helaers, R., Milinkovitch, M.C.: MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics. BMC Bioinform. 11(1), 379 (2010)Google Scholar
  12. 12.
    Hoang, D.T., Chernomor, O., von Haeseler, A., Minh, B.Q., Le, S.V.: UFBoot2 improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35(2), 518–522 (2018)CrossRefGoogle Scholar
  13. 13.
    Hordijk, W., Gascuel, O.: Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood. Bioinformatics 21(24), 4338–4347 (2005)CrossRefGoogle Scholar
  14. 14.
    Jarvis, E., Mirarab, S., Aberer, A., Li, B., Houde, P., Li, C., Ho, S., Faircloth, B., Nabholz, B., Howard, J., Suh, A., Weber, C., da Fonseca, R., Li, J., Zhang, F., Li, H., Zhou, L., Narula, N., Liu, L., Ganapathy, G., Boussau, B., Bayzid, M., Zavidovych, V., Subramanian, S., Gabaldón, T., Capella-Gutiérrez, S., Huerta-Cepas, J., Rekepalli, B., Munch, K., Schierup, M., Lindow, B., Warren, W., Ray, D., Green, R., Bruford, M., Zhan, X., Dixon, A., Li, S., Li, N., Huang, Y., Derryberry, E., Bertelsen, M., Sheldon, F., Brumfield, R., Mello, C., Lovell, P., Wirthlin, M., Schneider, M., Prosdocimi, F., Samaniego, J., Vargas Velazquez, A., Alfaro-Núñez, A., Campos, P., Petersen, B., Sicheritz-Ponten, T., Pas, A., Bailey, T., Scofield, P., Bunce, M., Lambert, D., Zhou, Q., Perelman, P., Driskell, A., Shapiro, B., Xiong, Z., Zeng, Y., Liu, S., Li, Z., Liu, B., Wu, K., Xiao, J., Yinqi, X., Zheng, Q., Zhang, Y., Yang, H., Wang, J., Smeds, L., Rheindt, F., Braun, M., Fjeldsa, J., Orlando, L., Barker, F., Jønsson, K., Johnson, W., Koepfli, K., O’Brien, S., Haussler, D., Ryder, O., Rahbek, C., Willerslev, E., Graves, G., Glenn, T., McCormack, J., Burt, D., Ellegren, H., Alström, P., Edwards, S., Stamatakis, A., Mindell, D., Cracraft, J., Braun, E., Warnow, T., Jun, W., Gilbert, M., Zhang, G.: Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346(6215), 1320–1331 (2014)CrossRefGoogle Scholar
  15. 15.
    Jones, D.T., Taylor, W.R., Thornton, J.M.: The rapid generation of mutation data matrices from protein sequences. Bioinformatics 8(3), 275–282 (1992)CrossRefGoogle Scholar
  16. 16.
    Jukes, T., Cantor, C.: Evolution of protein molecules. In: Munro, H. (ed.) Mammalian Protein Metabolism, vol. III, chap. 24, pp. 21–132. Academic Press, New York (1969)CrossRefGoogle Scholar
  17. 17.
    Le, S.Q., Dang, C.C., Gascuel, O.: Modeling protein evolution with several amino acid replacement matrices depending on site rates. Mol. Biol. Evol. 29(10), 2921–2936 (2012)CrossRefGoogle Scholar
  18. 18.
    Le, S.Q., Gascuel, O.: An improved general amino acid replacement matrix. Mol. Biol. Evol. 25(7), 1307–1320 (2008)CrossRefGoogle Scholar
  19. 19.
    Le, S.Q., Gascuel, O.: Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial. Syst. Biol. 59(3), 277–287 (2010)CrossRefGoogle Scholar
  20. 20.
    Lin, Y., Hu, F., Tang, J., Moret, B.M.: Maximum likelihood phylogenetic reconstruction from high-resolution whole-genome data and a tree of 68 eukaryotes. In: Biocomputing 2013, pp. 285–296. World Scientific (2013)Google Scholar
  21. 21.
    Nguyen, L.T., Schmidt, H.A., von Haeseler, A., Minh, B.Q.: IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32(1), 268–274 (2014)CrossRefGoogle Scholar
  22. 22.
    Nielsen, R., Yang, Z.: Likelihood models for detecting positively selected amino acid sites and application to the HIV-1 envelope gene. Genetics 148, 929–936 (1998)Google Scholar
  23. 23.
    Pratas, F., Trancoso, P., Stamatakis, A., Sousa, L.: Fine-grain parallelism using multi-core, cell/be, and GPU systems: accelerating the phylogenetic likelihood function. In: International Conference on Parallel Processing, 2009, ICPP’09, pp. 9–17. IEEE (2009)Google Scholar
  24. 24.
    Ronquist, F., Huelsenbeck, J.P.: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19(12), 1572–1574 (2003)CrossRefGoogle Scholar
  25. 25.
    Soubrier, J., Steel, M., Lee, M.S., Der Sarkissian, C., Guindon, S., Ho, S.Y., Cooper, A.: The influence of rate heterogeneity among sites on the time dependence of molecular rates. Mol. Biol. Evol. 29(11), 3345–3358 (2012)CrossRefGoogle Scholar
  26. 26.
    Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)CrossRefGoogle Scholar
  27. 27.
    Stamatakis, A., Ludwig, T., Meier, H.: RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21(4), 456–463 (2004)CrossRefGoogle Scholar
  28. 28.
    Susko, E., Field, C., Blouin, C., Roger, A.J.: Estimation of rates-across-sites distributions in phylogenetic substitution models. Syst. Biol. 52(5), 594–603 (2003)CrossRefGoogle Scholar
  29. 29.
    Swofford, D.: PAUP*: phylogenetic analysis using parsimony (* and other methods) Ver. 4. Sinauer Associates, Sunderland, Massachusetts (2002)Google Scholar
  30. 30.
    Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S.: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28(10), 2731–2739 (2011)CrossRefGoogle Scholar
  31. 31.
    Tavaré, S.: Some probabilistic and statistical problems in the analysis of DNA sequences. Lectures on Mathematics in the Life Sciences, vol. 17, pp. 57–86. American Mathematical Society (1986)Google Scholar
  32. 32.
    Vinh, L.S., von Haeseler, A.: IQPNNI: moving fast through tree space and stopping in time. Mol. Biol. Evol. 21(8), 1565–1571 (2004)CrossRefGoogle Scholar
  33. 33.
    Whelan, S., Goldman, N.: A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18(5), 691–699 (2001)CrossRefGoogle Scholar
  34. 34.
    Yang, Z.: Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39, 306–314 (1994)CrossRefGoogle Scholar
  35. 35.
    Yang, Z.: Computational molecular evolution. Oxford University Press (2006)Google Scholar
  36. 36.
    Yang, Z., Nielsen, R.: Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17, 32–43 (2000)CrossRefGoogle Scholar
  37. 37.
    Yang, Z., Nielsen, R.: Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol. Biol. Evol. 19, 908–917 (2002)CrossRefGoogle Scholar
  38. 38.
    Zwickl, D.: GARLI: genetic algorithm for rapid likelihood inference (2006). http://www.bio.utexas.edu/faculty/antisense/garli/Garli.html

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Laboratoire d’Informatique de Robotique et de Microélectronique de MontpellierCNRS and Université Montpellier (UMR 5506)MontpellierFrance
  2. 2.Unité Bioinformatique EvolutiveC3BI Institut Pasteur and CNRS (USR 3756)ParisFrance

Personalised recommendations