Skip to main content

Numerical Optimization Techniques in Maximum Likelihood Tree Inference

  • Chapter
  • First Online:
Bioinformatics and Phylogenetics

Part of the book series: Computational Biology ((COBO,volume 29))

Abstract

In this chapter, we present recent computational and algorithmic advances for improving the inference of phylogenetic trees from the analysis of homologous genetic sequences under the maximum likelihood criterion. In particular, we detail how the use of matrix algebra at the core of Felsenstein’s pruning algorithm, combined with the architecture of modern day computer processors, leads to efficient techniques for optimizing edge lengths. We also discuss some properties of the likelihood function when considering the optimization of the parameters of mixture models that are used to describe the variation of rates-across sites .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adachi, J., Hasegawa, M.: MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood. Institute of Statistical Mathematics Tokyo (1996)

    Google Scholar 

  2. Ayres, D.L., Darling, A., Zwickl, D.J., Beerli, P., Holder, M.T., Lewis, P.O., Huelsenbeck, J.P., Ronquist, F., Swofford, D.L., Cummings, M.P., Rambaut, A., Suchard, M.A.: BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. Syst. Biol. 61(1), 170–173 (2011)

    Article  Google Scholar 

  3. Brent, R.P.: An algorithm with guaranteed convergence for finding a zero of a function. Comput. J. 14(4), 422–425 (1971)

    Article  MathSciNet  Google Scholar 

  4. Dayhoff, M., Schwartz, R., Orcutt, B.: A model of evolutionary change in proteins. In: Dayhoff, M. (ed.) Atlas of Protein Sequence and Structure, vol. 5, pp. 345–352. National Biomedical Research Foundation, Washington, D.C. (1978)

    Google Scholar 

  5. Felsenstein, J.: Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)

    Article  Google Scholar 

  6. Felsenstein, J.: Inferring Phylogenies. Sinauer Associates, Sunderland, MA (2004)

    Google Scholar 

  7. Gascuel, O., Guindon, S.: Modelling the variability of evolutionary processes. In: Gascuel, O., Steel, M. (eds.) Reconstructing Evolution: New Mathematical and Computational Advances, pp. 65–99. Oxford University Press (2007)

    Google Scholar 

  8. Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O.: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59(3), 307–321 (2010)

    Article  Google Scholar 

  9. Guindon, S., Gascuel, O.: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52(5), 696–704 (2003)

    Article  Google Scholar 

  10. Hasegawa, M., Kishino, H., Yano, T.: Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22(2), 160–174 (1985)

    Article  Google Scholar 

  11. Helaers, R., Milinkovitch, M.C.: MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics. BMC Bioinform. 11(1), 379 (2010)

    Google Scholar 

  12. Hoang, D.T., Chernomor, O., von Haeseler, A., Minh, B.Q., Le, S.V.: UFBoot2 improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35(2), 518–522 (2018)

    Article  Google Scholar 

  13. Hordijk, W., Gascuel, O.: Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood. Bioinformatics 21(24), 4338–4347 (2005)

    Article  Google Scholar 

  14. Jarvis, E., Mirarab, S., Aberer, A., Li, B., Houde, P., Li, C., Ho, S., Faircloth, B., Nabholz, B., Howard, J., Suh, A., Weber, C., da Fonseca, R., Li, J., Zhang, F., Li, H., Zhou, L., Narula, N., Liu, L., Ganapathy, G., Boussau, B., Bayzid, M., Zavidovych, V., Subramanian, S., Gabaldón, T., Capella-Gutiérrez, S., Huerta-Cepas, J., Rekepalli, B., Munch, K., Schierup, M., Lindow, B., Warren, W., Ray, D., Green, R., Bruford, M., Zhan, X., Dixon, A., Li, S., Li, N., Huang, Y., Derryberry, E., Bertelsen, M., Sheldon, F., Brumfield, R., Mello, C., Lovell, P., Wirthlin, M., Schneider, M., Prosdocimi, F., Samaniego, J., Vargas Velazquez, A., Alfaro-Núñez, A., Campos, P., Petersen, B., Sicheritz-Ponten, T., Pas, A., Bailey, T., Scofield, P., Bunce, M., Lambert, D., Zhou, Q., Perelman, P., Driskell, A., Shapiro, B., Xiong, Z., Zeng, Y., Liu, S., Li, Z., Liu, B., Wu, K., Xiao, J., Yinqi, X., Zheng, Q., Zhang, Y., Yang, H., Wang, J., Smeds, L., Rheindt, F., Braun, M., Fjeldsa, J., Orlando, L., Barker, F., Jønsson, K., Johnson, W., Koepfli, K., O’Brien, S., Haussler, D., Ryder, O., Rahbek, C., Willerslev, E., Graves, G., Glenn, T., McCormack, J., Burt, D., Ellegren, H., Alström, P., Edwards, S., Stamatakis, A., Mindell, D., Cracraft, J., Braun, E., Warnow, T., Jun, W., Gilbert, M., Zhang, G.: Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346(6215), 1320–1331 (2014)

    Article  Google Scholar 

  15. Jones, D.T., Taylor, W.R., Thornton, J.M.: The rapid generation of mutation data matrices from protein sequences. Bioinformatics 8(3), 275–282 (1992)

    Article  Google Scholar 

  16. Jukes, T., Cantor, C.: Evolution of protein molecules. In: Munro, H. (ed.) Mammalian Protein Metabolism, vol. III, chap. 24, pp. 21–132. Academic Press, New York (1969)

    Chapter  Google Scholar 

  17. Le, S.Q., Dang, C.C., Gascuel, O.: Modeling protein evolution with several amino acid replacement matrices depending on site rates. Mol. Biol. Evol. 29(10), 2921–2936 (2012)

    Article  Google Scholar 

  18. Le, S.Q., Gascuel, O.: An improved general amino acid replacement matrix. Mol. Biol. Evol. 25(7), 1307–1320 (2008)

    Article  Google Scholar 

  19. Le, S.Q., Gascuel, O.: Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial. Syst. Biol. 59(3), 277–287 (2010)

    Article  Google Scholar 

  20. Lin, Y., Hu, F., Tang, J., Moret, B.M.: Maximum likelihood phylogenetic reconstruction from high-resolution whole-genome data and a tree of 68 eukaryotes. In: Biocomputing 2013, pp. 285–296. World Scientific (2013)

    Google Scholar 

  21. Nguyen, L.T., Schmidt, H.A., von Haeseler, A., Minh, B.Q.: IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32(1), 268–274 (2014)

    Article  Google Scholar 

  22. Nielsen, R., Yang, Z.: Likelihood models for detecting positively selected amino acid sites and application to the HIV-1 envelope gene. Genetics 148, 929–936 (1998)

    Google Scholar 

  23. Pratas, F., Trancoso, P., Stamatakis, A., Sousa, L.: Fine-grain parallelism using multi-core, cell/be, and GPU systems: accelerating the phylogenetic likelihood function. In: International Conference on Parallel Processing, 2009, ICPP’09, pp. 9–17. IEEE (2009)

    Google Scholar 

  24. Ronquist, F., Huelsenbeck, J.P.: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19(12), 1572–1574 (2003)

    Article  Google Scholar 

  25. Soubrier, J., Steel, M., Lee, M.S., Der Sarkissian, C., Guindon, S., Ho, S.Y., Cooper, A.: The influence of rate heterogeneity among sites on the time dependence of molecular rates. Mol. Biol. Evol. 29(11), 3345–3358 (2012)

    Article  Google Scholar 

  26. Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)

    Article  Google Scholar 

  27. Stamatakis, A., Ludwig, T., Meier, H.: RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21(4), 456–463 (2004)

    Article  Google Scholar 

  28. Susko, E., Field, C., Blouin, C., Roger, A.J.: Estimation of rates-across-sites distributions in phylogenetic substitution models. Syst. Biol. 52(5), 594–603 (2003)

    Article  Google Scholar 

  29. Swofford, D.: PAUP*: phylogenetic analysis using parsimony (* and other methods) Ver. 4. Sinauer Associates, Sunderland, Massachusetts (2002)

    Google Scholar 

  30. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S.: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28(10), 2731–2739 (2011)

    Article  Google Scholar 

  31. Tavaré, S.: Some probabilistic and statistical problems in the analysis of DNA sequences. Lectures on Mathematics in the Life Sciences, vol. 17, pp. 57–86. American Mathematical Society (1986)

    Google Scholar 

  32. Vinh, L.S., von Haeseler, A.: IQPNNI: moving fast through tree space and stopping in time. Mol. Biol. Evol. 21(8), 1565–1571 (2004)

    Article  Google Scholar 

  33. Whelan, S., Goldman, N.: A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18(5), 691–699 (2001)

    Article  Google Scholar 

  34. Yang, Z.: Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39, 306–314 (1994)

    Article  Google Scholar 

  35. Yang, Z.: Computational molecular evolution. Oxford University Press (2006)

    Google Scholar 

  36. Yang, Z., Nielsen, R.: Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17, 32–43 (2000)

    Article  Google Scholar 

  37. Yang, Z., Nielsen, R.: Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol. Biol. Evol. 19, 908–917 (2002)

    Article  Google Scholar 

  38. Zwickl, D.: GARLI: genetic algorithm for rapid likelihood inference (2006). http://www.bio.utexas.edu/faculty/antisense/garli/Garli.html

Download references

Acknowledgements

We would like to thank Alexandros Stamatakis for helpful suggestions on how to improve this chapter and Tandy Warnow for inviting us to celebrate Bernard Moret’s contributions to the field of computational evolution.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olivier Gascuel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Guindon, S., Gascuel, O. (2019). Numerical Optimization Techniques in Maximum Likelihood Tree Inference. In: Warnow, T. (eds) Bioinformatics and Phylogenetics. Computational Biology, vol 29. Springer, Cham. https://doi.org/10.1007/978-3-030-10837-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-10837-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-10836-6

  • Online ISBN: 978-3-030-10837-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics