Skip to main content

Part of the book series: Lecture Notes in Mathematics ((LNMBIOS,volume 1922))

Study of the evolutionary relationships among organisms has been of interest to scientists for over 100 years. The earliest attempts at inferring evolutionary relatedness relied solely on observable species characteristics. Modern molecular techniques, however, have made available an abundance of DNA sequence data, which can be used to study these relationships. Today, it is common to consider the information contained in both types of data in order to obtain robust estimates of evolutionary histories.

Estimation of the phylogenetic relationships among a collection of organisms given genetic data for these organisms can be divided into two distinct problems. The first is to define the particular criterion by which we compare the fit of a particular phylogenetic hypothesis to the observed data. The second is to search the space of possible phylogenies for the particular tree or trees that provide the best fit to the data. In this chapter, we give an overview of these two problems, with particular emphasis on the maximum parsimony and maximum likelihood criteria for comparing trees. Techniques for searching the space of trees for optimal phylogenies under these criteria are also discussed. Throughout the chapter, we use two data sets to illustrate the main ideas. We begin by defining some of the commonly used terminology, and by providing a careful description of the data used in phylogenetic analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Emile Aarts and Jan Korst. Simulated Annealing and Boltzman Machines. Wiley and Sons, First edition, 1989.

    Google Scholar 

  2. Emile Aarts and P. J. VanLaarhoven. A new polynomial time cooling schedule. Proc. IEEE Int. Conf. On Computer-Aided Design, Santa Clara, x:206–208, 1989.

    Google Scholar 

  3. J. Adachi and M. Hasegawa. MOLPHY: Programs for molecular phylogenetics I - PROTML: Maximum likelihood inference of protein phylogeny. Computer Science Monographs, No. 27. Institute of Statistical Mathematics, Tokyo, 1992.

    Google Scholar 

  4. B. L. Allen and M. Steel. Subtree transfer operations and their induced metrics on evolutionary trees. Annals of Combinatorics, 5:1–13, 2001.

    Article  MathSciNet  Google Scholar 

  5. Daniel Barker. LVB 1.0: Reconstructing Evolution with Parsimony and Simulated Annealing. University of Edinburgh, 1997.

    Google Scholar 

  6. L. Bonnaud, R. Boucher-Rodoni, and M. Monnerott. Phylogeny of cephalopods inferred from mitochondrial DNA sequences. Molecular Phylogenetics and Evolution, 7:44–54, 1997.

    Article  Google Scholar 

  7. D. Bryant. A classification of consensus methods for phylogenetics, pages 163–183. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Volume 61. American Mathematical Society, 2003.

    Google Scholar 

  8. D.B. Carlini, R.E. Young, and M.V. Vecchione. A molecular phylogeny of the Octopoda(Mollusca: Cephalopoda) evaluated in light of morphological evidence. Molecular Phylogenetics and Evolution, 21:388–397, 2001.

    Article  Google Scholar 

  9. B. Chor and R. Tuller. Maximum likelihood of evolutionary trees: hardness and approximation. Bioinformatics, 21(Suppl. 1):i97–i106, 2005.

    Article  Google Scholar 

  10. Andreas Dress and Michael Kruger. Parsimonious phylogenetic trees in metric spaces and simulated annealing. Advances in Applied Mathematics, 8:8–37, 1987.

    Article  MATH  MathSciNet  Google Scholar 

  11. A.W.F. Edwards and L. L. Cavalli-Sforza. Reconstruction of evolutionary trees, pages 67–76. Systematics Association Publication No. 6, 1964.

    Google Scholar 

  12. S.V. Edwards, L. Liu, and D. K. Pearl. High resolution species tree without concatenation. Proceedings of the National Academy of Sciences USA, 104:5936–5941, 2007.

    Article  Google Scholar 

  13. B. Efron. The Jackknife, the Bootstrap, and Other Resampling Plans. CBMS-NSF Regional Conference Series in Applied Mathematics, Monograph 38. Society for Industrial and Applied Mathematics, Philadelphia, 1982.

    Google Scholar 

  14. B. Efron and R. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, New York, 1993.

    MATH  Google Scholar 

  15. Bradley Efron, Elizabeth Halloran, and Susan Holmes. Bootstrap confidence levels for phylogenetic trees. Proc Natl Acad Sci USA, 93:13429–13434, 1996.

    Article  MATH  Google Scholar 

  16. N.C. Ellstrand, R. Whitkus, and L. H. Rieseberg. Distribution of spontaneous plant hybrids. Proceedings of the National Academy of Sciences, 93(10):5090–5093, 1996.

    Article  Google Scholar 

  17. P. Erdos, M. Steel, L. Szekely, and T. Warnow. Local quartet splits of a binary tree infer all quartet splits via one dyadic inference rule. Computers and Artificial Intelligence, 16(2):217–227, 1997.

    MathSciNet  Google Scholar 

  18. J.S. Farris. Methods for computing Wagner trees. Systematic Zoology, 19:83–92, 1970.

    Article  Google Scholar 

  19. J. Felsenstein. Cases in which parsimony or compatibility methods will be positively misleading. Systematic Zoology, 27:401–410, 1978.

    Article  Google Scholar 

  20. J. Felsenstein. Inferring Phylogenies. Sinauer Associates, 2004.

    Google Scholar 

  21. Joseph Felsenstein. Maximum-likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Systematic Zoology, 22:240–249, 1973.

    Article  Google Scholar 

  22. Joseph Felsenstein. Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17:368–376, 1981.

    Article  Google Scholar 

  23. Joseph Felsenstein. Distance methods for inferring phylogenies: A justification. Evolution, 38:16–24, 1984.

    Article  Google Scholar 

  24. Joseph Felsenstein. Confidence limits on phylogenies: An approach using the bootstrap. Evolution, 39(4):783–791, 1985.

    Article  Google Scholar 

  25. Joseph Felsenstein. Phylogenetic Inference Package (PHYLIP), Version 3.5. University of Washington, Seattle, 1993.

    Google Scholar 

  26. W.M. Fitch. Toward defining the course of evolution: Minimum change for a specified tree topology. Systematic Zoology, 20:406–416, 1971.

    Article  Google Scholar 

  27. N. Galtier. Maximum likelihood phylogenetic inference under a covarion-like model. Molecular Biology and Evolution, 18:866–873, 2001.

    Google Scholar 

  28. A. Graybeal. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol., 47:9–17, 1998.

    Article  Google Scholar 

  29. R. Guigo, I. Muchnik, and T.F. Smith. Reconstruction of ancient molecular phylogeny. Molecular Phylogenetics and Evolution, 6:189–213, 1996.

    Article  Google Scholar 

  30. S. Guindon and O. Gascuel. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol., 52(5):696–704, 2003.

    Article  Google Scholar 

  31. Masami Hasegawa, Hirohisa Kishino, and Taka-Aki Yano. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution, 21:160–174, 1985.

    Article  Google Scholar 

  32. S.M. Hedtke, T.M. Townsend, and D.M. Hillis. Resolution of phylogenetic conflict in large data sets by increased taxon sampling. Syst. Biol., 55(3): 522–529, 2006.

    Article  Google Scholar 

  33. David Hillis and James Bull. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst Biol, 42(2):182–192, 1993.

    Google Scholar 

  34. J.P. Huelsenbeck and J.P. Bollback. Application of the likelihood function in phylogenetic analysis, pages 415–444. Wiley (edited by D. J. Balding, M. Bishop and C. Cannings), 2001.

    Google Scholar 

  35. J.P. Huelsenbeck and F. Ronquist. MrBayes3: Bayesian phylogenetic inference under mixed models. Bioinformatics, 19:1572–1574, 2003.

    Article  Google Scholar 

  36. J.P. Huelsenbeck, F. Ronquist, R. Nielsen, and J. P. Bollback. Bayesian inference of phylogeny and its impact on evolutionary biology. Science, 294(5550):2310–2314, 2001.

    Article  Google Scholar 

  37. Daniel Huson, Scott Nettles, and Tandy Warnow. Disk-covering, a fast converging method for phylogenetic tree reconstruction. Journal of Computational Biology, 6(3):369–386, 1999.

    Article  Google Scholar 

  38. J.G. Joung, S. June, and B.T. Zhang. Protein sequence-based risk classification for human papillomaviruses. Computers in Biology and Medicine, 36(6):656–667, 2006.

    Article  Google Scholar 

  39. T.H. Jukes and C.R. Cantor. Evolution of protein molecules. In H. N. Munro, editor, Mammalian Protein Metabolism, pages 21–132. Academic Press, New York, 1969.

    Google Scholar 

  40. Motoo Kimura. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16:111–120, 1980.

    Article  Google Scholar 

  41. A.G. Kluge and J.S. Farris. Quantitative phyletics and the evolution of anurans. Systematic Zoology, 18:1–32, 1969.

    Article  Google Scholar 

  42. A.R. Lemmon and M.C. Milinkovitch. The metapopulation genetic algorithm: an efficient solution for the problem of large phylogeny estimation. Proceedings of the National Academy of Sciences, 99(16):10516–10521, 2002.

    Article  Google Scholar 

  43. Paul Lewis. A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Molecular Biology Evolution, 15(3):277–283, 1998.

    Google Scholar 

  44. P.O. Lewis. A likelihood approach to estimating phylogeny from discrete morphological character data. Systematic Biology, 50:913–925, 2001.

    Article  Google Scholar 

  45. Wen-Hsiung Li. Molecular Evolution. Sinauer Associates, First edition, 1997.

    Google Scholar 

  46. A.R. Lindgren, G. Giribet, and M.K. Nishiguchi. A combined approach to the phylogeny of cephalopoda (mollusca). Cladistics, 20:454–486, 2004.

    Article  Google Scholar 

  47. T.K. Lowrey, C.J. Quinn, R. K. Taylor, R. Chan, R. Kimball, and J. C. De Nardi. Molecular and morphological reassessment of relationships within the Vittadinia group of Astereae (Asteraceae). American Journal of Botany, 88:1279–1289, 2001.

    Article  Google Scholar 

  48. M. Lundy. Applications of the annealing algorithm to combinatorial problems in statistics. Biometrika, 72(1):191–198, 1985.

    Article  Google Scholar 

  49. M. Lundy and A. Mees. Convergence of an annealing algorithm. Mathematical Programming, 34:111–124, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  50. David Maddison. The discovery and importance of multiple islands of most-parsimonious trees. Systematic Zoology, 40(3):315–328, 1991.

    Article  Google Scholar 

  51. Wayne P. Maddison. Gene trees in species trees. Syst. Biol., 46:523–536, 1997.

    Google Scholar 

  52. Hideo Matsuda. Protein phylogenetic inference using maximum likelihood with a genetic algorithm. In Pacific Symposium on Biocomputing, pages 512–523, 1996.

    Google Scholar 

  53. C. Medigue, T. Rouxel, P. Vigier, A. Henaut, and A. Danchin. Evidence for horizontal gene transfer in Escherichia coli speciation. Journal of Molecular Biology, 222:851–856, 1991.

    Article  Google Scholar 

  54. S.B. Needleman and C.D. Wunsch. A general method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology, 48:443–453, 1970.

    Article  Google Scholar 

  55. M. Nei. Molecular Population Genetics and Evolution. North-Holland, First edition, 1975.

    Google Scholar 

  56. K. Nixon. The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics, 15(4):407–414, 1999.

    Article  Google Scholar 

  57. Gary Olsen, Hideo Matsuda, Ray Hagstrom, and Ross Overbeek. FastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Computations in Applied Biosciences, 10(1):41–48, 1994.

    Google Scholar 

  58. Pekka Pamilo and Masatoshi Nei. Relationships between gene trees and species trees. Molecular Biology and Evolution, 5(5):568–583, 1988.

    Google Scholar 

  59. D. Penny, M.D. Hendy, and M.A. Steel. Testing the theory of descent, pages 155–183. Oxford University Press (edited by M.M. Miyamoto and J. Cracraft), 1991.

    Google Scholar 

  60. D. Penny, B.J. McComish, M.A. Charleston, and M.D. Hendy. Mathematical elegance with biochemical realism: The covarion model of molecular evolution. Journal of Molecular Evolution, 54:711–723, 2001.

    Article  Google Scholar 

  61. H. Philippe, F. Delsuc, H. Brinkmann, and N. Larillot. Phylogenomics. Annual Review of Ecology, Evolution, and Systematics, 36:541–562, 2005.

    Article  Google Scholar 

  62. S. Poe and D. L. Swofford. Taxon sampling revisited. Nature, 398:299–300, 1999.

    Article  Google Scholar 

  63. D. Posada and K.A. Crandall. Modeltest: testing the model of DNA substitution. Bioinformatics, 14(9):817–818, 1998.

    Article  Google Scholar 

  64. Chenna R., H. Sugawara, T. Koike, R. Lopez, T.J. Gibson, D.G. Higgins, and J.D. Thompson. Multiple sequence alignment with the clustal series of programs. Nucleic Acids Research, 31:3497–3500, 2003.

    Article  Google Scholar 

  65. D.R. Robinson and L.R. Foulds. Comparison of phylogenetic trees. Math. Biosci., 53:131–147, 1981.

    Article  MATH  MathSciNet  Google Scholar 

  66. Noah A. Rosenberg. The probability of topological concordance of gene trees and species trees. Theor. Popul. Biol., 61:225–247, 2002.

    Article  MATH  Google Scholar 

  67. N. Saitou. Maximum likelihood methods. Meth. Enzymol., 183:584–598, 1990.

    Article  Google Scholar 

  68. N. Saitou and M. Nei. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology Evolution, 4:406–425, 1987.

    Google Scholar 

  69. Laura Salter and Dennis Pearl. A stochastic search strategy for estimation of maximum likelihood phylogenetic trees. in revision, Systematic Biology, 2000.

    Google Scholar 

  70. T. Sang and Y. Zhong. Testing hybridization hypotheses based on incongruent gene trees. Syst. Biol., 49(3):422–434, 2000.

    Article  Google Scholar 

  71. T.F. Smith and M.S. Waterman. Identification of common molecular sequences. Journal of Molecular Biology, 147:195–197, 1981.

    Article  Google Scholar 

  72. A. Stamatakis. RAxML-VI-HPVC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics - Advanced Access, 2006.

    Google Scholar 

  73. A. Stamatakis, T. Ludwig, and H. Meier. RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics, 21(4):456–463, 2005.

    Article  Google Scholar 

  74. M.A. Steel and D. Penny. Parsimony, likelihood, and the role of models in phylogenetics. Molecular Biology and Evolution, 17:839–850, 2000.

    Google Scholar 

  75. Korbinian Strimmer and Arndt von Haeseler. Quartet puzzling: A quartet maximum-likelihood method for reconstructing tree topologies. Molecular Biology and Evolution, 13(7):964–969, 1996.

    Google Scholar 

  76. J. Sullivan and P. Joyce. Model selection in phylogenetics. Ann. Rev. Ecol. Evol. Syst., 36:445–466, 2005.

    Article  Google Scholar 

  77. Dave Swofford, Gary Olsen, Peter Waddell, and David Hillis. Phylogenetic Inference, in Molecular Systematics (edited by D. Hillis, C. Moritz, and B. Mable), pages 407–514. Sinauer Associates, Inc., Second edition, 1996.

    Google Scholar 

  78. David Swofford. PAUP*. Phylogenetic analysis using parsimony (* and other methods). Version 4. Sinauer Associates, 1998.

    Google Scholar 

  79. C. Tuffley and M.A. Steel. Modelling the covarion hypothesis of nucleotide substitution. Mathematical Biosciences, 147:63–91, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  80. Anna Maria Valdez and Daniel Pinero. Phylogenetic estimation of plasmid exchange in bacteria. Evolution, 46(3):641–656, 1992.

    Article  Google Scholar 

  81. R.A. Vos. Accelerated likelihood surface exploration: the likelihood ratchet. Syst. Biol., 52(3):368–373, 2003.

    MathSciNet  Google Scholar 

  82. M.S. Waterman. Introduction to Computational Biology. Chapman & Hall, 1995.

    Google Scholar 

  83. W.C. Wheeler and D. Gladstein. Malign: A multiple sequence alignment program. Journal of Heredity, 85:417–418, 1994.

    Google Scholar 

  84. Ziheng Yang. Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Molecular Biology Evolution, 10(6):1396–1401, 1993.

    Google Scholar 

  85. Ziheng Yang. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. Journal of Molecular Evolution, 39:306–314, 1994.

    Article  Google Scholar 

  86. Ziheng Yang. Phylogenetic Analysis by Maximum Likelihood (PAML), Version 1.3. Department of Integrative Biology, University of California at Berkeley, 1997.

    Google Scholar 

  87. Z.M. Zheng and C.C. Baker. Papillomavirus genome structure, expression, and post-transcriptional regulation. Frontiers in Biosciences, 11:2286–2302, 2006.

    Article  Google Scholar 

  88. D. Zwickl. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence data sets under the maximum likelihood criterion. Technical report, Ph.D. Dissertation, University of Texas at Austin, 2006.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kubatko, L.S. (2008). Inference of Phylogenetic Trees. In: Friedman, A. (eds) Tutorials in Mathematical Biosciences IV. Lecture Notes in Mathematics, vol 1922. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74331-6_1

Download citation

Publish with us

Policies and ethics