Advertisement

Journal of Molecular Evolution

, Volume 43, Issue 3, pp 304–311 | Cite as

Probability distribution of molecular evolutionary trees: A new method of phylogenetic inference

  • Bruce Rannala
  • Ziheng Yang
Article

Abstract

A new method is presented for inferring evolutionary trees using nucleotide sequence data. The birth-death process is used as a model of speciation and extinction to specify the prior distribution of phylogenies and branching times. Nucleotide substitution is modeled by a continuous-time Markov process. Parameters of the branching model and the substitution model are estimated by maximum likelihood. The posterior probabilities of different phylogenies are calculated and the phylogeny with the highest posterior probability is chosen as the best estimate of the evolutionary relationship among species. We refer to this as the maximum posterior probability (MAP) tree. The posterior probability provides a natural measure of the reliability of the estimated phylogeny. Two example data sets are analyzed to infer the phylogenetic relationship of human, chimpanzee, gorilla, and orangutan. The best trees estimated by the new method are the same as those from the maximum likelihood analysis of separate topologies, but the posterior probabilities are quite different from the bootstrap proportions. The results of the method are found to be insensitive to changes in the rate parameter of the branching process.

Key words

Maximum likelihood Phylogeny Nucleotide substitution Posterior probability Empirical Bayes estimation MAP tree 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berger JO (1985) Statistical decision theory and Bayesian analysis. Springer-Verlag, New YorkGoogle Scholar
  2. Bishop MJ, Friday AE (1985) Evolutionary trees from nucleic acid and protein sequences. Proc R Soc Lond [Biol] 226:271–302Google Scholar
  3. Cavalli-Sforza LL, Edwards AWF (1967) Phylogenetic analysis: models and estimation procedures. Evolution 21:550–570Google Scholar
  4. Edwards AWF (1970) Estimation of the branch points of a branching diffusion process (with discussion). J R Stat Soc B 32:155–174Google Scholar
  5. Feller W (1939) Die grundlagen der volterraschen theorie des kampfer ums dasein in wahrsheinlichkeits theoretischen behandlung. Acta Biotheor 5:1–40CrossRefGoogle Scholar
  6. Felsenstein J (1973) Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Syst Zool 22:240–249Google Scholar
  7. Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376CrossRefPubMedGoogle Scholar
  8. Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791Google Scholar
  9. Felsenstein J, Kishino H (1993) Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull. Syst Biol 42:193–200Google Scholar
  10. Fukami-Kobayashi K, Tateno Y (1991) Robustness of maximum likelihood tree estimation against different patterns of base substitution. J Mol Evol 32:79–91CrossRefPubMedGoogle Scholar
  11. Gaut BS, Lewis PO (1995) Success of maximum likelihood phylogeny inference in the four-taxon case. Mol Biol Evol 12:152–162PubMedGoogle Scholar
  12. Goldman N (1990) Maximum likelihood inference of phylogenetic trees, with special reference to a Poisson process model of DNA substitution and to parsimony analysis. Syst Zool 39:345–361Google Scholar
  13. Grimmett GR, Stirzaker DR (1992) Probability and Random Processes. 2nd ed. Clarendon Press, OxfordGoogle Scholar
  14. Hasegawa M, Yano T (1984) Maximum likelihood method of phylogenetic inference from DNA sequence data. Bull Biomet Soc Jpn 5:1–7Google Scholar
  15. Hasegawa M, Kishino H, Yano T (1985) Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22: 160–174PubMedGoogle Scholar
  16. Hasegawa M, Kishino H, Saitou N (1991) On the maximum likelihood method in molecular phylogenetics. J Mol Evol 32:443–445PubMedGoogle Scholar
  17. Hillis DM, Bull JJ (1993) An empirical test of bootstrapping as a method for assessing the confidence in phylogenetic analysis. Syst Biol 42:182–192Google Scholar
  18. Horai S, Satta Y, Hayasaka K, Kondo R, Inoue T, Ishida T, Hayashi S, Takahata N (1992) Man's place in Hominoidea revealed by mitochondrial DNA genealogy. J Mol Evol 35:32–43CrossRefPubMedGoogle Scholar
  19. Huelsenbeck JP (1995a) The performance of phylogenetic methods in simulation. Syst Biol 44:17–48Google Scholar
  20. Huelsenbeck JP (1995b) The robustness of two phylogenetic methods: four-taxon simulations reveal a slight superiority of maximum likelihood over neighbor joining. Mol Biol Evol 12:843–849Google Scholar
  21. Kendall DG (1949) Stochastic processes and population growth. J R Star Soc B 11:230–264Google Scholar
  22. Kishino H, Hasegawa M (1989) Evaluation of maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol 29:170–179CrossRefPubMedGoogle Scholar
  23. Kuhner MK, Felsenstein J (1994) A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol Biol Evol 11:459–468PubMedGoogle Scholar
  24. Kuhner MK, Yamato J, Felsenstein J (1995) Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. Genetics 140:1421–1430PubMedGoogle Scholar
  25. Miyamoto MM, Slighton JL, Goodman M (1987) Phylogenetic relations of humans and African apes from DNA sequences in the ψη-globin region. Science 238:369–373PubMedGoogle Scholar
  26. Nee S, May RM, Harvey PH (1994) The reconstructed evolutionary process. Philos Trans R Soc Lond Biol 344:305–311PubMedGoogle Scholar
  27. Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New YorkGoogle Scholar
  28. Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992) Numerical recipes in C: the art of scientific computing. 2nd ed. Cambridge University Press, CambridgeGoogle Scholar
  29. Raup DM (1985) Mathematical models of cladogenesis. Paleobiology 11:42–52Google Scholar
  30. Robert CP (1994) The Bayesian choice: a decision-theoretic motivation. Springer-Verlag, New YorkGoogle Scholar
  31. Takezaki N, Rzhetsky A, Nei M (1995) Phylogenetic test of the molecular clock and linearized trees. Mol Biol Evol 12:823–833PubMedGoogle Scholar
  32. Tateno Y, Takezaki N, Nei M (1994) Relative efficiencies of the maximum-likelihood, neighbor-joining, and maximum-parsimony methods when substitution rate varies with site. Mol Biol Evol 11:261–277PubMedGoogle Scholar
  33. Thompson EA (1975) Human evolutionary trees. Cambridge University Press, Cambridge, EnglandGoogle Scholar
  34. Thorne JL, Kishino H, Felsenstein J (1992) Inching toward reliability: an improved likelihood model of sequence evolution. J Mol Evol 34:3–16CrossRefPubMedGoogle Scholar
  35. Yang Z (1993) Maximum likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol 10:1396–1401PubMedGoogle Scholar
  36. Yang Z (1994a) Statistical properties of the maximum likelihood method of phylogenetic estimation and comparison with distance matrix methods. Syst Biol 43:329–342Google Scholar
  37. Yang Z (1994b) Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol 39:306–314Google Scholar
  38. Yang Z (1995) Evaluation of several methods for estimating phylogenetic trees when substitution rates differ over nucleotide sites. J Mol Evol 40:689–697CrossRefGoogle Scholar
  39. Yang Z (1996) Phylogenetic analysis by parsimony and likelihood methods. J Mol Evol 42:294–307CrossRefPubMedGoogle Scholar
  40. Yang Z, Goldman N, Friday AE (1995) Maximum likelihood trees from DNA sequences: a peculiar statistical estimation problem. Syst Biol 44:384–399Google Scholar
  41. Yule GU (1925) A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, F.R.S. Philos Trans R Soc Lond Biol 213:21–87Google Scholar
  42. Zharkikh A, Li W-H (1992) Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences: 1. four taxa with a molecular clock. Mol Biol Evol 9:1119–1147PubMedGoogle Scholar

Copyright information

© Springer-Verlag New York Inc 1996

Authors and Affiliations

  • Bruce Rannala
    • 1
  • Ziheng Yang
    • 1
  1. 1.Department of Integrative BiologyUniversity of California BerkeleyUSA

Personalised recommendations