Biology & Philosophy

, Volume 30, Issue 4, pp 505–525 | Cite as

Seeing the wood for the trees: philosophical aspects of classical, Bayesian and likelihood approaches in statistical inference and some implications for phylogenetic analysis

  • Daniel Barker


The three main approaches in statistical inference—classical statistics, Bayesian and likelihood—are in current use in phylogeny research. The three approaches are discussed and compared, with particular emphasis on theoretical properties illustrated by simple thought-experiments. The methods are problematic on axiomatic grounds (classical statistics), extra-mathematical grounds relating to the use of a prior (Bayesian inference) or practical grounds (likelihood). This essay aims to increase understanding of these limits among those with an interest in phylogeny.


Phylogeny Statistics Bayesian inference Classical statistics Likelihood Philosophy of science 



I thank Maria Dornelas, Heleen Plaisier and Graeme Ruxton for their comments on an earlier version of the manuscript. Discussions at the University of St Andrews, particularly at the Harold Mitchell Building’s Lab Chat series organised by Mike Ritchie’s group and the Centre for Biological Diversity’s Quantitative Biology Discussion Group organised by Mike Morrisey, have also been helpful. I further thank Heleen Plaisier for pointing out the truth about librarians and farmers.


  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC19:716–723CrossRefGoogle Scholar
  2. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402CrossRefGoogle Scholar
  3. Anismova M, Gascuel O (2006) Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol 55:539–552CrossRefGoogle Scholar
  4. Autzen B (2011) Constraining prior probabilities of phylogenetic trees. Biol Philos 26:567–581CrossRefGoogle Scholar
  5. Baldi P, Brunak S (2001) Bioinformatics: the machine learning approach, 2nd edn. MIT Press, CambridgeGoogle Scholar
  6. Barker D, Meade A, Pagel M (2007) Constrained models of evolution lead to improved prediction of functional linkage from correlated gain and loss of genes. Bioinformatics 23:14–20CrossRefGoogle Scholar
  7. Beaumont MA, Rannala B (2004) The Bayesian revolution in genetics. Nat Rev Genet 5:251–261CrossRefGoogle Scholar
  8. Berger JO, Wolpert RL (1984) The likelihood principle. Institute of Mathematical Statistics, HaywardGoogle Scholar
  9. Birnbaum A (1962) On the foundations of statistical inference. J Am Stat Assoc 57:269–306CrossRefGoogle Scholar
  10. Birnbaum A (1972) More on concepts of statistical evidence. J Am Stat Assoc 67:858–861CrossRefGoogle Scholar
  11. Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer, BerlinCrossRefGoogle Scholar
  12. Buschbom J, Barker D (2006) Evolutionary history of vegetative reproduction in Porpidia s.l. (lichen-forming Ascomycota). Syst Biol 55:471–484CrossRefGoogle Scholar
  13. Casella G (1985) An introduction to empirical Bayes data analysis. Am Stat 39:83–87Google Scholar
  14. Dos Reiss M, Zhu T, Yang Z (2014) The impact of rate prior on Bayesian estimation of divergence times with multiple loci. Syst Biol 63:555–565CrossRefGoogle Scholar
  15. Douady CJ, Delsuc F, Boucher Y, Doolittle WF, Douzery EJP (2003) Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. Mol Biol Evol 20:248–254CrossRefGoogle Scholar
  16. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and dating with confidence. PLoS Biol 4:e88CrossRefGoogle Scholar
  17. Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological sequence analysis. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  18. Edwards AWF (1977) R.A. Fisher’s work on statistical inference. In Parenti G (ed) I fondamenti dell’inferenza statistica. Università degli Studi di Firenze, Firenze, pp 117–124. Reprinted in Edwards (1992), pp 245–251.Google Scholar
  19. Edwards AWF (1992) Likelihood, expanded edition. John Hopkins University Press, BaltimoreGoogle Scholar
  20. Efron B (2003) Robbins, empirical Bayes and microarrays. Ann Stat 31:366–378CrossRefGoogle Scholar
  21. Ekman S, Blaalid R (2011) The devil in the details: interactions between the branch-length prior and likelihood model affect node support and branch lengths in the phylogeny of the Psoraceae. Syst Biol 60:541–561CrossRefGoogle Scholar
  22. Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376CrossRefGoogle Scholar
  23. Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791CrossRefGoogle Scholar
  24. Felsenstein J (2004) Inferring phylogenies. Sinauer, SunderlandGoogle Scholar
  25. Firth D (1993) Bias reduction of maximum likelihood estimates. Biometrika 80:27–38CrossRefGoogle Scholar
  26. Fisher RA (1935a) The design of experiments. Oliver and Boyd, EdinburghGoogle Scholar
  27. Fisher RA (1935b) The fiducial argument in statistical inference. Ann Eugenics 6:391–398CrossRefGoogle Scholar
  28. Fisher RA (1956) Statistical methods and scientific inference. Oliver and Boyd, EdinburghGoogle Scholar
  29. Fraser DAS (1968) Fiducial inference. In: Sills L (ed) International encyclopedia of social sciences. The Macmillan Company and The Free Press, New York, pp 403–406Google Scholar
  30. Gandenberger G (2014) A new proof of the likelihood principle. Br J Philos Sci. doi: 10.1093/bjps/axt039
  31. Gelman A, Carlin JB, Stern HS, Rubin DB (1995) Bayesian data analysis. Chapman and Hall, LondonGoogle Scholar
  32. Graur D, Martin W (2004) Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. Trends Genet 20:80–86CrossRefGoogle Scholar
  33. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52:696–704CrossRefGoogle Scholar
  34. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321CrossRefGoogle Scholar
  35. Holder M, Lewis PO (2003) Phylogeny estimation: traditional and Bayesian approaches. Nat Rev Genet 4:275–284CrossRefGoogle Scholar
  36. Huelsenbeck JP, Bollback JP (2007) Application of the likelihood function in phylogenetic analysis. In: Balding DJ, Bishop M, Cannings C (eds) Handbook of statistical genetics, vol 1, 3rd edn. Wiley, Chichester, pp 460–488CrossRefGoogle Scholar
  37. Huelsenbeck JP, Larget B, Miller RE, Ronquist F (2002) Potential applications and pitfalls of Bayesian inference of phylogeny. Syst Biol 51:673–688CrossRefGoogle Scholar
  38. Huelsenbeck JP, Jain S, Frost SWD, Kosakovsky Pond SL (2006) A Dirichlet process model for detecting positive selection in protein-coding DNA sequences. Proc Natl Acad Sci USA 103:6263–6268CrossRefGoogle Scholar
  39. Ioannidis JPA (2005) Why most published research findings are false. PLoS Med 2:e124CrossRefGoogle Scholar
  40. Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism, vol 3. Academic Press, New York, pp 21–132CrossRefGoogle Scholar
  41. Kadane JB (2011) Principles of uncertainty. CRC Press, Boca RatonCrossRefGoogle Scholar
  42. Kahneman D (2012) Thinking, fast and slow, paperback edition. Penguin Books, LondonGoogle Scholar
  43. Keane TM, Creevey CJ, Pentony MM, Naughton TJ, McInerney JO (2006) Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol Biol 6:29CrossRefGoogle Scholar
  44. Kempthorne O (1962) Comments on A. Birnbaum’s “On the foundations of statistical inference”. J Am Stat Assoc 67:319–322Google Scholar
  45. Kumar S, Filipski AJ, Battistuzzi FU, Kosakovsky Pond SL, Tamura K (2012) Statistics and truth in phylogenomics. Mol Biol Evol 29:457–472CrossRefGoogle Scholar
  46. Lamarck J-BPAM (1809) Philosophie zoologique. Dentu, ParisGoogle Scholar
  47. Lim J-H, Iggo RD, Barker D (2013) Models incorporating chromatin modification data identify functionally important p53 binding sites. Nucleic Acids Res 41:5582–5593CrossRefGoogle Scholar
  48. Lindley DV (1957) A statistical paradox. Biometrika 44:187–192CrossRefGoogle Scholar
  49. Lv J, Liu H, Huang Z, Su J, He H, Xiu Y, Zhang Y, Wu Q (2013) Long non-coding RNA identification over mouse brain development by integrative modeling of chromatin and genomic features. Nucleic Acids Res 41:10044–10061CrossRefGoogle Scholar
  50. Mayo D (2010) An error in the argument from conditionality and sufficiency to the likelihood principle. In: Mayo D, Spanos A (eds) Error and inference: recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science. Cambridge University Press, Cambridge, pp 305–314Google Scholar
  51. Nielsen R, Yang Z (1998) Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929–936Google Scholar
  52. O’Meara BC (2012) Evolutionary inferences from phylogenies: a review of methods. Ann Rev Ecol Evol Syst 43:267–285CrossRefGoogle Scholar
  53. Pagel M (1999) The maximum likelihood approach to reconstructing ancestral character states of discrete characters on phylogenies. Syst Biol 48:612–622CrossRefGoogle Scholar
  54. Pichot A (1994) Présentation. In: Lamarck JBPA (ed) Philosophie Zoologique, avec présentation et notes par André Pichot. Flammarion, Paris, pp 7–49.Google Scholar
  55. Pickett KM, Randle CP (2005) Strange Bayes indeed: uniform topological priors imply non-uniform clade priors. Mol Phylogenet Evol 34:203–211CrossRefGoogle Scholar
  56. Posada D (2008) jModelTest: phylogenetic model averaging. Mol Biol Evol 25:1253–1256CrossRefGoogle Scholar
  57. Posada D, Buckley TR (2004) Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst Biol 53:793–808CrossRefGoogle Scholar
  58. Randle CP, Pickett KM (2010) The conflation of ignorance and knowledge in the inference of clade posteriors. Cladistics 26:550–559CrossRefGoogle Scholar
  59. Rannala B, Yang Z (2007) Inferring speciation times under an episodic molecular clock. Syst Biol 56:453–466CrossRefGoogle Scholar
  60. Royall R (2000) On the probability of observing misleading statistical evidence. J Am Stat Assoc 95:760–768CrossRefGoogle Scholar
  61. Sanderson MJ (1997) A nonparametric approach to estimating divergence times in the absence of rate constancy. Mol Biol Evol 14:1218–1231CrossRefGoogle Scholar
  62. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464CrossRefGoogle Scholar
  63. Seidenfeld T (1992) R.A. Fisher’s fiducial argument and Bayes’ theorem. Stat Sci 7:358–368CrossRefGoogle Scholar
  64. Shields R (2004) Pushing the envelope on molecular dating. Trends Genet 20:221–222CrossRefGoogle Scholar
  65. Simmons MP, Norton AP (2013) Quantification and relative severity of inflated branch-support values generated by alternative methods: an empirical example. Mol Phylogenet Evol 67:277–296CrossRefGoogle Scholar
  66. Sober E (2008) Evidence and evolution: the logic behind the science. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  67. Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313CrossRefGoogle Scholar
  68. Thorne JL, Kishino H, Painter IS (1998) Estimating the rate of evolution of the rate of molecular evolution. Mol Biol Evol 15:1647–1657CrossRefGoogle Scholar
  69. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288Google Scholar
  70. Tuffley C, Steel M (1997) Links between maximum likelihood and maximum parsimony under a simple model of site substitution. Bull Math Biol 59:581–607CrossRefGoogle Scholar
  71. Tversky A, Kahneman D (1974) Judgement under uncertainty: heuristics and biases. Science 185:1124–1131CrossRefGoogle Scholar
  72. Velasco JD (2008) The prior probabilities of phylogenetic trees. Biol Philos 23:455–473CrossRefGoogle Scholar
  73. Wilks SS (1938) The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann Math Stat 9:60–62CrossRefGoogle Scholar
  74. Yang Z (2006) Computational molecular evolution. Oxford University Press, OxfordCrossRefGoogle Scholar
  75. Yang Z, Rannala B (2005) Branch-length prior influences Bayesian posterior probability of phylogeny. Syst Biol 54:455–470CrossRefGoogle Scholar
  76. Yang Z, Rannala B (2006) Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. Mol Biol Evol 23:212–226CrossRefGoogle Scholar
  77. Yang Z, Yoder AD (2003) Comparison of likelihood and Bayesian methods for estimating divergence times using multiple gene loci and calibration points, with application to a radiation of cute-looking mouse lemur species. Syst Biol 52:705–726CrossRefGoogle Scholar
  78. Yang Z, Kumar S, Nei M (1995) A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:1641–1650Google Scholar
  79. Yoder AD, Yang Z (2000) Estimation of primate speciation dates using local molecular clocks. Mol Biol Evol 17:1081–1090CrossRefGoogle Scholar
  80. Zabel SL (1992) R.A. Fisher and the fiducial argument. Stat Sci 7:369–387CrossRefGoogle Scholar
  81. Zagordi O, Lobry JR (2005) Forcing reversibility in the no-strand-bias substitution model allows for the theoretical and practical identifiability of its 5 parameters from pairwise DNA sequence comparisons. Gene 347:175–182CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Sir Harold Mitchell Building, School of BiologyUniversity of St AndrewsSt Andrews, FifeUK

Personalised recommendations