The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number

A DOG, crossing a bridge over a stream with a piece of flesh in his mouth, saw his own shadow in the water and took it for that of another Dog, with a piece of meat double his own in size. He immediately let go of his own, and fiercely attacked the other Dog to get his larger piece from him. He thus lost both: that which he grasped at in the water, because it was a shadow; and his own, because the stream swept it away.

(Aesop’s Fables, translated by George Fyler Townsend, Amazon Digital Services, Inc., p. 18)

Abstract

Consider a data set as a body of evidence that might confirm or disconfirm a hypothesis about a parameter value. If the posterior probability of the hypothesis is high enough, then the truth of the hypothesis is accepted for some purpose such as reporting a new discovery. In that way, the posterior probability measures the sufficiency of the evidence for accepting the hypothesis. It would only follow that the evidence is relevant to the hypothesis if the prior probability were not already high enough for acceptance. A measure of the relevancy of the evidence is the Bayes factor since it is the ratio of the posterior odds to the prior odds. Measures of the sufficiency of the evidence and measures of the relevancy of the evidence are not mutually exclusive. An example falling in both classes is the likelihood ratio statistic, perhaps based on a pseudolikelihood function that eliminates nuisance parameters. There is a sense in which the likelihood ratio statistic measures both the sufficiency of the evidence and its relevancy. That result is established by representing the likelihood ratio statistic in terms of a conditional possibility measure that satisfies logical coherence rather than probabilistic coherence.

This is a preview of subscription content, access via your institution.

References

  1. Barnard GA (1967) The use of the likelihood function. In: proceedings of the fifth berkeley symposium in statistical practice. (pp 27–40)

  2. Bickel DR (2011) Estimating the null distribution to adjust observed confidence levels for genome-scale screening. Biometrics 67:363–370

    MathSciNet  MATH  Article  Google Scholar 

  3. Bickel DR (2012) The strength of statistical evidence for composite hypotheses: inference to the best explanation. Stat Sin 22:1147–1198

    MathSciNet  MATH  Google Scholar 

  4. Bickel DR (2013a) Minimax-optimal strength of statistical evidence for a composite alternative hypothesis. Int Stat Rev 81:188–206

    MathSciNet  MATH  Article  Google Scholar 

  5. Bickel DR (2013b) Pseudo-likelihood, explanatory power, and Bayes’s theorem [comment on “A likelihood paradigm for clinical trials”]. J Stat Theory Pract 7:178–182

    MathSciNet  MATH  Article  Google Scholar 

  6. Bickel DR (2018) Bayesian revision of a prior given prior-data conflict, expert opinion, or a similar insight: a large-deviation approach. Statistics 52:552–570

    MathSciNet  MATH  Article  Google Scholar 

  7. Bickel DR (2019) The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number, working paper, https://doi.org/10.5281/zenodo.2538412

  8. Bickel DR (2020a) Confidence distributions and empirical Bayes posterior distributions unified as distributions of evidential support. Communications in Statistics - Theory and Methods. https://doi.org/10.1080/03610926.2020.1790004

  9. Bickel DR (2020b) The p-value interpreted as the posterior probability of explaining the data: applications to multiple testing and to restricted parameter spaces, working paper, https://doi.org/10.5281/zenodo.3901806

  10. Bickel DR, Patriota AG (2019) Self-consistent confidence sets and tests of composite hypotheses applicable to restricted parameters. Bernoulli 25(1):47–74

    MathSciNet  MATH  Article  Google Scholar 

  11. Bickel DR, Rahal A (2019) Model fusion and multiple testing in the likelihood paradigm: shrinkage and evidence supporting a point null hypothesis. Statistics 53:1187–1209

    MathSciNet  MATH  Article  Google Scholar 

  12. Bjornstad JF (1990) Predictive likelihood: a review. Stat Sci 5:242–254

    MathSciNet  MATH  Article  Google Scholar 

  13. Blume J (2013) Likelihood and composite hypotheses [comment on “A likelihood paradigm for clinical trials”]. J Stat Theory Prac 7(2):183–186

    MATH  Article  Google Scholar 

  14. Blume JD (2002) Likelihood methods for measuring statistical evidence. Stat Med 21:2563–2599

    Article  Google Scholar 

  15. Blume JD (2011) Likelihood and its evidential framework. In: Bandyopadhyay PS, Forster MR (eds) Philosophy of Statistics. North Holland, Amsterdam, pp 493–512

    Google Scholar 

  16. Carnap R (1962) Logical foundation of probablity. University of Chicago Press, Chicago

    Google Scholar 

  17. Coletti G, Scozzafava R, Vantaggi B (2009) Integrated likelihood in a finitely additive setting. In: Symbolic and quantitative approaches to reasoning with uncertainty. Vol. 5590 of Lecture Notes in Comput. Sci. Springer, Berlin, pp 554–565

  18. Dubois D, Moral S, Prade H (1997) A semantics for possibility theory based on likelihoods. J Mathem Anal Appl 205(2):359–380

    MathSciNet  MATH  Article  Google Scholar 

  19. Edwards AWF (1992) Likelihood. Johns Hopkins Press, Baltimore

    Google Scholar 

  20. Evans M (2015) Measuring statistical evidence using relative belief. Chapman & Hall/CRC Monographs on statistics & applied probability. CRC Press, New York

    Google Scholar 

  21. Fisher RA (1973) Statistical methods and scientific inference. Hafner Press, New York

    Google Scholar 

  22. Fraser DAS (2011) Is Bayes posterior just quick and dirty confidence? Stat Sci 26:299–316

    MathSciNet  MATH  Article  Google Scholar 

  23. Giang PH, Shenoy PP (2005) Decision making on the sole basis of statistical likelihood. Artif Intell 165:137–163

    MathSciNet  MATH  Article  Google Scholar 

  24. Hacking I (1965) Logic of Statistical Inference. Cambridge University Press, Cambridge

    Google Scholar 

  25. Hoch JS, Blume JD (2008) Measuring and illustrating statistical evidence in a cost-effectiveness analysis. J Health Econ 27:476–495

    Article  Google Scholar 

  26. Hodge SE, Baskurt Z, Strug LJ (2011) Using parametric multipoint lods and mods for linkage analysis requires a shift in statistical thinking. Human Hered 72(4):264–275

    Article  Google Scholar 

  27. Jeffreys H (1948) Theory of Probability. Oxford University Press, London

    Google Scholar 

  28. Kalbfleisch JD (2000) Comment on R. Royall, “On the probability of observing misleading statistical evidence”. J Am Stat Assoc 95:770–771

    Google Scholar 

  29. Kaye D, Koehler J (2003) The misquantification of probative value. Law Human Behav 27(6):645–659

    Article  Google Scholar 

  30. Koehler JJ (2002) When do courts think base rate statistics are relevant? Jurimetr J 24:373–402

    Google Scholar 

  31. Koscholke J (2017) Carnap’s relevance measure as a probabilistic measure of coherence. Erkenntnis 82(2):339–350

    MathSciNet  MATH  Article  Google Scholar 

  32. Lavine M, Schervish MJ (1999) Bayes factors: what they are and what they are not. Am Stat 53:119–122

    MathSciNet  Google Scholar 

  33. Lee Y, Nelder JA (1996) Hierarchical generalized linear models. J R Stat Soc Ser B 58:619–678

    MathSciNet  MATH  Google Scholar 

  34. Lee Y, Nelder JA, Pawitan Y (2006) Generalized linear models with random effects. Chapman and Hall, New York

    Google Scholar 

  35. Lindsey J (1996) Parametric statistical inference. Oxford Science Publications, Clarendon Press, Oxford

    Google Scholar 

  36. Mandelkern M (2002) Setting confidence intervals for bounded parameters. Stat Sci 17:149–172

    MathSciNet  MATH  Article  Google Scholar 

  37. Marchand É, Strawderman W (2013) On bayesian credible sets, restricted parameter spaces and frequentist coverage. Electron J Stat 7(1):1419–1431

    MathSciNet  MATH  Article  Google Scholar 

  38. Marchand É, Strawderman WE (2004) Estimation in restricted parameter spaces: a review. Lect Notes Monogr Ser 45:21–44

    MathSciNet  MATH  Article  Google Scholar 

  39. Marchand É, Strawderman WE (2006) On the behavior of Bayesian credible intervals for some restricted parameter space problems. Lect Notes Monogr Ser 50:112–126

    MathSciNet  MATH  Article  Google Scholar 

  40. Morgenthaler S, Staudte RG (2012) Advantages of variance stabilization. Scand J Stat 39(4):714–728

    MathSciNet  MATH  Article  Google Scholar 

  41. Patriota AG (2013) A classical measure of evidence for general null hypotheses. Fuzzy Sets Syst 233:74–88

    MathSciNet  MATH  Article  Google Scholar 

  42. Patriota AG (2017) On some assumptions of the null hypothesis statistical testing. Educ Psychol Measurement 77(3):507–528

    Article  Google Scholar 

  43. Rohde CA (2014) Pure likelihood methods, Ch. 18. Springer International Publishing, New York, pp 197–209

    Google Scholar 

  44. Royall R (1997) Statistical evidence: a likelihood paradigm. CRC Press, New York

    Google Scholar 

  45. Royall R (2000a) On the probability of observing misleading statistical evidence. J Am Stat Assoc 95:760–768

    MathSciNet  MATH  Article  Google Scholar 

  46. Royall R (2000b) On the probability of observing misleading statistical evidence (with discussion). J Am Stat Assoc 95:760–780

    MathSciNet  MATH  Article  Google Scholar 

  47. Schervish MJ (1996) P values: what they are and what they are not. Am Stat 50:203–206

    MathSciNet  Google Scholar 

  48. Severini T (2000) Likelihood methods in statistics. Oxford University Press, Oxford

    Google Scholar 

  49. Spanos A (2013) Revisiting the likelihoodist evidential account [comment on “A likelihood paradigm for clinical trials”]. J Stat Theory Prac 7(2):187–195

    MATH  Article  Google Scholar 

  50. Spohn W (2012) The laws of belief: ranking theory and its philosophical applications. Oxford University Press, Oxford

    Google Scholar 

  51. Sprott DA (2000) Statistical inference in science. Springer, New York

    Google Scholar 

  52. Strug L (2018) The evidential statistical paradigm in genetics. Genetic Epidemiol. https://doi.org/10.1002/gepi.22151

    Article  Google Scholar 

  53. Strug L, Hodge S, Chiang T, Pal D, Corey P, Rohde C (2010) A pure likelihood approach to the analysis of genetic association data: an alternative to Bayesian and frequentist analysis. Eur J Human Genet 18:933–941

    Article  Google Scholar 

  54. Strug LJ, Hodge SE (2006a) An alternative foundation for the planning and evaluation of linkage analysis i. Decoupling ’error probabilities’ from ’measures of evidence’. Human Hered 61:166–188

    Article  Google Scholar 

  55. Strug LJ, Hodge SE (2006b) An alternative foundation for the planning and evaluation of linkage analysis. ii. Implications for multiple test adjustments. Human Hered 61:200–209

    Article  Google Scholar 

  56. Strug LJ, Rohde CA, Corey PN (2007) An introduction to evidential sample size calculations. Am Stat 61:207–212

    MathSciNet  Article  Google Scholar 

  57. Vieland VJ, Seok S-C (2016) Statistical evidence measured on a properly calibrated scale for multinomial hypothesis comparisons. Entropy 18(4):114

    Article  Google Scholar 

  58. Walley P, Moral S (1999) Upper probabilities based only on the likelihood function. J R Stat Soc Ser B (Stat Methodol) 61:831–847

    MathSciNet  MATH  Article  Google Scholar 

  59. Wang H (2006) Modified p-value of two-sided test for normal distribution with restricted parameter space. Commun Stat Theory Methods 35(8):1361–1374

    MathSciNet  MATH  Article  Google Scholar 

  60. Wang H (2007) Modified p-values for one-sided testing in restricted parameter spaces. Stat Probab Lett 77:625–631

    MathSciNet  MATH  Article  Google Scholar 

  61. Zhang T, Woodroofe M (2003) Credible and confidence sets for restricted parameter spaces. J Stat Plan Inference 115:479–490

    MathSciNet  MATH  Article  Google Scholar 

  62. Zhang Z, Zhang B (2013a) A likelihood paradigm for clinical trials. J Stat Theory Prac 7:157–177

    MathSciNet  MATH  Article  Google Scholar 

  63. Zhang Z, Zhang B (2013b) Rejoinder [on “A likelihood paradigm for clinical trials”]. J Stat Theory Prac 7:196–203

    Article  Google Scholar 

Download references

Acknowledgements

This research was partially supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN/356018-2009), by the Canada Foundation for Innovation (CFI16604), by the Ministry of Research and Innovation of Ontario (MRI16604), and by the Faculty of Medicine of the University of Ottawa.

Author information

Affiliations

Authors

Corresponding author

Correspondence to David R. Bickel.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bickel, D.R. The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number. Stat Methods Appl (2021). https://doi.org/10.1007/s10260-020-00553-3

Download citation

Keywords

  • Deductive closure
  • Deductive cogency
  • General law of likelihood
  • Likelihood paradigm
  • Possibility measure
  • Possibility theory
  • Pure likelihood methods
  • Restricted parameter space
  • Strength of statistical evidence