The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number

Bickel, David R.

doi:10.1007/s10260-020-00553-3

The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number

Original Paper
Published: 01 January 2021

Volume 30, pages 1157–1174, (2021)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

David R. Bickel ORCID: orcid.org/0000-0002-9623-3946¹

274 Accesses
2 Citations
1 Altmetric
Explore all metrics

A DOG, crossing a bridge over a stream with a piece of flesh in his mouth, saw his own shadow in the water and took it for that of another Dog, with a piece of meat double his own in size. He immediately let go of his own, and fiercely attacked the other Dog to get his larger piece from him. He thus lost both: that which he grasped at in the water, because it was a shadow; and his own, because the stream swept it away.

(Aesop’s Fables, translated by George Fyler Townsend, Amazon Digital Services, Inc., p. 18)

Abstract

Consider a data set as a body of evidence that might confirm or disconfirm a hypothesis about a parameter value. If the posterior probability of the hypothesis is high enough, then the truth of the hypothesis is accepted for some purpose such as reporting a new discovery. In that way, the posterior probability measures the sufficiency of the evidence for accepting the hypothesis. It would only follow that the evidence is relevant to the hypothesis if the prior probability were not already high enough for acceptance. A measure of the relevancy of the evidence is the Bayes factor since it is the ratio of the posterior odds to the prior odds. Measures of the sufficiency of the evidence and measures of the relevancy of the evidence are not mutually exclusive. An example falling in both classes is the likelihood ratio statistic, perhaps based on a pseudolikelihood function that eliminates nuisance parameters. There is a sense in which the likelihood ratio statistic measures both the sufficiency of the evidence and its relevancy. That result is established by representing the likelihood ratio statistic in terms of a conditional possibility measure that satisfies logical coherence rather than probabilistic coherence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Sander Greenland, Stephen J. Senn, … Douglas G. Altman

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Article Open access 17 April 2024

Klaas Sijtsma, Jules L. Ellis & Denny Borsboom

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Article Open access 08 March 2021

Eyke Hüllermeier & Willem Waegeman

References

Barnard GA (1967) The use of the likelihood function. In: proceedings of the fifth berkeley symposium in statistical practice. (pp 27–40)
Bickel DR (2011) Estimating the null distribution to adjust observed confidence levels for genome-scale screening. Biometrics 67:363–370
Article MathSciNet MATH Google Scholar
Bickel DR (2012) The strength of statistical evidence for composite hypotheses: inference to the best explanation. Stat Sin 22:1147–1198
MathSciNet MATH Google Scholar
Bickel DR (2013a) Minimax-optimal strength of statistical evidence for a composite alternative hypothesis. Int Stat Rev 81:188–206
Article MathSciNet MATH Google Scholar
Bickel DR (2013b) Pseudo-likelihood, explanatory power, and Bayes’s theorem [comment on “A likelihood paradigm for clinical trials”]. J Stat Theory Pract 7:178–182
Article MathSciNet MATH Google Scholar
Bickel DR (2018) Bayesian revision of a prior given prior-data conflict, expert opinion, or a similar insight: a large-deviation approach. Statistics 52:552–570
Article MathSciNet MATH Google Scholar
Bickel DR (2019) The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number, working paper, https://doi.org/10.5281/zenodo.2538412
Bickel DR (2020a) Confidence distributions and empirical Bayes posterior distributions unified as distributions of evidential support. Communications in Statistics - Theory and Methods. https://doi.org/10.1080/03610926.2020.1790004
Bickel DR (2020b) The p-value interpreted as the posterior probability of explaining the data: applications to multiple testing and to restricted parameter spaces, working paper, https://doi.org/10.5281/zenodo.3901806
Bickel DR, Patriota AG (2019) Self-consistent confidence sets and tests of composite hypotheses applicable to restricted parameters. Bernoulli 25(1):47–74
Article MathSciNet MATH Google Scholar
Bickel DR, Rahal A (2019) Model fusion and multiple testing in the likelihood paradigm: shrinkage and evidence supporting a point null hypothesis. Statistics 53:1187–1209
Article MathSciNet MATH Google Scholar
Bjornstad JF (1990) Predictive likelihood: a review. Stat Sci 5:242–254
MathSciNet MATH Google Scholar
Blume J (2013) Likelihood and composite hypotheses [comment on “A likelihood paradigm for clinical trials”]. J Stat Theory Prac 7(2):183–186
Article MATH Google Scholar
Blume JD (2002) Likelihood methods for measuring statistical evidence. Stat Med 21:2563–2599
Article Google Scholar
Blume JD (2011) Likelihood and its evidential framework. In: Bandyopadhyay PS, Forster MR (eds) Philosophy of Statistics. North Holland, Amsterdam, pp 493–512
Chapter Google Scholar
Carnap R (1962) Logical foundation of probablity. University of Chicago Press, Chicago
Google Scholar
Coletti G, Scozzafava R, Vantaggi B (2009) Integrated likelihood in a finitely additive setting. In: Symbolic and quantitative approaches to reasoning with uncertainty. Vol. 5590 of Lecture Notes in Comput. Sci. Springer, Berlin, pp 554–565
Dubois D, Moral S, Prade H (1997) A semantics for possibility theory based on likelihoods. J Mathem Anal Appl 205(2):359–380
Article MathSciNet MATH Google Scholar
Edwards AWF (1992) Likelihood. Johns Hopkins Press, Baltimore
MATH Google Scholar
Evans M (2015) Measuring statistical evidence using relative belief. Chapman & Hall/CRC Monographs on statistics & applied probability. CRC Press, New York
Book Google Scholar
Fisher RA (1973) Statistical methods and scientific inference. Hafner Press, New York
MATH Google Scholar
Fraser DAS (2011) Is Bayes posterior just quick and dirty confidence? Stat Sci 26:299–316
MathSciNet MATH Google Scholar
Giang PH, Shenoy PP (2005) Decision making on the sole basis of statistical likelihood. Artif Intell 165:137–163
Article MathSciNet MATH Google Scholar
Hacking I (1965) Logic of Statistical Inference. Cambridge University Press, Cambridge
Book MATH Google Scholar
Hoch JS, Blume JD (2008) Measuring and illustrating statistical evidence in a cost-effectiveness analysis. J Health Econ 27:476–495
Article Google Scholar
Hodge SE, Baskurt Z, Strug LJ (2011) Using parametric multipoint lods and mods for linkage analysis requires a shift in statistical thinking. Human Hered 72(4):264–275
Article Google Scholar
Jeffreys H (1948) Theory of Probability. Oxford University Press, London
MATH Google Scholar
Kalbfleisch JD (2000) Comment on R. Royall, “On the probability of observing misleading statistical evidence”. J Am Stat Assoc 95:770–771
Google Scholar
Kaye D, Koehler J (2003) The misquantification of probative value. Law Human Behav 27(6):645–659
Article Google Scholar
Koehler JJ (2002) When do courts think base rate statistics are relevant? Jurimetr J 24:373–402
Google Scholar
Koscholke J (2017) Carnap’s relevance measure as a probabilistic measure of coherence. Erkenntnis 82(2):339–350
Article MathSciNet MATH Google Scholar
Lavine M, Schervish MJ (1999) Bayes factors: what they are and what they are not. Am Stat 53:119–122
MathSciNet Google Scholar
Lee Y, Nelder JA (1996) Hierarchical generalized linear models. J R Stat Soc Ser B 58:619–678
MathSciNet MATH Google Scholar
Lee Y, Nelder JA, Pawitan Y (2006) Generalized linear models with random effects. Chapman and Hall, New York
Book MATH Google Scholar
Lindsey J (1996) Parametric statistical inference. Oxford Science Publications, Clarendon Press, Oxford
MATH Google Scholar
Mandelkern M (2002) Setting confidence intervals for bounded parameters. Stat Sci 17:149–172
Article MathSciNet MATH Google Scholar
Marchand É, Strawderman W (2013) On bayesian credible sets, restricted parameter spaces and frequentist coverage. Electron J Stat 7(1):1419–1431
MathSciNet MATH Google Scholar
Marchand É, Strawderman WE (2004) Estimation in restricted parameter spaces: a review. Lect Notes Monogr Ser 45:21–44
Article MathSciNet MATH Google Scholar
Marchand É, Strawderman WE (2006) On the behavior of Bayesian credible intervals for some restricted parameter space problems. Lect Notes Monogr Ser 50:112–126
Article MathSciNet MATH Google Scholar
Morgenthaler S, Staudte RG (2012) Advantages of variance stabilization. Scand J Stat 39(4):714–728
Article MathSciNet MATH Google Scholar
Patriota AG (2013) A classical measure of evidence for general null hypotheses. Fuzzy Sets Syst 233:74–88
Article MathSciNet MATH Google Scholar
Patriota AG (2017) On some assumptions of the null hypothesis statistical testing. Educ Psychol Measurement 77(3):507–528
Article Google Scholar
Rohde CA (2014) Pure likelihood methods, Ch. 18. Springer International Publishing, New York, pp 197–209
Google Scholar
Royall R (1997) Statistical evidence: a likelihood paradigm. CRC Press, New York
MATH Google Scholar
Royall R (2000a) On the probability of observing misleading statistical evidence. J Am Stat Assoc 95:760–768
Article MathSciNet MATH Google Scholar
Royall R (2000b) On the probability of observing misleading statistical evidence (with discussion). J Am Stat Assoc 95:760–780
Article MathSciNet MATH Google Scholar
Schervish MJ (1996) P values: what they are and what they are not. Am Stat 50:203–206
MathSciNet Google Scholar
Severini T (2000) Likelihood methods in statistics. Oxford University Press, Oxford
MATH Google Scholar
Spanos A (2013) Revisiting the likelihoodist evidential account [comment on “A likelihood paradigm for clinical trials”]. J Stat Theory Prac 7(2):187–195
Article MATH Google Scholar
Spohn W (2012) The laws of belief: ranking theory and its philosophical applications. Oxford University Press, Oxford
Book Google Scholar
Sprott DA (2000) Statistical inference in science. Springer, New York
MATH Google Scholar
Strug L (2018) The evidential statistical paradigm in genetics. Genetic Epidemiol. https://doi.org/10.1002/gepi.22151
Article Google Scholar
Strug L, Hodge S, Chiang T, Pal D, Corey P, Rohde C (2010) A pure likelihood approach to the analysis of genetic association data: an alternative to Bayesian and frequentist analysis. Eur J Human Genet 18:933–941
Article Google Scholar
Strug LJ, Hodge SE (2006a) An alternative foundation for the planning and evaluation of linkage analysis i. Decoupling ’error probabilities’ from ’measures of evidence’. Human Hered 61:166–188
Article Google Scholar
Strug LJ, Hodge SE (2006b) An alternative foundation for the planning and evaluation of linkage analysis. ii. Implications for multiple test adjustments. Human Hered 61:200–209
Article Google Scholar
Strug LJ, Rohde CA, Corey PN (2007) An introduction to evidential sample size calculations. Am Stat 61:207–212
Article MathSciNet Google Scholar
Vieland VJ, Seok S-C (2016) Statistical evidence measured on a properly calibrated scale for multinomial hypothesis comparisons. Entropy 18(4):114
Article Google Scholar
Walley P, Moral S (1999) Upper probabilities based only on the likelihood function. J R Stat Soc Ser B (Stat Methodol) 61:831–847
Article MathSciNet MATH Google Scholar
Wang H (2006) Modified p-value of two-sided test for normal distribution with restricted parameter space. Commun Stat Theory Methods 35(8):1361–1374
Article MathSciNet MATH Google Scholar
Wang H (2007) Modified p-values for one-sided testing in restricted parameter spaces. Stat Probab Lett 77:625–631
Article MathSciNet MATH Google Scholar
Zhang T, Woodroofe M (2003) Credible and confidence sets for restricted parameter spaces. J Stat Plan Inference 115:479–490
Article MathSciNet MATH Google Scholar
Zhang Z, Zhang B (2013a) A likelihood paradigm for clinical trials. J Stat Theory Prac 7:157–177
Article MathSciNet MATH Google Scholar
Zhang Z, Zhang B (2013b) Rejoinder [on “A likelihood paradigm for clinical trials”]. J Stat Theory Prac 7:196–203
Article Google Scholar

Download references

Acknowledgements

This research was partially supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN/356018-2009), by the Canada Foundation for Innovation (CFI16604), by the Ministry of Research and Innovation of Ontario (MRI16604), and by the Faculty of Medicine of the University of Ottawa.

Author information

Authors and Affiliations

University of Ottawa, Ottawa, Canada
David R. Bickel

Authors

David R. Bickel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David R. Bickel.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bickel, D.R. The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number. Stat Methods Appl 30, 1157–1174 (2021). https://doi.org/10.1007/s10260-020-00553-3

Download citation

Accepted: 23 November 2020
Published: 01 January 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s10260-020-00553-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The sufficiency of the evidence, the relevancy of the evidence, and quantifying both with a single number

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation