NHST is still logically flawed

Schneider, Jesper W.

doi:10.1007/s11192-018-2655-4

NHST is still logically flawed

Published: 27 January 2018

Volume 115, pages 627–635, (2018)
Cite this article

Scientometrics Aims and scope Submit manuscript

Jesper W. Schneider¹

1088 Accesses
7 Citations
4 Altmetric
Explore all metrics

Abstract

In this elaborate response to Wu (in Scientometrics, 2018), I maintain that null hypothesis significance testing (NHST) is logically flawed. Wu (2018) disagrees with this claim presented in Schneider (in Scientometrics 102(1):411–432, 2015). In this response, I examine the claim in more depth and demonstrate that since NHST is based on one conditional probability alone and framed in a probabilistic modus tollens framework of reasoning, it is by definition logically invalid. I also argue that disregarding this logically fallacy, as most researchers do, and treating the p value as a heuristic value for dichotomous decisions against the null hypothesis, is a risky business that often leads to false-positive claims.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://issi-society.org/blog/posts/2017/october/issi-paper-of-the-year-award-2017/.
The latter in itself can be seen as a logical flaw, as a valid measure of strength of evidence should not include the probabilities of unobserved outcomes (Jeffreys 1939; Berger and Delampady 1987; Berger and Berry 1988a, b; Royall 1997; Goodman 1999), but this is not the main logical flaw of interest here.

References

Berger, J. O., & Berry, D. A. (1988a). The relevance of stopping rules in statistical inference (with discussion). In S. Gupta & J. O. Berger (Eds.), Statistical decision theory and related topics IV (Vol. 1, pp. 29–72). New York, NY: Springer.
Berger, J. O., & Berry, D. A. (1988b). Statistical analysis and the illusion of objectivity. American Scientist, 76(2), 159–165.
Google Scholar
Berger, J. O., & Delampady, M. (1987). Testing precise hypotheses. Statistcial Science, 2(3), 317–352.
Article MathSciNet MATH Google Scholar
Berger, J. O., & Sellke, T. (1987). Testing a point null hypothesis—The irreconcilability of p-values and evidence. Journal of the American Statistical Association, 82(397), 112–122.
MathSciNet MATH Google Scholar
Berkson, J. (1942). Tests of significance considered as evidence. Journal of the American Statistical Association, 37(219), 325–335.
Article Google Scholar
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997–1003.
Article Google Scholar
Colquhoun, D. (2014). An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society Open Science, 1(3), 1–16. https://doi.org/10.1098/rsos.140216.
Article Google Scholar
Colquhoun, D. (2017). The reproducibility of research and the misinterpretation of P values. bioRxiv. https://doi.org/10.1101/144337.
Google Scholar
Edwards, A. W. F. (1972). Likelihood. Cambridge: Cambridge University Press.
MATH Google Scholar
Falk, R., & Greenbaum, C. (1995). Significance tests die hard: The amazing persistence of a probabilistic misconception. Theory Psychology, 5, 75–98.
Article Google Scholar
Fisher, R. A. (1956). Statistical methods and scientific inference. New York, NY: Hafner.
MATH Google Scholar
Gigerenzer, G. (1993). The superego, the ego, and the id in statistical reasoning. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences: Methodological issues (pp. 311–339). Hillsdale, MI: Erlbaum.
Google Scholar
Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-economics, 33(5), 587–606.
Article Google Scholar
Goodman, S. N. (1999). Toward evidence-based medical statistics. 1: The P value fallacy. Annals of Internal Medicine, 130(12), 995–1004.
Article Google Scholar
Hacking, I. (1965). Logic of statistical inference. Cambridge: Cambridge University Press.
Book MATH Google Scholar
Hofmann, S. G. (2002). Fisher’s fallacy and NHST’s flawed logic. American Psychologist, 57(1), 69–70.
Article Google Scholar
Hubbard, R., & Lindsay, R. M. (2008). Why P values are not a useful measure of evidence in statistical significance testing. Theory and Psychology, 18(1), 69–88.
Article Google Scholar
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), 696–701.
Article Google Scholar
Ioannidis, J. P. A., Stanley, T. D., & Doucouliagos, H. (2017). The power of bias in economics research. The Economic Journal, 127(605), F236–F265.
Article Google Scholar
Jeffreys, H. (1939). Theory of probability. Oxford: Clarendon Press.
MATH Google Scholar
Krueger, J. (2001). Null hypothesis significance testing: On the survival of a flawed method. American Psychologist, 56(1), 16–26.
Article Google Scholar
Krueger, J. I., & Heck, P. R. (2017). The Heuristic value of p in inductive statistical inference. Frontiers in Psychology, 8(908), 1–16. https://doi.org/10.3389/fpsyg.2017.00908.
Google Scholar
Lindley, D. V. (1957). A statistical paradox. Biometrika, 44(1–2), 187–192.
Article MATH Google Scholar
Meehl, P. E. (1967). Theory-testing in psychology and physics: A methodological paradox. Philosophy of Science, 34(2), 103–115.
Article Google Scholar
Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5(2), 241–301.
Article Google Scholar
Pollard, P., & Richardson, J. T. (1987). On the probability of making type I errors. Psychological Bulletin, 102(1), 159–163.
Article Google Scholar
Royall, R. (1997). Statistical evidence: A likelihood paradigm. London: Chapman & Hall.
MATH Google Scholar
Schneider, J. W. (2015). Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations. Scientometrics, 102(1), 411–432.
Article Google Scholar
Sellke, T., Bayarri, M. J., & Berger, J. O. (2001). Calibration of rho values for testing precise null hypotheses. The American Statistician, 55, 62–71.
Article MathSciNet MATH Google Scholar
Sober, E. (2008). Evidence and evolution. The logic behind science. Cambridge: Cambridge University Press.
Book Google Scholar
Szucs, D., & Ioannidis, J. P. A. (2017). When null hypothesis significance testing is unsuitable for research: A reassessment. Frontiers in Human Neuroscience, 11(390), 1–21. https://doi.org/10.3389/fnhum.2017.00390.
Google Scholar
Trafimow, D. (2003). Hypothesis testing and theory evaluation at the boundaries: Surprising insights from Bayes’s theorem. Psychological Review, 110(3), 526.
Article Google Scholar
Trafimow, D., & Rice, S. (2009). A test of the null hypothesis significance testing procedure correlation argument. The Journal of General Psychology, 136(3), 261–270.
Article Google Scholar
Wu, J. (2018). Is there an intrinsic logical error in null hypothesis significance tests? Commentary on: “Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations”. Scientometrics. https://doi.org/10.1007/s11192-018-2656-3.

Download references

Author information

Authors and Affiliations

Danish Centre for Studies in Research and Research Policy, Department of Political Science, Aarhus University, Bartholins Alle 7, 8000, Aarhus C, Denmark
Jesper W. Schneider

Authors

Jesper W. Schneider
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jesper W. Schneider.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schneider, J.W. NHST is still logically flawed. Scientometrics 115, 627–635 (2018). https://doi.org/10.1007/s11192-018-2655-4

Download citation

Received: 22 December 2017
Published: 27 January 2018
Issue Date: April 2018
DOI: https://doi.org/10.1007/s11192-018-2655-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Abstract

Access this article

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation