In this elaborate response to Wu (in Scientometrics, 2018), I maintain that null hypothesis significance testing (NHST) is logically flawed. Wu (2018) disagrees with this claim presented in Schneider (in Scientometrics 102(1):411–432, 2015). In this response, I examine the claim in more depth and demonstrate that since NHST is based on one conditional probability alone and framed in a probabilistic modus tollens framework of reasoning, it is by definition logically invalid. I also argue that disregarding this logically fallacy, as most researchers do, and treating the p value as a heuristic value for dichotomous decisions against the null hypothesis, is a risky business that often leads to false-positive claims.
KeywordsNull hypothesis significance test Statistical inference p values Conditional probability Inference logic Modus tollens
- Berger, J. O., & Berry, D. A. (1988a). The relevance of stopping rules in statistical inference (with discussion). In S. Gupta & J. O. Berger (Eds.), Statistical decision theory and related topics IV (Vol. 1, pp. 29–72). New York, NY: Springer.Google Scholar
- Berger, J. O., & Berry, D. A. (1988b). Statistical analysis and the illusion of objectivity. American Scientist, 76(2), 159–165.Google Scholar
- Gigerenzer, G. (1993). The superego, the ego, and the id in statistical reasoning. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences: Methodological issues (pp. 311–339). Hillsdale, MI: Erlbaum.Google Scholar
- Wu, J. (2018). Is there an intrinsic logical error in null hypothesis significance tests? Commentary on: “Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations”. Scientometrics. https://doi.org/10.1007/s11192-018-2656-3.