Abstract
The aim of the article is to propose a way to determine to what extent a hypothesis H is confirmed if it has successfully passed a classical significance test. Bayesians have already raised many serious objections against significance testing, but in doing so they have always had to rely on epistemic probabilities and a further Bayesian analysis, which are rejected by classical statisticians. Therefore, I will suggest a purely frequentist evaluation procedure for significance tests that should also be accepted by a classical statistician. This procedure likewise indicates some additional problems of significance tests. In some situations, such tests offer only a weak incremental support of a hypothesis, although an absolute confirmation is necessary, and they overestimate positive results for small effects, since the confirmation of H is often rather marginal in these cases. In specific cases, for example, in cases of ESP-hypotheses, such as precognition, this phenomenon leads too easily to a significant confirmation and can be regarded as a form of the probabilistic falsification fallacy.
Similar content being viewed by others
Notes
In controlled experiments we try to eliminate possible further hypotheses about the causes of the effect we can observe by the special design of our experiments (e.g., in comparing two groups selected by a process of randomization).
For a classical statistician the likelihood P(T ≥ x|H 0s) is the probability of the result T ≥ x under the assumption of the null hypothesis H 0. It should express what the hypothesis says about the result and cannot be computed by P(T ≥ x ∧ H 0)/P(H 0), since for a classical statistician P(H 0) is not defined and only a subjective magnitude.
If it is determined, we will additionally take the type-II error (here called β-error) into account that is associated with Neyman–Pearson hypothesis testing.
Nevertheless, one of the reviewers takes the frequentist interpretation as the more plausible reading of the mentioned articles. Thus, I have to leave the decision what the best understanding of the PPV is to the reader.
Hagen (1997, 19) has proposed a similar but Bayesian analysis and has come to the conclusion that null hypothesis testing tells us what we want, since he has only determined the posterior probability of H 1 for one special case in which it, indeed, increases from 0.5 to 0.89.
References
Alcock, J. (2011). Back from the future: Parapsychology and the Bem affair. Skeptical Inquirer. http://www.csicop.org/specialarticles/show/back_from_the_future.
Bartelborth, T. (2012). Die erkenntnistheoretischen Grundlagen induktiven Schließens. E-Book. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-84565.
Beck-Bornholdt, H. P., & Dubben, H. H. (1996). Is the pope an alien? Nature, 381, 730.
Beck-Bornholdt, H. P., & Dubben, H. H. (1998). Der Hund, der Eier legt. Erkennen von Fehlinformationen durch Querdenken. Hamburg: Rowohlt.
Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100, 1–19.
Berger, J. O., & Berry, D. A. (1988). Statistical analysis and the illusion of objectivity. American Scientist, 76, 159–165.
Bundschuh, M., Newman, M. C., Zubrod, J. P., Seitz, F., Rosenfeldt, R. R., & Schulz, R. (2013). Misuse of null hypothesis significance testing: Would estimation of positive and negative predictive values improve certainty of chemical risk assessment? Environmental Science and Pollution Research,. doi:10.1007/s11356-013-1749-z.
Campbell, S., & Franklin, J. (2004). Randomness and the justification of induction. Synthese, 138, 79–99.
Fisher, R. A. (1935a). The design of experiments. Edinburgh: Oliver & Boyd.
Fisher, R. A. (1935b). The logic of inductive inference. Journal of the Royal Statistical Society, 98, 39–82.
Franklin, J. (2001). Resurrecting logical probability. Erkenntnis, 55, 277–305.
Hagen, R. L. (1997). In praise of the null hypothesis statistical test. American Psychologist, 52(1), 15–24.
Hájek, A. (2012). Interpretations of probability. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Winter 2012 Edition). <http://plato.stanford.edu/archives/win2012/entries/probability-interpret/.
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2, 696–701.
Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.
Strevens, M. (2000). Do large probabilities explain better? Philosophy of Science, 67, 366–390.
Wacholder, S., Chanock, S., Garcia-Closas, M., El ghormli, L., & Rothman, N. (2004). Assessing the probability that a positive report is false: An approach for molecular epidemiology studies. Journal of the National Cancer Institute, 96(6), 434–442.
Wagenmakers, E.-J., Wetzels, R., Borsboom, D., & van der Maas, H. L. J. (2011). Why psychologists must change the way they analyze their data: The case of psi: Comment on Bem (2011). Journal of Personality and Social Psychology, 100, 426–432.
Wetzels, R., Matzke, D., Lee, M. D., Rouder, J. N., Iverson, G. J., & Wagenmakers, E.-J. (2011). Statistical evidence in experimental psychology: An empirical comparison using 855 t-tests. Perspectives on Psychological Science, 6, 291–298.
Acknowledgments
I am grateful to two anonymous referees for very helpful comments on an earlier draft.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bartelborth, T. How Strong is the Confirmation of a Hypothesis by Significant Data?. J Gen Philos Sci 47, 277–291 (2016). https://doi.org/10.1007/s10838-016-9341-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10838-016-9341-0