Skip to main content
Log in

How Strong is the Confirmation of a Hypothesis by Significant Data?

  • Article
  • Published:
Journal for General Philosophy of Science Aims and scope Submit manuscript

Abstract

The aim of the article is to propose a way to determine to what extent a hypothesis H is confirmed if it has successfully passed a classical significance test. Bayesians have already raised many serious objections against significance testing, but in doing so they have always had to rely on epistemic probabilities and a further Bayesian analysis, which are rejected by classical statisticians. Therefore, I will suggest a purely frequentist evaluation procedure for significance tests that should also be accepted by a classical statistician. This procedure likewise indicates some additional problems of significance tests. In some situations, such tests offer only a weak incremental support of a hypothesis, although an absolute confirmation is necessary, and they overestimate positive results for small effects, since the confirmation of H is often rather marginal in these cases. In specific cases, for example, in cases of ESP-hypotheses, such as precognition, this phenomenon leads too easily to a significant confirmation and can be regarded as a form of the probabilistic falsification fallacy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. In controlled experiments we try to eliminate possible further hypotheses about the causes of the effect we can observe by the special design of our experiments (e.g., in comparing two groups selected by a process of randomization).

  2. For a classical statistician the likelihood P(T ≥ x|H 0s) is the probability of the result T ≥ x under the assumption of the null hypothesis H 0. It should express what the hypothesis says about the result and cannot be computed by P(T ≥ x ∧ H 0)/P(H 0), since for a classical statistician P(H 0) is not defined and only a subjective magnitude.

  3. If it is determined, we will additionally take the type-II error (here called β-error) into account that is associated with Neyman–Pearson hypothesis testing.

  4. Nevertheless, one of the reviewers takes the frequentist interpretation as the more plausible reading of the mentioned articles. Thus, I have to leave the decision what the best understanding of the PPV is to the reader.

  5. Hagen (1997, 19) has proposed a similar but Bayesian analysis and has come to the conclusion that null hypothesis testing tells us what we want, since he has only determined the posterior probability of H 1 for one special case in which it, indeed, increases from 0.5 to 0.89.

References

  • Alcock, J. (2011). Back from the future: Parapsychology and the Bem affair. Skeptical Inquirer. http://www.csicop.org/specialarticles/show/back_from_the_future.

  • Bartelborth, T. (2012). Die erkenntnistheoretischen Grundlagen induktiven Schließens. E-Book. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-84565.

  • Beck-Bornholdt, H. P., & Dubben, H. H. (1996). Is the pope an alien? Nature, 381, 730.

    Article  Google Scholar 

  • Beck-Bornholdt, H. P., & Dubben, H. H. (1998). Der Hund, der Eier legt. Erkennen von Fehlinformationen durch Querdenken. Hamburg: Rowohlt.

    Google Scholar 

  • Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100, 1–19.

    Article  Google Scholar 

  • Berger, J. O., & Berry, D. A. (1988). Statistical analysis and the illusion of objectivity. American Scientist, 76, 159–165.

    Google Scholar 

  • Bundschuh, M., Newman, M. C., Zubrod, J. P., Seitz, F., Rosenfeldt, R. R., & Schulz, R. (2013). Misuse of null hypothesis significance testing: Would estimation of positive and negative predictive values improve certainty of chemical risk assessment? Environmental Science and Pollution Research,. doi:10.1007/s11356-013-1749-z.

    Google Scholar 

  • Campbell, S., & Franklin, J. (2004). Randomness and the justification of induction. Synthese, 138, 79–99.

    Article  Google Scholar 

  • Fisher, R. A. (1935a). The design of experiments. Edinburgh: Oliver & Boyd.

    Google Scholar 

  • Fisher, R. A. (1935b). The logic of inductive inference. Journal of the Royal Statistical Society, 98, 39–82.

    Article  Google Scholar 

  • Franklin, J. (2001). Resurrecting logical probability. Erkenntnis, 55, 277–305.

    Article  Google Scholar 

  • Hagen, R. L. (1997). In praise of the null hypothesis statistical test. American Psychologist, 52(1), 15–24.

    Article  Google Scholar 

  • Hájek, A. (2012). Interpretations of probability. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Winter 2012 Edition). <http://plato.stanford.edu/archives/win2012/entries/probability-interpret/.

  • Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2, 696–701.

    Google Scholar 

  • Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.

    Article  Google Scholar 

  • Strevens, M. (2000). Do large probabilities explain better? Philosophy of Science, 67, 366–390.

    Article  Google Scholar 

  • Wacholder, S., Chanock, S., Garcia-Closas, M., El ghormli, L., & Rothman, N. (2004). Assessing the probability that a positive report is false: An approach for molecular epidemiology studies. Journal of the National Cancer Institute, 96(6), 434–442.

    Article  Google Scholar 

  • Wagenmakers, E.-J., Wetzels, R., Borsboom, D., & van der Maas, H. L. J. (2011). Why psychologists must change the way they analyze their data: The case of psi: Comment on Bem (2011). Journal of Personality and Social Psychology, 100, 426–432.

    Article  Google Scholar 

  • Wetzels, R., Matzke, D., Lee, M. D., Rouder, J. N., Iverson, G. J., & Wagenmakers, E.-J. (2011). Statistical evidence in experimental psychology: An empirical comparison using 855 t-tests. Perspectives on Psychological Science, 6, 291–298.

    Article  Google Scholar 

Download references

Acknowledgments

I am grateful to two anonymous referees for very helpful comments on an earlier draft.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Bartelborth.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bartelborth, T. How Strong is the Confirmation of a Hypothesis by Significant Data?. J Gen Philos Sci 47, 277–291 (2016). https://doi.org/10.1007/s10838-016-9341-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10838-016-9341-0

Keywords

Navigation