A power fallacy

Wagenmakers, Eric-Jan; Verhagen, Josine; Ly, Alexander; Bakker, Marjan; Lee, Michael D.; Matzke, Dora; Rouder, Jeffrey N.; Morey, Richard D.

doi:10.3758/s13428-014-0517-4

A power fallacy

Published: 01 October 2014

Volume 47, pages 913–917, (2015)
Cite this article

Behavior Research Methods Aims and scope Submit manuscript

Eric-Jan Wagenmakers¹,
Josine Verhagen¹,
Alexander Ly¹,
Marjan Bakker¹,
Michael D. Lee²,
Dora Matzke¹,
Jeffrey N. Rouder³ &
…
Richard D. Morey⁴

2663 Accesses
59 Citations
9 Altmetric
Explore all metrics

Abstract

The power fallacy refers to the misconception that what holds on average –across an ensemble of hypothetical experiments– also holds for each case individually. According to the fallacy, high-power experiments always yield more informative data than do low-power experiments. Here we expose the fallacy with concrete examples, demonstrating that a particular outcome from a high-power experiment can be completely uninformative, whereas a particular outcome from a low-power experiment can be highly informative. Although power is useful in planning an experiment, it is less useful—and sometimes even misleading—for making inferences from observed data. To make inferences from data, we recommend the use of likelihood ratios or Bayes factors, which are the extension of likelihood ratios beyond point hypotheses. These methods of inference do not average over hypothetical replications of an experiment, but instead condition on the data that have actually been observed. In this way, likelihood ratios and Bayes factors rationally quantify the evidence that a particular data set provides for or against the null or any other hypothesis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Power(ful) guidelines for experimental economists

Article 03 August 2020

Kathryn N. Vasilaky & J. Michelle Brock

The interpretation of statistical power after the data have been gathered

Article 02 October 2018

John Joseph Dziak, Lisa C. Dierker & Beau Abar

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Article 07 February 2017

John K. Kruschke & Torrin M. Liddell

Notes

While odds lie on a naturally meaningful scale calibrated by betting, characterizing evidence through verbal labels such as “moderate” and “strong” is necessarily subjective (Kass & Raftery, 1995). We believe the labels are useful because they facilitate scientific communication, but they should only be considered an approximate descriptive articulation of different standards of evidence.
In order to obtain a t value of 5 with a sample size of only 10 participants, the precognition score needs to have a large mean or a small variance.
In order to obtain a t value of 1.7 with a sample size of 100, the the precognition score needs to have a small mean or a high variance.

References

Bakker, M., van Dijk, A., Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7, 543–554.
Article PubMed Google Scholar
Bem, D. J. (2011). Feeling the future: experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100, 407–425.
Article PubMed Google Scholar
Berger, J. O., & Wolpert, R. L. (1988). The likelihood principle Vol. 2. Hayward (CA): Institute of Mathematical Statistics.
Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., Munafò, M. R. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14, 1–12.
Article Google Scholar
Cohen, J. (1990). Things I have learned (thus far). American Psychologist, 45, 1304–1312.
Article Google Scholar
Faul, F., Erdfelder, E., Lang, A.-G., Buchner, A. (2007). G ∗Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191.
Article PubMed Google Scholar
Galak, J., LeBoeuf, R. A., Nelson, L. D., Simmons, J. P. (2012). Correcting the past: failures to replicate Psi. Journal of Personality and Social Psychology, 103, 933–948.
Article PubMed Google Scholar
Gönen, M., Johnson, W. O., Lu, Y., Westfall, P. H. (2005). The Bayesian two–sample t test. The American Statistician, 59, 252– 257.
Article Google Scholar
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2, 696–701.
Google Scholar
Jaynes, E. T. (2003). Probability theory: the logic of science. Cambridge: Cambridge University Press.
Book Google Scholar
Jeffreys, H. (1961). Theory of probability (3 ed). Oxford: Oxford University Press.
Google Scholar
Johnson, V. E. (2013). Revised standards for statistical evidence. In Proceedings of the national academy of sciences of the United States of America (Vol. 11, pp. 19313–19317).
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.
Article Google Scholar
Lee, M. D., & Wagenmakers, E.-J (2013). Bayesian modeling for cognitive science: a practical course. Cambridge University Press.
Morey, R. D., & Wagenmakers, E.-J. (2014). Simple relation between Bayesian order-restricted and point-null hypothesis tests. Statistics and Probability Letters, 92, 121–124.
Article Google Scholar
Pratt, J. W. (1965). Bayesian interpretation of standard inference statements. Journal of the Royal Statistical Society B, 27, 169– 203.
Google Scholar
Ritchie, S. J., Wiseman, R., French, C. C. (2012). Failing the future: three unsuccessful attempts to replicate Bem’s ‘retroactive facilitation of recall’ effect. PLoS ONE, 7, e33423.
Article PubMed Central PubMed Google Scholar
Sellke, T., Bayarri, M. J., Berger, J. O. (2001). Calibration of p values for testing precise null hypotheses. The American Statistician, 55, 62–71.
Article Google Scholar
Sham, P. C., & Purcell, S. M. (2014). Statistical power and significance testing in large-scale genetic studies. Nature Reviews Genetics, 15, 335–346.
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, University of Amsterdam, Weesperplein 4, 1018 XA, Amsterdam, The Netherlands
Eric-Jan Wagenmakers, Josine Verhagen, Alexander Ly, Marjan Bakker & Dora Matzke
University of California Irvine, Irvine, CA, USA
Michael D. Lee
University of Missouri, Columbia, MO, USA
Jeffrey N. Rouder
University of Groningen, Groningen, Netherlands
Richard D. Morey

Authors

Eric-Jan Wagenmakers
View author publications
You can also search for this author in PubMed Google Scholar
Josine Verhagen
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Ly
View author publications
You can also search for this author in PubMed Google Scholar
Marjan Bakker
View author publications
You can also search for this author in PubMed Google Scholar
Michael D. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Dora Matzke
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey N. Rouder
View author publications
You can also search for this author in PubMed Google Scholar
Richard D. Morey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eric-Jan Wagenmakers.

Additional information

Author Note

This work was supported by an ERC grant from the European Research Council. Correspondence concerning this article may be addressed to Eric-Jan Wagenmakers, University of Amsterdam, Department of Psychology, Weesperplein 4, 1018 XA Amsterdam, the Netherlands. Email address: EJ.Wagenmakers@gmail.com.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wagenmakers, EJ., Verhagen, J., Ly, A. et al. A power fallacy. Behav Res 47, 913–917 (2015). https://doi.org/10.3758/s13428-014-0517-4

Download citation

Published: 01 October 2014
Issue Date: December 2015
DOI: https://doi.org/10.3758/s13428-014-0517-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A power fallacy

Abstract

Access this article

Similar content being viewed by others

Power(ful) guidelines for experimental economists

The interpretation of statistical power after the data have been gathered

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Author Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A power fallacy

Abstract

Access this article

Similar content being viewed by others

Power(ful) guidelines for experimental economists

The interpretation of statistical power after the data have been gathered

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Author Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation