Abstract
Since its introduction, null hypothesis significance testing (NHST) has caused much debate. Many publications on common misunderstandings have appeared. Despite the many cautions, NHST remains one of the most prevalent, misused and abused statistical procedures in the biomedical literature. This article is directed at practicing researchers with limited statistical background who are driven by subject matter questions and have empirical data to be analyzed. We use a dialogue as in ancient Greek literature for didactic purposes. We illustrate several, though only a few, irritations that can come up when a researcher with minimal statistical background but a good sense of what she wants her study to do, and of what she wants to do with her study, asks for consultation by a statistician. We provide insights into the meaning of several concepts including null and alternative hypothesis, one- and two-sided null hypotheses, statistical models, test statistic, rejection and acceptance regions, type I and II error, p value, and the frequentist’ concept of endless study repetitions.
Similar content being viewed by others
References
Anderson DR, Burnham KP, Thompson WL. Null hypothesis testing: problems, prevalence, and an alternative. J Wildl Manag. 2000;64:912–23.
Baldick C. Oxford dictionary of literary terms. Oxford: Oxford University Press; 2008.
Box GEP. Sampling and Bayes’ inference in scientific modelling and robustness. J R Stat Soc A. 1980;143:383–430.
Cox DR. The role of significance tests. Scand J Stat. 1977;4:49–70.
Fisher RA. Statistical methods for research workers. Edingburgh: Oliver and Boyd; 1925.
Fisher RA. Statistical methods and scientific inference. Edingburgh: Oliver and Boyd; 1956.
Gigerenzer G, Swijtink Z, Porter T, et al. The empire of chance. how probability changed science and everyday life. Cambridge: Cambridge University Press; 1989.
Goodman S. A dirty dozen: twelve p-value misconceptions. Semin Hematol. 2008;45:135–40.
Goodman SN. P values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate. Am J Epidemiol. 1993;137:485–96.
Greenland S. Multiple-bias modelling for analysis of observational data. J R Stat Soc A. 2005;168:267–306.
Greenland S. Null misinterpretation in statistical testing and its impact on health risk assessment. Prev Med. 2011;53:225–8.
Greenland S. Nonsignificance plus high power does not imply support for the null over the alternative. Ann Epidemiol. 2012;22:364–8.
Greenland S, Poole C. Problems in common interpretations of statistics in scientific articles, expert reports, and testimony. Jurimetrics. 2011;51:129.
Hubbard R. Alphabet soup: blurring the distinction between p’s and α’s in psychological research. Theory Psychol. 2004;14:295–327.
Hubbard R, Bayarri MJ. Confusion over measures of evidence (p’s) versus errors (α’s) in classical statistical testing (with discussion). Am Stat. 2003;57:171–82.
Jeffreys H. Theory of probability. Oxford: Clarendon Press; 1939.
Kirk RE. Statistical consulting in a University: dealing with people and other challenges. Am Stat. 1991;45:28–34.
Leamer EE. Specification searches. New York: Wiley; 1978.
Neyman J. Frequentist probability and frequentist statistics. Synthese. 1977;36:97–131.
Neyman J, Pearson ES. On the use and interpretation of certain test criteria for purposes of statistical inference: Part I. Biometrika. 1928;20A:175–240.
Neyman J, Pearson ES. The testing of statistical hypotheses in relation to probabilities a priori. Proc Cambridge Philos Soc. 1933;29:492–510.
Pocock SJ, Ware JH. Translating statistical findings into plain English. Lancet. 2009;373:1926–8.
Robins JM, Greenland S. The role of model selection in causal inference from nonexperimental data. Am J Epidemiol. 1986;123:392–402.
Rothman KJ, Greenland S, Lash TL. Precision and validity in epidemiologic studies. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. Philadelphia: Wolters Kluwer, Lippincott Williams and Wilkins; 2008. p. 148–67.
Section on Statistical Consulting.American Statistical Association. When you consult a statistician… what to expect. 2003.
Stegman CE. Statistical consulting in the university: a faculty member’s perspective. J Educ Stat. 1985;10:269–82.
Tukey JW. Unsolved problems of experimental statistics. J Am Stat Assoc. 1954;49:706–31.
Acknowledgments
We would like to thank Sander Greenland PhD, Department of Epidemiology & Department of Statistics, University of California, Los Angeles, for helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Stang, A., Poole, C. The researcher and the consultant: a dialogue on null hypothesis significance testing. Eur J Epidemiol 28, 939–944 (2013). https://doi.org/10.1007/s10654-013-9861-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10654-013-9861-4