The p-value Case, a Review of the Debate: Issues and Plausible Remedies

Pauli, Francesco

doi:10.1007/978-3-319-73906-9_9

Francesco Pauli⁴

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 227))

Included in the following conference series:

Convegno della Società Italiana di Statistica

1019 Accesses

Abstract

We review the recent debate on the lack of reliability of scientific results and its connections to the statistical methodologies at the core of the discovery paradigm. Null hypotheses statistical testing, in particular, has often been related to, if not blamed for, the present situation. We argue that a loose relation exists: although NHST, if properly used, could not be seen as a cause, some common misuses may mask or even favour bad practices leading to the lack of reliability. We discuss various proposals which have been put forward to deal with these issues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baker, M.: Is there a reproducibility crisis? Nature 533, 452–454 (2016)
Article Google Scholar
Beall, A.T., Tracy, J.L.: Women are more likely to wear red or pink at peak fertility. Psychol. Sci. 24, 1837–1841 (2013)
Article Google Scholar
Berger, J.O.: Could Fisher, Jeffreys and Neyman have agreed on testing? Stat. Sci. 18(1), 1–12 (2003)
Article MathSciNet MATH Google Scholar
Boland, M.R., Shahn, Z., Madigan, D., Hripcsak, G., Tatonetti, N.P.: Birth month affects lifetime disease risk: a phenome-wide method. J. Am. Med. Inform. Assoc. ocv046 (2015)
Google Scholar
Brodeur, A., Lé, M., Sangnier, M., Zylberberg, Y.: Star wars: the empirics strike back. Am. Econ. J. Appl. Econ. 8(1), 1–32 (2016)
Article Google Scholar
Burnham, K., Anderson, D.: P values are only an index to evidence: 20th-vs. 21st-century statistical science. Ecology 95(3), 627–630 (2014)
Article Google Scholar
Cohen, J.: The earth is round (\(p\,<\,0.05\)). Am. Psychol. 49, 997–1003 (1994)
Article Google Scholar
Cowan, G., Cranmer, K., Gross, E., Vitells, O.: Asymptotic formulae for likelihood-based tests of new physics. Eur. Phys. J. C 71(2), 1–19 (2011)
Article Google Scholar
Cowen, R.: Big bang finding challenged. Nature 510(7503), 20 (2014)
Article Google Scholar
Cumming, G.: The new statistics why and how. Psychol. Sci. 25, 7–29 (2013)
Article Google Scholar
Fidler, F., Loftus, G.R.: Why figures with error bars should replace p values: some conceptual arguments and empirical demonstrations. J. Psychol. 217(1), 27–37 (2009)
Google Scholar
Fisher, R.A., et al.: Statistical methods for research workers. In: Statistical Methods for Research Workers, 10th. edn. (1946)
Google Scholar
Gelman, A.: Commentary: P values and statistical practice. Epidemiology 24(1), 69–72 (2013)
Article MathSciNet Google Scholar
Gelman, A., Loken, E.: The statistical crisis in science. Am. Sci. 102, 460–465 (2014)
Article Google Scholar
Gigerenzer, G.: Mindless statistics. J. Socio-Econ. 33(5), 587–606 (2004)
Article Google Scholar
Goodman, S.N.: Toward evidence-based medical statistics. 1: the p value fallacy. Ann. Intern. Med. 130(12), 995–1004 (1999)
Article Google Scholar
Goodman, S.N.: Toward evidence-based medical statistics. 2: the bayes factor. Ann. Intern. Med. 130(12), 1005–1013 (1999)
Article Google Scholar
Goodman, S.N.: Aligning statistical and scientific reasoning. Science 352, 1180–1181 (2016)
Article MathSciNet MATH Google Scholar
Greenland, S., Poole, C.: Living with p values: resurrecting a bayesian perspective on frequentist statistics. Epidemiology 24(1), 62–68 (2013)
Article Google Scholar
Hart, et al.: Dogs are sensitive to small variations of the Earth’s magnetic field. Front. Zool. 10, 80 (2013)
Google Scholar
Hauer, E.: The harm done by tests of significance. Accident Analysis & Prevention 36(3), 495–500 (2004)
Article Google Scholar
Head, M.L., Holman, L., Lanfear, R., Kahn, A.T., Jennions, M.D.: The extent and consequences of p-hacking in science. PLoS Biol. 13(3), e1002,106 (2015)
Google Scholar
Hoover, K.D., Siegler, M.V.: Sound and fury: Mccloskey and significance testing in economics. J. Econ. Method. 15(1), 1–37 (2008)
Article Google Scholar
Ioannidis, J.P.: Contradicted and initially stronger effects in highly cited clinical research. Jama 294(2), 218–228 (2005)
Article MathSciNet Google Scholar
Ioannidis, J.P.: Why most published research findings are false. PLoS Med. 2(8), e124 (2005)
Article Google Scholar
Kaplan, R.M., Irvin, V.L.: Likelihood of null effects of large nhlbi clinical trials has increased over time. PloS one 10(8), e0132,382 (2015)
Google Scholar
Klein, J.R., Roodman, A.: Blind analysis in nuclear and particle physics. Ann. Rev. Nucl. Part. Sci. 55(1), 141–163 (2005)
Article Google Scholar
Krantz, D.H.: The null hypothesis testing controversy in psychology. J. Am. Stat. Assoc. 94(448), 1372–1381 (1999)
Article Google Scholar
Leek, J.T., Peng, R.D.: Statistics: P-values are just the tip of the iceberg. Nature 520(7549) (2015)
Google Scholar
Lovell, D.: Biological importance and statistical significance. J. Agric. Food Chem. 61(35), 8340–8348 (2013)
Article Google Scholar
MacCoun, R., Perlmutter, S.: Blind analysis: hide results to seek the truth. Nature 526(7572), 187–189 (2015)
Article Google Scholar
Masicampo, E.J., Lalande, D.R.: A peculiar prevalence of p-values just below.05. Q. J. Exp. Psychol. 65(11), 2271–2279 (2012)
Article Google Scholar
Mayo, D.G., Spanos, A.: Severe testing as a basic concept in a neymanpearson philosophy of induction. Br. J. Philos. Sci. 57(2), 323–357 (2006)
Article MATH Google Scholar
McCloskey, D.: The insignificance of statistical significance. Sci. Am. 272, 32–33 (1995)
Article Google Scholar
McCloskey, D.N., Ziliak, S.T.: The standard error of regressions. J. Econ. Lit. 34(1), 97–114 (1996)
Google Scholar
Meehl, P.: The problem is epistemology, not statistics: replace significance tests by confidence intervals and quantify accuracy of risky numerical predictions. In: What if there were no significance tests, pp. 393–425. Psychology press (2013)
Google Scholar
Neyman, J., Pearson, E.S.: On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. R. Soc. Lon. Ser. A 231, 289–337 (1933)
Google Scholar
Nicholls, N.: Commentary and analysis: the insignificance of significance testing. Bull. Am. Meteorol. Soc. 82(5), 981–986 (2001)
Article Google Scholar
Nuzzo, R.: Scientific method: statistical errors. Nature 506(7487), 150–152 (2014)
Article Google Scholar
Reich, E.S.: Timing glitches dog neutrino claim. Nature 483(7387), 17 (2012)
Article Google Scholar
Rogoff, K., Reinhart, C.: Growth in a time of debt. Am. Econ. Rev. 100, 573–578 (2010)
Article Google Scholar
Rothman, K.J.: Writing for epidemiology. Epidemiology 9(3), 333–337 (1998)
Article Google Scholar
Royall, R.: Statistical Evidence: A Likelihood Paradigm (Chapman & Hall/CRC Monographs on Statistics & Applied Probability). Chapman and Hall/CRC (1997)
Google Scholar
Schmidt, F., Hunter, J.: Eight common but false objections to the discontinuation of significance testing in the analysis of research data. In: S.A.S.J. Harlow L.L. (ed.) What if There were no Significance Tests?, pp. 37–64. Psychology Press (1997)
Google Scholar
Simmons, J.P., Nelson, L.D., Simonsohn, U.: False-Positive psychology-undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22(11), 1359–1366 (2011)
Article Google Scholar
Simonsohn, U., Nelson, L.D., Simmons, J.P.: P-curve: a key to the file-drawer. J. Exp. Psychol. Gen. 143(2), 534–547 (2014)
Article Google Scholar
Sterne, J.A.C., Smith, G.D., Cox, D.R.: Sifting the evidence-what’s wrong with significance tests? Phys. Ther. 81(8), 1464–1469 (2001)
Article Google Scholar
Trafimow, D.: Editorial. Basic Appl. Soc. Psychol. 36(1), 1–2 (2014)
Google Scholar
Trafimow, D., Marks, M.: Editorial. Basic Appl. Soc. Psychol. 37(1), 1–2 (2015)
Google Scholar
Wagenmakers, E.J.J.: A practical solution to the pervasive problems of p values. Psychon. Bull. Rev. 14(5), 779–804 (2007)
Article Google Scholar
Wasserstein, R.L., Lazar, N.A.: The ASA’s statement on p-values: context, process, and purpose. Am. Stat. 70(2), 129–133 (2016)
Article MathSciNet Google Scholar
Ziliak, S., McCloskey, D.: Size matters: the standard error of regressions in the american economic review. J. Socio-Econ. 33(5), 527–546 (2004)
Article Google Scholar

Download references

Acknowledgements

This work was supported by Univesity of Trieste within the FRA project “Politiche strutturali e riforme. Analisi degli indicatori e valutazione degli effetti”.

Author information

Authors and Affiliations

DEAMS, University of Trieste, Trieste, Italy
Francesco Pauli

Authors

Francesco Pauli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Pauli .

Editor information

Editors and Affiliations

Dipartimento di Scienze Economiche e Statistiche, Università degli Studi di Salerno, Fisciano, Salerno, Italy
Cira Perna
Dipartimento di Economia e Management, Università degli Studi di Pisa, Pisa, Italy
Monica Pratesi
Toulouse School of Economics, University of Toulouse, Toulouse Cedex 6, France
Anne Ruiz-Gazen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pauli, F. (2018). The p-value Case, a Review of the Debate: Issues and Plausible Remedies. In: Perna, C., Pratesi, M., Ruiz-Gazen, A. (eds) Studies in Theoretical and Applied Statistics. SIS 2016. Springer Proceedings in Mathematics & Statistics, vol 227. Springer, Cham. https://doi.org/10.1007/978-3-319-73906-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-73906-9_9
Published: 02 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73905-2
Online ISBN: 978-3-319-73906-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics