P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers

Biau, David Jean; Jolles, Brigitte M.; Porcher, Raphaël

doi:10.1007/s11999-009-1164-4

P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers

Clinical Research
General
Published: 17 November 2009

Volume 468, pages 885–892, (2010)
Cite this article

Clinical Orthopaedics and Related Research®

David Jean Biau MD¹,
Brigitte M. Jolles MD Msc, MD² &
Raphaël Porcher PhD¹

2773 Accesses
80 Citations
30 Altmetric
2 Mentions
Explore all metrics

Abstract

In the 1920s, Ronald Fisher developed the theory behind the p value and Jerzy Neyman and Egon Pearson developed the theory of hypothesis testing. These distinct theories have provided researchers important quantitative tools to confirm or refute their hypotheses. The p value is the probability to obtain an effect equal to or more extreme than the one observed presuming the null hypothesis of no effect is true; it gives researchers a measure of the strength of evidence against the null hypothesis. As commonly used, investigators will select a threshold p value below which they will reject the null hypothesis. The theory of hypothesis testing allows researchers to reject a null hypothesis in favor of an alternative hypothesis of some effect. As commonly used, investigators choose Type I error (rejecting the null hypothesis when it is true) and Type II error (accepting the null hypothesis when it is false) levels and determine some critical region. If the test statistic falls into that critical region, the null hypothesis is rejected in favor of the alternative hypothesis. Despite similarities between the two, the p value and the theory of hypothesis testing are different theories that often are misunderstood and confused, leading researchers to improper conclusions. Perhaps the most common misconception is to consider the p value as the probability that the null hypothesis is true rather than the probability of obtaining the difference observed, or one that is more extreme, considering the null is true. Another concern is the risk that an important proportion of statistically significant results are falsely significant. Researchers should have a minimum understanding of these two theories so that they are better able to plan, conduct, interpret, and report scientific experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

Article Open access 07 June 2017

Keith S. Taber

What is Qualitative in Qualitative Research

Article Open access 27 February 2019

Patrik Aspers & Ugo Corte

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

Gordon W. Cheung, Helena D. Cooper-Thomas, … Linda C. Wang

References

Bailey CS, Fisher CG, Dvorak MF. Type II error in the spine surgical literature. Spine (Phila Pa 1976). 2004;29:1146–1149.
Google Scholar
Biau DJ, Kerneis S, Porcher R. Statistics in brief: the importance of sample size in the planning and interpretation of medical research. Clin Orthop Relat Res. 2008;466:2282–2288.
Article PubMed Google Scholar
Fisher RA. Statistical Methods for Research Workers. Edinburgh, UK: Oliver and Boyd; 1925.
Google Scholar
Fisher RA. The arrangement of field experiments. J Ministry of Agriculture Great Britain. 1926;33:503–513.
Google Scholar
Fisher RA. Statistical Methods for Research Workers. Ed 11 (rev). Edinburgh, UK: Oliver and Boyd; 1950.
Google Scholar
Fisher RA. Statistical Methods and Scientific Inference. Ed 2 (rev). Edinburgh, UK: Oliver and Boyd; 1959.
Google Scholar
Freedman KB, Back S, Bernstein J. Sample size and statistical power of randomised, controlled trials in orthopaedics. J Bone Joint Surg Br. 2001;83:397–402.
Article CAS PubMed Google Scholar
Garcia-Cimbrelo E, Diez-Vazquez V, Madero R, Munuera L. Progression of radiolucent lines adjacent to the acetabular component and factors influencing migration after Charnley low-friction total hip arthroplasty. J Bone Joint Surg Am. 1997;79:1373–1380.
CAS PubMed Google Scholar
Goodman S. A dirty dozen: twelve p-value misconceptions. Semin Hematol. 2008;45:135–140.
Article PubMed Google Scholar
Goodman SN. Toward evidence-based medical statistics. 1: The p value fallacy. Ann Intern Med. 1999;130:995–1004.
CAS PubMed Google Scholar
Hodgkinson JP, Shelley P, Wroblewski BM. The correlation between the roentgenographic appearance and operative findings at the bone-cement junction of the socket in Charnley low friction arthroplasties. Clin Orthop Relat Res. 1988;228:105–109.
PubMed Google Scholar
Hopkins PN, Williams RR. A survey of 246 suggested coronary risk factors. Atherosclerosis. 1981;40:1–52.
Article CAS PubMed Google Scholar
Hubbard R, Bayarri MJ. P values are not error probabilities. Available at: http://www.uv.es/sestio/TechRep/tr14-03.pdf. Accessed January 13, 2009.
Kobayashi S, Eftekhar NS, Terayama K, Iorio R. Risk factors affecting radiological failure of the socket in primary Charnley low friction arthroplasty: a 10- to 20-year followup study. Clin Orthop Relat Res. 1994;306:84–96.
PubMed Google Scholar
Neyman J, Pearson E. On the problem of the most efficient tests of statistical hypotheses. Philos Trans R Soc Lond A. 1933;231:289–337.
Article Google Scholar
Onsten I, Akesson K, Obrant KJ. Micromotion of the acetabular component and periacetabular bone morphology. Clin Orthop Relat Res. 1995;310:103–110.
PubMed Google Scholar
Pearson K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine. 1900;5:157–175.
Google Scholar
Schmalzried TP, Kwong LM, Jasty M, Sedlacek RC, Haire TC, O’Connor DO, Bragdon CR, Kabo JM, Malcolm AJ, Harris WH. The mechanism of loosening of cemented acetabular components in total hip arthroplasty: analysis of specimens retrieved at autopsy. Clin Orthop Relat Res. 1992;274:60–78.
PubMed Google Scholar
Scott IA. Evaluating cardiovascular risk assessment for asymptomatic people. BMJ. 2009;338:a2844.
Article PubMed Google Scholar
Sterne JA, Davey Smith G. Sifting the evidence what’s wrong with significance tests? BMJ. 2001;322:226–231.
Article CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Département de Biostatistique et Informatique Médicale, INSERM–UMR-S 717, AP-HP, Université Paris 7, Hôpital Saint Louis, 1, Avenue Claude-Vellefaux, Paris Cedex, 10 75475, France
David Jean Biau MD & Raphaël Porcher PhD
Hôpital Orthopédique Département de l’Appareil Locomoteur Centre Hospitalier, Universitaire Vaudois Université de Lausanne, Lausanne, Switzerland
Brigitte M. Jolles MD Msc, MD

Authors

David Jean Biau MD
View author publications
You can also search for this author in PubMed Google Scholar
Brigitte M. Jolles MD Msc, MD
View author publications
You can also search for this author in PubMed Google Scholar
Raphaël Porcher PhD
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Jean Biau MD.

Additional information

Each author certifies that he or she has no commercial associations (eg, consultancies, stock ownership, equity interest, patent/licensing arrangements, etc) that might pose a conflict of interest in connection with the submitted article.

About this article

Cite this article

Biau, D.J., Jolles, B.M. & Porcher, R. P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers. Clin Orthop Relat Res 468, 885–892 (2010). https://doi.org/10.1007/s11999-009-1164-4

Download citation

Received: 11 February 2009
Accepted: 02 November 2009
Published: 17 November 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s11999-009-1164-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers

Abstract

Access this article

Similar content being viewed by others

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

What is Qualitative in Qualitative Research

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Keywords

Navigation

P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers

Abstract

Access this article

Similar content being viewed by others

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

What is Qualitative in Qualitative Research

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Share this article

Keywords

Search

Navigation