Skip to main content
Log in

P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers

  • Clinical Research
  • General
  • Published:
Clinical Orthopaedics and Related Research®

Abstract

In the 1920s, Ronald Fisher developed the theory behind the p value and Jerzy Neyman and Egon Pearson developed the theory of hypothesis testing. These distinct theories have provided researchers important quantitative tools to confirm or refute their hypotheses. The p value is the probability to obtain an effect equal to or more extreme than the one observed presuming the null hypothesis of no effect is true; it gives researchers a measure of the strength of evidence against the null hypothesis. As commonly used, investigators will select a threshold p value below which they will reject the null hypothesis. The theory of hypothesis testing allows researchers to reject a null hypothesis in favor of an alternative hypothesis of some effect. As commonly used, investigators choose Type I error (rejecting the null hypothesis when it is true) and Type II error (accepting the null hypothesis when it is false) levels and determine some critical region. If the test statistic falls into that critical region, the null hypothesis is rejected in favor of the alternative hypothesis. Despite similarities between the two, the p value and the theory of hypothesis testing are different theories that often are misunderstood and confused, leading researchers to improper conclusions. Perhaps the most common misconception is to consider the p value as the probability that the null hypothesis is true rather than the probability of obtaining the difference observed, or one that is more extreme, considering the null is true. Another concern is the risk that an important proportion of statistically significant results are falsely significant. Researchers should have a minimum understanding of these two theories so that they are better able to plan, conduct, interpret, and report scientific experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1A–B
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Bailey CS, Fisher CG, Dvorak MF. Type II error in the spine surgical literature. Spine (Phila Pa 1976). 2004;29:1146–1149.

    Google Scholar 

  2. Biau DJ, Kerneis S, Porcher R. Statistics in brief: the importance of sample size in the planning and interpretation of medical research. Clin Orthop Relat Res. 2008;466:2282–2288.

    Article  PubMed  Google Scholar 

  3. Fisher RA. Statistical Methods for Research Workers. Edinburgh, UK: Oliver and Boyd; 1925.

    Google Scholar 

  4. Fisher RA. The arrangement of field experiments. J Ministry of Agriculture Great Britain. 1926;33:503–513.

    Google Scholar 

  5. Fisher RA. Statistical Methods for Research Workers. Ed 11 (rev). Edinburgh, UK: Oliver and Boyd; 1950.

    Google Scholar 

  6. Fisher RA. Statistical Methods and Scientific Inference. Ed 2 (rev). Edinburgh, UK: Oliver and Boyd; 1959.

    Google Scholar 

  7. Freedman KB, Back S, Bernstein J. Sample size and statistical power of randomised, controlled trials in orthopaedics. J Bone Joint Surg Br. 2001;83:397–402.

    Article  CAS  PubMed  Google Scholar 

  8. Garcia-Cimbrelo E, Diez-Vazquez V, Madero R, Munuera L. Progression of radiolucent lines adjacent to the acetabular component and factors influencing migration after Charnley low-friction total hip arthroplasty. J Bone Joint Surg Am. 1997;79:1373–1380.

    CAS  PubMed  Google Scholar 

  9. Goodman S. A dirty dozen: twelve p-value misconceptions. Semin Hematol. 2008;45:135–140.

    Article  PubMed  Google Scholar 

  10. Goodman SN. Toward evidence-based medical statistics. 1: The p value fallacy. Ann Intern Med. 1999;130:995–1004.

    CAS  PubMed  Google Scholar 

  11. Hodgkinson JP, Shelley P, Wroblewski BM. The correlation between the roentgenographic appearance and operative findings at the bone-cement junction of the socket in Charnley low friction arthroplasties. Clin Orthop Relat Res. 1988;228:105–109.

    PubMed  Google Scholar 

  12. Hopkins PN, Williams RR. A survey of 246 suggested coronary risk factors. Atherosclerosis. 1981;40:1–52.

    Article  CAS  PubMed  Google Scholar 

  13. Hubbard R, Bayarri MJ. P values are not error probabilities. Available at: http://www.uv.es/sestio/TechRep/tr14-03.pdf. Accessed January 13, 2009.

  14. Kobayashi S, Eftekhar NS, Terayama K, Iorio R. Risk factors affecting radiological failure of the socket in primary Charnley low friction arthroplasty: a 10- to 20-year followup study. Clin Orthop Relat Res. 1994;306:84–96.

    PubMed  Google Scholar 

  15. Neyman J, Pearson E. On the problem of the most efficient tests of statistical hypotheses. Philos Trans R Soc Lond A. 1933;231:289–337.

    Article  Google Scholar 

  16. Onsten I, Akesson K, Obrant KJ. Micromotion of the acetabular component and periacetabular bone morphology. Clin Orthop Relat Res. 1995;310:103–110.

    PubMed  Google Scholar 

  17. Pearson K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine. 1900;5:157–175.

    Google Scholar 

  18. Schmalzried TP, Kwong LM, Jasty M, Sedlacek RC, Haire TC, O’Connor DO, Bragdon CR, Kabo JM, Malcolm AJ, Harris WH. The mechanism of loosening of cemented acetabular components in total hip arthroplasty: analysis of specimens retrieved at autopsy. Clin Orthop Relat Res. 1992;274:60–78.

    PubMed  Google Scholar 

  19. Scott IA. Evaluating cardiovascular risk assessment for asymptomatic people. BMJ. 2009;338:a2844.

    Article  PubMed  Google Scholar 

  20. Sterne JA, Davey Smith G. Sifting the evidence what’s wrong with significance tests? BMJ. 2001;322:226–231.

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Jean Biau MD.

Additional information

Each author certifies that he or she has no commercial associations (eg, consultancies, stock ownership, equity interest, patent/licensing arrangements, etc) that might pose a conflict of interest in connection with the submitted article.

About this article

Cite this article

Biau, D.J., Jolles, B.M. & Porcher, R. P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers. Clin Orthop Relat Res 468, 885–892 (2010). https://doi.org/10.1007/s11999-009-1164-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11999-009-1164-4

Keywords

Navigation