Skip to main content
Log in

Comparing Effect Sizes in Follow-Up Studies: ROC Area, Cohen's d, and r

  • Published:
Law and Human Behavior

Abstract

In order to facilitate comparisons across follow-up studies that have used different measures of effect size, we provide a table of effect size equivalencies for the three most common measures: ROC area (AUC), Cohen's d, and r. We outline why AUC is the preferred measure of predictive or diagnostic accuracy in forensic psychology or psychiatry, and we urge researchers and practitioners to use numbers rather than verbal labels to characterize effect sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Berlin, F. S., Galbreath, N. W., Geary, B., & McGlone, G. (2003). The use of actuarials at civil commitment hearings to predict the likelihood of future sexual violence. Sexual Abuse: A Journal of Research and Treatment, 15, 377–382.

    Article  Google Scholar 

  • Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York: Academic Press.

    Google Scholar 

  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Cohen, J. (1992). A power primer. Psychological Bulletin, 122, 155–159.

    Article  Google Scholar 

  • Delaney, H. D., & Vargha, A. (2002). Comparing several robust tests of stochastic equality with ordinally scaled variables and small to moderate sized samples. Psychological Methods, 7, 485–503.

    Article  PubMed  Google Scholar 

  • Harris, G. T., & Rice, M. E. (2003). Actuarial assessment of risk among sex offenders. In R. A. Prentky, E. S. Janus, & M. C. Seto (Eds.), Understanding and managing sexually coercive behavior, Vol. 989 (pp. 198–210). New York: Annals of the New York Academy of Sciences.

  • Hemphill, J. F. (2003). Interpreting the magnitudes of correlation coefficients. American Psychologist, 58, 78–80.

    Article  PubMed  Google Scholar 

  • Hilton, N. Z., Carter, A. M., Harris, G. T., & Bryans, A. (2005). Using categorical judgments to communicate risk of violence. Unpublished manuscript.

  • McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111, 361–365.

    Article  Google Scholar 

  • Mossman, D. (1994). Assessing predictions of violence being accurate about accuracy. Journal of Consulting and Clinical Psychology, 62, 783–792.

    Article  PubMed  Google Scholar 

  • Pearson, E. S., & Hartley, H. O. (Eds.). (1954). Biometrika tables for statisticians, Vol. 1 (1st ed.). Cambridge: Cambridge University Press.

  • Rice, M. E., & Harris, G. T. (1995). Violent recidivism: Assessing predictive validity. Journal of Consulting and Clinical Psychology, 63, 737–748.

    Article  PubMed  Google Scholar 

  • Rosenthal, R. (1990). How are we doing in soft psychology? American Psychologist, 45, 775–777.

    Article  Google Scholar 

  • Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park, CA: Sage.

    Google Scholar 

  • Rosenthal, R., & Rubin, D. B. (1982). A simple, general purpose display of magnitude of experimental effect. Journal of Educational Psychology, 74, 166–169.

    Article  Google Scholar 

  • Swets, J. A. (1986). Indices of discrimination or diagnostic accuracy: Their ROCs and implied models. Psychological Bulletin, 99, 100–117.

    Article  PubMed  Google Scholar 

  • Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological science can improve diagnostic decisions. Psychological Science in the Public Interest: A Journal of the American Psychological Society, 1, 1–26.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marnie E. Rice.

Additional information

Strictly speaking, d values pertain only to variables scored on an interval scale. When the nondichotomous variable is ordinally scaled, r or AUC should be used. Nevertheless, the values in Table 1 allow one to compare the relative magnitudes across studies that have reported any of the three effect size measures.

About this article

Cite this article

Rice, M.E., Harris, G.T. Comparing Effect Sizes in Follow-Up Studies: ROC Area, Cohen's d, and r. Law Hum Behav 29, 615–620 (2005). https://doi.org/10.1007/s10979-005-6832-7

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10979-005-6832-7

Key Words

Navigation