The Impact of APA and AERA Guidelines on Effect Size Reporting
- 1.1k Downloads
Given the long history of effect size (ES) indices (Olejnik and Algina Contemporary Educational Psychology, 25, 241–286 2000) and various attempts by APA and AERA to encourage the reporting and interpretation of ES to supplement findings from inferential statistical analyses, it is essential to document the impact of APA and AERA standards on ES reporting practices. In this paper, we investigated the impact by examining findings from 31 published reviews and our own review of 451 articles published in 2009 and 2010. The 32 reviews were divided into two periods: before and after 1999. A total of 116 journals were reviewed. Findings from these 32 reviews revealed that since 1999, the ES reporting has improved in terms of its rate, variety, interpretation, confidence intervals, and fullness. Yet several inadequate practices still persisted: (1) the dominance of Cohen’s d, and the unadjusted R 2/η2, (2) the mere labeling of ES, (3) the under-reporting of confidence intervals, and (4) a lack of integration between ES and statistical tests. The paper concludes with resources on Internet and recommendations for improving ES reporting practices.
KeywordsEffect size Impact Statistical test R2 Cohen’s d η2 Meta-analysis Review
- Algina, J., Keselman, H. J., & Penfield, R. D. (2006). Confidence intervals for an effect size when variances are not equal. Journal of Modern Applied Statistical Methods, 5, 2–13. Retrieved from http://www.jmasm.com.
- American Psychological Association. (2001). Publication manual of the American Psychological Association (5th ed.). Washington, DC: American Psychological Association.Google Scholar
- American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association.Google Scholar
- Cochran-Smith, M., & Zeichner, K. M. (Eds.). (2005). Studying teacher education: the report of the AERA Panel on Research and Teacher Education. Mahwah, NJ: Lawrence Erlbaum.Google Scholar
- Cohen, J. (1965). Some statistical issues in psychological research. In B. B. Wolman (Ed.), Handbook of clinical psychology (pp. 95–121). New York: McGraw-Hill.Google Scholar
- Cohen, J. (1969). Statistical power analysis in the behavioral sciences. New York: Academic Press.Google Scholar
- Cohen, J. (1988). Statistical power analysis in the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
- Cohen, J., & Cohen, P. (1975). Applied multiple regression/correlation analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
- Dunleavy, E. M., Barr, C. D., Glenn, D. M., & Miller, K. R. (2006). Effect size reporting in applied psychology: how are we doing? The Industrial-Organizational Psychologist, 43(4), 29–37. Retrieved from http://www.openj-gate.com/browse/Archive.aspx?year=2009&Journal_id=102632.Google Scholar
- Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.Google Scholar
- Fidler, F., Cumming, G., Thomason, N., Pannuzzo, D., Smith, J., Fyffe, P., Schmitt, R. (2005). Evaluating the effectiveness of editorial policy to improve statistical practice: the case of the Journal of Consulting and Clinical Psychology. Journal of Consulting and Clinical Psychology, 73, 136–143. doi: 10.1037/0022-006X.73.1.136.Google Scholar
- Friedman, H. (1968). Magnitude of experimental effect and a table for its rapid estimation. Psychological Bulletin, 70, 245-251. doi: 10.1037/h0026258.
- Fritz, C. O., Morris, P. E., & Richler, J. J. (2012). Effect size estimates: current use, calculations, and interpretation. Journal of Experimental Psychology: General, 141, 2–18. doi: 10.1037/a0024338.
- Grissom, R. J., & Kim, J. J. (2012). Effect sizes for research: univariate and multivariate applications (2nd ed.). New York: Routledge.Google Scholar
- Harrison, J., Thompson, B., & Vannest, K. J. (2009). Interpreting the evidence for effective interventions to increase the academic performance of students with ADHD: relevance of the statistical significance controversy. Review of Educational Research, 79, 740–775. doi: 10.3102/0034654309331516.CrossRefGoogle Scholar
- Hays, W. L. (1963). Statistics for psychologists. New York: Holt, Rinehart & Winston.Google Scholar
- Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.Google Scholar
- Hess, M. R., & Kromrey, J. D. (2004). Robust confidence intervals for effect sizes: A comparative study of Cohen's d and Cliff's delta under non-normality and heterogeneous variances. Paper presented at the American Educational Research Association, San Diego.Google Scholar
- Hogarty, K. Y., & Kromrey, J. D. (April, 2001). We've been reporting some effect sizes: Can you guess what they mean? Paper presented at the annual meeting of the American Educational Research Association, Seattle, WA.Google Scholar
- Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: correcting error and bias in research findings. Thousand Oaks, CA: SAGE Publications.Google Scholar
- Jitendra, A. K., Griffin, C. C., Haria, P., Leh, J., Adams, A., & Kaduvettoor, A. (2007). A comparison of single and multiple strategy instruction on third-grade students’ mathematical problem solving. Journal of Educational Psychology, 99, 115–127. doi: 10.1037/0022-06220.127.116.11.CrossRefGoogle Scholar
- Kelly, K. (2005). The effects of non nomral distributions on confidence intervals around the standardized mean difference: bootstrap and parametric confidence intervals. Educational and Psychological Measurement, 51–69. doi: 10.1177/0013164404264850.
- Keppel, G. (1973). Design and analysis: a researcher's handbook. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
- Keselman, H. J., Huberty, C. J., Lix, L. M., Olejnik, S., Cribbie, R. A., Donahue, B., Levin, J. R. (1998). Statistical practices of educational researchers: an analysis of their ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research, 68, 350–386. doi: 10.3102/00346543068003350.
- Kirk, R. E. (1996). Practical significance: a concept whose time has come. Educational and Psychological Measurement, 56, 746–759. doi: 10.1177/0013164496056005002.
- Kromrey, J. D., & Coughlin, K. B. (2007, November). ROBUST_ES: a SAS macro for computing robust estimates of effect size. Paper presented at the annual meeting of the SouthEast SAS Users Group, Hilton Head, SC. Retrieved from http://analytics.ncsu.edu/sesug/2007/PO19.pdf.
- Lipsey, M. W., & Wilson, D. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.Google Scholar
- Lipsey, M. W., Puzio, K., Yun, C., Hebert, M. A., Steinka-Fry, K., Cole, M. W., Roberts, M., Anthony, K. S., & Busick, M. D. (2012). Translating the statistical representation of the effects of education interventions into more readily interpretable forms. (NCSER 2013–3000). Washington, DC: National Center for Special Education Research, Institute of Education Sciences, US Department of Education.Google Scholar
- McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111, 361-365. doi: 10.1037/0033-2909.111.2.361.
- Meline, T., & Wang, B. (2004). Effect reporting practices in AJSLP and other ASHA journals, 1999–2003. American Journal of Speech-Language Pathology, 13, 202–207. Retrieved from http://ajslp.asha.org/.
- Neyman, J. (1937). Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society of London. Series A, 236, 333–380. Retrieved from http://rstl.royalsocietypublishing.org/.
- Pearson, K. (1905). Mathematical contributions to the theory of evolution: XIV. On the general theory of skew correlations and nonlinear regression (Draper’s Company Research Memoirs, Biometric Series II). London: DulauGoogle Scholar
- Peng, C.-Y. J., & Chen, L.-T. (2013). Beyond Cohen's d: alternative effect size measures for between subject designs. The Journal of Experimental Education (in press).Google Scholar
- Peng, C.-Y., Chen, L.-T., Chiang, H.-M., & Chiang, Y.-C. (2013). The impact of APA and AERA guidelines on effect size reporting. Educational Psychology Review. doi: 10.1007/s10648-013-9218-2.
- Rosenthal, R. (1994). Parametric measures of effect size. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis. New York: Russell Sage Foundation.Google Scholar
- Staudte, R. G., & Sheather, S. J. (1990). Robust estimation and testing. New York: Wiley.Google Scholar
- Steiger, J. H., & Fouladi, R. T. (1997). Noncentrality interval estimation and the evaluation of statistical models. In L. Harlow, S. Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests? (pp. 221–257). Hillsdale, NJ: Erlbaum.Google Scholar
- Thompson, B. (1999). Improving research clarity and usefulness with effect size indices as supplements to statistical significance tests. Exceptional Children, 65, 329–337. http://journals.cec.sped.org/ec/.
- Thompson, B. (2006). Foundations of behavioral statistics: an insight-based approach. New York: Guilford.Google Scholar
- Vacha-Haase, T., & Nilsson, J. E. (1998). Statistical significance reporting: current trends and usages in MECD. Measurement and Evaluation in Counseling and Development, 31, 46–57. Retrieved from http://mec.sagepub.com.
- Wilcox, R. R. (2005). Introduction to robust estimation and hypothesis testing (2nd ed.). San Diego, CA: Elsevier Academic Press.Google Scholar
- Yin, P., & Fan, X. (2001). Estimating R 2 shrinkage in multiple regression: a comparison of different analytical methods. The Journal of Experimental Education, 69, 203–224. doi: 10.1080/00220970109600656.