Skip to main content
Log in

On Detecting a Minimal Important Difference among Standardized Means

  • Published:
Current Psychology Aims and scope Submit manuscript

Abstract

As a generalization of the standardized mean difference between two independent populations, two different effect size measures have been proposed to represent the degree of disparity among several treatment groups. One index relies on the standard deviation of the standardized means and the second formula is the range of the standardized means. Despite the obvious usage of the two measures, the associated test procedures for detecting a minimal important difference among standardized means have not been well explicated. This article reviews and compares the two approaches to testing the hypothesis that treatments have negligible effects rather than that of no difference. The primary emphasis is to reveal the underlying properties of the two methods with regard to power behavior and sample size requirement across a variety of design configurations. To enhance the practical usefulness, a complete set of computer algorithms for calculating the critical values, p-values, power levels, and sample sizes is also developed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alhija, F. N. A., & Levy, A. (2009). Effect size reporting practices in published articles. Educational and Psychological Measurement, 69(2), 245–265.

    Article  Google Scholar 

  • Bau, J. J., Chen, H. J., & Xiong, M. (1993). Percentage points of the studentized range test for dispersion of normal means. Journal of Statistical Computation and Simulation, 44(3–4), 149–163.

    Article  Google Scholar 

  • Baumann, J. F., Seifert-Kessell, N., & Jones, L. A. (1992). Effect of think-aloud instruction on elementary students’ comprehension monitoring abilities. Journal of Reading Behavior, 24(2), 143–172.

    Article  Google Scholar 

  • Breaugh, J. A. (2003). Effect size estimation: factors to consider and mistakes to avoid. Journal of Management, 29(1), 79–97.

    Article  Google Scholar 

  • Chen, H. J., Wen, M. J., & Wang, M. (2009). On testing the bioequivalence of several treatments using the measure of distance. Statistics, 43(5), 513–530.

    Article  Google Scholar 

  • Chen, H. J., Wen, M. J., & Chuang, C. J. (2011). On testing the equivalence of treatments using the measure of range. Computational Statistics and Data Analysis, 55(1), 603–614.

    Article  Google Scholar 

  • Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York: Academic Press.

    Google Scholar 

  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale: Erlbaum.

    Google Scholar 

  • Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997–1003.

    Article  Google Scholar 

  • Cribbie, R. A., Arpin-Cribbie, C. A., & Gruman, J. A. (2010). Tests of equivalence for one-way independent groups designs. Journal of Experimental Education, 78(1), 1–13.

    Article  Google Scholar 

  • David, H. A., Lachenbruch, P. A., & Brandis, H. P. (1972). The power function of range and studentized range tests in normal samples. Biometrika, 59(1), 161–168.

    Article  Google Scholar 

  • Durlak, J. A. (2009). How to select, calculate, and interpret effect sizes. Journal of Pediatric Psychology, 34(9), 917–928.

    Article  PubMed  Google Scholar 

  • Ferguson, C. J. (2009). An effect size primer: a guide for clinicians and researchers. Professional Psychology: Research and Practices, 40(5), 532–538.

    Article  Google Scholar 

  • Fern, E. F., & Monroe, K. B. (1996). Effect-size estimates: issues and problems in interpretation. Journal of Consumer Research, 23(2), 89–105.

    Article  Google Scholar 

  • Fleishman, A. I. (1980). Confidence intervals for correlation ratios. Educational and Psychological Measurement, 40(3), 659–670.

    Article  Google Scholar 

  • Ghosh, B. K. (1973). Some monotonicity theorems for χ2, F and t distributions with applications. Journal of the Royal Statistical Society, Series B, 35, 480–492.

    Google Scholar 

  • Giani, G., & Finner, H. (1991). Some general results on least favorable parameter configurations with special reference to equivalence testing and the range statistic. Journal of Statistical Planning and Inference, 28(1), 33–47.

    Article  Google Scholar 

  • Grissom, R. J., & Kim, J. J. (2012). Effect sizes for research: univariate and multivariate applications (2nd ed.). New York: Routledge.

    Google Scholar 

  • Hayter, A. J., & Hurn, M. (1992). Power comparisons between the F-test, the studentised range test, and an optimal test of the equality of several normal means. Journal of Statistical Computation and Simulation, 42(3–4), 173–185.

    Article  Google Scholar 

  • Hayter, A. J., & Liu, W. (1992). Some minimax test procedures for comparing several normal means. In F. M. Hoppe (Ed.), Multiple comparisons, selection, and applications in biometry, a festschrift in honor of Charles W. Dunnett (pp. 137–148). New York: Marcel Dekker.

    Google Scholar 

  • Huberty, C. (2002). A history of effect size indices. Educational and Psychological Measurement, 62(2), 227–240.

    Article  Google Scholar 

  • Kirk, R. (1996). Practical significance: a concept whose time has come. Educational and Psychological Measurement, 56(5), 746–759.

    Article  Google Scholar 

  • Kline, R. B. (2004). Beyond significance testing: reforming data analysis methods in behavioral research. Washington: American Psychological Association.

    Book  Google Scholar 

  • Murphy, K. R. (1990). If the null hypothesis is impossible, why test it? American Psychologist, 45(3), 403–404.

    Article  Google Scholar 

  • Murphy, K. R., & Myors, B. (1999). Testing the hypothesis that treatments have negligible effects: minimum-effect tests in the general linear model. Journal of Applied Psychology, 84(2), 234–248.

    Article  Google Scholar 

  • Olejnik, S., & Algina, J. (2000). Measures of effect size for comparative studies: applications, interpretations, and limitations. Contemporary Educational Psychology, 25(3), 241–286.

    Article  PubMed  Google Scholar 

  • Pearson, E. S., & Hartley, H. O. (1951). Charts of the power function for analysis of variance tests, derived from the non-central F-distribution. Biometrika, 38(1–2), 112–130.

    Article  PubMed  Google Scholar 

  • R Development Core Team (2014). R: A language and environment for statistical computing [Computer software and manual]. Retrieved from http://www.r-project.org.

  • Richardson, J. T. E. (1996). Measures of effect size. Behavior Research Methods, Instruments, & Computers, 28(1), 12–22.

    Article  Google Scholar 

  • Rosenthal, R., Rosnow, R. L., & Rubin, D. B. (2000). Contrasts and effect size in behavioral research: a correlational approach. New York: Cambridge University Press.

    Google Scholar 

  • Rosnow, R. L., & Rosenthal, R. (2003). Effect sizes for experimenting psychologists. Canadian Journal of Experimental Psychology, 57(3), 221–237.

    Article  PubMed  Google Scholar 

  • SAS Institute. (2014). SAS/IML User’s Guide, Version 9.3. Cary: SAS Institute Inc..

    Google Scholar 

  • Serlin, R. A., & Lapsley, D. K. (1993). Rational appraisal of psychological research and the good-enough principle. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences: methodological issues (pp. 199–228). Hillsdale: Erlbaum.

    Google Scholar 

  • Skidmore, S. T., & Thompson, B. (2010). Statistical techniques used in published articles: a historical review of reviews. Educational and Psychological Measurement, 70(5), 777–795.

    Article  Google Scholar 

  • Steiger, J. H. (2004). Beyond the F test: effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. Psychological Methods, 9(2), 164–182.

    Article  PubMed  Google Scholar 

  • Vacha-Haase, T., & Thompson, B. (2004). How to estimate and interpret various effect sizes. Journal of Counseling Psychology, 51(4), 473–481.

    Article  Google Scholar 

  • Warne, R. T., Lazo, M., Ramos, T., & Ritter, N. (2012). Statistical methods used in gifted education journals, 2006-2010. Gifted Child Quarterly, 56(3), 134–149.

    Article  Google Scholar 

  • Wellek, S. (2010). Testing statistical hypotheses of equivalence and noninferiority (2nd ed.). New York: CRC Press.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gwowen Shieh.

Ethics declarations

Funding

The author has no support or funding to report.

Conflict of Interest

The author declares that he has no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Electronic supplementary material

ESM 1

(DOCX 53 kb)

ESM 2

(DOCX 96 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shieh, G. On Detecting a Minimal Important Difference among Standardized Means. Curr Psychol 37, 640–647 (2018). https://doi.org/10.1007/s12144-016-9549-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12144-016-9549-5

Keywords

Navigation