The sensitivity of three methods to nonnormality and unequal variances in interval estimation of effect sizes

Chen, Li-Ting; Peng, Chao-Ying Joanne

doi:10.3758/s13428-014-0461-3

The sensitivity of three methods to nonnormality and unequal variances in interval estimation of effect sizes

Published: 03 May 2014

Volume 47, pages 107–126, (2015)
Cite this article

Behavior Research Methods Aims and scope Submit manuscript

Li-Ting Chen¹ &
Chao-Ying Joanne Peng¹

562 Accesses
11 Citations
1 Altmetric
Explore all metrics

Abstract

Confidence interval (CI) estimation for an effect size (ES) provides a range of possible population ESs supported by data. In this article, we investigated the noncentral t method, Bonett’s method, and the bias-corrected and accelerated (BCa) bootstrap method for constructing CIs when a standardized linear contrast of means is defined as an ES. The noncentral t method assumes normality and equal variances, Bonett’s method assumes only normality, and the BCa bootstrap method makes no assumptions. We simulated data for three and four groups from a variety of populations (one normal and five nonnormals) with varied variance ratios (1, 2.25, 4, 8), population ESs (0, 0.2, 0.5, 0.8), and sample size patterns (one equal and two unequal). Results showed that the noncentral method performed the best among the three methods under the joint condition of ES = 0 and equal variances. Performance of the noncentral method was comparable to that of the other two methods under (1) equal sample size, unequal weight for each group, and the last group sampled from a leptokurtic distribution, or (2) equal sample size and equal weight for all groups, when all are sampled from a normal population, or only the last group sampled from a nonnormal distribution. In the remaining conditions, Bonett’s and the BCa bootstrap methods performed better than the noncentral method. The BCa bootstrap method is the method of choice when the sample size per group is 30 or more. Findings from this study have implications for simultaneous comparisons of means and of ranked means in between- and within-subjects designs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sampling Techniques for Quantitative Research

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

References

Algina, J., Keselman, H. J., & Penfield, R. D. (2005a). An alternative to Cohen’s standardized mean difference effect size: A robust parameter and confidence interval in the two independent groups case. Psychological Methods, 10, 317–328. doi:10.1037/1082-989X.10.3.31
Algina, J., Keselman, H. J., & Penfield, R. D. (2005b). Effect sizes and their Intervals: The two-level repeated measures case. Educational and Psychological Measurement, 65, 241–258. doi:10.1177/0013164404268675
Barker, N. (2005). A practical introduction to the bootstrap using the SAS system. Proceedings of SAS conference: Phuse. Retrieved from http://www.lexjansen.com/phuse/2005/pk/pk02.pdf
Bird, K. D. (2002). Confidence intervals for effect sizes in analysis of variance. Educational and Psychological Measurement, 62, 197–226. doi:10.1177/0013164402062002001
Article Google Scholar
Bonett, D. G. (2008). Confidence intervals for standardized linear contrasts of means. Psychological Methods, 13, 99–109. doi:10.1037/1082-989X.13.2.99
Article PubMed Google Scholar
Bonett, D. G., & Price, R. M. (2002). Statistical inference for a linear function of medians: Confidence intervals, hypothesis testing, and sample size requirements. Psychological Methods, 7, 370–383. doi:10.1037/1082-989x.7.3.370
Article PubMed Google Scholar
Bradley, J. V. (1978). Robustness? British Journal of Mathematical & Statistical Psychology, 31, 144–152. doi:10.1111/j.2044-8317.1978.tb00581.x
Chen, L.-T., & Peng, C.-Y. J. (2013). Constructing confidence intervals for effect sizes in ANOVA designs. Journal of Modern Applied Statistical Methods, 12, 82--104.
Cliff, N. (1993). Dominance statistics: Ordinal analyses to answer ordinal questions. Psychological Bulletin, 114, 494–509. doi:10.1037/0033-2909.114.3.494
Article Google Scholar
Cohen, J. (1969). Statistic power analysis in the behavioral sciences. New York: Academic Press.
Google Scholar
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale: Erlbaum.
Google Scholar
Cumming, G., & Finch, S. (2001). A primer on the understanding, use, and calculation of confidence intervals that are based on central and noncetral distributions. Educational and Psychological Measurement, 61, 532–574. doi:10.1177/0013164401614002
Article Google Scholar
Deng, N., Allison, J. J., Fang, H. J., Ash, A. S., & Ware, J. E. (2013). Using the bootstrap to establish statistical significance for relative validity comparisons among patient-reported outcome measures. Health and Quality of Life Outcomes, 11, 89. doi:10.1186/1477-7525-11-89
Article PubMed Central PubMed Google Scholar
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.
Book Google Scholar
Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43, 521–532.
Article Google Scholar
Harwell, M. (1997). An empirical study of Hedges’s homogeneity test. Psychological Methods, 2, 219–231. doi:10.1037//1082-989x.2.2.219
Article Google Scholar
Harwell, M. R., Rubinstein, E. N., Hayes, W. S., & Olds, C. C. (1992). Summarizing Monte Carlo results in methodological research: The one-and two-factor fixed effects ANOVA cases. Journal of Educational and Behavioral Statistics, 17, 315–339. doi:10.3102/10769986017004315
Google Scholar
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego: Academic Press.
Google Scholar
Hess, M. R., & Kromrey, J. D. (2004). Robust confidence intervals for effect sizes: A comparative study of Cohen’s d and Cliff’s delta under non-normality and heterogeneous variance. Paper presented at the the annual meeting of the American Educational Research Association, San Diego, CA.
Kelley, K. (2005). The effects of nonnormal distributions on confidence intervals around the standardized mean difference: Bootstrap and parametric confidence intervals. Educational and Psychological Measurement, 65, 51–69. doi:10.1177/0013164404264850
Keselman, H. J., Algina, J., Lix, L. M., Wilcox, R. R., & Deering, K. N. (2008). A generally robust approach for testing hypotheses and setting confidence intervals for effect sizes. Psychological Methods, 13, 110–129. doi:10.1037/1082-989x.13.2.110
Article PubMed Google Scholar
Keselman, H. J., Huberty, C. J., Lix, L. M., Olejnik, S., Cribbie, R. A., Donahue, B., Kowalchuk, R. K., Lowman, L. L., Petoskey, M. D., Keselman, J. C., & Levin, J. R. (1998). Statistical practices of educational researchers: An analysis of their ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research, 68, 350–386. doi:10.3102/00346543068003350
Kirk, R. E. (2013). Experimental design: Procedures for the behavioral sciences (4th ed.). Thousand Oaks: Sage.
Book Google Scholar
Kline, R. B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: American Psychological Association.
Kratochwill, T. R., & Levin, J. R. (1992). Single-case research design and analysis: New directions for psychology and education. Hillsdale, NJ: Erlbaum.
Odgaard, E. C., & Fowler, R. L. (2010). Confidence intervals for effect sizes: Compliance and clinical significance in the Journal of Consulting and Clinical Psychology. Journal of Consulting and Clinical Psychology, 78, 287–297. doi:10.1037/a0019294
Article PubMed Google Scholar
Peng, C.-Y. J., & Chen, L.-T. (2014). Beyond Cohen's d: Alternative effect size measures for between-subject designs. The Journal of Experimental Education, 82, 22–50. doi:10.1080/00220973.2012.745471
Peng, C.-Y. J., Chen, L.-T., Chiang, H.-M., & Chiang, Y.-C. (2013). The impact of APA and AERA guidelines on effect size reporting. Educational Psychology Review, 25, 157–209. doi:10.1007/s10648-013-9218-2
Ramsey, P. H., Barrera, K., Hachimine-Semprebom, P., & Liu, C.-C. (2011). Pairwise comparisons of means under realistic nonnormality, unequal variances, outliers and equal sample sizes. Journal of Statistical Computation and Simulation, 81, 125–135. doi:10.1080/00949650903219935
Article Google Scholar
Snijders, T. A. B., & Bosker, R. J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Thousand Oaks: Sage.
Google Scholar
Steiger, J. H. (2004). Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. Psychological Methods, 9, 164–182. doi:10.1037/1082-989X.9.2.164
Steiger, J. H., & Fouladi, R. T. (1997). Noncentrality interval estimation and the evaluation of statistical models. In L. L. Harlow, S. A. Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests? (pp. 221–257). Mahwah, NJ: Lawrence Erlbaum.
Stuart, A., & Ord, J. K. (1994). Kendall’s advanced theory of statistics (Vol. I, 6th ed). London: Arnold.
Thompson, B. (2002). What future quantitative social science research could look like: Confidence intervals for effect sizes. Educational Researcher, 31(3), 25--32. doi:10.3102/0013189X031003025
Thompson, B. (2008). Computing and interpreting effect sizes, confidence intervals, and confidence intervals for effect sizes. In J. W. Osborne (Ed.), Best Practices in Quantitative Methods (pp. 246–262). Thousand Oaks: Sage.
Viechtbauer, W. (2007). Approximate confidence intervals for standardized effect sizes in the two-independent and two-dependent samples design. Journal of Educational and Behavioral Statistics, 32, 39–60. doi:10.3102/1076998606298034
Article Google Scholar
Wilcox, R. R. (2005). Introduction to robust estimation and hypothesis testing (2nd ed.). San Diego, CA: Academic Press.

Download references

Acknowledgments

This research was supported in part by the Maris M. Proffitt and Mary Higgins Proffitt Endowment Grant of Indiana University, awarded to the second author while the first author worked on the project as a research assistant. We thank the editor, two reviewers, and Po-Ju Wu for their insightful comments on an earlier version of the manuscript.

Author information

Authors and Affiliations

Department of Counseling and Educational Psychology, Indiana University-Bloomington, Bloomington, IN, 47405, USA
Li-Ting Chen & Chao-Ying Joanne Peng

Authors

Li-Ting Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chao-Ying Joanne Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li-Ting Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, LT., Peng, CY.J. The sensitivity of three methods to nonnormality and unequal variances in interval estimation of effect sizes. Behav Res 47, 107–126 (2015). https://doi.org/10.3758/s13428-014-0461-3

Download citation

Published: 03 May 2014
Issue Date: March 2015
DOI: https://doi.org/10.3758/s13428-014-0461-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The sensitivity of three methods to nonnormality and unequal variances in interval estimation of effect sizes

Abstract

Access this article

Similar content being viewed by others

Sampling Techniques for Quantitative Research

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The sensitivity of three methods to nonnormality and unequal variances in interval estimation of effect sizes

Abstract

Access this article

Similar content being viewed by others

Sampling Techniques for Quantitative Research

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation