Bias and precision of some classical ANOVA effect sizes when assumptions are violated
 Susan Troncoso Skidmore,
 Bruce Thompson
 … show all 2 hide
Abstract
Previous simulation research has focused on evaluating the impact of analytic assumption violations on statistics related to the F test and associated p _{CALCULATED} values. The present article evaluated the bias of classical estimates of practical significance (i.e., effect size sample estimators \( {\widehat{\eta}^2} \) , \( {\widehat{\varepsilon}^2} \) , and \( {\widehat{\omega}^2} \) ) in a oneway betweensubjects univariate ANOVA when assumptions are violated. The simulation conditions modeled were selected on the basis of prior empirical research. Estimated (1) sampling error bias and (2) precision computed for each of the three effect size estimates for the 5,000 samples drawn for each of the 270 (5 parameter Cohen's d values × 3 group size ratios × 3 population distribution shapes × 3 variance ratios × 2 total ns) conditions were modeled for each of the k = 2, 3, and 4 group analyses. Our results corroborate the limited previous related research and suggest that \( {\widehat{\eta}^2} \) should not be used as an ANOVA effect size estimator, even though \( {\widehat{\eta}^2} \) is the only available choice in the menus in most commonly available software.
 Aiken, LS, West, SG, Millsap, RE (2008) Doctoral training in statistics, measurement and methodology in psychology. American Psychologist 63: pp. 3250 CrossRef
 Algina, J, Keselman, HJ, Penfield, RD (2005) An alternative to Cohen's standardized mean difference effect size: A robust parameter and confidence interval in the two independent groups case. Psychological Methods 10: pp. 317328 CrossRef
 Standards for reporting on empirical social science research in AERA publications. Educational Researcher 35: pp. 3340 CrossRef
 Publication manual of the American Psychological Association. American Psychological Association, Washington, DC
 Publication manual of the American Psychological Association. American Psychological Association, Washington, DC
 Publication manual of the American Psychological Association. American Psychological Association, Washington, DC
 Box, GEP (1954) Some theorems on quadratic forms applied in the study of analysis of variance problems. II. Effects of inequality of variance and of correlation between errors in the twoway classification. Annals of Mathematical Statistics 25: pp. 484498 CrossRef
 Capraro, RM, Thompson, B (2008) The educational researcher defined: What will future researchers be trained to do?. The Journal of Educational Research 101: pp. 247253 CrossRef
 Carroll, RM, Nordholm, LA (1975) Sampling characteristics of Kelley's ε2 and Hay's ω2. Educational and Psychological Measurement 35: pp. 541554 CrossRef
 Cohen, J (1988) Statistical power analysis for the behavioral sciences. Erlbaum, Hillsdale, NJ
 David, HA (1995) First (?) occurrence of common terms in mathematical statistics. The American Statistician 49: pp. 121133
 Donaldson, TS (1968) Robustness of the Ftest to errors of both kinds and the correlation between the numerator and denominator of the Fratio. Journal of the American Statistical Association 63: pp. 660676 CrossRef
 Edginton, ES (1964) A tabulation of inferential statistics used in psychology journals. American Psychologist 19: pp. 202203 CrossRef
 Edginton, ES (1974) A new tabulation of statistical procedures used in APA journals. American Psychologist 29: pp. 2526 CrossRef
 Elmore, P. B., & Woehlke, P. L. (1996, April). Research methods employed in American Educational Research Journal, Educational Researcher, and Review of Educational Research from 1978–1995. Paper presented at the Annual meeting of the American Educational Research Association, New York.
 Fan, X, Felsovalyi, A, Sivo, SA, Keenan, SC (2001) SAS for Monte Carlo studies: A guide for quantitative researchers. SAS Institute, Cary, NC
 Faul, F, Erdfelder, E, Lang, AG, Buchner, A (2007) G*Power 3: A flexible statistical power analysis program for the social, behavioral and biomedical sciences. Behavior Research Methods 39: pp. 175191 CrossRef
 Fidler, F (2002) The fifth edition of the APA Publication Manual: Why its statistics recommendations are so controversial. Educational and Psychological Measurement 62: pp. 749770 CrossRef
 Fidler, F. (2005). From statistical significance to effect estimation: Statistical reform in psychology, medicine, and ecology. Doctoral dissertation, University of Melbourne. www.botany.unimelb.edu.au/envisci/docs/fidler/fidlerphd_aug06.pdf
 Fisher, RA (1918) The causes of human variability. The Eugenics Review 10: pp. 213220
 Fleishman, AI (1978) A method for simulating nonnormal distributions. Psychometrika 43: pp. 521532 CrossRef
 Gamst, G, Meyers, LS, Guarino, AJ (2008) Analysis of variance designs: A conceptual and computational approach with SPSS and SAS. Cambridge University Press, Cambridge CrossRef
 Glass, GV, Peckham, PD, Sanders, JR (1972) Consequences of failure to meet assumption underlying the fixed effects analysis of variance and covariance. Review of Educational Research 42: pp. 237288 CrossRef
 Grissom, RJ, Kim, JJ (2005) Effect sizes for research: A broad practical approach. Psychology Press, New York
 Harwell, MR, Rubinstein, EN, Hayes, WS, Olds, CC (1992) Summarizing Monte Carlo results in methodological research: The one and twofactor fixed effects ANOVA cases. Journal of Educational Statistics 17: pp. 315339 CrossRef
 Hays, WL (1981) Statistics. Holt, Rinehart and Winston, New York
 Henson, R. K., & Williams, C. (2006, April). Doctoral training in research methodology: A national survey of educationrelated degrees. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.
 Hsu, PL (1938) Contribution to the theory of "Student's" ttest as applied to the problem of two samples. Statistical Research Memoirs 2: pp. 124
 Kelley, TL (1935) An unbiased correlation ratio measure. Proceedings of the National Academy of Sciences 21: pp. 554559 CrossRef
 Keselman, HJ (1975) A Monte Carlo investigation of three estimates of treatment magnitude: Epsilon squared, eta squared, and omega squared. Canadian Psychological Review 16: pp. 4448 CrossRef
 Keselman, HJ, Algina, J, Lix, LM, Wilcox, RR, Deering, KN (2008) A generally robust approach for testing hypotheses and setting confidence intervals for effect sizes. Psychological Methods 13: pp. 110129 CrossRef
 Keselman, H. J., Algina, J., Lix, L. M., Wilcox, R. R., & Deering, K. (2008b). Supplemental materials to 129a. A SAS program to implement a general approximate degrees of freedom solution for inference and estimation. http://dx.doi.org/10.1037/ 1082989X.13.2.110.supp
 Keselman, HJ, Huberty, CJ, Lix, LM, Olejnik, S, Cribbie, RA, Donahue, B, Kowalchuk, RK, Lowman, LL, Petoskey, MD, Keselman, JC, Levin, JR (1998) Statistical practices of educational researchers: An analysis of their ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research 68: pp. 350386 CrossRef
 Kieffer, KM, Reese, RJ, Thompson, B (2001) Statistical techniques employed in "AERJ" and "JCP" articles from 1988 to 1997: A methodological review. The Journal of Experimental Education 69: pp. 280309 CrossRef
 Kirk, RE (1995) Experimental design: Procedures for the behavioral sciences. Brooks/Cole, New York
 Kromrey, JD, Hines, CV (1996) Estimating the coefficient of crossvalidity in multiple regression: A comparison of analytical and empirical methods. The Journal of Experimental Education 64: pp. 240266 CrossRef
 Lix, LM, Keselman, JC, Keselman, HJ (1996) Consequences of assumption violations revisited: A quantitative review of alternatives to the oneway analysis of variance "F" test. Review of Educational Research 66: pp. 579619
 Pierce, CA, Block, RA, Aguinis, H (2004) Cautionary note on reporting etasquared values from multifactor ANOVA designs. Educational and Psychological Measurement 64: pp. 916924 CrossRef
 Robey, RR, Barcikowski, RS (1992) Type I error and the number of iterations in Monte Carlo studies of robustness. British Journal of Mathematical and Statistical Psychology 45: pp. 283288 CrossRef
 Skidmore, ST, Thompson, B (2010) Statistical techniques used in published articles: A historical review of reviews. Educational and Psychological Measurement 70: pp. 777795 CrossRef
 Skidmore, ST, Thompson, B (2011) Choosing the best correction formula for the Pearson r 2 effect size. The Journal of Experimental Education 79: pp. 257278 CrossRef
 Snyder, PA, Thompson, B (1998) Use of tests of statistical significance and other analytic choices in a school psychology journal: Review of practices and suggested alternatives. School Psychology Quarterly 13: pp. 335348 CrossRef
 Thompson, B (1999) Improving research clarity and usefulness with effect size indices as supplements to statistical significance tests. Exceptional Children 65: pp. 329337
 Thompson, B (2002) What future quantitative social science research could look like: Confidence intervals for effect sizes. Educational Researcher 31: pp. 2431 CrossRef
 Thompson, B (2006) Foundations of behavioral statistics: An insightbased approach. Guilford, New York, NY
 Thompson, B Research synthesis: Effect sizes. In: Green, J, Camilli, G, Elmore, PB eds. (2006) Handbook of complementary methods in education research. American Educational Research Association, Washington, DC, pp. 583603
 Thompson, B, Snyder, PA (1998) Statistical significance and reliability analyses in recent JCD research articles. Journal of Counseling and Development 76: pp. 436441 CrossRef
 Tukey, JW A survey of sampling from contaminated distributions. In: Olkin, SG, Hoeffding, W, Madow, W, Mann, H eds. (1960) Contributions to probability and statistics: Essays in honor of Harold Hotelling. Stanford University Press, Stanford, CA
 Vale, CD, Maurelli, VA (1983) Simulating multivariate nonnormal distributions. Psychometrika 48: pp. 465471 CrossRef
 Wang, Z, Thompson, B (2007) Is the Pearson r2 biased, and if so, what is the best correction formula?. The Journal of Experimental Education 75: pp. 109125 CrossRef
 Wilcox, RR (1987) New designs in analysis of variance. Annual Review of Psychology 38: pp. 2960 CrossRef
 Wilcox, RR Robustness in ANOVA. In: Edwards, LK eds. (1993) Applied analysis of variance in behavioral science. Marcel Dekker, New York, pp. 345374
 Wilcox, RR (1995) ANOVA: A paradigm for low power and misleading measures of effect size. Review of Educational Research 65: pp. 5177 CrossRef
 Wilcox, RR (2006) Graphical methods for assessing effect size. The Journal of Experimental Education 74: pp. 353367 CrossRef
 Wilcox, RR, Charlin, V, Thompson, KL (1986) New Monte Carlo results on the robustness of the ANOVA F, W, F* statistics. Communications in Statistics: Simulation and Computation 15: pp. 933944 CrossRef
 Wilcox, RR, Keselman, HJ (2003) Modern robust data analysis methods: Measures of central tendency. Psychological Methods 8: pp. 254274 CrossRef
 Wilkinson, L (1999) Statistical methods in psychology journals: Guidelines and explanations. American Psychologist 54: pp. 594604 CrossRef
 Yin, P, Fan, X (2001) Estimating R 2 shrinkage in multiple regression: A comparison of different analytical methods. The Journal of Experimental Education 69: pp. 203224 CrossRef
 Zhang, Z, Schoeps, N (1997) On robust estimation of effect size under semiparametric models. Psychometrika 62: pp. 201214 CrossRef
 Title
 Bias and precision of some classical ANOVA effect sizes when assumptions are violated
 Journal

Behavior Research Methods
Volume 45, Issue 2 , pp 536546
 Cover Date
 20130601
 DOI
 10.3758/s1342801202572
 Online ISSN
 15543528
 Publisher
 SpringerVerlag
 Additional Links
 Topics
 Keywords

 Effect size
 Practical significance
 Analysis of variance
 Homogeneity of variance
 Type I error
 Power
 Eta squared
 Epsilon squared
 Omega squared
 Industry Sectors
 Authors

 Susan Troncoso Skidmore ^{(1)}
 Bruce Thompson ^{(2)}
 Author Affiliations

 1. Sam Houston State University, Huntsville, TX, USA
 2. Texas A&M University, College Station, TX, USA