Computational Statistics

, Volume 28, Issue 3, pp 1269–1297 | Cite as

Testing homogeneity of variances with unequal sample sizes

  • I. Parra-FrutosEmail author
Original Paper


When sample sizes are unequal, problems of heteroscedasticity of the variables given by the absolute deviation from the median arise. This paper studies how the best known heteroscedastic alternatives to the ANOVA F test perform when they are applied to these variables. This procedure leads to testing homoscedasticity in a similar manner to Levene’s (1960) test. The difference is that the ANOVA method used by Levene’s test is non-robust against unequal variances of the parent populations and Levene’s variables may be heteroscedastic. The adjustment proposed by O’Neil and Mathews (Aust Nz J Stat 42:81–100, 2000) is approximated by the Keyes and Levy (J Educ Behav Stat 22:227–236, 1997) adjustment and used to ensure the correct null hypothesis of homoscedasticity. Structural zeros, as defined by Hines and O’Hara Hines (Biometrics 56:451–454, 2000), are eliminated. To reduce the error introduced by the approximate distribution of test statistics, estimated critical values are used. Simulation results show that after applying the Keyes–Levy adjustment, including estimated critical values and removing structural zeros the heteroscedastic tests perform better than Levene’s test. In particular, Brown–Forsythe’s test controls the Type I error rate in all situations considered, although it is slightly less powerful than Welch’s, James’s, and Alexander and Govern’s tests, which perform well, except in highly asymmetric distributions where they are moderately liberal.


Homoscedasticity tests Levene’s test Bartlett’s test Welch’s test Brown and Forsythe’s test James’s second-order test Alexander and Govern’s test Monte Carlo simulation Small samples Estimated critical values Structural zeros 



The author is sincerely grateful to two anonymous referees and the Associate Editor for their time and effort in providing very constructive, helpful and valuable comments and suggestions that have led to a substantial improvement in the quality of the paper.


  1. Akritas MG, Papadatos N (2004) Heteroscedastic one-way ANOVA and lack-of-fit tests. J Am Stat Assoc 99:368–382MathSciNetzbMATHCrossRefGoogle Scholar
  2. Alexander RA, Govern DM (1994) A new and simpler approximation and ANOVA under variance heterogeneity. J Educ Stat 19:91–101CrossRefGoogle Scholar
  3. Bartlett MS (1937) Properties of sufficiency and statistical tests. Proc R Soc Lond A Mat A 160:268–282CrossRefGoogle Scholar
  4. Bathke A (2004) The ANOVA F test can still be used in some balanced designs with unequal variances and nonnormal data. J Stat Plan Inference 126:413–422MathSciNetzbMATHCrossRefGoogle Scholar
  5. Boos DD, Brownie C (1989) Bootstrap methods for testing homogeneity of variances. Technometrics 31:69–82MathSciNetzbMATHCrossRefGoogle Scholar
  6. Boos DD, Brownie C (2004) Comparing variances and other measures of dispersion. Stat Sci 19:571–578MathSciNetzbMATHCrossRefGoogle Scholar
  7. Box GEP (1954) Some theorems on quadratic forms applied in the study of analysis of variance problems. I. Effect of inequality of variance in the one-way classification. Ann Math Stat 25:290–302MathSciNetzbMATHCrossRefGoogle Scholar
  8. Bradley JV (1978) Robustness? Br J Math Stat Psych 31:144–152CrossRefGoogle Scholar
  9. Brown MB, Forsythe AB (1974a) Robust tests for equality of variances. J Am Stat Assoc 69:364– 367zbMATHCrossRefGoogle Scholar
  10. Brown MB, Forsythe AB (1974b) The small sample behavior of some statistics which test the equality of several means. Technometrics 16:129–132MathSciNetzbMATHCrossRefGoogle Scholar
  11. Cahoy DO (2010) A bootstrap test for equality of variances. Comput Stat Data Anal 54:2306–2316MathSciNetCrossRefGoogle Scholar
  12. Carroll RJ, Schneider H (1985) A note on Levene’s test for equality of variances. Stat Probab Lett 3:191–194zbMATHCrossRefGoogle Scholar
  13. Charway H, Bailer AJ (2007) Testing multiple-group variance equality with randomization procedures. J Stat Comput Simul 77:797–803MathSciNetzbMATHCrossRefGoogle Scholar
  14. Clinch JJ, Keselman HJ (1982) Parametric alternatives to the analysis of variance. J Educ Stat 7:207–214CrossRefGoogle Scholar
  15. Cochran WG (1954) Some methods for strengthening the common \(\chi ^{2}\)-tests. Biometrics 10:417–451MathSciNetzbMATHCrossRefGoogle Scholar
  16. Conover WJ, Johnson ME, Johnson MM (1981) A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics 23:351–361CrossRefGoogle Scholar
  17. De Beuckelaer A (1996) A closer examination on some parametric alternatives to the ANOVA F-test. Stat Pap 37:291–305zbMATHCrossRefGoogle Scholar
  18. Dijkstra JB, Werter PS (1981) Testing the equality of several means when the population variances are unequal. Commun Stat B-Simul 10:557–569MathSciNetCrossRefGoogle Scholar
  19. Glass GV, Peckham PD, Sanders JR (1972) Consequences of failure to meet assumptions underlying the fixed effects analyses of variance and covariance. Rev Educ Res 42:237–288CrossRefGoogle Scholar
  20. Harwell MR, Rubinstein EN, Hayes WS, Olds CC (1992) Summarizing Monte Carlo results in methodological research: the one- and two-factor fixed effects ANOVA cases. J Educ Stat 17:315–339CrossRefGoogle Scholar
  21. Hui W, Gel YR, Gastwirth JL (2008) lawstat: an R package for law, public policy and biostatistics. J Stat Softw 28.
  22. Hines WGS, O’Hara Hines RJ (2000) Increased power with modified forms of the Levene (Med) test for heterogeneity of variance. Biometrics 56:451–454zbMATHCrossRefGoogle Scholar
  23. Hsiung T, Olejnik S, Huberty CJ (1994) Comment on a Wilcox test statistic for comparing means when variances are unequal. J Educ Stat 19:111–118CrossRefGoogle Scholar
  24. Iachine I, Petersen HC, Kyvikc KO (2010) Robust tests for the equality of variances for clustered data. J Stat Comput Simul 80:365–377MathSciNetzbMATHCrossRefGoogle Scholar
  25. James GS (1951) The comparison of several groups of observations when the ratios of the population variances are unknown. Biometrika 38:324–329MathSciNetzbMATHGoogle Scholar
  26. Johansen S (1980) The Welch-James approximation of the distribution of the residual sum of squares in weighted linear regression. Biometrika 67:85–92MathSciNetzbMATHCrossRefGoogle Scholar
  27. Kenny DA, Judd CM (1986) Consequences of violating the independence assumption in analysis of variance. Psychol Bull 99:422–431CrossRefGoogle Scholar
  28. Keselman HJ, Rogan JC, Feir-Walsh BJ (1977) An evaluation of some nonparametric and parametric tests for location equality. Br J Math Stat Psychol 30:213–221CrossRefGoogle Scholar
  29. Keselman HJ, Wilcox RR, Algina J, Othman AR, Fradette K (2008) A comparative study of robust tests for spread: Asymmetric trimming strategies. Br J Math Stat Psychol 61:235–253MathSciNetCrossRefGoogle Scholar
  30. Keyes TK, Levy MS (1997) Analysis of Levene’s test under design imbalance. J Educ Behav Stat 22: 227–236Google Scholar
  31. Layard MWJ (1973) Robust large-sample tests for homogeneity of variances. J Am Stat Assoc 68:195–198CrossRefGoogle Scholar
  32. Levene H, Levene H (1960) Robust tests for equality of variances. Essays in Honor of Harold Hotelling. In: Olkin I, Ghurye SG, Hoeffding W, Madow WG, Mann HB (eds) Contributions to probability and statistics. Stanford University Press, Palo Alto, p 292Google Scholar
  33. Lim TS, Loh WY (1996) A comparison of tests of equality of variances. Comput Stat Data Anal 22:287–301MathSciNetzbMATHCrossRefGoogle Scholar
  34. Lix LM, Keselman JC, Keselman HJ (1996) Consequences of assumption violations revisited, a quantitative review of alternatives to the one-way analysis of variance F test. Rev Educ Res 66:579–619Google Scholar
  35. Loh WY (1987) Some modifications of Levene’s test of variance homogeneity. J Stat Comput Simul 28:213–226CrossRefGoogle Scholar
  36. Markowski CA, Markowski EP (1990) Conditions for the effectiveness of a preliminary test of variance. Am Stat 44:322–326Google Scholar
  37. Mehrotra DV (1997) Improving the Brown-Forsythe solution to the generalizied Behrens-Fisher problem. Commun Stat-Simul 26:1139–1145zbMATHCrossRefGoogle Scholar
  38. Neuhäuser M (2007) A comparative study of nonparametric two-sample tests after Levene’s transformation. J Stat Comput Simul 77:517–526MathSciNetzbMATHCrossRefGoogle Scholar
  39. Neuhäuser M, Hothorn LA (2000) Location-scale and scale trend tests based on Levene’s transformation. Comput Stat Data Anal 33:189–200zbMATHCrossRefGoogle Scholar
  40. Noguchi K, Gel YR (2010) Combination of Levene-type tests and a finite-intersection method for testing equality of variances against ordered alternatives. J Nonparametr Stat 22:897–913MathSciNetzbMATHCrossRefGoogle Scholar
  41. O’Neill ME, Mathews K (2000) A weighted least squares approach to Levene’s test of homogeneity of variance. Aust Nz J Stat 42:81–100MathSciNetzbMATHCrossRefGoogle Scholar
  42. Oshima TC, Algina J (1992) Type I error rates for James’s second-order test and Wilcox’s Hm test under heteroscedasticity and non-normality. Br J Math Stat Psychol 45:255–263CrossRefGoogle Scholar
  43. Parra-Frutos I (2009) The behaviour of the modified Levene’s test when data are not normally distributed. Comput Stat 24:671–693MathSciNetzbMATHCrossRefGoogle Scholar
  44. Rogan JC, Keselman HJ (1977) Is the ANOVA F-test robust to variance heterogeneity when samples sizes are equal? Am Educ Res J 14:493–498CrossRefGoogle Scholar
  45. Rubin AS (1983) The use of weighted contrasts in analysis of models with heterogeneity of variance. P Bus Eco Stat Am Stat Assoc 347–352Google Scholar
  46. Scheffé H (1959) The analysis of variance. Wiley, New YorkzbMATHGoogle Scholar
  47. Schneider PJ, Penfield DA (1997) Alexander and Govern’s approximation, providing an alternative to ANOVA under variance heterogeneity. J Exp Educ 65:271–286CrossRefGoogle Scholar
  48. Siegel S, Tukey JW (1960) A nonparametric sum of ranks procedure for relative spread in unpaired samples. J Am Stat Assoc 55:429–444 (corrections appear in vol. 56:1005)Google Scholar
  49. Welch BL (1951) On the comparison of several mean values, an alternative approach. Biometrika 38: 330–336MathSciNetzbMATHGoogle Scholar
  50. Wilcox RR (1988) A new alternative to the ANOVA F and new results on James’ second-order method. Br J Math Stat Psychl 41:109–117MathSciNetzbMATHCrossRefGoogle Scholar
  51. Wilcox RR (1989) Adjusting for unequal variances when comparing means in one-way and two-way fixed effects ANOVA models. J Educ Stat 14:269–278CrossRefGoogle Scholar
  52. Wilcox RR (1990) Comparing the means of two independent groups. Biom J 32:771–780CrossRefGoogle Scholar
  53. Wilcox RR (1995) ANOVA: a paradigm for low power and misleading measures of effect size? Rev Educ Res 65:51–77CrossRefGoogle Scholar
  54. Wilcox RR (1997) A bootstrap modification of the Alexander-Govern ANOVA method, plus comments on comparing trimmed means. Educ Psychol Meas 57:655–665CrossRefGoogle Scholar
  55. Wilcox RR, Charlin VI, Thompson KL (1986) New Monte Carlo results on the robustness of the ANOVA F, W and F-statistics. Commun Stat Simul C 15:933–943MathSciNetCrossRefGoogle Scholar
  56. Wludyka P, Sa P (2004) A robust I-sample analysis of means type randomization test for variances for unbalanced designs. J Stat Comput Simul 74:701–726MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  1. 1.Department of Quantitative Methods for Economics and Business, Economics and Business SchoolUniversity of MurciaMurciaSpain

Personalised recommendations