In the present Monte Carlo study, the empirical Type I error properties and power of several statistics for testing the homogeneity hypothesis in a one—way classification are examined in the case of small sample sizes. We compared these tests under several scenarios: normal populations under heterogeneous variances, nonnormal populations under homogeneous variances, nonnormal populations under heterogeneous variances, balanced and unbalanced sample sizes, and increasing number of populations. Overall, none of the tests considered is uniformly dominating the others. Under normality and variance heterogeneity, the Brown—Forsythe and the Welch test perform well over a wide range of parameter configurations, the modified Brown-Forsythe test by Mehrotra keeps generally the level, but other tests may also perform well, depending on the constellation of the parameters under study. The Welch test becomes liberal when the sample sizes are small and the number of populations is large. We propose a modified version of Welch’s test that keeps the nominal level in these cases. With the understanding that methods are unacceptable if they have Type I error rates that are too high, only the testing procedure associated with the modified Brown-Forsythe test can be recommended both for normal and nonnormal data. Under normality, the modified Welch test can also be recommended.
meta—analysis balanced and unbalanced sample sizes homogeneous and heterogeneous variances nonnormality
This is a preview of subscription content, log in to check access.
Asiribo, O., Gurland, J. (1990). Coping with variance heterogeneity. Commun. Statist. Theory Meth., 19, 4029–4048.CrossRefGoogle Scholar
Bockenhoff, A., Hartung, J. (1998). Some corrections of the significance level in meta-analysis. Biometrical Journal, 40, 937–947.CrossRefMathSciNetGoogle Scholar
Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems, I. Effect of inequality of variance in the one-way classification. Annals of Mathematical Statistics, 25, 290–403.MATHCrossRefMathSciNetGoogle Scholar
Brown, M. B., Forsythe, A. B. (1974). The small sample behavior of some statistics which test the equality of several means. Technometrics, 16, 129–132.MATHCrossRefMathSciNetGoogle Scholar
Chalmers, T. C. (1991). Problems induced by meta-analyses. Statistics in Medicine, 10, 971–980.CrossRefGoogle Scholar
Cochran, W. G. (1937). Problems arising in the analysis of a series of similar experiments. J. Roy. Stat. Soc. Supp., 4, 102–118.CrossRefGoogle Scholar
Conover, W. J., Iman, R. L. (1981). Rank transformations as a bridge between parametric and nonparametric statistics. American Statistician, 35, 124–129.MATHCrossRefGoogle Scholar
De Beuckelaer, A. (1996). A closer examination on some parametric alternatives to the ANOVA F-test. Statistical Papers, 37, 291–305.MATHCrossRefGoogle Scholar
Fligner, M. A. (1981). Comment on rank transformations as a bridge between parametric and nonparametric statistics. American Statistician, 35, 131–133.CrossRefGoogle Scholar
Hardy, R. J., Thompson, S. G. (1998). Detecting and describing heterogeneity in meta-analysis. Statistics in Medicine, 17, 841–856.CrossRefGoogle Scholar
Hartung, J., Knapp, G. (2000). On tests of the overall treatment effect in the meta-analysis with normally distributed responses. Statistics in Medicine, to appear.Google Scholar
James, G. S. (1951). The comparison of several groups of observations when the ratios of population variances are unknown. Biometrika, 38, 324–329.MATHMathSciNetGoogle Scholar
Keselman, H. J., Wilcox, R. R. (1999). The ’improved’ Brown and Forsythe test for mean equality: some things can’t be fixed. Commun. Statist. Simula., 28, 687–698.MATHCrossRefGoogle Scholar