ANOVA for unbalanced data: Use Type II instead of Type III sums of squares

Abstract

Methods for analyzing unbalanced factorial designs can be traced back to Yates (1934). Today, most major statistical programs perform, by default, unbalanced ANOVA based on Type III sums of squares (Yates's weighted squares of means). As criticized by Nelder and Lane (1995), this analysis is founded on unrealistic models—models with interactions, but without all corresponding main effects. The Type II analysis (Yates's method of fitting constants) is usually not preferred because of the underlying assumption of no interactions. This argument is, however, also founded on unrealistic models. Furthermore, by considering the power of the two methods, it is clear that Type II is preferable.

This is a preview of subscription content, log in to check access.

References

  1. Brandt A.E. 1933. The analysis of variance in a 2 × s table with disproportionate frequencies. Journal of the American Statistical Association 28: 164–173.

    Google Scholar 

  2. Elston R.C. and Bush N. 1964. The hypotheses that can be tested when there are interactions in an analysis of variance model. Biometrics 20: 681–698.

    Google Scholar 

  3. Gallo P.P. 2000. Center-weighting issues in multicenter clinical trials. Journal of Biopharmaceutical Statistics. 10: 145–163.

    Google Scholar 

  4. Herr D.G. 1986. On the history of ANOVA in unbalanced, factorial designs: The first 30 years. The American Statistician 40: 265–270.

    Google Scholar 

  5. Kempthorne O. 1975. Fixed and mixed models in the analysis of variance. Biometrics 38: 613–621.

    Google Scholar 

  6. Lewsey J.D., Gardiner W.P., and Gettinby G. 1997. A study of simple unbalanced factorial designs that use type II and type III sums of squares. Communications in Statistics-Simulation and Computation 26: 1315–1328.

    Google Scholar 

  7. Lewsey J.D., Gardiner W.P., and Gettinby G. 2001. A study of type II and type III power for testing hypotheses from unbalanced factorial designs. Communications in Statistics-Simulation and Computation 30: 597–609.

    Google Scholar 

  8. Nelder J.A. 1977. A reformulation of linear models (with discussion). Journal of the Royal Statistical Society Series A140: 48–77.

    Google Scholar 

  9. Nelder J.A. 1994. The statistics of linear models: Back to basics (with discussion in vol. 5 (1995) 84-111). Statistics and Computing 4: 221–234.

    Google Scholar 

  10. Nelder J.A. and Lane P.W. 1995. The computer analysis of factorial experiments: In memoriam-FrankYates. The American Statistician 49: 382–385.

    Google Scholar 

  11. Overall J.E. and Spiegel D.K. 1969. Concerning least squares analysis of experimental data. Psychological Bulletin. 72: 311–322.

    Google Scholar 

  12. Senn S. 1998. Some controversies in planning and analysing multicentre trials. Statistics in Medicine 17: 1753–1765.

    Google Scholar 

  13. Senn S. 2000a. The many modes of meta. Drug Information Journal 34: 535–549.

    Google Scholar 

  14. Senn S. 2000b. Consensus and controversy in pharmaceutical statistics. Journal of the Royal Statistical Society Series D-The Statistician 49: 135–156.

    Google Scholar 

  15. Shaw R.G. and Mitchell-Olds T. 1993. ANOVA for unbalanced data: An overview. Ecology 74: 1638–1645.

    Google Scholar 

  16. Speed F.M., Hocking R.R., and Hackney O.P. 1978. Methods of analysis of linear models with unbalanced data. Journal of the American Statistical Association. 73: 105–112.

    Google Scholar 

  17. Yates F. 1934. The analysis of multiple classifications with unequal numbers in the different classes. Journal of the American Statistical Association. 29: 51–66.

    Google Scholar 

Download references

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Langsrud, Ø. ANOVA for unbalanced data: Use Type II instead of Type III sums of squares. Statistics and Computing 13, 163–167 (2003). https://doi.org/10.1023/A:1023260610025

Download citation

  • unbalanced factorial design
  • linear model
  • fixed effect
  • nonorthogonal
  • fitting constants
  • constraint