Analysis of unbalanced mixed model data: A case study comparison of ANOVA versus REML/GLS Authors Ramon C. Littell Department of Statistics, Institute of Food and Agricultural Sciences University of Florida Editor’s Invited Paper

DOI :
10.1198/108571102816

Cite this article as: Littell, R.C. JABES (2002) 7: 472. doi:10.1198/108571102816
Abstract Major transition has occurred in recent years in statistical methods for analysis of linear mixed model data from analysis of variance (ANOVA) to likelihood-based methods. Prior to the early 1990s, most applications used some version of analysis of variance because computer software was either not available or not easy to use for likelihood-based methods. ANOVA is based on ordinary least squares computations, with adoptions for mixed models. Computer programs for such methodology were plagued with technical problems of estimability, weighting, and handling missing data. Likelihood-based methods mainly use a combination of residual maximum likelihood (REML) estimation of covariance parameters and generalized least squares (GLS) estimation of mean parameters. Software for REML/GLS methods became readily available early in the 1990s, but the methodology still is not universally embraced. Although many of the computational inadequacies have been overcome, conceptual problems remain. Also, technical problems with REML/GLS have emerged, such as the need for adjustments for effects due to estimating covariance parameters. This article attempts to identify the major problems with ANOVA, describe the problems which remain with REML/GLS, and discuss new problems with REML/GLS.

Key Words Likelihood Linear models Random effects

References Damon, R. A., Jr., and Harvey, W. R. (1987), Experimental Design, ANOVA, and Regression , New York: Harper & Row.

Fai, A. H. T., and Cornelius, P. L. (1996), “Approximate

F -tests of Multiple Degree of Freedom Hypotheses in Generalized Least Squares Analysis of Unbalanced Split-Plot Experiments,”

Journal of Statistical Computation and Simulation , 54, 363–378.

MATH CrossRef MathSciNet Giesbrecht, F. G., and Burns, J. C. (1985), “Two-Stage Analysis Based on a Mixed Model: Large-Sample Theory and Small-Sample Simulation Results,”

Biometrics , 41, 477–486.

MATH CrossRef Guerin, L., and Stroup, W. W. (2000), “A Simulation Study to Evaluate PROC MIXED Analysis of Repeated Measures Data,” in Proceedings of the 12th Annual Conference on Applied Statistics in Agriculture , Manhattan, KS: Kansas State University.

Harvey, W. R. (1982), “Mixed Model Capabilities of LSML76,” Journal of Animal Science , 54, 1279–1285.

Harville, D. A., and Jeske, D. R. (1992), “Mean Squared Error of Prediction Under a General Linear Model,”

Journal of the American Statistical Association , 87, 724–731.

MATH CrossRef MathSciNet Henderson, C. R. (1975), “Best Linear Estimation and Prediction Under a Selection Model,”

Biometrics , 31, 423–449.

MATH CrossRef — (1984), “Applications of Linear Models in Animal Breeding,” Guelph, Ontario: University of Guelph.

Jeske, D. R., and Harville, D. A. (1988), “Prediction-Interval Procedures and (Fixed Effect) Confidence-Interval Procedures for Mixed Linear Models,”

Communications in Statistics: Theory and Methods , 17 1053–1088.

MATH CrossRef MathSciNet Kackar, R. N., and Harville, D. A. (1984), “Approximations for Standard Errors of Estimators of Fixed nad Random Effects in Mixed Linear Models,”

Journal of the American Statistical Association , 79, 853–862.

MATH CrossRef MathSciNet Kenward, M. G., and Roger, J. H. (1997), “Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood,”

Biometrics , 53, 983–997.

MATH CrossRef Little, R. J. A., and Rubin, D. B. (1987),

Statistical Analysis with Missing Data , New York: Wiley.

MATH Littell, R. C., Milliken, G. A., Stroup, W. W., and Wolfinger, R. D. (1996), SAS System for Mixed Models , Cary, NC: SAS Institute, Inc.

Littell, R. C., Pendergast, J., and Natarajan, R. (2000), “Modelling Covariance Structure in the Analysis of Repeated Measures Data,”

Statistics in Medicine , 19, 1793–1819.

CrossRef Littell, R. C., Stroup, W. W., and Freund, R. J. (2002), SAS for Linear Models (4th ed.), Cary, NC: SAS Institute, Inc.

Milliken, G. A., and Johnson, D. E. (1992), Analysis of Messy Data, Volume 1: Designed Experiments , New York: Chapman and Hall.

Prasad, N. G. N., and Rao, J. N. K. (1990), “The Estimation of Mean Squared Error of Small-Area Estimators,”

Journal of the American Statistical Association , 85, 163–171.

MATH CrossRef MathSciNet Puntanen, S., and Styan, G. P. H. (1989), “The Equality of the Ordinary Leas Squares Estimator and the Best Linear Unbiased Estimator,”

The American Statistician , 43, 153–164.

CrossRef MathSciNet Rawlings, J. O., Pantula, D. A., and Dickey, D. A. (1998),

Applied Regression Analysis New York: Springer.

MATH CrossRef Remenga, M. D., and Johnson, D. E. (1995), “A Comparison of Inference Procedures in Unbalanced Split-Plot Designs,”

Journal of Statistical Computation and Simulation , 51, 353–367.

CrossRef MathSciNet Satterthwaite, F. W. (1946), “An Approximate Distribution of Estimates of Variance Components,”

Biometrics Bulletin , 2, 110–114.

CrossRef Self, S. G., and Liang, K-Y. (1987), “Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests Under Nonstandard Conditions,”

Journal of the American Statistical Association , 82, 605–610.

MATH CrossRef MathSciNet Speed, F. M., Hocking, R. R., and Hackney, O. P. (1978), “Methods for Analysis of Linear Models with Unbalanced Data,”

Journal of the American Statistical Association , 73, 105–112.

MATH CrossRef Searle, S. R. (1987), Linear Models for Unbalanced Data , New York: Wiley.

Searle, S. R., Casella, G., and McCulloch, C. E. (1992),

Variance Components , New York: Wiley.

MATH CrossRef © International Biometric Society 2002