Abstract
A method for dealing with the problem of missing observations in multivariate data is developed and evaluated. The method uses a transformation of the principal components of the data to estimate missing entries. The properties of this method and four alternative methods are investigated by means of a Monte Carlo study of 42 computer-generated data matrices. The methods are compared with respect to their ability to predict correlation matrices as well as missing entries.
The results indicate that whenever there exists modest intercorrelations among the variables (i.e., average off diagonal correlation above .2) the proposed method is at least as good as the best alternative (a regression method) while being considerably faster and simpler computationally. Models for determining the best alternative based upon easily calculated characteristics of the matrix are given. The generality of these models is demonstrated using the previously published results of Timm.
Similar content being viewed by others
References
Anderson, T. W. Maximum likelihood estimates for a multivariate normal distribution when some observations are missing.Journal of the American Statistical Association, 1957,52, 200–03.
Buck, S. F. A method of estimation of missing values in multivariate data suitable for use with an electronic computer.Journal of the Royal Statistical Society, Series B, 1960,22, 302–307.
Christofferson, A. A method for component analysis when the data are incomplete. Seminar communication, University Institute of Statistics, Uppsala, 1965.
Dear, R. E. A principal-Component missing data method for multiple regression models. System Development Corporation, Technical Report SP-86, 1959.
Eckart, C. and Young, G. The approximation of one matrix by another of lower rank.Psychometrika, 1936,1, 211–218.
Edgett, G. L. Multiple regression with missing observations among the independent variables.Journal of the American Statistical Association, 1956,51, 122–132.
Glasser, M. Linear regression analysis with missing observations among the independent variables.Journal of the American Statistical Association, 1964,59, 834–844.
Gleason, T. C. and Staelin, R. Improving the metric quality of questionnaire data.Psychometrika, 1973, 393–410.
Haitovsky, Y. Missing data in regression analysis.Journal of the Royal Statistical Society, Series B, 1968,30, 67–82.
Horn, J. L. A rationale and test for the number of factors in factor analysis.Psychometrika, 1965,30, 179–185.
Johnson, R. M. On a theorem stated by Eckart and Young.Psychometrika, 1963,28, 259–264.
Srivastava, J. N. and McDonald, L. On a large class of incomplete multivariate models which can be transformed to make manova applicable.Metron, 1970,28, 241–52.
Staelin, R. and Gleason, T. C. On the quality of principle components. American Marketing Association Combined Conference Proceedings Spring and Fall 1972, B. W. Becker and H. Becker (Eds.),34, 484–488.
Timm, N. H. The estimation of variance-covariance and correlation matrices from incomplete data.Psychometrika, 1970,35, 417–438.
Trawinski, I. M. and Bargmann, R. E. Maximum likelihood estimation with incomplete multivariate data.Annals of Mathematical Statistics, 1964,35, 647–57.
Walsh, J. E. Computer-feasible method for handling incomplete data in regression analysis.Journal of the Association for Computer Machinery, 1961,18, 201–211.
Wilks, S. S. Moments and distributions of estimates of population parameters from fragmentary samples.Annals of Mathematical Statistics, 1932,3, 163–195.
Wold, H. Nonlinear estimation by iterative least squares procedures. In F. N. David (Ed.),Festchrift Jerzy Neyman. Wiley: New York, 1966.
Author information
Authors and Affiliations
Additional information
This is an extension and elaboration of a paper read at the Spring 1973 meetings of the Psychometric Society. We wish to express our appreciation to Timothy McGuire for his helpful comments.
Rights and permissions
About this article
Cite this article
Gleason, T.C., Staelin, R. A proposal for handling missing data. Psychometrika 40, 229–252 (1975). https://doi.org/10.1007/BF02291569
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02291569