On structural equation modeling with data that are not missing completely at random
 Bengt Muthén,
 David Kaplan,
 Michael Hollis
 … show all 3 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
A general latent variable model is given which includes the specification of a missing data mechanism. This framework allows for an elucidating discussion of existing general multivariate theory bearing on maximum likelihood estimation with missing data. Here, missing completely at random is not a prerequisite for unbiased estimation in large samples, as when using the traditional listwise or pairwise present data approaches. The theory is connected with old and new results in the area of selection and factorial invariance. It is pointed out that in many applications, maximum likelihood estimation with missing data may be carried out by existing structural equation modeling software, such as LISREL and LISCOMP. Several sets of artifical data are generated within the general model framework. The proposed estimator is compared to the two traditional ones and found superior.
 Anderson, T. W. (1957). Maximum likelihood estimates for a multivariate normal distribution when some observations are missing.Journal of the American Statistical Association, 52, 200–203.
 Beale, E. L., & Little, R. J. A. (1975). Missing values in multivariate analysis.Journal of the Royal Statistical Society, Series B, 37, 129–146.
 Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm.Psychometrika, 46, 443–459.
 Boomsma, A. (1983).On the robustness of the LISREL (maximum likelihood estimation) against small sample size and nonnormality. Unpublished doctoral, dissertation, University of Groningen, Groningen, The Netherlands.
 Brown, C. H. (1983). Asymptotic comparison of missing data procedures for estimating factor loadings.Psychometrika, 48, 269–291.
 Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society, Series B, 39, 1–38.
 Dixon, W. J. (1983). BMDP Statistical Software. Berkeley: University of California Press.
 Finkbeiner, C. (1979). Estimation for the multiple factor model when data are missing.Psychometrika, 44, 409–420.
 Hartley, H. O., & Hocking, R. R. (1971). The analysis of incomplete data.Biometrics, 14, 174–194.
 Hausman, J. A., & Wise, D. A. (1979). Attrition bias in experimental and panel data: The Gary income maintenance experiment.Econometrica, 47, 455–474.
 Heckman, J. J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models.Annals of Economic and Social Measurement, 5, 475–492.
 Johnson, N. L., & Kotz, S. (1972).Distributions in statistics: Continuous multivariate distributions. New York: John Wiley & Sons.
 Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis.Psychometrika, 34, 183–202.
 Jöreskog, K. G. (1971). Simultaneous factor analysis in several populationsPsychometrika, 36, 409–426.
 Jöreskog, K. G. (1977). Structural equation models in the social sciences: Specification, estimation and testing. In P. R. Krishnaiah (Ed.),Applications of statistics. Amsterdam: North Holland.
 Jöreskog, K. G., & Sörbom, D. (1980). Simultaneous analysis of longitudinal data from several cohorts. Research Report 805, Department of Statistics, University of Uppsala, Sweden.
 Jöreskog, K. G., & Sörbom, D. (1984). LISREL VI; Analysis of linear structural relationships by maximum likelihood and least squares methods. Scientific Software.
 Lawley, D. N. (1943–1944). A note on Karl Pearson's selection formulae.Proceedings of the Royal Society Edinburgh, Section A (Mathematics and Physics Section), 62(1), 28–30.
 Little, R. J. A. (1982). Models for nonresponse in sample surveys.Journal of the American Statistical Association, 77, 237–250.
 Little, R. J. A. (1983). The ignorable case. W. G. Modon, I. Olkin, & D. R. Rubin (Eds.), InIncomplete data in sample surveys, Vol. 2: Theory and bibliographies. New York: Academic Press.
 Little, R. J. A. (1985). A note about models for selectivity bias.Econometrica, 53(6), 1469–1474.
 Little, R. J. A., & Rubin, D. R. (1987).Statistical analysis with missing data. New York: John Wiley & Sons.
 Marini, M. M., Olsen, A. R., & Rubin, D. B. (1980). Maximum likelihood estimation in panel studies with missing data. In K. F. Schuessler (Ed.),Sociological Methodology. San Francisco: Jossey Bass.
 Meredith, W. (1964). Notes on factorial invariance.Psychometrika, 29, 177–185.
 Muthén, B., 1984. A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators.Psychometrika, 49, 115–132.
 Muthén, B. (1985). Moments of the censored and truncated bivariate normal distribution. Submitted for publication.
 Muthén, B. (1987).LISCOMP. Analysis of linear structural equations using a comprehensive measurement model. User's guide. Scientific Software.
 Muthén, B., & Jöreskog, K. (1983). Selectivity problems in quasiexperimental studies.Evaluation Review, 7, 139–173.
 Muthén, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of nonnormal Likert variables.British Journal of Mathematical and Statistical Psychology, 33, 171–189.
 Olsson, U. (1978). Selection bias in confirmatory factor analysis (Department of Statistics Research, Report No. 784). Uppsala, Sweden: University of Uppsala.
 Pearson, K. (1912). On the general theory of the influence of selection on correlation and variation.Biometrika, 8, 437–443.
 Rosenbaum, S. (1961). Moments of a truncated bivariate normal distribution.Journal of the Royal Statistical Society, Series B, 23, 405–408.
 Rubin, D. B. (1974). Characterizing the estimation of parameters in incomplete data problems.Journal of the American Statistical Association, 69, 456–474.
 Rubin, D. B. (1976). Inference and missing data.Biometrika, 63, 581–592.
 Tallis, G. M. (1961). The moment generating function of the truncated multinormal distribution.Journal of the Royal Statistical Society, Series B, 23, 223–229.
 Trawinski, I. M., & Bargmann, R. E. (1964). Maximum likelihood estimation with incomplete multivariate data.Annals of Mathematical Statistics, 35, 647–657.
 Wedderburn, R. W. M. (1974). Quasilikelihood functions, generalized linear models and the GaussNewton method.Biometrika, 61, 439–447.
 Werts, C. E., Rock, D. A., & Grandy, J. (1979). Confirmatory factor analysis applications: Missing data problems and comparison of path models between populations.Multivariate Behavioral Research, 14, 199–213.
 Wheaton, B., Muthén, B., Alwin, D. F., & Summers, G. F. (1977). Assessing reliability and stability in panel models. In D. R. Heise (Ed.),Sociological methodology, San Francisco: Jossey Bass.
 Title
 On structural equation modeling with data that are not missing completely at random
 Journal

Psychometrika
Volume 52, Issue 3 , pp 431462
 Cover Date
 19870901
 DOI
 10.1007/BF02294365
 Print ISSN
 00333123
 Online ISSN
 18600980
 Publisher
 SpringerVerlag
 Additional Links
 Topics
 Keywords

 maximum likelihood
 ignorability
 selectivity
 factor analysis
 factorial invariance
 LISREL
 Industry Sectors
 Authors

 Bengt Muthén ^{(1)}
 David Kaplan ^{(1)}
 Michael Hollis ^{(2)}
 Author Affiliations

 1. Graduate School of Education, University of California, 90024, Los Angeles, CA
 2. Graduate School of Architecture and Urban Planning, University of California, Los Angeles