Should “multiple imputations” be treated as “multiple indicators”?
Rubin's “multiple imputation” approach to missing data creates synthetic data sets, in which each missing variable is replaced by a draw from its predictive distribution, conditional on the observed data. By construction, analyses of such filled-in data sets as if the imputations were true values have the correct expectations for population parameters. In a recent paper, Mislevy showed how this approach can be applied to estimate the distributions of latent variables from complex samples. Multiple imputations for a latent variable bear a surface similarity to classical “multiple indicators” of a latent variable, as might be addressed in structural equation modelling or hierarchical modelling of successive stages of random sampling. This note demonstrates with a simple example why analyzing “multiple imputations” as if they were “multiple indicators” does not generally yield correct results; they must instead be analyzed by means concordant with their construction.
Key wordsmultiple imputation multiple indicators National Assessment of Educational Progress plausible values
Unable to display preview. Download preview PDF.
- Beaton, A. E. (1987).The NAEP 1983/84 technical report (NAEP Report 15-TR-20). Princeton: Educational Testing Service.Google Scholar
- Beaton, A. E. (1988).The NAEP 1985/86 technical report (NAEP Report 17-TR-20). Princeton: Educational Testing Service.Google Scholar
- Goldstein, H. (1987).Multilevel models in educational and social research. London: Griffin, NY: Oxford University Press.Google Scholar
- Johnson, E. G., & Zwick, R. (1990).The NAEP 1987/88 technical report (NAEP Report 19-TR-20). Princeton: Educational Testing Service.Google Scholar
- Jöreskog, K. G., & Sörbom, D. (1989).LISREL 7: User's reference guide. Mooresville, IN: Scientific Software.Google Scholar
- Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples.Psychometrika, 56, 177–196.Google Scholar
- Mislevy, R. J., Beaton, A. E., Kaplan, B., & Sheehan, K. M. (1992). Estimating population characteristics from sparse matrix samples of item responses.Journal of Educational Measurement, 29, 133–161.Google Scholar
- Muthén, B. (1988).LISCOMP [computer program]. Mooresville, IN: Scientific Software.Google Scholar
- Rubin, D. B. (1977). Formalizing subjective notions about the effect of nonrespondents in sample surveys.Journal of the American Statistical Association, 72, 538–543.Google Scholar
- Rubin, D. B. (1987).Multiple imputation for nonresponse in surveys. New York: Wiley.Google Scholar