Abstract
A general approach for analyzing categorical data when there are missing data is described and illustrated. The method is based on generalized linear models with composite links. The approach can be used (among other applications) to fill in contingency tables with supplementary margins, fit loglinear models when data are missing, fit latent class models (without or with missing data on observed variables), fit models with fused cells (including many models from genetics), and to fill in tables or fit models to data when variables are more finely categorized for some cases than others. Both Newton-like and EM methods are easy to implement for parameter estimation.
Similar content being viewed by others
References
Arminger, G. (1982).Latent class analysis with generalized linear models using composite link functions. Unpublished notes.
Baker, S. G., & Laird, N. M. (1988). Regression analysis for categorical variables with outcome subject to nonignorable nonresponse.Journal of the American Statistical Association, 83, 62–69.
Burn, R. (1982). Loglinear models with composite link functions in genetics. In R. Gilchrist (Ed.),GLIM 82: Proceedings of the international conference on generalised linear models (pp. 144–154). New York: Springer-Verlag.
Chen, T., & Fienberg, S. E. (1976). The analysis of contingency tables with incompletely classified data.Biometrics, 32, 133–144.
Ekholm, A., & Palmgren, J. (1985). A model for a binary response with misclassifications. In R. Gilchrest (Ed.),GLIM 82: Proceedings of the international conference on generalised linear models (pp. 128–143). New York: Springer-Verlag.
Espeland, M. A., & Hui, S. L. (1987). A general approach to analyzing epidemiologic data that contain misclassification errors.Biometrics, 43, 1001–1012.
Espeland, M. A., & Odoroff, C. L. (1985). Log-linear models for doubly sampled categorical data fitted by the EM algorithm.Journal of the American Statistical Association, 80, 663–670.
Grizzle, J. E., Starmer, F. C., & Koch, G. C. (1969). Analysis of categorical data by linear models.Biometrics, 25, 489–504.
Haberman, S. J. (1974). Log-linear models for frequency tables derived by indirect observation: Maximum likelihood equations.Annals of statistics, 2, 911–924.
Haberman, S. J. (1977) Product models for frequency tables involving indirect observation.Annals of Statistics, 5, 1124–1147.
Haberman, S. J. (1979).Analysis of qualitative data: Volume 2. New developments. New York: Academic Press.
Haberman, S. J. (1988). A stabilized Newton-Raphson algorithm for log-linear models for frequency tables derived by indirect observation. In C. C. Clogg (Ed.),Sociological methodology 1988 (pp. 193–211). Washington, D.C.: American Sociological Association.
Hochberg, Y. (1977) On the use of double sampling schemes in analyzing categorical data with misclassification errors.Journal of the American Statistical Association, 72, 914–921.
Hocking, R. R., & Oxspring, H. H. (1974). The analysis of partially categorized contingency data.Biometrics, 30, 469–483.
Kempthorne, O. (1980). The term “design matrix”.American Statistician, 34, 249.
Little, R. J. A., & Rubin, D. B. (1987).Statistical analysis with missing data. New York: Wiley.
McCullagh, P., & Nelder, J. A. (1989).Generalized linear models (2nd ed.). London: Chapman and Hall.
Rindskopf, D. (1984). Linear equality restrictions in regression and loglinear models.Psychological Bulletin, 96, 597–603.
Rindskopf, D. (1990). Nonstandard loglinear models.Psychological Bulletin, 108, 150–162.
Tenenbein, A. (1970) A double sampling scheme for estimating from binomial data with misclassifications.Journal of the American Statistical Association, 65, 1350–1361.
Thompson, R., & Baker, R. J. (1981). Composite link functions in generalized linear models.Applied Statistics, 30, 125–131.
Winship, C., & Mare, R. D. (1990) Loglinear models with missing data: A latent class approach.Sociological methodology, 20, 331–367.
Author information
Authors and Affiliations
Additional information
The author thanks the editor, the reviewers, Laurie Hopp Rindskopf, and Clifford Clogg for comments and suggestions that substantially improved the paper.
Rights and permissions
About this article
Cite this article
Rindskopf, D. A general approach to categorical data analysis with missing data, using generalized linear models with composite links. Psychometrika 57, 29–42 (1992). https://doi.org/10.1007/BF02294657
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02294657