Skip to main content
Log in

A general approach to categorical data analysis with missing data, using generalized linear models with composite links

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

A general approach for analyzing categorical data when there are missing data is described and illustrated. The method is based on generalized linear models with composite links. The approach can be used (among other applications) to fill in contingency tables with supplementary margins, fit loglinear models when data are missing, fit latent class models (without or with missing data on observed variables), fit models with fused cells (including many models from genetics), and to fill in tables or fit models to data when variables are more finely categorized for some cases than others. Both Newton-like and EM methods are easy to implement for parameter estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arminger, G. (1982).Latent class analysis with generalized linear models using composite link functions. Unpublished notes.

  • Baker, S. G., & Laird, N. M. (1988). Regression analysis for categorical variables with outcome subject to nonignorable nonresponse.Journal of the American Statistical Association, 83, 62–69.

    Google Scholar 

  • Burn, R. (1982). Loglinear models with composite link functions in genetics. In R. Gilchrist (Ed.),GLIM 82: Proceedings of the international conference on generalised linear models (pp. 144–154). New York: Springer-Verlag.

    Google Scholar 

  • Chen, T., & Fienberg, S. E. (1976). The analysis of contingency tables with incompletely classified data.Biometrics, 32, 133–144.

    Google Scholar 

  • Ekholm, A., & Palmgren, J. (1985). A model for a binary response with misclassifications. In R. Gilchrest (Ed.),GLIM 82: Proceedings of the international conference on generalised linear models (pp. 128–143). New York: Springer-Verlag.

    Google Scholar 

  • Espeland, M. A., & Hui, S. L. (1987). A general approach to analyzing epidemiologic data that contain misclassification errors.Biometrics, 43, 1001–1012.

    Google Scholar 

  • Espeland, M. A., & Odoroff, C. L. (1985). Log-linear models for doubly sampled categorical data fitted by the EM algorithm.Journal of the American Statistical Association, 80, 663–670.

    Google Scholar 

  • Grizzle, J. E., Starmer, F. C., & Koch, G. C. (1969). Analysis of categorical data by linear models.Biometrics, 25, 489–504.

    Google Scholar 

  • Haberman, S. J. (1974). Log-linear models for frequency tables derived by indirect observation: Maximum likelihood equations.Annals of statistics, 2, 911–924.

    Google Scholar 

  • Haberman, S. J. (1977) Product models for frequency tables involving indirect observation.Annals of Statistics, 5, 1124–1147.

    Google Scholar 

  • Haberman, S. J. (1979).Analysis of qualitative data: Volume 2. New developments. New York: Academic Press.

    Google Scholar 

  • Haberman, S. J. (1988). A stabilized Newton-Raphson algorithm for log-linear models for frequency tables derived by indirect observation. In C. C. Clogg (Ed.),Sociological methodology 1988 (pp. 193–211). Washington, D.C.: American Sociological Association.

    Google Scholar 

  • Hochberg, Y. (1977) On the use of double sampling schemes in analyzing categorical data with misclassification errors.Journal of the American Statistical Association, 72, 914–921.

    Google Scholar 

  • Hocking, R. R., & Oxspring, H. H. (1974). The analysis of partially categorized contingency data.Biometrics, 30, 469–483.

    Google Scholar 

  • Kempthorne, O. (1980). The term “design matrix”.American Statistician, 34, 249.

    Google Scholar 

  • Little, R. J. A., & Rubin, D. B. (1987).Statistical analysis with missing data. New York: Wiley.

    Google Scholar 

  • McCullagh, P., & Nelder, J. A. (1989).Generalized linear models (2nd ed.). London: Chapman and Hall.

    Google Scholar 

  • Rindskopf, D. (1984). Linear equality restrictions in regression and loglinear models.Psychological Bulletin, 96, 597–603.

    Google Scholar 

  • Rindskopf, D. (1990). Nonstandard loglinear models.Psychological Bulletin, 108, 150–162.

    Google Scholar 

  • Tenenbein, A. (1970) A double sampling scheme for estimating from binomial data with misclassifications.Journal of the American Statistical Association, 65, 1350–1361.

    Google Scholar 

  • Thompson, R., & Baker, R. J. (1981). Composite link functions in generalized linear models.Applied Statistics, 30, 125–131.

    Google Scholar 

  • Winship, C., & Mare, R. D. (1990) Loglinear models with missing data: A latent class approach.Sociological methodology, 20, 331–367.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The author thanks the editor, the reviewers, Laurie Hopp Rindskopf, and Clifford Clogg for comments and suggestions that substantially improved the paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rindskopf, D. A general approach to categorical data analysis with missing data, using generalized linear models with composite links. Psychometrika 57, 29–42 (1992). https://doi.org/10.1007/BF02294657

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02294657

Key words

Navigation