, Volume 69, Issue 1, pp 5–32 | Cite as

Building an identifiable latent class model with covariate effects on underlying and measured variables

  • Guan-Hua Huang
  • Karen Bandeen-Roche
Theory And Methods


In recent years, latent class models have proven useful for analyzing relationships between measured multiple indicators and covariates of interest. Such models summarize shared features of the multiple indicators as an underlying categorical variable, and the indicators' substantive associations with predictors are built directly and indirectly in unique model parameters. In this paper, we provide a detailed study on the theory and application of building models that allow mediated relationships between primary predictors and latent class membership, but that also allow direct effects of secondary covariates on the indicators themselves. Theory for model identification is developed. We detail an Expectation-Maximization algorithm for parameter estimation, standard error calculation, and convergent properties. Comparison of the proposed model with models underlying existing latent class modeling software is provided. A detailed analysis of how visual impairments affect older persons' functioning requiring distance vision is used for illustration.

Key words

EM algorithm finite mixture model identifiability multiple discrete indicators visual functioning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Agresti, A. (1984).Analysis of Categorical Data. New York: John Wiley and Sons.Google Scholar
  2. Akaike, H. (1987). Factor analysis and AIC.Psychometrika, 52, 317–332.Google Scholar
  3. Baker, S.G., & Laird, N.M. (1988). Regression analysis for categorical variables with outcome subject to nonignorable nonresponse.Journal of the American Statistical Association, 83, 62–69.Google Scholar
  4. Bandeen-Roche, K., Huang, G.H., Munoz, B., & Rubin, G.S. (1999). Determination of risk factor associations with questionnaire outcomes: A methods case study.American Journal of Epidemiology, 150, 1165–1178.Google Scholar
  5. Bandeen-Roche, K., Miglioretti, D.L., Zeger, S.L., & Rathouz, P.J. (1997). Latent variable regression for multiple discrete outcomes.Journal of the American Statistical Association, 92, 1375–1386.Google Scholar
  6. Bollen, K. (1989).Structural Equations with Latent Variables. New York: John Wiley and Sons.Google Scholar
  7. Clogg, C.C., & Goodman, L.A. (1984). Latent structure analysis of a set of multidimensional contingency tables.Journal of the American Statistical Association, 79, 762–771.Google Scholar
  8. Clogg, C.C., & Goodman, L.A. (1985). Simultaneous latent structure analysis in several groups. InSociological Methodology 1985, Tuma N.B. (Ed.), 81–110. San Francisco: Jossey-Bass.Google Scholar
  9. Dayton, C.M., & Macready, G.B. (1988). Concomitant-variable latent-class models.Journal of the American Statistical Association, 83, 173–178.Google Scholar
  10. Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society, Series B, 39, 1–38.Google Scholar
  11. Eaton, W. W., Dryman, A., Sorenson, A., & McCutcheon, A. (1989). DSM-III major depressive disorder in the community—A latent class analysis of data from the NIMH epidemiologic catchment-area program.British Journal of Psychiatry, 155, 48–54.Google Scholar
  12. Efron, B., & Hinkley, D.V. (1978). Assessing the acuracy of the maximum likelihood estimator: Observed versus expected Fisher information.Biometrika, 65, 457–487.Google Scholar
  13. Folstein, M.F., Folstein, S.E., & McHugh, P.R. (1975). Mini-mental state: A practical method for grading the cognitive state of patients for the clinician.Journal of Psychiatric Research, 12, 189.Google Scholar
  14. Formann, A.K. (1985). Constrained latent class models: Theory and applications.British Journal of Mathematical and Statistical Psychology, 38, 87–111.Google Scholar
  15. Formann, A.K. (1992). Linear logistic latent class analysis for polytomous data.Journal of the American Statistical Association, 87, 476–486.Google Scholar
  16. Garrett, E.S., & Zeger, S.L. (2000). Latent class model diagnosis.Biometrics, 56, 1055–1067.Google Scholar
  17. Goldberg, D. (1972).GHQ The Selection of Psychiatric Illness by Questionnaire. London: Oxford University Press.Google Scholar
  18. Goodman, L.A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models.Biometrika, 61, 215–231.Google Scholar
  19. Graybill, F.A. (1969).Introduction to Matrices with Applications in Statistic, Belmont, CA: Wadsworth.Google Scholar
  20. Green, B.F. (1951). A general solution of the latent class model of latent structure analysis and latent profile analysis.Psychometrika, 16, 151–166.Google Scholar
  21. Haberman, S.J. (1974). Log-linear models for frequency tables derived by indirect observation: Maximum likelihood equations.Annals of Statistics, 2, 911–924.Google Scholar
  22. Haberman, S.J. (1979).Analysis of Qualitative Data. Vol. 2: New Developments. New York: Academic Press.Google Scholar
  23. Hagenaars, J.A. (1993).Loglinear Models with Latent Variables. Sage University Paper Series on Quantitative Applications in the Social Sciences, series no. 07-094. Newbury Park, CA: Sage.Google Scholar
  24. Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991).Fundamentals of Item Response Theory. Newbury Park, CA: Sage.Google Scholar
  25. Huang, G.H., Bandeen-Roche, K., & Rubin, G.S. (2002). Building marginal models for multiple ordinal measurements.Applied Statistics, 51, 37–57.Google Scholar
  26. Hudziak, J.J., Heath, A.C., Madden, P.F., Reich, W., Bucholz, K.K., Slutske, W., Bierut, L.J., Neuman, R.J., & Todd, R.D. (1998). Latent class and factor analysis of DSM-IV ADHD: A twin study of female adolescents.Journal of the American Academy of Child and Adolescent Psychiatry, 37, 848–857.Google Scholar
  27. Jette, A.M., & Branch, L.G. (1985). Impairment and disability in the aged.J Chronic Dis, 38, 59–65.Google Scholar
  28. Jöreskog, K.G., & Goldberger, A.S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable.Journal of the American Statistical Association, 70, 631–639.Google Scholar
  29. Katz, S., Ford, A.B., Moskowitz, R.W., Jackson, B.A., & Jaffer, M.W. (1963). Studies of illness in the age. The index of ADL: A standardized measure of biological and psychosocial function.Journal of the American Medical Association, 185, 914–918.Google Scholar
  30. Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm.Journal of the Royal Statistical Society, Series B, 57, 425–437.Google Scholar
  31. Lazarsfeld, P.F., & Henry, N.W. (1968).Latent Structure Analysis. New York: Houghton-Mifflin.Google Scholar
  32. Legler, J.M., & Ryan, L.M. (1997). Latent variable models for teratogenesis using multiple binary outcomes.Journal of the American Statistical Association, 92, 13–20.Google Scholar
  33. Little, R.J.A., & Rubin, D.B. (1987).Statistical Analysis with Missing Data. New York: John Wiley and Sons.Google Scholar
  34. Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm.Journal of the Royal Statistical Society, Series B, 44, 226–233.Google Scholar
  35. Magidson J., & Vermunt, J.K. (2001). Latent class factor and cluster models, bi-plots, and related graphical displays. InSociological Methodology 2001, Stolzenberg R.M. (ed), 223–264. Boston: Blackwell.Google Scholar
  36. Mangione, C.M., Phillips, R.S., Seddon, J.M., Lawrence, M.G., Cook, E.F., Dailey, R., & Goldman, L. (1992). Development of the “activities of daily vision” scale: A measurement of visual functional status.Medical Care, 30, 1111–1126.Google Scholar
  37. McCullagh, P., & Nelder, J.A. (1989).Generalized Linear Models, 2nd edition. London: Chapman and Hall.Google Scholar
  38. McCutcheon, A. C. (1987).Latent Class Analysis. Sage University Paper Series on Quantitative Applications in the Social Sciences, series no. 07-064. Beverly Hills, CA: Sage.Google Scholar
  39. McHugh, R.B. (1956). Efficient estimation and local identification in latent class analysis.Psychometrika, 21, 331–347.Google Scholar
  40. McLachlan G.J., & Krishnan T. (1996).The EM Algorithm and Extensions. New York: John Wiley and Sons.Google Scholar
  41. Melton, B., Liang, K.Y., & Pulver, A.E. (1994). Extended latent class approach to the study of familial/sporadic forms of a disease: Its application to the study of the heterogeneity of schizophrenia.Genetic Epidemiology, 11, 311–327.Google Scholar
  42. Moustaki, I. (1996). A latent trait and a latent class model for mixed observed variables.British Journal of Mathematical and Statistical Psychology, 49, 313–334.Google Scholar
  43. Muthén, B. (1983). Latent variable structural equation modeling with categorical data.Journal of Econometrics, 22, 43–65.Google Scholar
  44. Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical and continuous latent variable indicators.Psychometrika, 49, 115–132.Google Scholar
  45. Muthén, L.K., & Muthén, B.O. (1998).Mplus User's Guide. Los Angeles, CA: Muthén & Muthén.Google Scholar
  46. Muthén, B., & Shedden, K. (1999). Finite mixture modeling with mixture outcomes using EM algorithm.Biometrics, 55, 463–469.Google Scholar
  47. Neuman, R.J., Heath, A., Reich, W., Bucholz, K.K., Madden, P.A.F., Sun, L., Todd, R.D., & Hudziak, J.J. (2001). Latent class analysis of ADHD and comorbid symptoms in a population sample of adolescent female twins.Journal of Child Psychology and Psychiatry and Allied Disciplines, 42, 933–942.Google Scholar
  48. Piantadosi, S. (1997).Clinical Trials: A Methodologic Perspective. New York: John Wiley and Sons.Google Scholar
  49. Rasch, G. (1960).Probabilistic Models for Some Intelligence and Attainment Tests. Chicago: University of Chicago Press.Google Scholar
  50. Roeder, K., Lynch, K.G., & Nagin, D.S. (1999). Modeling uncertainty in latent class membership: A case study in criminology.Journal of the American Statistical Association, 94, 766–776.Google Scholar
  51. Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis.Applied Psychological Measurement, 14, 271–282.Google Scholar
  52. Rost, J. (1991). A logistic mixture distribution model for polychotomous item responses.British Journal of Mathematical and Statistical Psychology, 44, 75–92.Google Scholar
  53. Rubin, G.S., Bandeen-Roche, K., Huang, G.H., Munoz, B., Schein, O.D., Fried, L.P., & West, S.K. (2001). The association of multiple visual impairments with self-reported visual disability: SEE project.Investigative Ophthalmology and Visual Science, 42, 64–72.Google Scholar
  54. Rubin, G.S., West, S.K., Munoz, B., Bandeen-Roche, K., Zeger, S.L., Schein, O., & Fried, L.P. (1997). A comprehensive assessment of visual impairment in an older American population: SEE study.Investigative Ophthalmology and Visual Science, 38, 557–568.Google Scholar
  55. Sammel, M.D., Ryan, L.M., & Legler, J.M. (1997). Latent variable models for mixed discrete and continuous outcomes.Journal of the Royal Statistical Society, Series B, 59, 667–678.Google Scholar
  56. Schwarz, G. (1978). Estimating the dimension of a model.Annals of Statistics, 6, 461–464.Google Scholar
  57. Statistical Sciences, Inc. (1995).S-PLUS User's Manual, Version 3.3 for Windows. Seattle: Statistical Sciences, Inc.Google Scholar
  58. Sullivan, P.F., Kessler, R.C., & Kendler, K.S. (1998). Latent class analysis of lifetime depressive symptoms in national comorbidity survey.American Journal of Psychiatry, 155, 1398–1406.Google Scholar
  59. Uebersax, J. (1993). Statistical modeling of expert ratings on medical treatment appropriateness.Journal of the American Statistical Association, 88, 421–427.Google Scholar
  60. Valbuena, M., Bandeen-Roche, K., Rubin, G.S., Munoz, B., West, S.K., and SEE Project Team (1999). Self-reported assessment of visual functioning in a population based setting.Investigative Ophthalmology and Visual Science, 40, 280–288.Google Scholar
  61. Van der Heijden, P.G.M., Dessens, J., & Böckenholt, U. (1996). Estimating the concomitant-variable latent-class model with the EM algorithm,Journal of Educational and Behavioral Statistics, 21, 215–229.Google Scholar
  62. Vermunt, J.K. (1996).Log-linear Event History Analysis: A General Approach with Missing Data, Unobserved Heterogeneity, and Latent Variables. Tilburg: Tilburg University Press.Google Scholar
  63. Vermunt, J.K., & Magidson, J. (2000).Latent GOLD 2.0 User's Guide. Belmont, MA: Statistical Innovations Inc.Google Scholar
  64. West, S.K., Munoz, B., Rubin, G.S., Schein, O.D., Bandeen-Roche, K., Zeger, S., German, P.S., & Fried, L.P. (1997). Function and visual impairment in a population-based study of older adults: SEE project.Investigative Ophthalmology and Visual Science, 38, 72–82.Google Scholar
  65. Wu, C.F. (1983). On the convergence properties of the EM algorithm.Annals of Statistics, 11, 95–103.Google Scholar

Copyright information

© The Psychometric Society 2004

Authors and Affiliations

  • Guan-Hua Huang
    • 1
  • Karen Bandeen-Roche
    • 2
  1. 1.University of WisconsinMadison
  2. 2.The Johns Hopkins UniversityBaltimore

Personalised recommendations