, Volume 83, Issue 4, pp 871–892 | Cite as

Two-Step Estimation of Models Between Latent Classes and External Variables

  • Zsuzsa BakkEmail author
  • Jouni Kuha


We consider models which combine latent class measurement models for categorical latent variables with structural regression models for the relationships between the latent classes and observed explanatory and response variables. We propose a two-step method of estimating such models. In its first step, the measurement model is estimated alone, and in the second step the parameters of this measurement model are held fixed when the structural model is estimated. Simulation studies and applied examples suggest that the two-step method is an attractive alternative to existing one-step and three-step methods. We derive estimated standard errors for the two-step estimates of the structural model which account for the uncertainty from both steps of the estimation, and show how the method can be implemented in existing software for latent variable modelling.


latent variables mixture models structural equation models pseudo-maximum likelihood estimation 

Supplementary material (81 kb)
Supplementary material 1 (zip 81 KB)


  1. Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103, 411–423.CrossRefGoogle Scholar
  2. Asparouhov, T., & Muthén, B. (2014). Auxiliary variables in mixture modeling: Three-step approaches using Mplus. Structural Equation Modeling, 21, 329–341.CrossRefGoogle Scholar
  3. Asparouhov, T., & Muthén, B. (2015). Auxiliary variables in mixture modeling: Using the BCH method in Mplus to estimate a distal outcome model and an arbitrary secondary model (Mplus Web Notes No. 21).Google Scholar
  4. Bakk, Z., Oberski, D., & Vermunt, J. (2014a). Relating latent class assignments to external variables: Standard errors for correct inference. Political Analysis, 22, 520–540.CrossRefGoogle Scholar
  5. Bakk, Z., Oberski, D. L., & Vermunt, J. K. (2014b). Replication data for: Relating latent class assignments to external variables: Standard errors for correct inference. Harvard Dataverse. Retrieved from
  6. Bakk, Z., Tekle, F. T., & Vermunt, J. K. (2013). Estimating the association between latent class membership and external variables using bias-adjusted three-step approaches. Sociological Methodology, 43, 272–311.CrossRefGoogle Scholar
  7. Bakk, Z., & Vermunt, J. K. (2016). Robustness of stepwise latent class modeling with continuous distal outcomes. Structural Equation Modeling, 23, 20–31.CrossRefGoogle Scholar
  8. Bandeen-Roche, K., Miglioretti, D. L., Zeger, S. L., & Rathouz, P. J. (1997). Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association, 92, 1375–1386.CrossRefGoogle Scholar
  9. Bartolucci, F. , Montanari, G. E., & Pandolfi, S. (2014). A comparison of some estimation methods for latent Markov models with covariates. In Proceedings of COMPSTAT 2014—21st International Conference on Computational Statistics (pp. 531–538).Google Scholar
  10. Bolck, A., Croon, M., & Hagenaars, J. (2004). Estimating latent structure models with categorical variables: One-step versus three-step estimators. Political Analysis, 12, 3–27.CrossRefGoogle Scholar
  11. Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.CrossRefGoogle Scholar
  12. Bollen, K. A. (1996). An alternative two stage least squares (2SLS) estimator for latent variable equations. Psychometrika, 61, 109–121.CrossRefGoogle Scholar
  13. Burt, R. S. (1973). Confirmatory factor-analytic structures and the theory construction process. Sociological Methods & Research, 2, 131–190.CrossRefGoogle Scholar
  14. Burt, R. S. (1976). Interpretational confounding of unobserved variables in structural equation models. Sociological Methods & Research, 5, 3–52.CrossRefGoogle Scholar
  15. Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: Methods and applications. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  16. Chan, T. W., & Goldthorpe, J. H. (2007). European social stratification and cultural consumption: Music in England. Sociological Review, 23, 11–19.Google Scholar
  17. Clogg, C. C. (1981). New developments in latent structure analysis. In D. J. Jackson & E. F. Borgotta (Eds.), Factor analysis and measurement in sociological research. London: Sage.Google Scholar
  18. Croon, M. (2002). Using predicted latent scores in general latent structure models. In G. A. Marcoulides & I. Moustaki (Eds.), Latent variable and latent structure models (pp. 195–223). Mahwah, NJ: Lawrence Erlbaum.Google Scholar
  19. Dayton, C. M., & Macready, G. B. (1988). Concomitant-variable latent class models. Journal of the American Statistical Association, 83, 173–178.CrossRefGoogle Scholar
  20. De Cuyper, N., Rigotti, T., Witte, H. D., & Mohr, G. (2008). Balancing psychological contracts: Validation of a typology. International Journal of Human Resource Management, 19, 543–561.CrossRefGoogle Scholar
  21. Dempster, A. P., Laird, N. W., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society B, 39, 1–38.Google Scholar
  22. Devlieger, I., Mayer, A., & Rosseel, Y. (2016). Hypothesis testing using factor score regression: A comparison of four methods. Educational and Psychological Measurement, 76, 741–770.CrossRefGoogle Scholar
  23. De Witte, H. (2000). Arbeidsethos en jobonzekerheid: Meting en gevolgen voor welzijn, tevredenheid en inzet op het werk [Work ethic and job insecurity: Measurement and consequences for well-being, satisfaction and performance]. In R. Bouwen, K. De Witte, H. De Witte, & T. Taillieu (Eds.), Van groep naar gemeenschap. Liber Amicorum Prof. Dr. Leo Lagrou (pp. 325–350). Leuven: Garant.Google Scholar
  24. Dias, J. G., & Vermunt, J. K. (2008). A bootstrap-based aggregate classifier for model-based clustering. Computational Statistics, 23, 643–59.CrossRefGoogle Scholar
  25. Einarsen, S., Hoel, H., & Notelaers, G. (2009). Measuring exposure to bullying and harassment at work: Validity, factor structure and psychometric properties of the Negative Acts Questionnaire—Revised. Work & Stress, 23, 24–44.CrossRefGoogle Scholar
  26. Gong, G., & Samaniego, F. J. (1981). Pseudo maximum likelihood estimation: Theory and applications. The Annals of Statistics, 76, 861–889.CrossRefGoogle Scholar
  27. Goodman, L. A. (1974). The analysis of systems of qualitative variables when some of the variables are unobservable. Part I: A modified latent structure approach. American Journal of Sociology, 79, 1179–1259.CrossRefGoogle Scholar
  28. Gourieroux, C., & Monfort, A. (1995). Statistics and econometric models (Vol. 2). Cambridge: Cambridge University Press.Google Scholar
  29. Haberman, S. (1979). Analysis of qualitative data. vol. 2: New developments. New York: Academic Press.Google Scholar
  30. Hagenaars, J. A. (1993). Loglinear models with latent variables. Newbury Park, CA: Sage.CrossRefGoogle Scholar
  31. Jöreskog, K. G., & Sörbom, D. (1986). LISREL VI: Analysis of linear structural relationships by maximum likelihood and least squares methods. Mooresville, IN: Scientific Software Inc.Google Scholar
  32. Kam, J. A. (2011). Identifying changes in youth’s subgroup membership over time based on their targeted communication about substance use with parents and friends. Human Communication Research, 37, 324–349.CrossRefGoogle Scholar
  33. Lance, C. E., Cornwell, J. M., & Mulaik, S. A. (1988). Limited information parameter estimates for latent or mixed manifest and latent variable models. Multivariate Behavioral Research, 23, 171–187.CrossRefGoogle Scholar
  34. Lanza, T. S., Tan, X., & Bray, C. B. (2013). Latent class analysis with distal outcomes: A flexible model-based approach. Structural Equation Modeling, 20(1), 1–26.CrossRefGoogle Scholar
  35. Lazarsfeld, P. F., & Henry, N. W. (1968). Latent structure analysis latent structure analysis. Boston: Houghton-Mifflin.Google Scholar
  36. Lu, I. R. R., & Thomas, D. R. (2008). Avoiding and correcting bias in score-based latent variable regression with discrete manifest items. Structural Equation Modeling, 15, 462–490.CrossRefGoogle Scholar
  37. Magidson, J. (1981). Qualitative variance, entropy, and correlation ratios for nominal dependent variables. Social Science Research, 10, 177–194.CrossRefGoogle Scholar
  38. McCutcheon, A. L. (1985). A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly, 494, 474–488.CrossRefGoogle Scholar
  39. McCutcheon, A. L. (1987). Latent class analysis. Newbury Park, CA: Sage.CrossRefGoogle Scholar
  40. Muthén, L. K., & Muthén, B. O. (2017). Mplus user’s guide [Computer software manual] (8th ed.). Los Angeles, CA: Muthen and Muthen.Google Scholar
  41. Petersen, J., Bandeen-Roche, K., Budtz-Jørgensen, E., & Groes Larsen, K. (2012). Predicting latent class scores for subsequent analysis. Psychometrika, 77, 244–262.CrossRefGoogle Scholar
  42. Ping, R. A. (1996). Latent variable interaction and quadratic effect estimation: A two-step technique using structural equation analysis. Psychological Bulletin, 119, 166–175.CrossRefGoogle Scholar
  43. PSYCONES. (2006). Psychological contracts across employment situations, final report. DG Research, European Commission. Retrieved from
  44. R Core Team. (2016). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria.
  45. Skrondal, A., & Kuha, J. (2012). Improved regression calibration. Psychometrika, 77, 649–669.CrossRefGoogle Scholar
  46. Skrondal, A., & Laake, P. (2001). Regression among factor scores. Psychometrika, 66, 563–576.CrossRefGoogle Scholar
  47. Vermunt, J. K. (2010). Latent class modeling with covariates: Two improved three-step approaches. Political Analysis, 18, 450–469.CrossRefGoogle Scholar
  48. Vermunt, J. K., & Magidson, J. (2005). Latent GOLD 4.0 user’s guide. Belmont, MA: Statistical Innovations.Google Scholar
  49. Vermunt, J. K., & Magidson, J. (2016). Technical guide for Latent GOLD 5.1: Basic, Advanced and Syntax. Belmont, MA: Statistical Innovations.Google Scholar
  50. Xue, Q. L., & Bandeen-Roche, K. (2002). Combining complete multivariate outcomes with incomplete covariate information: A latent class approach. Biometrics, 58, 110–120.CrossRefGoogle Scholar

Copyright information

© The Psychometric Society 2017

Authors and Affiliations

  1. 1.Leiden UniversityLeidenThe Netherlands
  2. 2.London School of Economics and Political ScienceLondonUK

Personalised recommendations