Skip to main content
Log in

Categorical latent variable modeling utilizing fuzzy clustering generalized structured component analysis as an alternative to latent class analysis

  • Original Paper
  • Published:
Behaviormetrika Aims and scope Submit manuscript

Abstract

Latent class analysis is becoming popular in many areas of education, psychology, social and behavioral sciences, public health, and medicine. However, it often suffers from identification issues due to the large number of parameters involved when using maximum likelihood (ML) estimation. Increasing the sample size, reducing sparseness, and strengthening the relationship between the observed variables and the latent variables all improve the information and thus reduce the identification issues, but the identification issue still affects the validity of parameter estimates in ML estimation and the definition of identification is not sufficient to guarantee the existence of an ML solution. In this paper, generalized structured component analysis (GSCA), which is a component-based approach that utilizes optimal scaling and fuzzy clustering, is applied to avoid these identification issues and develop more stable solutions for the heterogeneity of a population based on a set of categorical responses. Testing our proposed new approach, component-based (CB) latent class analysis (LCA), on real world substance use data from Add Health produced not only the same features as those yielded by conventional ML LCA but also stable estimation without identification issues. Comparing the results obtained from ML LCA using Mplus and poLCA in R, with those from our proposed CB LCA using GSCA in R revealed a similar number of latent classes and posterior probabilities and only minor discrepancies in individual latent class classifications when the posterior probabilities of membership are not distinct.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Becker JM, Rai A, Ringle CM, Völckner F (2013) Discovering unobserved heterogeneity in structural equation models to avert validity threats. MIS Q 37(3):665–694

    Article  Google Scholar 

  • Bezdek JC (1974) Numerical taxonomy with fuzzy sets. J Math Biol 1:57–71

    Article  MathSciNet  MATH  Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    Book  MATH  Google Scholar 

  • Collins L, Lanza S (2010) Latent class and latent transition analysis: with applications in the social, behavioral, and health sciences. Wiley, New York

    Google Scholar 

  • Dziak JJ, Lanza ST, Tan X (2014) Effect size, statistical power, and sample size requirements for the bootstrap likelihood ratio test in latent class analysis. Struct Equ Model 21(4):534–552

    Article  MathSciNet  Google Scholar 

  • Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26

    Article  MathSciNet  MATH  Google Scholar 

  • Efron B (1982) The jackknife, the bootstrap and other resampling plans. SIAM, Philadelphia

    Book  MATH  Google Scholar 

  • Esposito Vinzi V, Trinchera L, Squillacciotti S, Tenenhaus M (2008) REBUS–PLS: a response-based procedure for detecting unit segments in PLS path modeling. Appl Stoch Models Bus Industry 24:439–458

    Article  MathSciNet  MATH  Google Scholar 

  • Goodman LA (1974a) The analysis of systems of qualitative variables when some of the variables are unobservable. Part I—a modified latent structure approach. Am J Sociol 79:1179–1259

    Article  Google Scholar 

  • Goodman LA (1974b) Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61:215–231

    Article  MathSciNet  MATH  Google Scholar 

  • Goodman LA (1979) On the estimation of parameters in latent structure analysis. Psychometrika 44:123–128

    Article  MathSciNet  Google Scholar 

  • Gudicha DW, Schmittmann VD, Vermunt JK (2016) Power computation for likelihood ratio tests for the transition parameters in latent Markov models. Struct Equ Model 23:234–245

    Article  MathSciNet  Google Scholar 

  • Hahn C, Johnson DM, Herrmann A, Huber F (2002) Capturing customer heterogeneity using a finite mixture PLS approach. Schmalenbach Bus Rev 54:243–269

    Article  Google Scholar 

  • Hair JF, Hult GTM, Ringle CM, Sarstedt M (2017) A primer on partial least squares structural equation modeling (PLS–SEM), 2nd edn. Sage, Thousand Oaks

    MATH  Google Scholar 

  • Harris KM (2009) The national longitudinal study of adolescent to adult health (Add Health), Waves I & II, 1994–1996; Wave III, 2001–2002; Wave IV, 2007–2009 (Machine-readable data file and documentation). Chapel Hill: Carolina Population Center, University of North Carolina at Chapel Hill. Retrieved from https://doi.org/10.3886/ICPSR21600.v21

  • Harris KM, Udry JR (2018) National longitudinal study of adolescent to adult health (Add Health), 1994–2008 [Public Use]. Ann Arbor, MI: Carolina Population Center, University of North Carolina-Chapel Hill [distributor], Inter-university Consortium for Political and Social Research [distributor], 2018-08-06. https://doi.org/10.3886/ICPSR21600.v21

  • Hwang H, Takane Y (2004) Generalized structured component analysis. Psychometrika 69:81–99

    Article  MathSciNet  MATH  Google Scholar 

  • Hwang H, Takane Y (2010) Nonlinear generalized structured component analysis. Behaviormetrika 34:95–109

    Article  MATH  Google Scholar 

  • Hwang H, Takane Y (2014) Generalized structured component analysis: a component-based approach to structural equation modeling. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  • Hwang H, DeSarbo SW, Takane Y (2007) Fuzzy clusterwise generalized structured component analysis. Psychometrika 72:181–198

    Article  MathSciNet  MATH  Google Scholar 

  • Hwang H, Takane Y, Jung K (2017) Generalized structured component analysis with uniqueness terms for accommodating measurement error. Front Psychol 8:2137

    Article  Google Scholar 

  • Jeon M, Rabe-Hesketh S (2012) Profile-likelihood approach for estimating generalized linear mixed models with factor structures. J Educ Behav Stat 37:518–542

    Article  Google Scholar 

  • Jöreskog KG (1973) A general method for estimating a linear structural equation system. In: Goldberger AS, Duncan OD (eds) Structural equation models in the social sciences. Seminar Press, New York

    Google Scholar 

  • Jöreskog KG (1977) Structural equation models in the social sciences. In: Krishnaiah PR (ed) Applications of statistics. North-Holland, Amsterdam

    MATH  Google Scholar 

  • Jöreskog KG (1978) Structural analysis of covariance and correlation matrices. Psychometrika 43:443–477

    Article  MathSciNet  MATH  Google Scholar 

  • Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton, Mifflin, New York

    MATH  Google Scholar 

  • Linzer DA, Lewis J (2013) “poLCA: polytomous variable latent class analysis.” R package version 1.4. http://dlinzer.github.com/poLCA

  • Lord FM (1952) A theory of test scores. Psychometric Monograph, No, p 7

    Google Scholar 

  • Lubke GH, MuthĂ©n B (2005) Investigating population heterogeneity with factor mixture models. Psychol Methods 10(1):21–39

    Article  Google Scholar 

  • Masyn KE (2013) Latent class analysis and finite mixture modeling. In: Little TD (ed) Oxford library of psychology. The oxford handbook of quantitative methods: statistical analysis. Oxford University Press, New York, pp 551–611

    Google Scholar 

  • McDonald RP (1999) Test theory: a unified treatment. Lawrence Erlbaum Associates, Mahwah

    Google Scholar 

  • MuthĂ©n B, Asparouhov T (2006) Item response mixture modeling: application to tobacco dependence criteria. Addict Behav 31:1050–1066

    Article  Google Scholar 

  • MuthĂ©n LK, MuthĂ©n BO (1998–2017) Mplus User’s Guide. Eighth Ed. Los Angeles, CA: MuthĂ©n & MuthĂ©n

  • MuthĂ©n BO, Shedden K (1999) Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics 55:463–469

    Article  MATH  Google Scholar 

  • Nagin D (2005) Group-based modeling of development. Harvard University Press, Cambridge

    Book  Google Scholar 

  • Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A 135:370–384

    Article  Google Scholar 

  • Nylund KL, Asparouhov T, MuthĂ©n BO (2007) Deciding on the number of classes in latent class analysis and growth mixture modeling: a Monte Carlo simulation study. Struct Equ Model 14(4):535–569

    Article  MathSciNet  Google Scholar 

  • Pastor DA, Beretvas SN (2006) Longitudinal Rasch modeling in context of psychotherapy outcomes assessment. Appl Psychol Meas 30(2):100–120

    Article  MathSciNet  Google Scholar 

  • Ringle C, Wende S, Becker J-M (2015) SmartPLS 3. Bönningstedt: SmartPLS. http://www.smartpls.com. Accessed 30 Nov 2018

  • Roubens M (1982) Fuzzy clustering algorithms and their cluster validity. Eur J Oper Res 10:294–301

    Article  MathSciNet  MATH  Google Scholar 

  • Ryoo JH, Hwang H (2017) Model evaluation in the generalized structured component analysis using the confirmatory tetrad analysis. Front Psychol Quant Psychol Meas 8:916

    Article  Google Scholar 

  • Ryoo JH, Chatterjee S, Shi D (2015) New variable selection criteria in model selection. In: Paper presented at the annual meeting of the modern modeling methods conference, Storrs, CT

  • Ryoo JH, Wang C, Swearer S, Hull M, Shi D (2018) Longitudinal model building using latent transition analysis: an example of using school bullying data. Front Psychol Quant Psychol Meas 9:675

    Article  Google Scholar 

  • Skrondal A, Rabe-Hesketh S (2004) Generalized latent variable modeling: multilevel, longitudinal, and structural equation models. Chapman & Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  • R Core Team (2017). R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. https://www.R-project.org/. Accessed 30 Nov 2018

  • Wilson M, Zheng X, McGuire L (2012) Formulating latent growth using an explanatory item response model approach. J Appl Meas 13(1):1–22

    Google Scholar 

  • Wold H (1975) PLS path models with latent variables: the NIPALS approach. In: Blalock HM, Aganbegian A, Borodkin FM, Boudon R, Cappecchi V (eds) Quantitative sociology: international perspectives on mathematical and statistical modeling. Academic Press, New York, pp 307–357

    Chapter  Google Scholar 

  • Yang JS, Zheng X (2018) Item response data analysis using Stata item response theory package. J Educ Behav Stat 43(1):116–129

    Article  MathSciNet  Google Scholar 

  • Young FW (1981) Quantitative analysis of qualitative data. Psychometrika 46:347–388

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ji Hoon Ryoo.

Ethics declarations

Conflict of interest

“On behalf of all authors, the corresponding author states that there is no conflict of interest.” Categorical Latent Variable Modeling Utilizing Fuzzy Clustering Generalized Structured Component Analysis as an Alternative to Latent Class Analysis.

Additional information

Communicated by Heungsun Hwang

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ryoo, J.H., Park, S. & Kim, S. Categorical latent variable modeling utilizing fuzzy clustering generalized structured component analysis as an alternative to latent class analysis. Behaviormetrika 47, 291–306 (2020). https://doi.org/10.1007/s41237-019-00084-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41237-019-00084-6

Keywords

Navigation