Abstract
Latent class analysis is becoming popular in many areas of education, psychology, social and behavioral sciences, public health, and medicine. However, it often suffers from identification issues due to the large number of parameters involved when using maximum likelihood (ML) estimation. Increasing the sample size, reducing sparseness, and strengthening the relationship between the observed variables and the latent variables all improve the information and thus reduce the identification issues, but the identification issue still affects the validity of parameter estimates in ML estimation and the definition of identification is not sufficient to guarantee the existence of an ML solution. In this paper, generalized structured component analysis (GSCA), which is a component-based approach that utilizes optimal scaling and fuzzy clustering, is applied to avoid these identification issues and develop more stable solutions for the heterogeneity of a population based on a set of categorical responses. Testing our proposed new approach, component-based (CB) latent class analysis (LCA), on real world substance use data from Add Health produced not only the same features as those yielded by conventional ML LCA but also stable estimation without identification issues. Comparing the results obtained from ML LCA using Mplus and poLCA in R, with those from our proposed CB LCA using GSCA in R revealed a similar number of latent classes and posterior probabilities and only minor discrepancies in individual latent class classifications when the posterior probabilities of membership are not distinct.
Similar content being viewed by others
References
Becker JM, Rai A, Ringle CM, Völckner F (2013) Discovering unobserved heterogeneity in structural equation models to avert validity threats. MIS Q 37(3):665–694
Bezdek JC (1974) Numerical taxonomy with fuzzy sets. J Math Biol 1:57–71
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York
Collins L, Lanza S (2010) Latent class and latent transition analysis: with applications in the social, behavioral, and health sciences. Wiley, New York
Dziak JJ, Lanza ST, Tan X (2014) Effect size, statistical power, and sample size requirements for the bootstrap likelihood ratio test in latent class analysis. Struct Equ Model 21(4):534–552
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
Efron B (1982) The jackknife, the bootstrap and other resampling plans. SIAM, Philadelphia
Esposito Vinzi V, Trinchera L, Squillacciotti S, Tenenhaus M (2008) REBUS–PLS: a response-based procedure for detecting unit segments in PLS path modeling. Appl Stoch Models Bus Industry 24:439–458
Goodman LA (1974a) The analysis of systems of qualitative variables when some of the variables are unobservable. Part I—a modified latent structure approach. Am J Sociol 79:1179–1259
Goodman LA (1974b) Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61:215–231
Goodman LA (1979) On the estimation of parameters in latent structure analysis. Psychometrika 44:123–128
Gudicha DW, Schmittmann VD, Vermunt JK (2016) Power computation for likelihood ratio tests for the transition parameters in latent Markov models. Struct Equ Model 23:234–245
Hahn C, Johnson DM, Herrmann A, Huber F (2002) Capturing customer heterogeneity using a finite mixture PLS approach. Schmalenbach Bus Rev 54:243–269
Hair JF, Hult GTM, Ringle CM, Sarstedt M (2017) A primer on partial least squares structural equation modeling (PLS–SEM), 2nd edn. Sage, Thousand Oaks
Harris KM (2009) The national longitudinal study of adolescent to adult health (Add Health), Waves I & II, 1994–1996; Wave III, 2001–2002; Wave IV, 2007–2009 (Machine-readable data file and documentation). Chapel Hill: Carolina Population Center, University of North Carolina at Chapel Hill. Retrieved from https://doi.org/10.3886/ICPSR21600.v21
Harris KM, Udry JR (2018) National longitudinal study of adolescent to adult health (Add Health), 1994–2008 [Public Use]. Ann Arbor, MI: Carolina Population Center, University of North Carolina-Chapel Hill [distributor], Inter-university Consortium for Political and Social Research [distributor], 2018-08-06. https://doi.org/10.3886/ICPSR21600.v21
Hwang H, Takane Y (2004) Generalized structured component analysis. Psychometrika 69:81–99
Hwang H, Takane Y (2010) Nonlinear generalized structured component analysis. Behaviormetrika 34:95–109
Hwang H, Takane Y (2014) Generalized structured component analysis: a component-based approach to structural equation modeling. CRC Press, Boca Raton
Hwang H, DeSarbo SW, Takane Y (2007) Fuzzy clusterwise generalized structured component analysis. Psychometrika 72:181–198
Hwang H, Takane Y, Jung K (2017) Generalized structured component analysis with uniqueness terms for accommodating measurement error. Front Psychol 8:2137
Jeon M, Rabe-Hesketh S (2012) Profile-likelihood approach for estimating generalized linear mixed models with factor structures. J Educ Behav Stat 37:518–542
Jöreskog KG (1973) A general method for estimating a linear structural equation system. In: Goldberger AS, Duncan OD (eds) Structural equation models in the social sciences. Seminar Press, New York
Jöreskog KG (1977) Structural equation models in the social sciences. In: Krishnaiah PR (ed) Applications of statistics. North-Holland, Amsterdam
Jöreskog KG (1978) Structural analysis of covariance and correlation matrices. Psychometrika 43:443–477
Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton, Mifflin, New York
Linzer DA, Lewis J (2013) “poLCA: polytomous variable latent class analysis.” R package version 1.4. http://dlinzer.github.com/poLCA
Lord FM (1952) A theory of test scores. Psychometric Monograph, No, p 7
Lubke GH, Muthén B (2005) Investigating population heterogeneity with factor mixture models. Psychol Methods 10(1):21–39
Masyn KE (2013) Latent class analysis and finite mixture modeling. In: Little TD (ed) Oxford library of psychology. The oxford handbook of quantitative methods: statistical analysis. Oxford University Press, New York, pp 551–611
McDonald RP (1999) Test theory: a unified treatment. Lawrence Erlbaum Associates, Mahwah
Muthén B, Asparouhov T (2006) Item response mixture modeling: application to tobacco dependence criteria. Addict Behav 31:1050–1066
Muthén LK, Muthén BO (1998–2017) Mplus User’s Guide. Eighth Ed. Los Angeles, CA: Muthén & Muthén
Muthén BO, Shedden K (1999) Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics 55:463–469
Nagin D (2005) Group-based modeling of development. Harvard University Press, Cambridge
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A 135:370–384
Nylund KL, Asparouhov T, Muthén BO (2007) Deciding on the number of classes in latent class analysis and growth mixture modeling: a Monte Carlo simulation study. Struct Equ Model 14(4):535–569
Pastor DA, Beretvas SN (2006) Longitudinal Rasch modeling in context of psychotherapy outcomes assessment. Appl Psychol Meas 30(2):100–120
Ringle C, Wende S, Becker J-M (2015) SmartPLS 3. Bönningstedt: SmartPLS. http://www.smartpls.com. Accessed 30 Nov 2018
Roubens M (1982) Fuzzy clustering algorithms and their cluster validity. Eur J Oper Res 10:294–301
Ryoo JH, Hwang H (2017) Model evaluation in the generalized structured component analysis using the confirmatory tetrad analysis. Front Psychol Quant Psychol Meas 8:916
Ryoo JH, Chatterjee S, Shi D (2015) New variable selection criteria in model selection. In: Paper presented at the annual meeting of the modern modeling methods conference, Storrs, CT
Ryoo JH, Wang C, Swearer S, Hull M, Shi D (2018) Longitudinal model building using latent transition analysis: an example of using school bullying data. Front Psychol Quant Psychol Meas 9:675
Skrondal A, Rabe-Hesketh S (2004) Generalized latent variable modeling: multilevel, longitudinal, and structural equation models. Chapman & Hall/CRC, Boca Raton
R Core Team (2017). R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. https://www.R-project.org/. Accessed 30 Nov 2018
Wilson M, Zheng X, McGuire L (2012) Formulating latent growth using an explanatory item response model approach. J Appl Meas 13(1):1–22
Wold H (1975) PLS path models with latent variables: the NIPALS approach. In: Blalock HM, Aganbegian A, Borodkin FM, Boudon R, Cappecchi V (eds) Quantitative sociology: international perspectives on mathematical and statistical modeling. Academic Press, New York, pp 307–357
Yang JS, Zheng X (2018) Item response data analysis using Stata item response theory package. J Educ Behav Stat 43(1):116–129
Young FW (1981) Quantitative analysis of qualitative data. Psychometrika 46:347–388
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
“On behalf of all authors, the corresponding author states that there is no conflict of interest.” Categorical Latent Variable Modeling Utilizing Fuzzy Clustering Generalized Structured Component Analysis as an Alternative to Latent Class Analysis.
Additional information
Communicated by Heungsun Hwang
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ryoo, J.H., Park, S. & Kim, S. Categorical latent variable modeling utilizing fuzzy clustering generalized structured component analysis as an alternative to latent class analysis. Behaviormetrika 47, 291–306 (2020). https://doi.org/10.1007/s41237-019-00084-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41237-019-00084-6