Skip to main content
Log in

Divisive Latent Class Modeling as a Density Estimation Method for Categorical Data

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Traditionally latent class (LC) analysis is used by applied researchers as a tool for identifying substantively meaningful clusters. More recently, LC models have also been used as a density estimation tool for categorical variables. We introduce a divisive LC (DLC) model as a density estimation tool that may offer several advantages in comparison to a standard LC model. When using an LC model for density estimation, a considerable number of increasingly large LC models may have to be estimated before sufficient model-fit is achieved. A DLC model consists of a sequence of small LC models. Therefore, a DLC model can be estimated much faster and can easily utilize multiple processor cores, meaning that this model is more widely applicable and practical. In this study we describe the algorithm of fitting a DLC model, and discuss the various settings that indirectly influence the precision of a DLC model as a density estimation tool. These settings are illustrated using a synthetic data example, and the best performing algorithm is applied to a real-data example. The generated data example showed that, using specific decision rules, a DLC model is able to correctly model complex associations amongst categorical variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • AGRESTI, A. (2002), Categorical Data Analysis, New York: Wiley.

  • AKAIKE, H. (1974), "A New Look at the Statistical Model Identification", IEEE Transactions on Automatic Control, 19, 716–723.

  • BOUGUILA, N., and ELGUEBALY, W. (2009), "Discrete Data Clustering Using Finite Mixture Models", Pattern Recognition, 42, 33–42.

  • COLLINS, L.M., and LANZA, S.T. (2010), Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences, Hoboken, NJ: Wiley.

  • EVERITT, B.S., LANDAU, S., and LEESE, M. (2001), Cluster Analysis, London: Arnold.

  • GEBREGZIABHER, M., and DESANTIS, S.M. (2010), "Latent Class Based Multiple Imputation Approach For Missing Categorical Data", Journal of Statistical Planning and Inference, 140, 3252–3262.

  • GOODMAN, L.A. (1974), "Exploratory Latent Structure Analysis Using Both Identifiable and Unidentifiable Models", Biometrika, 61, 215–231.

  • HAGENAARS, J.A., and MCCUTCHEON, A.L. (eds.) (2002), Applied Latent Class Analysis, Cambridge, UK: Cambridge University Press.

  • HOIJTINK, H., and NOTENBOOM, A. (2004), "Model Based Clustering of Large Data Sets: Tracing the Development of Spelling Ability", Psychometrika, 69, 481–498.

  • KEEL, P., FICHTER, M., QUADFLIEG, N., BULIK, C., BAXTER, M., THORNTON, L., et al. (2004), "Application of Latent Class Analysis to Empirically Defined Eating Disorder Phenotypes", Archives of General Psychiatry, 61, 192–200.

  • LAZARSFELD, P.F. (1950), "The Logical and Mathematical Foundation of Latent Structure Analysis", in Studies in Social Psychology in World War II. Vol. IV: Measurement and Prediction, eds. S.A. Stouffer, L. Guttman, E.A. Suchman, P.F. Lazarsfeld, S.A. Star and J.A. Clausen, Princeton, NJ: Princeton University Press, pp. 361–412.

  • LINZER, D.A. (2011), "Reliable Inference in Highly Stratified Contingency Tables: Using Latent Class Models as Density Estimators", Political Analysis, 19, 173–187.

  • MAGIDSON, J., and VERMUNT, J.K., (2004), "Latent Class Models", in Handbook of Quantitative Methodology for the Social Sciences, ed. D. Kaplan, Newbury Park, NJ: Sage, pp. 175–198.

  • MCCUTCHEON, A.L. (1987), Latent Class Analysis, Newbury Park, CA: Sage.

  • MCLACHLAN, G.J., and PEEL, D. (2000), Finite Mixture Models, New York: Wiley.

  • RICHARDS, G. (2010), "The Traditional Quantitative Approach Surveying Cultural Tourists: Lessons From the ATLAS Cultural Tourism Research Project", in Cultural Tourism Research Methods, eds. G. Richards and W. Munsters, Wallingford, UK: CABI, pp. 13–32.

  • RICHARDSON, S., and GREEN, P.J. (1997), "On Bayesian Analysis of Mixtures with an Unknown Number of Components", Journal of the Royal Statistical Society. Series B, 59, 731–792.

  • RINDSKOPF, D., and RINDSKOPF, W. (1986), "The Value of Latent Class Analysis in Medical Diagnosis", Statistics in Medicine, 5, 21–27.

  • RUBIN, D.B. (1987), Multiple Imputation for Nonresponse in Surveys, New York: Wiley.

  • SCHAFER J.L., and GRAHAM J.W. (2002), "Missing Data: Our View of the State of the Art", Psychological Methods, 7, 147–177.

  • UEDA, N., and NAKANO, R. (2000), "EM Algorithm With Split and Merge Operations for Mixture Models", Systems and Computers, 31, 930–940.

  • VAN DER ARK, L.A., VAN DER PALM, D.W., and SIJTSMA, K. (2011), "A Latent-Class Approach to Estimating Test-Score Reliability", Applied Psychological Measurement, 35, 380–392.

  • VAN DER PALM, D.W., VAN DER ARK, L.A., and SIJTSMA, K. (2014), "A Flexible Latent-Class Approach to Estimating Test-Score Reliability", Journal of Educational Measurement, 51, 339–357.

  • VAN DER PALM, D.W., VAN DER ARK, L.A., and VERMUNT, J.K. (Advance Online Publication), "A Comparison of Incomplete-Data Methods for Categorical Data", Statistical Methods in Medical Research, doi: 10.1177/0962280212465502, http://smm.sagepub.com/content/early/2012/11/15/0962280212465502.abstract.

  • VAN HATTUM, P., and HOIJTINK, H. (2009), "Market Segmentation Using Brand Strategy Research: Bayesian Inference With Respect to Mixtures of Log-Linear Models", Journal of Classification, 26, 297–328.

  • VERMUNT J.K., and MAGIDSON J. (2008), LG-Syntax User’s Guide: Manual for Latent GOLD 4.5 Syntax Module, Belmont, MA: Statistical Innovations.

  • VERMUNT, J.K., VAN GINKEL, J.R., VAN DER ARK, L.A., and SIJTSMA, K. (2008), "Multiple Imputation of Incomplete Categorical Data Using Latent Class Analysis", Sociological Methodology, 38, 369–397.

  • WANG, H.X., LUO, B., ZHANG, Q.B., and WEI, S. (2004), "Estimation for the Number of Components in a Mixture Model Using Stepwise Split-and-Merge EM Algorithm", Pattern Recognition Letters, 25, 1799–1809.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniël W. van der Palm.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

van der Palm, D.W., van der Ark, L.A. & Vermunt, J.K. Divisive Latent Class Modeling as a Density Estimation Method for Categorical Data. J Classif 33, 52–72 (2016). https://doi.org/10.1007/s00357-016-9195-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-016-9195-5

Keywords

Navigation