Abstract
Discovering the preferences and the behaviour of consumers is a key challenge in marketing. Information about such topics can be gathered through surveys in which the respondents must assign a score to a number of items. A strategy based on different latent class models can be used to analyze such data and achieve this objective: it consists in identifying groups of consumers whose response patterns are similar and characterizing them in terms of preferences and covariates. The basic latent class model can be extended by including covariates to model differences in (1) latent class probabilities and (2) conditional probabilities. A strategy for fitting and choosing a suitable model among them is proposed taking into account identifiability issues, the identification of potential covariates and the checking of goodness-of-fit. The tools to perform this analysis are implemented in the R package covLCA available from CRAN. We illustrate and explain the application of this strategy using data about the preferences of Belgian households for supermarkets.
Similar content being viewed by others
Notes
The category Bachelor is an approximate translation of the Belgian education level graduat, while we use Master to denote the Belgian education levels licence and maitrise.
p-value of the Pearson’s chi-squared test: \(<\!2.2\,\times \,10^{-16}\) for Age, Education and Profession, 0.04 for Gender.
In class GB-Carrefour: p value of the likelihood ratio test is \(<\)0.01.
p-value in latent class Lidl-Aldi: 0.03 ; p value in class Delhaize: 0.05.
p value of the Pearson’s chi-squared test: \(<\)2.2\(\,\times \,10^{-16}\).
References
Albert PS (2007) Random effects modeling approaches for estimating ROC curves from repeated ordinal tests without a gold standard. Biometrics 63(2):593–602
Bandeen-Roche K, Miglioretti DL, Zeger SL, Rathouz PJ (1997) Latent variable regression for multiple discrete outcomes. J Am Stat Assoc 92(440):1375–1386
Bandeen-Roche K, Huang G-H, Munoz B, Rubin GS (1999) Determination of risk factor associations with questionnaire outcomes: a methods case study. Am J Epidemiol 150(11):1165–1178
Beath K (2011) RandomLCA: random effects latent class analysis. R package version 0.8-3. http://CRAN.R-project.org/package=randomLCA
Bolck A, Croon M, Hagenaars J (2004) Estimating latent structure models with categorical variables: one-step versus three-step estimators. Political Anal 12(1):3–27
Collins LM, Fidler PL, Wugalter SE, Long JD (1993) Goodness-of-fit testing for latent class models. Multivar Behav Res 28(3):375–389
Croissant Y (2011) Mlogit: multinomial logit model. R package version 0.2-1. http://CRAN.R-project.org/package=mlogit
Formann AK (2003) Latent class model diagnosis from a frequentist point of view. Biometrics 59(1):189–196
Goodman LA (1974) Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61(2):215–231
Huang G-H, Bandeen-Roche K (2004) Building an identifiable latent class model with covariate effects on underlying and measured variables. Psychometrika 69(1):5–32
Lange K (1995) A gradient algorithm locally equivalent to the EM algorithm. J Roy Stat Soc Ser B (Methodol) 57(2):425–437
Lin TH, Dayton CM (1997) Model selection information criteria for non-nested latent class models. J Educ Behav Stat 22(3):249–264
Linzer DA, Lewis J (2011a) poLCA: Polytomous Variable Latent Class Analysis. R package version 1.3.1. http://userwww.service.emory.edu/~dlinzer/poLCA
Linzer DA, Lewis J (2011b) poLCA: an R Package for polytomous variable latent class analysis. J Stat Softw 42(10):1–29
Malhotra N, Decaudin J-M, Bouguerra A (2007) Etudes marketing avec SPSS. Pearson Education France, Paris
McCutcheon AL (1987) Latent class analysis. Sage University paper series on quantitative applications in the social sciences. Sage Publications, Beverly Hills
Paulhus DL (1991) Measurement and control of response bias. In: Robinson JP, Shaver PR, Wrightsman LS (eds) Measures of personality and social psychological attitudes (vol 1). Academic Press, San Diego, CA
Qu Y, Tan M, Kutner MH (1996) Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics 52(3):797–810
R Core Team (2013) R: a language and environment for statistical computing. R Foundation for statistical computing, Vienna, Austria. http://www.R-project.org/
Reboussin BA, Ip EH, Wolfson M (2008) Locally dependent latent class models with covariates: an application to under-age drinking in the USA. J Roy Stat Soc Ser A (Stat Soc) 171(4):877–897
Uebersax JS (2000) A practical guide to local dependence in latent class models. http://www.john-uebersax.com/stat/condep.htm
Van Herk H, Poortinga YH, Verhallen TMM (2004) Response styles in rating scales: evidence of method bias in data from six EU countries. J Cross Cult Psychol 35(3):346–360
Vermunt JK (2010) Latent class modeling with covariates: two improved three-step approaches. Political Anal 18(4):450–469
Acknowledgments
We would like to thank Eric Lecoutre for helpful discussions, and Business and Decision, Brussels, for providing the data. We also would like to thank the editor and two anonymous referees for helpful comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bertrand, A.M.E., Hafner, C.M. On heterogeneous latent class models with applications to the analysis of rating scores. Comput Stat 29, 307–330 (2014). https://doi.org/10.1007/s00180-013-0450-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-013-0450-5