Mixture of latent trait analyzers for model-based clustering of categorical data

Abstract

Model-based clustering methods for continuous data are well established and commonly used in a wide range of applications. However, model-based clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for model-based clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a correspondence between the estimated latent classes and groups in the population of interest. The mixture of latent trait analyzers model extends latent class analysis by assuming a model for the categorical response variables that depends on both a categorical latent class and a continuous latent trait variable; the discrete latent class accommodates group structure and the continuous latent trait accommodates dependence within these groups. Fitting the mixture of latent trait analyzers model is potentially difficult because the likelihood function involves an integral that cannot be evaluated analytically. We develop a variational approach for fitting the mixture of latent trait models and this provides an efficient model fitting strategy. The mixture of latent trait analyzers model is demonstrated on the analysis of data from the National Long Term Care Survey (NLTCS) and voting in the U.S. Congress. The model is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2

References

  1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th edn. Dover, New York (1964)

    Google Scholar 

  2. Allman, E.S., Matias, C., Rhodes, J.: Identifiability of parameters in latent structure models with many observed variables. Ann. Stat. 37, 3099–3132 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  3. Andrews, J.L., McNicholas, P.D.: Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions: the tEIGEN family. Stat. Comput. 22, 1021–1029 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  4. Baek, J., McLachlan, G., Flack, L.: Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualisation of high-dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1298–1309 (2010)

    Article  Google Scholar 

  5. Bartholomew, D.J.: Factor analysis for categorical data. J. R. Stat. Soc. B 42, 293–321 (1980)

    MATH  MathSciNet  Google Scholar 

  6. Bartholomew, D.J., Steele, F., Moustaki, I., Galbraith, J.: The Analysis and Interpretation of Multivariate Data for Social Scientists. Chapman & Hall, London (2002)

    Google Scholar 

  7. Bartholomew, D.J., Knott, M., Moustaki, I.: Latent Variable Models and Factor Analysis: A Unified Approach, 3rd edn. Wiley, New York (2011)

    Google Scholar 

  8. Biernacki, C., Celeux, G., Govaert, G., Langrognet, F.: Model-based cluster and discriminant analysis with the MIXMOD software. Comput. Stat. Data Anal. 51, 587–600 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  9. Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)

    Google Scholar 

  10. Bock, R.D., Aitkin, M.: Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika 46, 443–459 (1981)

    Article  MathSciNet  Google Scholar 

  11. Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. SIGMOD Rec. 26, 255–264 (1997). doi:10.1145/253262.253325

    Article  Google Scholar 

  12. Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recognit. 28, 781–793 (1995)

    Article  Google Scholar 

  13. Congressional Quarterly Almanac: 98th congress, 2nd session, volume XL ed. (1984)

  14. Dean, N., Raftery, A.: Latent class analysis variable selection. Ann. Inst. Stat. Math. 62, 11–35 (2010)

    Article  MathSciNet  Google Scholar 

  15. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood for incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. B 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  16. Efron, B.: Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika 68, 589–599 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  17. Erosheva, E.A.: Grade of membership and latent structure models with application to disability survey data. Ph.D. thesis, Department of Statistics, Carnegie Mellon University (2002)

  18. Erosheva, E.A.: Bayesian estimation of the grade of membership model. In: Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., West, M. (eds.) Bayesian Statistics, Oxford, vol. 7, pp. 501–510 (2003)

    Google Scholar 

  19. Erosheva, E.A.: Partial membership models with application to disability survey data. In: Bozdogan, H. (ed.) Statistical Data Mining and Knowledge Discovery, pp. 117–134. CRC Press, Boca Raton (2004)

    Google Scholar 

  20. Erosheva, E.A., Fienberg, S.E., Joutard, C.: Describing disability through individual-level mixture models for multivariate binary data. Ann. Appl. Stat. 1, 502–537 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  21. Fienberg, S.E., Hersh, P., Rinaldo, A., Zhou, Y.: Maximum likelihood estimation in latent class models for contingency tables. In: Gibilisco, P., Riccomagno, E., Rogantin, M., Wynn, H. (eds.) Algebraic and Geometric Methods in Statistics, pp. 31–66. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  22. Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97, 611–612 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  23. Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2010). http://archive.ics.uci.edu/ml

    Google Scholar 

  24. Ghahramani, Z., Hinton, G.E.: The EM algorithm for mixtures of factor analyzers. Tech. Rep. CRG-TR-96-1, University of Toronto, Toronto (1997)

  25. Goodman, L.A.: Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61, 215–231 (1974)

    Article  MATH  MathSciNet  Google Scholar 

  26. Hadgu, A., Qu, Y.: A biomedical application of latent class models with random effects. Appl. Stat. 47, 603–616 (1998)

    MATH  Google Scholar 

  27. Jaakkola, T.S., Jordan, M.I.: A variational approach to Bayesian logistic regression models and their extensions. In: Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics (1996)

    Google Scholar 

  28. Karlis, D., Santourian, A.: Model-based clustering with non-elliptically contoured distributions. Stat. Comput. 19, 73–83 (2008)

    Article  MathSciNet  Google Scholar 

  29. Lin, T.I.: Robust mixture modeling using multivariate skew t distributions. Stat. Comput. 20, 343–356 (2010)

    Article  MathSciNet  Google Scholar 

  30. Lin, T.I., Lee, J.C., Yen, S.Y.: Finite mixture modeling using the skew normal distribution. Stat. Sin. 17, 909–927 (2007)

    MATH  MathSciNet  Google Scholar 

  31. McLachlan, G., Peel, D.: The EMMIX algorithm for the fitting of normal and t-components. J. Stat. Softw. 4, 1–14 (1999)

    Google Scholar 

  32. McLachlan, G., Peel, D.: Finite Mixture Models. Wiley Series in Probability and Statistics: Applied Probability and Statistics. Wiley-Interscience, New York (2000)

    Google Scholar 

  33. McLachlan, G., Peel, D., Bean, R.: Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data Anal. 41, 379–388 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  34. McNicholas, P.D., Murphy, T.B.: Parsimonious Gaussian mixture models. Stat. Comput. 18, 285–296 (2008)

    Article  MathSciNet  Google Scholar 

  35. McNicholas, P.D., Murphy, T.B.: Model-based clustering of microarray expression data via latent Gaussian mixture models. Bioinformatics 26, 2705–2712 (2010)

    Article  Google Scholar 

  36. Muthén, B.: Latent variable mixture modeling. In: Marcoulides, G.A., Schumacker, R.E. (eds.) New Developments and Techniques in Structural Equation Modeling, pp. 1–33. Lawrence Erlbaum Associates, Mahwah (2001)

    Google Scholar 

  37. Pauler, D.K.: The Schwarz criterion and related methods for normal linear models. Biometrika 85, 13–27 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  38. Qu, Y., Tan, M., Kutner, M.H.: Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Biometrics 52, 797–810 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  39. Raftery, A.E., Newton, M.A., Satagopan, J.M., Krivitsky, P.N.: Estimating the integrated likelihood via posterior simulation using the harmonic mean identity. In: Bayesian Statistics, vol. 8, pp. 1–45. Oxford University Press, Oxford (2007)

    Google Scholar 

  40. Rasch, G.: Studies in Mathematical Psychology: I. Probabilistic Models for Some Intelligence and Attainment Tests. Nielsen & Lydiche, Oxford (1960)

    Google Scholar 

  41. Rost, J.: Rasch models in latent classes: an integration of two approaches to item analysis. Appl. Psychol. Meas. 14, 271–282 (1990)

    Article  Google Scholar 

  42. Rost, J., von Davier, M: Mixture distribution Rasch models. In: Fischer, G.H., Molenaar, I.W. (eds.) Rasch Models: Foundations, Recent Developments, and Applications, pp. 257–268. Springer, New York (1995)

    Google Scholar 

  43. Sammel, M.D., Ryan, L.M., Legler, J.M.: Latent variable models for mixed discrete and continuous outcomes. J. R. Stat. Soc. B 59, 667–678 (1997)

    Article  MATH  Google Scholar 

  44. Schlimmer, J.C.: Concept acquisition through representational adjustment. Ph.D. thesis, Department of Information and Computer Science, University of California, Irvine (1987)

  45. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MATH  Google Scholar 

  46. Steele, R.J.: Practical Importance Sampling Methods for Finite Mixture Models and Multiple Imputation. Ph.D. thesis, University of Washington (2002)

  47. Tipping, M.E.: Probabilistic visualisation of high-dimensional binary data. In: Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems 11, pp. 592–598. MIT Press, Cambridge (1999)

    Google Scholar 

  48. Uebersax, J.S.: Probit latent class analysis with dichotomous or ordered category measures: conditional independence/dependence models. Appl. Psychol. Meas. 23, 283–297 (1999)

    Article  Google Scholar 

  49. Vermunt, J.: Multilevel mixture item response theory models: an application in education testing. In: Proceedings of the 56th Session of the International Statistical Institute, Lisbon, Portugal. International Statistical Institute, Voorburg, Netherlands (2007)

    Google Scholar 

  50. Vermunt, J., Magidson, J.: Factor analysis with categorical indicators: a comparison between traditional and latent class approaches. In: der Ark, A.V., Croon, M.A., Sijtsma, K. (eds.) New Developments in Categorical Data Analysis for the Social and Behavioral Sciences, pp. 41–62. Lawrence Erlbaum Associates, Mahwah (2005)

    Google Scholar 

  51. Vermunt, J., Magidson, J.: LG-Syntax User’s Guide: Manual for Latent GOLD 4.5 Syntax Module. Statistical Innovations Inc., Belmont (2008)

    Google Scholar 

  52. von Davier, M., Yamamoto, K.: Mixture distribution and HYBRID Rasch models. In: von Davier, M., Carstensen, C.H. (eds.) Multivariate and Mixture Distribution Rasch Models, pp. 99–115. Springer, New York (2007)

    Google Scholar 

  53. von Davier, M., Rost, J., Carstensen, C.H.: Introduction: Extending the Rasch model. In: von Davier, M., Carstensen, C.H. (eds.) Multivariate and Mixture Distribution Rasch Models, pp. 1–12. Springer, New York (2007)

    Google Scholar 

Download references

Acknowledgements

We would like to think the editor, associate editor and reviewers for their insightful comments and suggestions which have greatly improved this paper. This research was supported by a Science Foundation Ireland Research Frontiers Programme Grant (06/RFP/M040) and Strategic Research Cluster Grant (08/SRC/I1407).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Thomas Brendan Murphy.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 9.5 MB)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Gollini, I., Murphy, T.B. Mixture of latent trait analyzers for model-based clustering of categorical data. Stat Comput 24, 569–588 (2014). https://doi.org/10.1007/s11222-013-9389-1

Download citation

Keywords

  • Model-based clustering
  • Mixture models
  • Latent variables
  • Categorical data
  • Variational EM Algorithm