Skip to main content

An Overview on the URV Model-Based Approach to Cluster Mixed-Type Data

  • Conference paper
  • First Online:
Statistical Learning of Complex Data (CLADAG 2017)

Abstract

In this paper, we provide an overview on the underlying response variable (URV) model-based approach to cluster and, optionally, simultaneously reduce ordinal and, optionally, continuous variables. We summarize and compare its main features discussing some key issues. An example of application to real data is illustrated comparing and discussing clustering performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bock, D., Moustaki, I.: Item response theory in a general framework. In: Handbook of Statistics on Psychometrics. Elsevier, Amsterdam (2007)

    Google Scholar 

  2. Bouveyron, C., Brunet, C.: Model-based clustering of high-dimensional data: a review. Comput. Stat. Data Anal. 71, 52–78 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  3. Browne, R.P., McNicholas, P.D.: Model-based clustering, classification, and discriminant analysis of data with mixed type. J. Stat. Plan. Inference 142(11), 2976–2984 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  4. Cagnone, S., Viroli, C.: A factor mixture analysis model for multivariate binary data. Stat. Model. 12, 257–277 (2012)

    Article  MathSciNet  Google Scholar 

  5. Cai, J.H., Song, X.Y., Lam, K.H., Ip, E.H.S.: A mixture of generalized latent variable models for mixed mode and heterogeneous data. Comput. Stat. Data Anal. 55(11), 2889–2907 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  6. Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recognit. 28(5), 781–793 (1995)

    Article  Google Scholar 

  7. Dean, N., Raftery, A.E.: Latent class analysis variable selection. Ann. Inst. Stat. Math. 62(1), 11–35 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  8. Everitt, B.: A finite mixture model for the clustering of mixed-mode data. Stat. Probab. Lett. 6(5), 305–309 (1988)

    Article  MathSciNet  Google Scholar 

  9. Gollini, I., Murphy, T.: Mixture of latent trait analyzers for model-based clustering of categorical data. Stat. Comput. 24(4), 569–588 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  10. Hunt, L., Jorgensen, M.: Clustering mixed data. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 1(4), 352–361 (2011)

    Article  Google Scholar 

  11. Lawrence, C., Krzanowski, W.: Mixture separation for mixed-mode data. Stat. Comput. 6(1), 85–92 (1996)

    Article  Google Scholar 

  12. Lubke, G., Neale, M.: Distinguishing between latent classes and continuous factors with categorical outcomes: class invariance of parameters of factor mixture models. Multivar. Behav. Res. 43(4), 592–620 (2008)

    Article  Google Scholar 

  13. Marbac, M., Biernacki, C., Vandewalle, V.: Finite mixture model of conditional dependencies modes to cluster categorical data (2014, preprint). arXiv:1402.5103

    Google Scholar 

  14. Mardia, K.V., Kent, J.T., Hughes, G., Taylor, C.C.: Maximum likelihood estimation using composite likelihoods for closed exponential families. Biometrika 96(4), 975–982 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  15. McLachlan, G.J., Rathnayake, S.I.: Mixture models for standard p-dimensional Euclidean data. In: Hennig, C., Meila, M., Murtagh, F., Rocci, R. (eds.) Handbook of Cluster Analysis, pp. 145–172. CRC Press, Boca Raton (2016)

    Google Scholar 

  16. McLachlan, G.J., Bean, R.W., Ben-Tovim Jones, L.: Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution. Comput. Stat. Data Anal. 51, 5327–5338 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  17. McNicholas, P., Murphy, T.: Parsimonious Gaussian mixture models. Stat. Comput. 18(3), 285–296 (2008)

    Article  MathSciNet  Google Scholar 

  18. Morlini, I.: A latent variables approach for clustering mixed binary and continuous variables within a Gaussian mixture model. Adv. Data Anal. Classif. 6(1), 5–28 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  19. Muthén, B.: A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika 49(1), 115–132 (1984)

    Article  Google Scholar 

  20. Ranalli, M., Rocci, R.: Mixture models for ordinal data: a pairwise likelihood approach. Stat. Comput. 26(1), 529–547 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  21. Ranalli, M., Rocci, R.: Standard and novel model selection criteria in the pairwise likelihood estimation of a mixture model for ordinal data. In: Wilhelm, A.F.X., Kestler, H.A. (eds.) Analysis of Large and Complex Data. Studies in Classification, Data Analysis and Knowledge Organization, pp. 53–68. Springer, Cham (2016)

    Google Scholar 

  22. Ranalli, M., Rocci, R.: Mixture models for mixed-type data through a composite likelihood approach. Comput. Stat. Data Anal. 110(C), 87–102 (2017). https://doi.org/10.1016/j.csda.2016.12.01

    Article  MathSciNet  MATH  Google Scholar 

  23. Ranalli, M., Rocci, R.: A model-based approach to simultaneous clustering and dimensional reduction of ordinal data. Psychometrika (2017). https://doi.org/10.1007/s11336-017-9578-5

    Article  MathSciNet  MATH  Google Scholar 

  24. Tipping, M.E.: Probabilistic visualisation of high-dimensional binary data. In: Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems II, pp. 592–598. MIT Press (1999)

    Google Scholar 

  25. Varin, C., Reid, N., Firth, D.: An overview of composite likelihood methods. Stat. Sin. 21(1), 1–41 (2011)

    MathSciNet  MATH  Google Scholar 

  26. Vermunt, J.K., Magidson, J.: Latent GOLD 4.0 User’s Guide. Statistical Innovations Inc., Belmont (2005)

    Google Scholar 

  27. White, A., Wyse, J., Murphy, T.B.: Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler (2014, preprint). arXiv:1402.6928

    Article  MathSciNet  MATH  Google Scholar 

  28. Willse, A., Boik, R.: Identifiable finite mixtures of location models for clustering mixed-mode data. Stat. Comput. 9(2), 111–121 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Monia Ranalli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ranalli, M., Rocci, R. (2019). An Overview on the URV Model-Based Approach to Cluster Mixed-Type Data. In: Greselin, F., Deldossi, L., Bagnato, L., Vichi, M. (eds) Statistical Learning of Complex Data. CLADAG 2017. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-030-21140-0_5

Download citation

Publish with us

Policies and ethics