Skip to main content
Log in

A Probabilistic Clustering Model for Variables of Mixed Type

  • Published:
Quality and Quantity Aims and scope Submit manuscript

Abstract

This paper develops a probabilistic clustering model for mixeddata. The model allows analysis of variables of mixed type: thevariables may be nominal, ordinal and/or quantitative. The modelcontains the well-known models of latent class analysis as submodels.As in latent class analysis, local independence of the variables isassumed. The parameters of the model are estimated by the EMalgorithm. Test statistics and goodness-of-fit measures are proposedfor model selection. Two artificial data sets show the usefulness ofthese tests. An empirical example completes the presentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bacher, J. (1994). Clusteranalyse. Anwendungsorientierte Einführung [Applied cluster analysis, in German]. München: Oldenbourg.

    Google Scholar 

  • Bock, H. H. (1989). Probabilistic aspects in cluster Analysis. In: O. Optiz (ed.), Conceptual and Numerical Analysis of Data. Berlin-Heidelberg-New York: Springer Press, pp. 12–44.

    Google Scholar 

  • Bryant, P. G. (1991). Large-sample results for optimization-based clustering methods. Journal of Classification 8: 31–44.

    Google Scholar 

  • Everitt, B. (1980). Cluster Analysis, 2nd edn. New York: Halsted Press.

    Google Scholar 

  • Fielding, A. (1987). Latent structure models. In: C. A. O'Muircheartaigh & C. Payne (eds). Exploring Data Structures. London-New York-Sydney-Toronto: John Wiley & Sons, pp. 125–158.

    Google Scholar 

  • Fox, J. (1982). Selective aspects of measuring resemblance for taxonomy. In: H. C. Hudson (ed.), Classifying Social Data. New Applications of Analytic Methods for Social Science Research. San Francisco-Washington-London: Jossey-Bass Publishers, pp. 127–151.

    Google Scholar 

  • Jahnke, H. (1988). Clusteranalyse als Verfahren der schließenden Statistik. [Cluster Analysis as a Method of Inference Statistics, in German.] Göttingen: Vandenhoeck & Ruprecht.

    Google Scholar 

  • Jain, A. K. & Dubes, R. C. (1988). Algorithms for Clustering Data. Englewood Cliffs, New Jersey: Prentice.

    Google Scholar 

  • Kendall, M. (1980). Multivariate Analysis, 2nd edn. London: Charles Griffin & Company LTD.

    Google Scholar 

  • Pollard, D. (1981). Strong consistency of K-means clustering. Annals of Statistics 9: 135–140.

    Google Scholar 

  • Pollard, D. (1982). A central limit theorem for K-means clustering. Annals of Probability 10: 919–926.

    Google Scholar 

  • Rost, J. (1985). A latent class model for rating data. Psychometrika 50: 37–39.

    Google Scholar 

  • Van de Pol, F. & de Leeuw, P. (1986). A latent Markov model of correct measurement error in categorial data. Sociological Methods and Research 15: 118–141.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bacher, J. A Probabilistic Clustering Model for Variables of Mixed Type. Quality & Quantity 34, 223–235 (2000). https://doi.org/10.1023/A:1004759101388

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1004759101388

Navigation