, Volume 71, Issue 1, pp 161–171 | Cite as

An Extension of Multiple Correspondence Analysis for Identifying Heterogeneous Subgroups of Respondents

  • Heungsun Hwang
  • Hec Montréal
  • William R. Dillon
  • Yoshio Takane


An extension of multiple correspondence analysis is proposed that takes into account cluster-level heterogeneity in respondents’ preferences/choices. The method involves combining multiple correspondence analysis and k-means in a unified framework. The former is used for uncovering a low-dimensional space of multivariate categorical variables while the latter is used for identifying relatively homogeneous clusters of respondents. The proposed method offers an integrated graphical display that provides information on cluster-based structures inherent in multivariate categorical data as well as the interdependencies among the data. An empirical application is presented which demonstrates the usefulness of the proposed method and how it compares to several extant approaches.


multiple correspondence analysis k-means cluster-level respondent heterogeneity alternating least squares 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Arabie, P., & Hubert, L. (1994). Cluster analysis in marketing research. In R.P. Bagozzi (Ed.), Advanced methods of marketing research (pp. 160–189). Oxford: Blackwell.Google Scholar
  2. Arimond, G., & Elfessi, A. (2001). A clustering method for categorical data in tourism market segmentation research. Journal of Travel Research, 39, 391–397.CrossRefGoogle Scholar
  3. Bagozzi, R.P. (1982). A field investigation of causal relations among cognition, affect, intensions, and behavior. Journal of Marketing Research, 19, 562–584.CrossRefGoogle Scholar
  4. Benzécri, J.P. (1973). l’ Analyse des données. Vol. 2. l’ Analyse des correspondances. Paris: Dunod.Google Scholar
  5. Benzécri, J.P. (1979). Sur le calcul des taux d’inertia dans l’analyse d’un questionaire. Addendum et erratum à [BIN.MULT]. Cahiers de l’Analyse des Données, 4, 377–378.Google Scholar
  6. Bezdek, J.C. (1974). Numerical taxonomy with fuzzy sets. Journal of Mathematical Biology, 1, 57–71.CrossRefGoogle Scholar
  7. Bock, H.H. (1987). On the interface between cluster analysis, principal component analysis, and multidimensional scaling. In H. Bozdogan, & A. K. Gupta, (Eds.) Multivariate statistical modeling and data analysis (pp. 17–34). New York: Reidel.Google Scholar
  8. Chang, W. (1983). On using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32, 267–275.Google Scholar
  9. de Leeuw, J., Young, F.W., & Takane, Y. (1976). Additive structure in qualitative data: An alternating least squares method with optimal scaling features. Psychometrika, 41, 471–503.CrossRefGoogle Scholar
  10. DeSarbo, W.S., Howard, D.J., & Jedidi, K. (1991). MULTICLUS: A new method for simultaneous performing multidimensional scaling and clustering. Psychometrika, 56, 121–136.CrossRefGoogle Scholar
  11. DeSarbo, W.S., Jedidi, K., Cool, K., & Schendel, D. (1990). Simultaneous multidimensional unfolding and cluster analysis: An investigation of strategic groups. Marketing Letters, 2, 129–146.CrossRefGoogle Scholar
  12. De Soete, G., & Carroll, J.D. (1994). k-means clustering in a low-dimensional Euclidean space. In E. Diday et al. (Eds.), New approaches in classification and data analysis (pp. 212–219). Heidelberg: Springer-Verlag.Google Scholar
  13. Dolničar, S., & Leisch, F. (2001). Behavioral market segmentation of binary guest survey data with bagged clustering. In G. Dorffner, H. Bischof, & K. Hornik (Eds.), ICANN 2001 (pp. 111–118). Berlin: Springer-Verlag.Google Scholar
  14. Gifi, A. (1990). Nonlinear multivariate analysis. Chichester, UK: Wiley.Google Scholar
  15. Green, P.E., Carmone, F.J., & Kim, J. (1990). A preliminary study of optimal variable weighting in k-means clustering. Journal of Classification, 7, 271–285.CrossRefGoogle Scholar
  16. Green, P.E., & Krieger, A.M. (1995). Alternative approaches to cluster-based market segmentation. Journal of the Market Research Society, 37, 221–239.Google Scholar
  17. Green, P.E., & Krieger, A.M. (1998). User’s Guide to HIERMAPR. The Wharton School. University of Pennsylvania.Google Scholar
  18. Green, P.E., Schaffer, C.M., & Patterson, K.M. (1988). A reduced-space approach to the clustering of categorical data in market segmentation. Journal of the Market Research Society, 30, 267–288.Google Scholar
  19. Greenacre, M.J. (1984). Theory and applications of correspondence analysis. London: Academic Press.Google Scholar
  20. Heiser, W.J. (1993). Clustering in low-dimensional space. In O. Opitz, B. Lausen, & R. Klar (Eds.), Information and classification: Concepts, methods, and applications(pp. 162–173). Heidelberg: Springer-Verlag.Google Scholar
  21. Hwang, H., & Takane, Y. (2002). Generalized constrained multiple correspondence analysis. Psychometrika, 67, 211–224.CrossRefGoogle Scholar
  22. Javalgi, R., Whipple, T., McManamon, M., & Edick, V. (1992). Hospital image: A correspondence analysis approach. Journal of Health Care Marketing, 12, 34–41.PubMedGoogle Scholar
  23. Kamakura, W.A., Kim, B., & Lee, J. (1996). Modeling preference and structural heterogeneity in consumer choice. Marketing Science, 15, 152–172.CrossRefGoogle Scholar
  24. Kruskal, J.B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27CrossRefGoogle Scholar
  25. Lebart, L. (1994). Complementary use of correspondence analysis and cluster analysis. In M. J., Greenacre, & J. Blasius (Eds.), Correspondence Analysis in the Social Sciences (pp. 162–178). London: Academic Press.Google Scholar
  26. Lebart, L., Morineau, A., & Warwick, K.M. (1984). Multivariate descriptive statistical analysis. New York: Wiley.Google Scholar
  27. Lloyd, S.P. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28, 129–37.CrossRefGoogle Scholar
  28. MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In L.M. Le Cam, & J. Neyman (Eds.), Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (pp. 281–297). Berkeley: University of California Press.Google Scholar
  29. Manton, K.G., Woodbury, M.A., & Tolley, H.D. (1994). Statistical applications using fuzzy sets. New York: Wiley.Google Scholar
  30. Mucha, H.-J. (2002). An intelligent clustering technique based on dual scaling. In S. Nishisato, Y. Baba, H. Bozdogan, & K. Kanefuji (Eds.), Measurement and multivariate analysis(pp. 37–46). Tokyo: Springer-Verlag.Google Scholar
  31. Nishisato, S. (1980). Analysis of categorical data: Dual scaling and its applications. Toronto: University of Toronto Press.Google Scholar
  32. Nishisato, S. (1984). Forced classification: A simple application of a quantitative technique. Psychometrika, 49, 25–36.CrossRefGoogle Scholar
  33. Nishisato, S. (1994). Elements of dual scaling: An introduction to practical data analysis. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
  34. Punj, G., & Stewart, D.W. (1983). Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research, 20, 134–148.CrossRefGoogle Scholar
  35. Ramsay, J.O. (1988). Monotone regression splines in action (with discussion). Statistical Science, 3, 425–461.Google Scholar
  36. Ramsay, J.O. (1998). Estimating smooth monotone functions. Journal of the Royal Statistical Society Series B, 60, 365–375.CrossRefGoogle Scholar
  37. Rovan, J. (1994). Visualizing solutions in more than two dimensions. In M. J. Greenacre, & J. Blasius (Eds.), Correspondence analysis in the social sciences (pp. 210–229). London: Academic Press.Google Scholar
  38. Steinley, D. (2003). Local optima in k-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294–302.PubMedCrossRefGoogle Scholar
  39. Van Buuren, S., & Heiser, W.J. (1989). Clustering n objects into k groups under optimal scaling of variables. Psychometrika, 54, 699–706.CrossRefGoogle Scholar
  40. Vichi, M., & Kiers, H.A.L. (2001). Factorial k-means analysis for two-way data. Computational Statistics and Data Analysis, 37, 49–64.CrossRefGoogle Scholar
  41. Wedel, M., & Kamakura, W.A. (1998). Market segmentation: Conceptual and methodological foundations. Boston: Kluwer Academic.Google Scholar
  42. Wind, Y. (1978). Issues and advances in segmentation research. Journal of Marketing Research, 15, 317–337.CrossRefGoogle Scholar
  43. Yanai, H. (1998). Generalized canonical correlation analysis with linear constraints. In C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock, & Y. Baba (Eds.), Data science, classification, and related methods (pp. 539–546). Tokyo: Springer-Verlag.Google Scholar

Copyright information

© The Psychometric Society 2006

Authors and Affiliations

  • Heungsun Hwang
    • 1
    • 3
  • Hec Montréal
    • 1
  • William R. Dillon
    • 1
  • Yoshio Takane
    • 2
  1. 1.Southern Methodist UniversityUSA
  2. 2.McGill UniversityCanada
  3. 3.Department of MarketingHEC MontréalMontréalCanada

Personalised recommendations