Skip to main content
Log in

Cluster Correspondence Analysis

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

A method is proposed that combines dimension reduction and cluster analysis for categorical data by simultaneously assigning individuals to clusters and optimal scaling values to categories in such a way that a single between variance maximization objective is achieved. In a unified framework, a brief review of alternative methods is provided and we show that the proposed method is equivalent to GROUPALS applied to categorical data. Performance of the methods is appraised by means of a simulation study. The results of the joint dimension reduction and clustering methods are compared with the so-called tandem approach, a sequential analysis of dimension reduction followed by cluster analysis. The tandem approach is conjectured to perform worse when variables are added that are unrelated to the cluster structure. Our simulation study confirms this conjecture. Moreover, the results of the simulation study indicate that the proposed method also consistently outperforms alternative joint dimension reduction and clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The presented results are based on the solution using 100 random starts. Increasing the number of random starts to 1000 led to small changes in the configuration that had no effect on the interpretation. The congruence indices with the current solution were 0.997, for the attributes, and 0.999, for the subjects.

References

  • Bäck, T. (1996). Evolutionary algorithms in theory and practice: Evolution strategies, evolutionary programming, genetic algorithms. Oxford: Oxford University Press.

    Google Scholar 

  • Borg, I., & Groenen, P. J. (2005). Modern multidimensional scaling: Theory and applications. New York: Springer.

    Google Scholar 

  • De Soete, G., & Carroll, J. D. (1994). K-means clustering in a low-dimensional euclidean space. In E. Diday, Y. Lechevallier, M. Schader, P. Bertrand, & B. Burtschy (Eds.), New approaches in classification and data analysis (pp. 212–219). Berlin: Springer.

    Chapter  Google Scholar 

  • Gifi, A. (1990). Nonlinear multivariate analysis. Chichester: Wiley.

    Google Scholar 

  • Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 27, 623–637.

    Article  Google Scholar 

  • Gower, J. C., Lubbe, S. G., & Le Roux, N. J. (2011). Understanding biplots. New York: Wiley.

    Book  Google Scholar 

  • Gower, J. C., Groenen, P. J. F., & van de Velden, M. (2010). Area biplots. Journal of Computational and Graphical Statistics, 19(1), 46–61.

    Article  Google Scholar 

  • Gower, J. C., & Hand, D. J. (1996). Biplots. London: Chapman and Hall.

    Google Scholar 

  • Greenacre, M. J. (1984). Theory and applications of correspondence analysis. London: Academic Press.

    Google Scholar 

  • Greenacre, M. J. (1993). Biplots in correspondence analysis. Journal of Applied Statistics, 20(2), 251–269.

    Article  Google Scholar 

  • Greenacre, M. J. (2007). Correspondence analysis in practice. Boca Raton: CRC Press.

    Book  Google Scholar 

  • Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218. doi:10.1007/BF01908075

  • Hwang, H., Dillon, W. R., & Takane, Y. (2006). An extension of multiple correspondence analysis for identifying heterogenous subgroups of respondents. Psychometrika, 71, 161–171.

    Article  Google Scholar 

  • Iodice D’Enza, A., & Palumbo, F. (2013). Iterative factor clustering of binary data. Computational Statistics, 789-807. doi:10.1007/s00180-012-0329-x

  • Iodice D’Enza, A., van de Velden, M., & Palumbo, F. (2014). On joint dimension reduction and clustering of categorical data. In D. Vicari, A. Okada, G. Ragozini, & C. Weihs (Eds.), Analysis and modeling of complex data in behavioral and social sciences. Berlin: Springer.

    Google Scholar 

  • Jolliffe, J. (2002). Principal component analysis. New York: Springer.

    Google Scholar 

  • Kroonenberg, P. M., & Lombardo, R. (1999). Nonsymmetric correspondence analysis: A tool for analysing contingency tables with a dependence structure. Multivariate Behavioral Research, 34, 367–396.

    Article  Google Scholar 

  • Lauro, N., & D’Ambra, L. (1984). L’ analyse non symetrique des correspondances [nonsymmetric correspondence analysis]. In E. Diday, L. Lebart, M. Jambu, & Thomassone (Eds.), Data analysis and informatics III (pp. 433–446). Amsterdam: Elsevier.

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In L. Cam & J. Neyman (Eds.), Proceedings of the fifth berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 281–297). California: University of California Press.

    Google Scholar 

  • Martin, R. A., Puhlik-Doris, P., Larsen, G., Gray, J., & Weir, K. (2003). Individual differences in uses of humor and their relation to psychological well-being: Development of the humor styles questionnaire. Journal of Research in Personality, 37(1), 48–75.

    Article  Google Scholar 

  • Nishisato, S. (1980). Analysis of categorical data: Dual scaling and its applications. Toronto: University of Toronto Press.

    Google Scholar 

  • Nishisato, S. (1994). Elements of dual scaling: An introduction to practical data analysis. Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • van de Velden, M., & Bijmolt, T. (2006). Generalized canonical correlation analysis of matrices with missing rows: A simulation study. Psychometrika, 71(2), 323–331.

    Article  PubMed  Google Scholar 

  • van de Velden, M., & Takane, Y. (2012). Generalized canonical correlation analysis with missing values. Computational Statistics, 27(3), 551–571.

    Article  Google Scholar 

  • Van Buuren, S., & Heiser, W. (1989). Clustering n objects into k groups under optimal scaling of variables. Psychometrika, 54, 699–706.

    Article  Google Scholar 

  • Vichi, M., & Kiers, H. A. L. (2001). Factorial k-means analysis for two-way data. Computational Statistics and Data Analysis, 37, 49–64.

    Article  Google Scholar 

  • Vichi, M., Vicari, D., & Kiers, H. (2009). Clustering and dimensional reduction for mixed variables. (Unpublished manuscript)

  • Yamamoto, M., & Hwang, H. (2014). A general formulation of cluster analysis with dimension reduction and subspace separation. Behaviormetrika, 41, 115–129.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. van de Velden.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 1126 KB)

Appendix: GROUPALS and Cluster CA

Appendix: GROUPALS and Cluster CA

To show the relationship between GROUPALS and cluster CA, we consider the GROUPALS objective function for the case where all variables are categorical.

$$\begin{aligned} \min \phi _{\text {groupals}}\left( {\mathbf {B},\mathbf {Z}_{K},\mathbf {G}}\right) =\frac{1}{p}\sum _{j=1}^{p}\left\| \mathbf {Z}_{K}\mathbf {G}-\mathbf {Z}_{j}\mathbf {B}_{j}\right\| ^{2}, \end{aligned}$$

subject to

$$\begin{aligned} \sum \limits _{j=1}^{p} \mathbf {B}_{j}^{^{\prime }}\mathbf {Z}_{j}^{^{\prime }}\mathbf {Z}_{j} \mathbf {B}_{j}=\mathbf {I}_{k}. \end{aligned}$$

We can solve this problem deriving the first-order conditions. These first-order conditions can be used to formulate an alternating least-squares algorithm. Thus, we fix \(\mathbf {Z}_{K}\) and solve for \(\mathbf {B}_{j}\) and \(\mathbf {G}\) by setting up the Lagrangian:

$$\begin{aligned} \psi&=\frac{1}{p}\sum _{j=1}^{p}\hbox {trace }\left( \mathbf {Z} _{K}\mathbf {G-Z}_{j}\mathbf {B}_{j}\right) ^{^{\prime }}\left( \mathbf {Z} _{K}\mathbf {G-Z}_{j}\mathbf {B}_{j}\right) +\hbox {trace } \mathbf {L}\left( \sum _{j=1}^{p}\mathbf {B}_{j}^{^{\prime }}\mathbf {D} _{j}\mathbf {B}_{j}-\mathbf {I}_{k}\right) \\&=\hbox {trace }\mathbf {G}^{^{\prime }}\mathbf {Z}_{K}^{^{\prime } }\mathbf {Z}_{K}\mathbf {G}+\frac{1}{p}\sum _{j=1}^{p}\hbox {trace } \mathbf {B}_{j}^{^{\prime }}\mathbf {Z}_{j}^{^{\prime }}\mathbf {Z}_{j} \mathbf {B}_{j}-\frac{2}{p}\sum _{j=1}^{p}\hbox {trace }\mathbf {G} ^{^{\prime }}\mathbf {Z}_{K}{}^{^{\prime }}\mathbf {Z}_{j}\mathbf {B} _{j}\\&\quad +\hbox {trace }\mathbf {L}\left( \sum _{j=1}^{p}\mathbf {B} _{j}^{^{\prime }}\mathbf {D}_{j}\mathbf {B}_{j}-\mathbf {I}_{k}\right) \\&=\hbox {trace }\mathbf {G}^{^{\prime }}\mathbf {Z}_{K}{}^{^{\prime } }\mathbf {Z}_{K}\mathbf {G}+\frac{k}{p}-\frac{2}{p}\sum _{j=1}^{p} \hbox {trace }\mathbf {G}^{^{\prime }}\mathbf {Z}_{K}{}^{^{\prime } }\mathbf {Z}_{j}\mathbf {B}_{j}+\hbox {trace }\mathbf {L}\left( \sum _{j=1}^{p}\mathbf {B}_{j}^{^{\prime }}\mathbf {D}_{j}\mathbf {B} _{j}-\mathbf {I}_{k}\right) , \end{aligned}$$

where \(\mathbf {L}\) is the matrix of Lagrange multipliers and \(\mathbf {D}_j = \mathbf {Z}_j^{\prime }\mathbf {Z}_j\). Taking derivatives and equating to zero yields the first-order conditions.

For \(\mathbf {G:}\)

$$\begin{aligned} 2\hbox { trace }\mathbf {G}^{^{\prime }}\mathbf {Z}_{K}{}^{^{\prime } }\mathbf {Z}_{K}d\mathbf {G}&\,\mathbf {=}\,\frac{2}{p}\sum _{j=1}^{p} \hbox {trace }\mathbf {B}_{j}^{^{\prime }}\mathbf {Z}_{j}^{^{\prime } }\mathbf {Z}_{K}d\mathbf {G}\\ \mathbf {G}^{^{\prime }}\mathbf {Z}_{K}{}^{^{\prime }}\mathbf {Z}_{K}&\,\mathbf {=}\,\frac{1}{p}\sum _{j=1}^{p}\mathbf {B}_{j}^{^{\prime }}\mathbf {Z} _{j}^{^{\prime }}\mathbf {Z}_{K}\\ \mathbf {G}&\,\mathbf {=}\,\frac{1}{p}\left( \mathbf {Z}_{K}{}^{^{\prime } }\mathbf {Z}_{K}\right) ^{-1}\mathbf {Z}_{K}{}^{^{\prime }}\sum _{j=1} ^{p}\mathbf {Z}_{j}\mathbf {B}_{j}. \end{aligned}$$

For \(\mathbf {B}_{j}:\)

$$\begin{aligned} \frac{2}{p}\hbox {trace }\mathbf {G}^{^{\prime }}\mathbf {Z}_{K} {}^{^{\prime }}\mathbf {Z}_{j}d\mathbf {B}_{j}&=2\hbox { trace } \mathbf {LB}_{j}^{^{\prime }}\mathbf {D}_{j}d\mathbf {B}_{j}\\ \frac{1}{p}\mathbf {Z}_{j}^{^{\prime }}\mathbf {Z}_{K}\mathbf {G}&=\mathbf {D}_{j}\mathbf {B}_{j}\mathbf {L}. \end{aligned}$$

Inserting the solution for \(\mathbf {G}\) we obtain

$$\begin{aligned} \frac{1}{p^{2}}\mathbf {Z}_{j}^{^{\prime }}\mathbf {Z}_{K}\left( \mathbf {Z} _{K}{}^{^{\prime }}\mathbf {Z}_{K}\right) ^{-1}\mathbf {Z}_{K}{}^{^{\prime }} \sum _{j=1}^{p}\mathbf {Z}_{j}\mathbf {B}_{j}=\mathbf {D}_{j}\mathbf {B} _{j}\mathbf {L}. \end{aligned}$$

Note that, as the constraints are symmetric, \(\mathbf {L}\) is also symmetric. Furthermore, as \(j=1,...,p\), we have p equations. However, defining \(\mathbf {Z}=\left[ \mathbf {Z}_{1},\ldots ,\mathbf {Z} _{p}\right] \) and \(\mathbf {B}=\left[ \mathbf {B}_{1}^{^{\prime }} ,\ldots ,\mathbf {B}_{p}^{^{\prime }}\right] ^{^{\prime }}\), the p equations can be expressed as

$$\begin{aligned} \frac{1}{p^{2}}\mathbf {Z}^{^{\prime }}\mathbf {Z}_{K}\left( \mathbf {Z}_{K} {}^{^{\prime }}\mathbf {Z}_{K}\right) ^{-1}\mathbf {Z}_{K}{}^{^{\prime } }\mathbf {ZB}=\mathbf {DBL}, \end{aligned}$$

where \(\mathbf {D}\) is a block-diagonal matrix with as diagonal blocks \(\mathbf {D}_{1},\ldots ,\mathbf {D}_{p}\).

Premultiplying both sides by \(\mathbf {D}^{-1/2}\) we get

$$\begin{aligned} \frac{1}{p^{2}}\mathbf {D}^{-1/2}\mathbf {Z}^{^{\prime }}\mathbf {Z}_{K}\left( \mathbf {Z}_{K}{}^{^{\prime }}\mathbf {Z}_{K}\right) ^{-1}\mathbf {Z}_{K} {}^{^{\prime }}\mathbf {Z}\mathbf {D}^{-1/2}{\mathbf {D}}^{1/2} \mathbf {B}=\mathbf {D}^{1/2}\mathbf {BL}\text {.} \end{aligned}$$

Without loss of generality we can replace \(\mathbf {L}\) by its eigendecomposition \(\mathbf {U}\varvec{\Lambda }\mathbf {U}^{\prime }\) to get

$$\begin{aligned} \frac{1}{p^{2}}\mathbf {D}^{-1/2}\mathbf {Z}^{^{\prime }}\mathbf {Z}_{K}\left( \mathbf {Z}_{K}{}^{^{\prime }}\mathbf {Z}_{K}\right) ^{-1}\mathbf {Z}_{K} {}^{^{\prime }}\mathbf {Z}\mathbf {D}^{-1/2}\mathbf {D}^{1/2}\mathbf {B}=\mathbf {D} ^{1/2}\mathbf {BU}{\varvec{\Lambda }} \mathbf {U}^{^{\prime }} \end{aligned}$$

so that

$$\begin{aligned} \frac{1}{p^{2}}\mathbf {D}^{-1/2}\mathbf {Z}^{^{\prime }}\mathbf {Z}_{K}\left( \mathbf {Z}_{K}{}^{^{\prime }}\mathbf {Z}_{K}\right) ^{-1}\mathbf {Z} _{K}^{^{\prime }}\mathbf {Z}\mathbf {D}^{-1/2}\mathbf {D}^{1/2}\mathbf {B}\mathbf {U} =\mathbf {D}^{1/2}\mathbf {BU}\varvec{\Lambda }. \end{aligned}$$

Hence, letting

$$\begin{aligned} \mathbf {B}^{*}={\mathbf {D}}^{1/2}\mathbf {B}\mathbf {U} \end{aligned}$$

we see that \(\mathbf {B}^{*}\) can be obtained by taking the first k orthonormal eigenvectors (corresponding to the k largest eigenvalues) of

$$\begin{aligned} \frac{1}{p^{2}}\mathbf {D}^{-1/2}\mathbf {Z}^{^{\prime }}\mathbf {Z}_{K}\left( \mathbf {Z}_{K}{}^{^{\prime }}\mathbf {Z}_{K}\right) ^{-1}\mathbf {Z}_{K} {}^{^{\prime }}\mathbf {Z}\mathbf {D}^{-1/2}. \end{aligned}$$
(23)

The appropriately standardized category quantifications become

$$\begin{aligned} \mathbf {B}=\mathbf {D}^{-1/2}\mathbf {B}^{*} \end{aligned}$$
(24)

and \(\mathbf {G}\) is obtained by inserting this into the first-order condition for \(\mathbf {G}\), that is,

$$\begin{aligned} \mathbf {G}\,\mathbf {=}\,\frac{1}{p}\left( \mathbf {Z}_{K}{}^{^{\prime }} \mathbf {Z}_{K}\right) ^{-1}\mathbf {Z}_{K}{}^{^{\prime }}\mathbf {ZB} . \end{aligned}$$
(25)

To find \(\mathbf {Z}_{K}\), recall the original objective function:

$$\begin{aligned} \min \phi _{\text {groupals}}\left( {\mathbf {B},\mathbf {Z}_{K},\mathbf {G}}\right) =\frac{1}{p}\sum _{j=1}^{p}\left\| \mathbf {Z}_{K}\mathbf {G-Z}_{j}\mathbf {B}_{j}\right\| ^{2}. \end{aligned}$$

For fixed \(\mathbf {B}_{j}\), this is equivalent to considering

$$\begin{aligned} \min \phi ^{\prime }_{\text {groupals}}\left( \mathbf {Z}_{K},\mathbf {G}\right) =\left\| \frac{1}{p}\sum _{j=1} ^{p}\mathbf {Z}_{j}\mathbf {B}_{j}-\mathbf {Z}_{K}\mathbf {G}\right\| ^{2}. \end{aligned}$$

Hence, to find \(\mathbf {Z}_{K}\) we can apply K-means to the “average configuration”: \(\frac{1}{p}\sum _{j=1}^{p}\mathbf {Z}_{j}\mathbf {B}_{j}\).

Note: It can easily be verified that \(\mathbf {D}^{1/2}\mathbf {1}\) is an eigenvector of (23) corresponding to the eigenvalue 1. Hence, as in CA and MCA, there is a so-called trivial first solution. Discarding this solution can be achieved by centering \(\mathbf {Z}\).

We can summarize the resulting GROUPALS algorithm as follows:

  1. 1.

    Generate an initial cluster allocation \(\mathbf {Z}_{K}\) (e.g., by randomly assigning subjects to clusters).

  2. 2.

    Use (23), (24) and (25) to obtain \(\mathbf {B}\) and \(\mathbf {G}\).

  3. 3.

    Apply the K-means algorithm to the average configuration \(\frac{1}{p}\sum _{j=1}^{p}\mathbf {Z}_{j}\mathbf {B}_{j}\), using \(\mathbf {G}\) for the initial cluster means, to update \(\mathbf {Z}_{K}\) and \(\mathbf {G}\).

  4. 4.

    Return to step 2 and repeat until convergence.

Comparing this algorithm to the cluster CA algorithm in Sect. 3 shows that despite the different objectives, the two approaches lead to the same algorithm when all variables are categorical.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

van de Velden, M., D’Enza, A.I. & Palumbo, F. Cluster Correspondence Analysis. Psychometrika 82, 158–185 (2017). https://doi.org/10.1007/s11336-016-9514-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-016-9514-0

Keywords

Navigation