Abstract
Many well-known measures for the comparison of distinct partitions of the same set ofn objects are based on the structure of class overlap presented in the form of a contingency table (e.g., Pearson's chi-square statistic, Rand's measure, or Goodman-Kruskal'sτ b ), but they all can be rephrased through the use of a simple cross-product index defined between the corresponding entries from twon ×n proximity matrices that provide particular a priori (numerical) codings of the within- and between-class relationships for each of the partitions. We consider the task of optimally constructing the proximity matrices characterizing the partitions (under suitable restriction) so as to maximize the cross-product measure, or equivalently, the Pearson correlation between their entries. The major result presented states that within the broad classes of matrices that are either symmetric, skew-symmetric, or completely arbitrary, optimal representations are already derivable from what is given by a simple one-dimensional correspondence analysis solution. Besides severely limiting the type of structures that might be of interest to consider for representing the proximity matrices, this result also implies that correspondence analysis beyond one dimension must always be justified from logical bases other than the optimization of a single correlational relationship between the matrices representing the two partitions.
Similar content being viewed by others
References
Benzécri, J. P. (1973).L'analyse des données, Volume 1: La taxinomie, Volume 2: L'analyse des correspondances [The analysis of data, Volume 1: Taxonomy, Volume 2: Correspondence analysis]. Paris: Dunod.
Goodman, L. A., & Kruskal, W. H. (1954). Measures of association for cross-classifications.Journal of the American Statistical Association, 49, 732–764.
Greenacre, M. (1984).Theory and applications of correspondence analysis. London: Academic Press.
Hayashi, C. (1950). On the quantification of qualitative data from the mathematico-statistical point of view.Annals of the Institute of Statistical Mathematics, 2, 35–47.
Heiser, W. J. (1981).Unfolding analysis of proximity data. Leiden: University of Leiden, Department of Data Theory.
Heiser, W. J., & Meulman, J. (1983). Analyzing rectangular tables by joint and constrained multidimensional scaling.Journal of Econometrics, 22, 139–167.
Hill, M. O. (1973). Reciprocal averaging: An eigenvector method of ordination.Journal of Ecology, 61, 237–251.
Hill, M. O. (1974). Correspondence analysis: A neglected multivariate method.Journal of the Royal Statistical Society, Series C, Applied Statistics, 23, 340–354.
Hill, M. O. (1982). Correspondence analysis. In S. Klotz, N. L. Johnson, & C. B. Read (Eds.),Encyclopedia of statistical sciences, Volume 2 (pp. 204–210). New York: Wiley.
Hubert, L. J. (1987).Assignment methods in combinatorial data analysis. New York: Marcel Dekker.
Hubert, L. J., & Arabie, P. (1985). Comparing partitions.Journal of Classification, 2, 193–218.
Hubert, L. J., & Arabie, P. (1986). Unidimensional scaling and combinatorial optimization. In J. de Leeuw, W. Heiser, J. Meulman, & F. Critchley (Eds.),Multidimensional data analysis (pp. 181–196). Leiden: DSWO Press.
Lebart, L., Morineau, A., & Warwick, K. M. (1984).Multivariate descriptive statistical analysis: Correspondence analysis and related techniques for large matrices (E. M. Berry, Trans.). New York: Wiley. (Original work published in 1977)
Milligan, G. W., & Cooper, M. C. (1986). A study of the comparability of external criteria for hierarchical cluster analysis.Multivariate Behavioral Research, 21, 441–458.
Mirkin, B. G. (1979).Group choice (P. C. Fishburn, Ed.: V. Oliker, Trans.). Washington, DC: V. H. Winston. (Original work published in 1974)
Nishisato, S. (1980).Analysis of categorical data: Dual scaling and its applications. Toronto: University of Toronto Press.
Nishisato, S. (1986).Quantification of categorical data: A bibliography 1975–1986. Toronto: Microstats.
Torgerson, W. S. (1958).Theory and methods of scaling. New York: Wiley.
Author information
Authors and Affiliations
Additional information
This research was supported in part by a grant from American Telephone and Telegraph (AT&T) to the Industrial Affiliates Program of the University of Illinois. The acting Editor for this manuscript was Shizuhiko Nishisato.
Rights and permissions
About this article
Cite this article
Hubert, L., Arabie, P. Correspondence analysis and optimal structural representations. Psychometrika 57, 119–140 (1992). https://doi.org/10.1007/BF02294662
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02294662