Abstract
When clustering asymmetric proximity data, only the average amounts are often considered by assuming that the asymmetry is due to noise. But when the asymmetry is structural, as typically may happen for exchange flows, migration data or confusion data, this may strongly affect the search for the groups because the directions of the exchanges are ignored and not integrated in the clustering process. The clustering model proposed here relies on the decomposition of the asymmetric dissimilarity matrix into symmetric and skew-symmetric effects both decomposed in within and between cluster effects. The classification structures used here are generally based on two different partitions of the objects fitted to the symmetric and the skew-symmetric part of the data, respectively; the restricted case is also presented where the partition fits jointly both of them allowing for clusters of objects similar with respect to the average amounts and directions of the data. Parsimonious models are presented which allow for effective and simple graphical representations of the results.
Similar content being viewed by others
References
ARABIE, P., SCHLEUTERMANN, S., DAWS, J., and HUBERT, L. (1988), “Marketing Applications of Sequencing and Partitioning of Nonsymmetric and/or Two-Mode Matrices”, in Data Analysis, Decision Support and Expert Knowledge Representation in Marketing, eds. W. Gaul and M. Schader, Hiedelberg: Springer Verlag, pp. 215–224.
BORG, I., and GROENEN, P. (2005), Modern Multidimensional Scaling. Theory and Applications (2nd ed.), Berlin: Springer.
BOTH, M., and GAUL, W. (1986), “Ein Vergleich Zweimodaler Clusteranalyseverfahren”, Mathematical Methods of Operations Research, 57, 593–605.
BROSSIER, G. (1982), “Classification Hiérarchique à Partir de Matrices Carrées Non Symétriques”, Statistiques et Analyse des Données, 7, 22–40.
BUJA, A., and SWAYNE, D.F. (2002), “Visualization Methodology for Multidimensional Scaling”, Journal of Classification, 19, 7–44.
CARROLL, J.D., and WISH, M. (1974), “Multidimensional Perceptual Models and Measurement Methods”, in Handbook of Perception (Vol. II), eds. E.C. Carterette and M.P. Friedman, New York: Academic, pp. 391–447.
CONSTANTINE, A.G., and GOWER, J.C. (1978), “Graphic Representations of Asymmetric Matrices”, Applied Statistics, 27, 297–304.
DESARBO, W.S. (1982), “GENNCLUS: New Model for General Non-Hierarchical Clustering Analysis”, Psychometrika, 47, 449–475.
DESARBO, W.S., and DE SOETE, G. (1984), “On the Use of Hierarchical Clustering for the Analysis of Nonsymmetric Proximities”, Journal of Consumer Research, 11, 601–610.
DESARBO, W.S., MANRAI, A.K., and BURKE, R.R. (1990), “A Nonspatial Methodology for the Analysis of Two-Way Proximity Data Incorporating the Distance–Density Hypothesis”, Psychometrika, 55, 229–253.
DE SOETE, G., DESARBO, W.S., FURNAS, G.W., and CARROLL, J.D. (1984), “The Estimation of Ultrametric and Path Length Trees from Rectangular Proximity Data”, Psychometrika, 49, 289–310.
ECKES, T., and ORLIK, P. (1993), “An Error Variance Approach to Two-Mode Hierarchical Clustering”, Journal of Classification, 10, 51–74.
ESCOUFIER, Y., and GRORUD, A. (1980), “Analyse Factorielle des Matrices Carrées Non-Symetriques”, in Data Analysis and Informatics, eds. E. Diday et al., North Holland: Amsterdam, pp. 263–276.
FERSHTMAN, M. (1997), “Cohesive Group Detection in a Social Network by the Segregation Matrix Index”, Social Networks, 19, 193–207.
FURNAS, G.W. (1980), “Objects and Their Features: The Metric Representation of Two Class Data”, unpublished doctoral dissertation, Stanford University.
FUJIWARA, H. (1980), “Methods for Cluster Analysis Using Asymmetric Measures and Homogeneity Coefficient”, The Japanese Journal of Behaviormetrics, 7, 12–21.
GOWER, J.C. (1977), “The Analysis of Asymmetry and Orthogonality”, in Recent Developments in Statistics, eds. J.R. Barra, F. Brodeau, G. Romier, and B. Van Cutsem, North Holland: Amsterdam, pp. 109–123.
HUBERT, L. (1973), “Min and Max Hierarchical Clustering Using Asymmetric Similarity Measures”, Psychometrika, 38, 63–72.
HUBERT, L., and ARABIE, P. (1985), “Comparing Partitions”, Journal of Classification, 2, 193–218.
HUBERT L.J., ARABIE P., and MEULMAN J. (2001), Combinatorial Data Analysis: Optimization by Dynamic Programming, Philadelphia: SIAM.
HUBERT L.J., ARABIE P., and MEULMAN J. (2006), The Structural Representation of Proximity Matrices with MATLAB, Philadelphia: SIAM.
MCCORMICK, W.T., SCHWEITZER, P.J., and WHITE, T.W. (1972), “Problem Decomposition and Data Reorganization by a Clustering Technique”, Operations Research, 20, 993–1009.
OKADA, A. and IWAMOTO, T. (1996), “University Enrollment Flow Among the Japanese Prefectures: A Comparison Before and After the Joint First Stage Achievement Test by Asymmetric Cluster Analysis”, Behaviormetrika, 23, 169–185.
OZAWA, K. (1983), “Classic: A Hierarchical Clustering Algorithm Based on Asymmetric Similarities”, Pattern Recognition, 16, 201–211.
REITZ, K.P. (1988), “Social Groups in a Monastery”, Social Networks, 10, 343–357.
ROCCI, R. and BOVE, G. (2002), “Rotation Techniques in Asymmetric Multidimensional Scaling”, Journal of Computational and Graphical Statistics, 11, 405–419.
ROTHKOPF, E.Z. (1957), “A Measure of Stimulus Similarity and Errors in Some Paired Associate Learning”, Journal of Experimental Psychology, 53, 94–101.
RUBIN, J. (1967), “Optimal Classification into Groups: An Approach to Solve the Taxonomy Problem”, Journal of Theoretical Biology, 15, 103–144.
SAITO, T., and YADOHISA, H. (2005), Data Analysis of Asymmetric Structures. Advanced Approaches in Computational Statistics, New York: Marcel Dekker.
SOKAL, R.R., and ROHLF, F.J. (1962), “The Comparison of Dendrograms by Objective Methods”, Taxon, 11, 33-40.
SHEPARD, R.N., and ARABIE, P. (1979), “Additive Clustering: Representation of Similarities as Combinations of Discrete Overlapping Properties”, Psychological Review, 86, 87–123.
TAKEUCHI, A., SAITO, T., and YADOHISA, H. (2007), “Asymmetric Agglomerative Hierarchical Clustering Algorithms and Their Evaluations”, Journal of Classification, 24, 123–143.
VICARI, D., and VICHI M. (2000), “Non-Hierarchical Classification Structures”, in Data Analysis, Studies in Classification Data Analysis and Knowledge Organization, eds. W. Gaul, O. Opitz, and M. Schader, Berline: Springer, pp.51–66.
VICHI, M. (2008), “Fitting Semiparametric Clustering Models to Dissimilarity Data”, Advances in Data Analysis and Classification, 2, 121–161.
YADOHISA, H. (2002), “Formulation of Asymmetric Agglomerative Hierarchical Clustering and Graphical Representation of Its Result”, Bulletin of the Computational, Statistics of Japan, 15, 309–316.
ZIELMAN, B., and HEISER, W.J. (1993), “Analysis of Asymmetry by a Slide-Vector”, Psychometrika, 58, 101–114.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vicari, D. Classification of Asymmetric Proximity Data. J Classif 31, 386–420 (2014). https://doi.org/10.1007/s00357-014-9159-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-014-9159-6