Abstract
We propose a new measure to evaluate the distance between subjects expressing their preferences by rankings in order to segment them by hierarchical cluster analysis. The proposed index builds upon the Spearman’s grade correlation coefficient on a transformation, operated by the copula function, of the position/rank denoting the level of the importance assigned by subjects under classification to k objects. In particular, by using the copula functions with tail dependence we obtain an index suitable for emphasizing the agreement on top ranks, when the top ranks are considered more important than the lower ones. We evaluate the performance of our proposal by an example on simulated data, showing that the resulting groups contain subjects whose preferences are more similar on the most important ranks. A further application with real data confirms the pertinence and the importance of our proposal.
Similar content being viewed by others
Notes
See Nelsen (2013, pp. 169–170) for the definition of the Spearman’s grade correlation coefficient for continuous random variables.
References
Alvo, M., Yu, P.L.H.: Statistical Methods for Ranking Data. Springer, New York (2014)
Biernacki, C., Jacques, J.: A generative model for rank data based on insertion sort algorithm. Comput. Stat. Data Anal. 58, 162–176 (2013)
Bonanomi, A., Cantaluppi, G., Nai Ruscone, M., Osmetti, S.A.: A new estimator of Zumbo’s Ordinal Alpha: a copula approach. Qual. Quant. 49, 941–953 (2015)
Brentari, E., Dancelli, L., Manisera, M.: Clustering ranking data in market segmentation: a case study on the Italian McDonald’s customers’ preferences. J. Appl. Stat. 43(11), 1–18 (2016)
Critchlow, D.E., Fligner, M.A., Verducci, J.S.: Probability models on rankings. J. Math. Psychol. 35, 294–318 (1991)
Critchlow, D., Verducci, J.: Detecting a trend in paired rankings. Appl. Stat. 41, 17–29 (1992)
da Costa, P.J., Solares, C.: A weighted rank measure of correlation. Aust. N Z. J. Stat. 47, 515–529 (2005)
Dancelli, L., Manisera, M., Vezzoli, M.: Weighted Rank Correlation measures in hierarchical cluster analysis. Book of Short Papers JCS-CLADAG 2012. Anacapri (2012)
Dancelli, L., Manisera, M., Vezzoli, M.: On two classes of Weighted Rank Correlation measures deriving from the Spearman’s \(\rho\). In: Statistical Model and Data Analysis, pp. 107–114. Springer, New York (2013)
Diaconis, P.: Group Representations in Probability and Statistics. Lecture Notes-Monograph Series, pp. 1–192. Institute of Mathematical Statistics, Hayward (1988)
Emond, E., Mason, D.: A new rank correlation coefficient with application to the consensus ranking problem. J. Multi-Criteria Decis. Anal. 11, 17–28 (2002)
Feigin, P.D.: Modelling and analysing paired ranking data. In: Fligner, M.A., Verducci, J.S. (eds.) Probability Models and Statistical Analyses for Ranking Data, pp. 75–91. Springer, New York (1993)
Jacques, J., Biernacki, C.: Model-based clustering for multivariate partial ranking data. J. Stat. Plan. Inference 149, 201–217 (2014)
Joe, H.: Multivariate Models and Dependence Concepts. Chapman & Hall, Boca Raton (1997)
Kojadinovic, I.: Hierarchical clustering of continuous variables based on the empirical copula process and permutation linkages. Comput. Stat. Data Anal. 54, 90–108 (2010)
Lee, P.H., Yu, P.L.H.: Mixtures of weighted distance-based models for ranking data with applications in political studies. Comput. Stat. Data Anal. 56, 2486–2500 (2012)
Mallows, C.L.: Non-null ranking models. I. Biometrika 44, 114–130 (1957)
Nelsen, R.B.: An Introduction to Copulas. Springer, New York (2013)
Quade, D., Salama, I.: A survey of weighted rank correlation. In: Order Statistics and Nonparametrics: Theory and Applications, pp. 213–224. Elsevier, Amsterdam (1992)
Shieh, G.S.: A weighted Kendall’s tau statistic. Stat. Probab. Lett. 39, 17–24 (1988)
Sklar, A.W.: Fonctions de répartition à n dimension et leurs marges. Publ. Inst. Stat. Univ. Paris 8, 229–231 (1959)
Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101 (1904)
Tarsitano, A.: Weighted rank correlation and hierarchical clustering. In: Book of Short Papers CLADAG. Parma, pp. 517–521 (2005)
Zani, S., Cerioli, A.: Analisi dei dati e data mining per le decisioni aziendali. Giuffrè Editore, Milano (2007)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bonanomi, A., Nai Ruscone, M. & Osmetti, S.A. Defining subjects distance in hierarchical cluster analysis by copula approach. Qual Quant 51, 859–872 (2017). https://doi.org/10.1007/s11135-016-0444-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-016-0444-9