Skip to main content
Log in

Defining subjects distance in hierarchical cluster analysis by copula approach

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

We propose a new measure to evaluate the distance between subjects expressing their preferences by rankings in order to segment them by hierarchical cluster analysis. The proposed index builds upon the Spearman’s grade correlation coefficient on a transformation, operated by the copula function, of the position/rank denoting the level of the importance assigned by subjects under classification to k objects. In particular, by using the copula functions with tail dependence we obtain an index suitable for emphasizing the agreement on top ranks, when the top ranks are considered more important than the lower ones. We evaluate the performance of our proposal by an example on simulated data, showing that the resulting groups contain subjects whose preferences are more similar on the most important ranks. A further application with real data confirms the pertinence and the importance of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. See Nelsen (2013, pp. 169–170) for the definition of the Spearman’s grade correlation coefficient for continuous random variables.

References

  • Alvo, M., Yu, P.L.H.: Statistical Methods for Ranking Data. Springer, New York (2014)

    Book  Google Scholar 

  • Biernacki, C., Jacques, J.: A generative model for rank data based on insertion sort algorithm. Comput. Stat. Data Anal. 58, 162–176 (2013)

    Article  Google Scholar 

  • Bonanomi, A., Cantaluppi, G., Nai Ruscone, M., Osmetti, S.A.: A new estimator of Zumbo’s Ordinal Alpha: a copula approach. Qual. Quant. 49, 941–953 (2015)

    Article  Google Scholar 

  • Brentari, E., Dancelli, L., Manisera, M.: Clustering ranking data in market segmentation: a case study on the Italian McDonald’s customers’ preferences. J. Appl. Stat. 43(11), 1–18 (2016)

    Article  Google Scholar 

  • Critchlow, D.E., Fligner, M.A., Verducci, J.S.: Probability models on rankings. J. Math. Psychol. 35, 294–318 (1991)

    Article  Google Scholar 

  • Critchlow, D., Verducci, J.: Detecting a trend in paired rankings. Appl. Stat. 41, 17–29 (1992)

    Article  Google Scholar 

  • da Costa, P.J., Solares, C.: A weighted rank measure of correlation. Aust. N Z. J. Stat. 47, 515–529 (2005)

    Article  Google Scholar 

  • Dancelli, L., Manisera, M., Vezzoli, M.: Weighted Rank Correlation measures in hierarchical cluster analysis. Book of Short Papers JCS-CLADAG 2012. Anacapri (2012)

  • Dancelli, L., Manisera, M., Vezzoli, M.: On two classes of Weighted Rank Correlation measures deriving from the Spearman’s \(\rho\). In: Statistical Model and Data Analysis, pp. 107–114. Springer, New York (2013)

  • Diaconis, P.: Group Representations in Probability and Statistics. Lecture Notes-Monograph Series, pp. 1–192. Institute of Mathematical Statistics, Hayward (1988)

    Google Scholar 

  • Emond, E., Mason, D.: A new rank correlation coefficient with application to the consensus ranking problem. J. Multi-Criteria Decis. Anal. 11, 17–28 (2002)

    Article  Google Scholar 

  • Feigin, P.D.: Modelling and analysing paired ranking data. In: Fligner, M.A., Verducci, J.S. (eds.) Probability Models and Statistical Analyses for Ranking Data, pp. 75–91. Springer, New York (1993)

    Chapter  Google Scholar 

  • Jacques, J., Biernacki, C.: Model-based clustering for multivariate partial ranking data. J. Stat. Plan. Inference 149, 201–217 (2014)

    Article  Google Scholar 

  • Joe, H.: Multivariate Models and Dependence Concepts. Chapman & Hall, Boca Raton (1997)

    Book  Google Scholar 

  • Kojadinovic, I.: Hierarchical clustering of continuous variables based on the empirical copula process and permutation linkages. Comput. Stat. Data Anal. 54, 90–108 (2010)

    Article  Google Scholar 

  • Lee, P.H., Yu, P.L.H.: Mixtures of weighted distance-based models for ranking data with applications in political studies. Comput. Stat. Data Anal. 56, 2486–2500 (2012)

    Article  Google Scholar 

  • Mallows, C.L.: Non-null ranking models. I. Biometrika 44, 114–130 (1957)

    Article  Google Scholar 

  • Nelsen, R.B.: An Introduction to Copulas. Springer, New York (2013)

    Google Scholar 

  • Quade, D., Salama, I.: A survey of weighted rank correlation. In: Order Statistics and Nonparametrics: Theory and Applications, pp. 213–224. Elsevier, Amsterdam (1992)

  • Shieh, G.S.: A weighted Kendall’s tau statistic. Stat. Probab. Lett. 39, 17–24 (1988)

    Article  Google Scholar 

  • Sklar, A.W.: Fonctions de répartition à n dimension et leurs marges. Publ. Inst. Stat. Univ. Paris 8, 229–231 (1959)

    Google Scholar 

  • Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101 (1904)

    Article  Google Scholar 

  • Tarsitano, A.: Weighted rank correlation and hierarchical clustering. In: Book of Short Papers CLADAG. Parma, pp. 517–521 (2005)

  • Zani, S., Cerioli, A.: Analisi dei dati e data mining per le decisioni aziendali. Giuffrè Editore, Milano (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marta Nai Ruscone.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bonanomi, A., Nai Ruscone, M. & Osmetti, S.A. Defining subjects distance in hierarchical cluster analysis by copula approach. Qual Quant 51, 859–872 (2017). https://doi.org/10.1007/s11135-016-0444-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-016-0444-9

Keywords

Navigation