Abstract
In this paper, we consider a cluster analysis for complete rankings of N items that aims to identify typical groups of rank choices. The “K-means” procedure based on Lee distance is studied in details and several asymptotical results for large values of N are derived. An algorithm for approximating the normalizing constant in the clustering procedure is proposed by using some properties of Lee distance. In order to compare the clustering method based on Lee distance to those based on other distances on permutations, we apply the presented procedure to a data set obtained from the results of the American Psychological Association presidential election.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bradley, R., Terry, M.: Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39, 324–345 (1952)
Busse, L.M., Orbanz, P., Buhmann, J.M.: Cluster analysis of heterogeneous rank data. In: 24-th International Conference on Machine Learning (ICML 2007), pp. 113–120 (2007)
Critchlow, D.E.: Metric Methods for Analyzing Partially Ranked Data. Lecture Notes in Statistics, vol. 34. Springer, New York (1985)
Deza, M., Huang, T.: Metrics on permutations, a survey. J. Comb. Inf. Syst. Sci. 23, 173–185 (1998)
Diaconis, P.: Group Representations in Probability and Statistics. IMS Lecture Notes - Monograph Series, vol. 11. Hayward, California (1988)
Diaconis, P.: A generalisation of spectral analysis with application to ranked data. Ann. Stat. 17, 949–979 (1989)
Fligner, M., Verducci, T.: Distance based ranking models. J. R. Stat. Soc. 48, 359–369 (1986)
Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975)
Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 1, 2nd edn. Wiley, New York (1995)
Klementiev, A., Roth, D., Small, K.: An unsupervised learning algorithm for rank aggregation. In: Proceedings of the European Conference on Machine Learning, pp. 616–623 (2007)
Mallows, C.M.: Non-null ranking models. I. Biometrika 44, 114–130 (1957)
Marden, J.I.: Analyzing and Modeling Rank Data. Monographs on Statistics and Applied Probability, vol. 64. Chapman & Hall (1995)
Murphy, T.B., Martin, D.: Mixtures of distance-based models for ranking data. Comput. Stat. Data Anal. 41, 645–655 (2003)
Nikolov, N.I., Stoimenova, E.: Asymptotic properties of Lee distance. Metrika 82, 385–408 (2018)
Thurstone, L.: A law of comparative judgment. Psychol. Rev. 34, 273 (1927)
Acknowledgements
The work of N.N. was supported by the National Science Fund of Bulgaria under Grant KP-06-N32/8. The work of E.S. has been partially supported by Grant No BG05M2OP001-1.001-0003, financed by the Science and Education for Smart Growth Operational Program (2014-2020) and co-financed by the European Union through the European structural and Investment funds.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nikolov, N.I., Stoimenova, E. (2021). Rank Data Clustering Based on Lee Distance. In: Georgiev, I., Kostadinov, H., Lilkova, E. (eds) Advanced Computing in Industrial Mathematics. BGSIAM 2018. Studies in Computational Intelligence, vol 961. Springer, Cham. https://doi.org/10.1007/978-3-030-71616-5_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-71616-5_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71615-8
Online ISBN: 978-3-030-71616-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)