Rank Data Clustering Based on Lee Distance

Nikolov, Nikolay I.; Stoimenova, Eugenia

doi:10.1007/978-3-030-71616-5_27

Nikolay I. Nikolov⁵ &
Eugenia Stoimenova⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 961))

Included in the following conference series:

Annual Meeting of the Bulgarian Section of SIAM

258 Accesses

Abstract

In this paper, we consider a cluster analysis for complete rankings of N items that aims to identify typical groups of rank choices. The “K-means” procedure based on Lee distance is studied in details and several asymptotical results for large values of N are derived. An algorithm for approximating the normalizing constant in the clustering procedure is proposed by using some properties of Lee distance. In order to compare the clustering method based on Lee distance to those based on other distances on permutations, we apply the presented procedure to a data set obtained from the results of the American Psychological Association presidential election.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bradley, R., Terry, M.: Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39, 324–345 (1952)
MathSciNet MATH Google Scholar
Busse, L.M., Orbanz, P., Buhmann, J.M.: Cluster analysis of heterogeneous rank data. In: 24-th International Conference on Machine Learning (ICML 2007), pp. 113–120 (2007)
Google Scholar
Critchlow, D.E.: Metric Methods for Analyzing Partially Ranked Data. Lecture Notes in Statistics, vol. 34. Springer, New York (1985)
Book Google Scholar
Deza, M., Huang, T.: Metrics on permutations, a survey. J. Comb. Inf. Syst. Sci. 23, 173–185 (1998)
MathSciNet MATH Google Scholar
Diaconis, P.: Group Representations in Probability and Statistics. IMS Lecture Notes - Monograph Series, vol. 11. Hayward, California (1988)
Google Scholar
Diaconis, P.: A generalisation of spectral analysis with application to ranked data. Ann. Stat. 17, 949–979 (1989)
Article Google Scholar
Fligner, M., Verducci, T.: Distance based ranking models. J. R. Stat. Soc. 48, 359–369 (1986)
MathSciNet MATH Google Scholar
Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975)
MATH Google Scholar
Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 1, 2nd edn. Wiley, New York (1995)
MATH Google Scholar
Klementiev, A., Roth, D., Small, K.: An unsupervised learning algorithm for rank aggregation. In: Proceedings of the European Conference on Machine Learning, pp. 616–623 (2007)
Google Scholar
Mallows, C.M.: Non-null ranking models. I. Biometrika 44, 114–130 (1957)
Article MathSciNet Google Scholar
Marden, J.I.: Analyzing and Modeling Rank Data. Monographs on Statistics and Applied Probability, vol. 64. Chapman & Hall (1995)
Google Scholar
Murphy, T.B., Martin, D.: Mixtures of distance-based models for ranking data. Comput. Stat. Data Anal. 41, 645–655 (2003)
Article MathSciNet Google Scholar
Nikolov, N.I., Stoimenova, E.: Asymptotic properties of Lee distance. Metrika 82, 385–408 (2018)
Article MathSciNet Google Scholar
Thurstone, L.: A law of comparative judgment. Psychol. Rev. 34, 273 (1927)
Article Google Scholar

Download references

Acknowledgements

The work of N.N. was supported by the National Science Fund of Bulgaria under Grant KP-06-N32/8. The work of E.S. has been partially supported by Grant No BG05M2OP001-1.001-0003, financed by the Science and Education for Smart Growth Operational Program (2014-2020) and co-financed by the European Union through the European structural and Investment funds.

Author information

Authors and Affiliations

Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Acad. G. Bontchev Street, Block 8, 1113, Sofia, Bulgaria
Nikolay I. Nikolov & Eugenia Stoimenova

Authors

Nikolay I. Nikolov
View author publications
You can also search for this author in PubMed Google Scholar
Eugenia Stoimenova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikolay I. Nikolov .

Editor information

Editors and Affiliations

Institute of Information and Communication Technologies and Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Sofia, Bulgaria
Ivan Georgiev
Institute of Mathematics and Informatics, Bulgarian Academy of Science, Sofia, Bulgaria
Hristo Kostadinov
Institute of Information and Communication Technologies, Bulgarian Academy of Science, Sofia, Bulgaria
Elena Lilkova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nikolov, N.I., Stoimenova, E. (2021). Rank Data Clustering Based on Lee Distance. In: Georgiev, I., Kostadinov, H., Lilkova, E. (eds) Advanced Computing in Industrial Mathematics. BGSIAM 2018. Studies in Computational Intelligence, vol 961. Springer, Cham. https://doi.org/10.1007/978-3-030-71616-5_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-71616-5_27
Published: 04 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71615-8
Online ISBN: 978-3-030-71616-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics