Understanding the Rand Index

Warrens, Matthijs J.; van der Hoef, Hanneke

doi:10.1007/978-981-15-3311-2_24

Matthijs J. Warrens²³ &
Hanneke van der Hoef²⁴

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

1077 Accesses
7 Citations

Abstract

The Rand index continues to be one of the most popular indices for assessing agreement between two partitions. The Rand index combines two sources of information, object pairs put together, and object pairs assigned to different clusters, in both partitions. Via a decomposition of the Rand index into four asymmetric indices, we show that in many situations object pairs that were assigned to different clusters have considerable impact on the value of the overall Rand index.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Albatineh, A.N., Niewiadomska-Bugaj, M., Mihalko, D.: On similarity indices and correction for chance agreement. J. Classif. 23, 301–313 (2006)
Article MathSciNet Google Scholar
Albatineh, A.N., Niewiadomska-Bugaj, M.: Correcting Jaccard and other similarity indices for chance agreement in cluster analysis. Adv. Data Anal. Classif. 5, 179–200 (2011)
Article MathSciNet Google Scholar
Anderson, D.T., Bezdek, J.C., Popescu, M., Keller, J.M.: Comparing fuzzy, probabilistic, and possibilistic partitions. IEEE Trans. Fuzzy Syst. 18, 906–917 (2010)
Article Google Scholar
Baulieu, F.B.: A classification of presence/absence based dissimilarity coefficients. J. Classif. 6, 233–246 (1989)
Article MathSciNet Google Scholar
Brun, M., Sima, C., Hua, J., Lowey, J., Carroll, B., Suh, E., Dougherty, E.R.: Model-based evaluation of clustering validation measures. Pattern Recogn. 40, 807–824 (2007)
Article Google Scholar
Dubey, A.K., Gupta, U., Jain, S.: Analysis of k-means clustering approach on the breast cancer Wisconsin dataset. Int. J. Comput. Assist. Radiol. Surg. 11, 2033–2047 (2016)
Article Google Scholar
Fowlkes, E.B., Mallows, C.L.: A method for comparing two hierarchical clusterings. J. Am. Stat. Assoc. 78, 553–569 (1983)
Article Google Scholar
Gower, J.C., Warrens, M.J.: Similarity, dissimilarity, and distance, measures of. Wiley StatsRef: Statistics Reference Online (2017)
Google Scholar
Heiser, W.J., Warrens, M.J.: Families of relational statistics for 2×2 tables. In: Kaul, H., Mulder, H.M. (eds.) Advances in Interdisciplinary Applied Discrete Mathematics, pp. 25–52. World Scientific, Singapore (2010)
Chapter Google Scholar
Hennig, C., Meilă, M., Murtagh, F., Rocci, R.: Handbook of Cluster Analysis. Chapman and Hall/CRC, New York (2015)
Book Google Scholar
Hubert, L.J., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)
Article Google Scholar
Huo, Z., Ding, Y., Liu, S., Oesterreich, S., Tseng, G.: Meta-analytic framework for sparse K-means to identify disease subtypes in multiple transcriptomic studies. J. Am. Stat. Assoc. 111, 27–52 (2016)
Article MathSciNet Google Scholar
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31, 651–666 (2010)
Article Google Scholar
Katiyar, P., Divine, M.R., Kohlhofer, U., Quintanilla-Martinez, L., Schölkopf, B., Pichler, B.J., Disselhorst, J.A.: Spectral clustering predicts tumor tissue heterogeneity using dynamic 18F-FDG PET: a complement to the standard compartmental modeling approach. J. Nucl. Med. 57, 651–657 (2016)
Google Scholar
Kaufman, L., Rousseeuw, P.: Finding groups in data: an introduction to cluster analysis. Wiley, New York (1990)
Book Google Scholar
Kumar, V.: Cluster analysis: basic concepts and algorithms. In: Tan, P., Steinbach, M., Kumar, V. (eds.) Introduction to Data Mining, pp. 487–568. Pearson Education, New York (2005)
Google Scholar
Luo, C., Pang, W., Wang, Z.: Semi-supervised clustering on heterogeneous information networks. In: Tseng, V.S., Ho, T.B., Zhou, Z., Chen, A.L.P., Kao, H. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 548–559. Springer, Berlin (2014)
Chapter Google Scholar
Meilă, M.: Comparing clusterings. An information based distance. J. Multivar. Anal. 98, 873–895 (2007)
Article MathSciNet Google Scholar
Milligan, G.W.: Clustering validation: results and implications for applied analyses. In: Arabie, P., Hubert, L.J., De Soete, G. (eds.) Clustering and Classification, pp. 341–375. World Scientific, River Edge (1996)
Chapter Google Scholar
Milligan, G.W., Cooper, M.C.: A study of the comparability of external criteria for hierarchical cluster analysis. Multivar. Behav. Res. 21, 441–458 (1986)
Article Google Scholar
Pfitzner, D., Leibbrandt, R., Powers, D.: Characterization and evaluation of similarity measures for pairs of clusterings. Knowl. Inf. Syst. 19, 361–394 (2009)
Article Google Scholar
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66, 846–850 (1971)
Article Google Scholar
Rezaei, M., Fränti, P.: Set matching measures for external cluster validity. IEEE Trans. Knowl. Data Eng. 28, 2173–2186 (2016)
Article Google Scholar
Severiano, A., Pinto, F.R., Ramirez, M., Carrio, J.A.: Adjusted Wallace coefficient as a measure of congruence between typing methods. J. Clin. Microbiol. 49, 3997–4000 (2011)
Article Google Scholar
Sokal, R.R., Michener, C.D.: A statistical method for evaluating systematic relationships. Univ. Kansas Sci. Bull. 38, 1409–1438 (1958)
Google Scholar
Steinley, D.: Properties of the Hubert-Arabie adjusted Rand index. Psychol. Method 9, 386–396 (2004)
Article Google Scholar
Steinley, D., Brusco, M.J., Hubert, L.J.: The variance of the adjusted Rand index. Psychol. Methods 21, 261–272 (2016)
Article Google Scholar
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clustering comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)
MathSciNet MATH Google Scholar
Wallace, D.L.: Comment on a method for comparing two hierarchical clusterings. J. Am. Stat. Assoc. 78, 569–576 (1983)
Google Scholar
Warrens, M.J.: On the indeterminacy of resemblance measures for binary (presence/absence) data. J. Classif. 25, 125–136 (2008)
Article MathSciNet Google Scholar
Warrens, M.J.: Bounds of resemblance measures for binary (presence/absence) variables. J. Classif. 25, 195–208 (2008)
Article MathSciNet Google Scholar
Warrens, M.J.: On similarity coefficients for 2 × 2 tables and correction for chance. Psychometrika 73, 487–502 (2008)
Article MathSciNet Google Scholar
Warrens, M.J.: On the equivalence of Cohen’s kappa and the Hubert-Arabie adjusted Rand index. J. Classif. 25, 177–183 (2008)
Article MathSciNet Google Scholar
Warrens, M.J.: On association coefficients for 2 × 2 tables and properties that do not depend on the marginal distributions. Psychometrika 73, 777–789 (2008)
Article MathSciNet Google Scholar
Warrens, M.J.: Similarity measures for 2 × 2 tables. J. Intell. Fuzzy Syst. 36, 3005–3018 (2019)
Article Google Scholar
Zeng, S., Huang, R., Kang, Z., Sang, N.: Image segmentation using spectral clustering of Gaussian mixture models. Neurocomputing 144, 346–356 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

GION, University of Groningen, Groningen, TG, Netherlands
Matthijs J. Warrens
Faculty of Behavioral and Social Sciences, University of Groningen, Groningen, TS, Netherlands
Hanneke van der Hoef

Authors

Matthijs J. Warrens
View author publications
You can also search for this author in PubMed Google Scholar
Hanneke van der Hoef
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthijs J. Warrens .

Editor information

Editors and Affiliations

School of Management and Information Sciences, Tama University, Tokyo, Japan
Tadashi Imaizumi
Rikkyo University, Tokyo, Japan
Akinori Okada
University of Tsukuba, Tsukuba, Japan
Sadaaki Miyamoto
Department of Mathematics, Chuo University, Tokyo, Japan
Fumitake Sakaori
Department of Mathematics, Tokai University, Hiratsuka-shi, Japan
Yoshiro Yamamoto
Department of Statistical Sciences, Sapienza University of Rome, Roma, Italy
Maurizio Vichi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Warrens, M.J., van der Hoef, H. (2020). Understanding the Rand Index. In: Imaizumi, T., Okada, A., Miyamoto, S., Sakaori, F., Yamamoto, Y., Vichi, M. (eds) Advanced Studies in Classification and Data Science. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Singapore. https://doi.org/10.1007/978-981-15-3311-2_24

Download citation

DOI: https://doi.org/10.1007/978-981-15-3311-2_24
Published: 26 September 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3310-5
Online ISBN: 978-981-15-3311-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics