Abstract
The Rand index continues to be one of the most popular indices for assessing agreement between two partitions. The Rand index combines two sources of information, object pairs put together, and object pairs assigned to different clusters, in both partitions. Via a decomposition of the Rand index into four asymmetric indices, we show that in many situations object pairs that were assigned to different clusters have considerable impact on the value of the overall Rand index.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Albatineh, A.N., Niewiadomska-Bugaj, M., Mihalko, D.: On similarity indices and correction for chance agreement. J. Classif. 23, 301–313 (2006)
Albatineh, A.N., Niewiadomska-Bugaj, M.: Correcting Jaccard and other similarity indices for chance agreement in cluster analysis. Adv. Data Anal. Classif. 5, 179–200 (2011)
Anderson, D.T., Bezdek, J.C., Popescu, M., Keller, J.M.: Comparing fuzzy, probabilistic, and possibilistic partitions. IEEE Trans. Fuzzy Syst. 18, 906–917 (2010)
Baulieu, F.B.: A classification of presence/absence based dissimilarity coefficients. J. Classif. 6, 233–246 (1989)
Brun, M., Sima, C., Hua, J., Lowey, J., Carroll, B., Suh, E., Dougherty, E.R.: Model-based evaluation of clustering validation measures. Pattern Recogn. 40, 807–824 (2007)
Dubey, A.K., Gupta, U., Jain, S.: Analysis of k-means clustering approach on the breast cancer Wisconsin dataset. Int. J. Comput. Assist. Radiol. Surg. 11, 2033–2047 (2016)
Fowlkes, E.B., Mallows, C.L.: A method for comparing two hierarchical clusterings. J. Am. Stat. Assoc. 78, 553–569 (1983)
Gower, J.C., Warrens, M.J.: Similarity, dissimilarity, and distance, measures of. Wiley StatsRef: Statistics Reference Online (2017)
Heiser, W.J., Warrens, M.J.: Families of relational statistics for 2×2 tables. In: Kaul, H., Mulder, H.M. (eds.) Advances in Interdisciplinary Applied Discrete Mathematics, pp. 25–52. World Scientific, Singapore (2010)
Hennig, C., Meilă, M., Murtagh, F., Rocci, R.: Handbook of Cluster Analysis. Chapman and Hall/CRC, New York (2015)
Hubert, L.J., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)
Huo, Z., Ding, Y., Liu, S., Oesterreich, S., Tseng, G.: Meta-analytic framework for sparse K-means to identify disease subtypes in multiple transcriptomic studies. J. Am. Stat. Assoc. 111, 27–52 (2016)
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31, 651–666 (2010)
Katiyar, P., Divine, M.R., Kohlhofer, U., Quintanilla-Martinez, L., Schölkopf, B., Pichler, B.J., Disselhorst, J.A.: Spectral clustering predicts tumor tissue heterogeneity using dynamic 18F-FDG PET: a complement to the standard compartmental modeling approach. J. Nucl. Med. 57, 651–657 (2016)
Kaufman, L., Rousseeuw, P.: Finding groups in data: an introduction to cluster analysis. Wiley, New York (1990)
Kumar, V.: Cluster analysis: basic concepts and algorithms. In: Tan, P., Steinbach, M., Kumar, V. (eds.) Introduction to Data Mining, pp. 487–568. Pearson Education, New York (2005)
Luo, C., Pang, W., Wang, Z.: Semi-supervised clustering on heterogeneous information networks. In: Tseng, V.S., Ho, T.B., Zhou, Z., Chen, A.L.P., Kao, H. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 548–559. Springer, Berlin (2014)
Meilă, M.: Comparing clusterings. An information based distance. J. Multivar. Anal. 98, 873–895 (2007)
Milligan, G.W.: Clustering validation: results and implications for applied analyses. In: Arabie, P., Hubert, L.J., De Soete, G. (eds.) Clustering and Classification, pp. 341–375. World Scientific, River Edge (1996)
Milligan, G.W., Cooper, M.C.: A study of the comparability of external criteria for hierarchical cluster analysis. Multivar. Behav. Res. 21, 441–458 (1986)
Pfitzner, D., Leibbrandt, R., Powers, D.: Characterization and evaluation of similarity measures for pairs of clusterings. Knowl. Inf. Syst. 19, 361–394 (2009)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66, 846–850 (1971)
Rezaei, M., Fränti, P.: Set matching measures for external cluster validity. IEEE Trans. Knowl. Data Eng. 28, 2173–2186 (2016)
Severiano, A., Pinto, F.R., Ramirez, M., Carrio, J.A.: Adjusted Wallace coefficient as a measure of congruence between typing methods. J. Clin. Microbiol. 49, 3997–4000 (2011)
Sokal, R.R., Michener, C.D.: A statistical method for evaluating systematic relationships. Univ. Kansas Sci. Bull. 38, 1409–1438 (1958)
Steinley, D.: Properties of the Hubert-Arabie adjusted Rand index. Psychol. Method 9, 386–396 (2004)
Steinley, D., Brusco, M.J., Hubert, L.J.: The variance of the adjusted Rand index. Psychol. Methods 21, 261–272 (2016)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clustering comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)
Wallace, D.L.: Comment on a method for comparing two hierarchical clusterings. J. Am. Stat. Assoc. 78, 569–576 (1983)
Warrens, M.J.: On the indeterminacy of resemblance measures for binary (presence/absence) data. J. Classif. 25, 125–136 (2008)
Warrens, M.J.: Bounds of resemblance measures for binary (presence/absence) variables. J. Classif. 25, 195–208 (2008)
Warrens, M.J.: On similarity coefficients for 2 × 2 tables and correction for chance. Psychometrika 73, 487–502 (2008)
Warrens, M.J.: On the equivalence of Cohen’s kappa and the Hubert-Arabie adjusted Rand index. J. Classif. 25, 177–183 (2008)
Warrens, M.J.: On association coefficients for 2 × 2 tables and properties that do not depend on the marginal distributions. Psychometrika 73, 777–789 (2008)
Warrens, M.J.: Similarity measures for 2 × 2 tables. J. Intell. Fuzzy Syst. 36, 3005–3018 (2019)
Zeng, S., Huang, R., Kang, Z., Sang, N.: Image segmentation using spectral clustering of Gaussian mixture models. Neurocomputing 144, 346–356 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Warrens, M.J., van der Hoef, H. (2020). Understanding the Rand Index. In: Imaizumi, T., Okada, A., Miyamoto, S., Sakaori, F., Yamamoto, Y., Vichi, M. (eds) Advanced Studies in Classification and Data Science. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Singapore. https://doi.org/10.1007/978-981-15-3311-2_24
Download citation
DOI: https://doi.org/10.1007/978-981-15-3311-2_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3310-5
Online ISBN: 978-981-15-3311-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)