On Similarity Indices and Correction for Chance Agreement

Abstract

Similarity indices can be used to compare partitions (clusterings) of a data set. Many such indices were introduced in the literature over the years. We are showing that out of 28 indices we were able to track, there are 22 different ones. Even though their values differ for the same clusterings compared, after correcting for agreement attributed to chance only, their values become similar and some of them even become equivalent. Consequently, the problem of choice of the index to be used for comparing different clusterings becomes less important.

This is a preview of subscription content, log in to check access.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Ahmed N. Albatineh or Magdalena Niewiadomska-Bugaj or Daniel Mihalko.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Albatineh, A., Niewiadomska-Bugaj, M. & Mihalko, D. On Similarity Indices and Correction for Chance Agreement. Journal of Classification 23, 301–313 (2006). https://doi.org/10.1007/s00357-006-0017-z

Download citation

Keywords

  • Cluster Size
  • Similarity Index
  • American Statistical Association
  • Chance Agreement
  • Similarity Table