Scientific and Statistical Database Management

Volume 6187 of the series Lecture Notes in Computer Science pp 482-500

Can Shared-Neighbor Distances Defeat the Curse of Dimensionality?

  • Michael E. HouleAffiliated withNational Institute of Informatics
  • , Hans-Peter KriegelAffiliated withLudwig-Maximilians-Universität München
  • , Peer KrögerAffiliated withLudwig-Maximilians-Universität München
  • , Erich SchubertAffiliated withLudwig-Maximilians-Universität München
  • , Arthur ZimekAffiliated withLudwig-Maximilians-Universität München


The performance of similarity measures for search, indexing, and data mining applications tends to degrade rapidly as the dimensionality of the data increases. The effects of the so-called ‘curse of dimensionality’ have been studied by researchers for data sets generated according to a single data distribution. In this paper, we study the effects of this phenomenon on different similarity measures for multiply-distributed data. In particular, we assess the performance of shared-neighbor similarity measures, which are secondary similarity measures based on the rankings of data objects induced by some primary distance measure. We find that rank-based similarity measures can result in more stable performance than their associated primary distance measures.