Can Shared-Neighbor Distances Defeat the Curse of Dimensionality?

  • Michael E. Houle
  • Hans-Peter Kriegel
  • Peer Kröger
  • Erich Schubert
  • Arthur Zimek
Conference paper

DOI: 10.1007/978-3-642-13818-8_34

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6187)
Cite this paper as:
Houle M.E., Kriegel HP., Kröger P., Schubert E., Zimek A. (2010) Can Shared-Neighbor Distances Defeat the Curse of Dimensionality?. In: Gertz M., Ludäscher B. (eds) Scientific and Statistical Database Management. SSDBM 2010. Lecture Notes in Computer Science, vol 6187. Springer, Berlin, Heidelberg

Abstract

The performance of similarity measures for search, indexing, and data mining applications tends to degrade rapidly as the dimensionality of the data increases. The effects of the so-called ‘curse of dimensionality’ have been studied by researchers for data sets generated according to a single data distribution. In this paper, we study the effects of this phenomenon on different similarity measures for multiply-distributed data. In particular, we assess the performance of shared-neighbor similarity measures, which are secondary similarity measures based on the rankings of data objects induced by some primary distance measure. We find that rank-based similarity measures can result in more stable performance than their associated primary distance measures.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Michael E. Houle
    • 1
  • Hans-Peter Kriegel
    • 2
  • Peer Kröger
    • 2
  • Erich Schubert
    • 2
  • Arthur Zimek
    • 2
  1. 1.National Institute of InformaticsTokyoJapan
  2. 2.Ludwig-Maximilians-Universität MünchenMünchenGermany

Personalised recommendations