The Hubness Phenomenon: Fact or Artifact?
- Thomas LowAffiliated withData and Knowledge Engineering Group, Otto-von-Guericke-University of Magdeburg Email author
- , Christian BorgeltAffiliated withEuropean Centre for Soft Computing
- , Sebastian StoberAffiliated withData and Knowledge Engineering Group, Otto-von-Guericke-University of Magdeburg
- , Andreas NürnbergerAffiliated withData and Knowledge Engineering Group, Otto-von-Guericke-University of Magdeburg
The hubness phenomenon, as it was recently described, consists in the observation that for increasing dimensionality of a data set the distribution of the number of times a data point occurs among the k nearest neighbors of other data points becomes increasingly skewed to the right. As a consequence, so-called hubs emerge, that is, data points that appear in the lists of the k nearest neighbors of other data points much more often than others. In this paper we challenge the hypothesis that the hubness phenomenon is an effect of the dimensionality of the data set and provide evidence that it is rather a boundary effect or, more generally, an effect of a density gradient. As such, it may be seen as an artifact that results from the process in which the data is generated that is used to demonstrate this phenomenon. We report experiments showing that the hubness phenomenon need not occur in high-dimensional data and can be made to occur in low-dimensional data.
- The Hubness Phenomenon: Fact or Artifact?
- Book Title
- Towards Advanced Data Analysis by Combining Soft Computing and Statistics
- pp 267-278
- Print ISBN
- Online ISBN
- Series Title
- Studies in Fuzziness and Soft Computing
- Series Volume
- Series ISSN
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-Verlag GmbH Berlin Heidelberg
- Additional Links
- Industry Sectors
- eBook Packages
- Editor Affiliations
- ID1. Intelligent Data Analysis & Graphical, Models Research Unit, European Centre for Soft Computing
- ID2. , Departamento de Estadistica e I. O. y, Universidad de Oviedo
- ID3. Instituto Superior Técnico, Department of Mechanical Engineering, Technical University Lisbon
- ID4. Labo. Microelectronique, Université Catholique de Louvain
- Author Affiliations
- 1. Data and Knowledge Engineering Group, Otto-von-Guericke-University of Magdeburg, Universitätsplatz 2, D-39106, Magdeburg, Germany
- 2. European Centre for Soft Computing, c/ Gonzalo Gutiérrez Quirós s/n, E-33600, Mieres, Asturias, Spain
To view the rest of this content please follow the download PDF link above.