Skip to main content

Providing Diversity in K-Nearest Neighbor Query Results

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3056))

Included in the following conference series:

Abstract

Given a point query Q in multi-dimensional space, K-Nearest Neighbor (KNN) queries return the K closest answers in the database with respect to Q. In this scenario, it is possible that a majority of the answers may be very similar to one or more of the other answers, especially when the data has clusters. For a variety of applications, such homogeneous result sets may not add value to the user. In this paper, we consider the problem of providing diversity in the results of KNN queries, that is, to produce the closest result set such that each answer is sufficiently different from the rest. We first propose a user-tunable definition of diversity, and then present an algorithm, called MOTLEY, for producing a diverse result set as per this definition. Through a detailed experimental evaluation we show that MOTLEY can produce diverse result sets by reading only a small fraction of the tuples in the database. Further, it imposes no additional overhead on the evaluation of traditional KNN queries, thereby providing a seamless interface between diversity and distance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beckmann, N., Kriegel, H., Schneider, R., Seeger, B.: The R∗-tree: An efficient and robust access method for points and rectangles. In: Proc. ofACM SIGMOD Intl. Conf. on Management of Data (1990)

    Google Scholar 

  2. Gower, J.: A general coefficient of similarity and some of its properties. Biometrics 27 (1971)

    Google Scholar 

  3. Grohe, M.: Parameterized Complexity for Database Theorists. SIGMOD Record 31(4) (December 2002)

    Google Scholar 

  4. Guttman, A.: R-trees:A dynamic index structure for spatial searching. In: Proc. of ACMSIGMOD Intl. Conf. on Management of Data (1984)

    Google Scholar 

  5. Hjaltason, G., Samet, H.: Distance Browsing in Spatial Databases. ACM Trans. on Database Systems 24(2) (1999)

    Google Scholar 

  6. Jain, A., Sarda, P., Haritsa, J.: Providing Diversity in K-Nearest Neighbor Query Results, Tech. Report TR-2003-04, DSL/SERC, Indian Institute of Science (2003)

    Google Scholar 

  7. Kothuri, R., Ravada, S., Abugov, D.: Quadtree and R-tree indexes in Oracle Spatial: A comparison using GIS data. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2002)

    Google Scholar 

  8. Roussopoulos, N., Kelley, S., Vincent, F.: Nearest Neighbor Queries. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (1995)

    Google Scholar 

  9. http://www.thefreedictionarity.com

  10. ftp://ftp.ics.uci.edu/pub/machine-learning-databases/covtype

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jain, A., Sarda, P., Haritsa, J.R. (2004). Providing Diversity in K-Nearest Neighbor Query Results. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24775-3_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22064-0

  • Online ISBN: 978-3-540-24775-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics