Using the k-Nearest Neighbor Graph for Proximity Searching in Metric Spaces

  • Rodrigo Paredes
  • Edgar Chávez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3772)

Abstract

Proximity searching consists in retrieving from a database, objects that are close to a query. For this type of searching problem, the most general model is the metric space, where proximity is defined in terms of a distance function. A solution for this problem consists in building an offline index to quickly satisfy online queries. The ultimate goal is to use as few distance computations as possible to satisfy queries, since the distance is considered expensive to compute. Proximity searching is central to several applications, ranging from multimedia indexing and querying to data compression and clustering.

In this paper we present a new approach to solve the proximity searching problem. Our solution is based on indexing the database with the k-nearest neighbor graph (knng), which is a directed graph connecting each element to its k closest neighbors.

We present two search algorithms for both range and nearest neighbor queries which use navigational and metrical features of the knng graph. We show that our approach is competitive against current ones. For instance, in the document metric space our nearest neighbor search algorithms perform 30% more distance evaluations than AESA using only a 0.25% of its space requirement. In the same space, the pivot-based technique is completely useless.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query clustering for boosting web page ranking. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 164–175. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  2. 2.
    Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query recommendation usign query logs in search engines. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 588–596. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)Google Scholar
  4. 4.
    Brito, M., Chávez, E., Quiroz, A., Yukich, J.: Connectivity of the mutual k-nearest neighbor graph in clustering and outlier detection. Statistics & Probability Letters 35, 33–42 (1996)CrossRefGoogle Scholar
  5. 5.
    Callahan, P., Kosaraju, R.: A decomposition of multidimensional point sets with applications to k nearest neighbors and n body potential fields. JACM 42(1), 67–90 (1995)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Computing Surveys 33(3), 273–321 (2001)CrossRefGoogle Scholar
  7. 7.
    Clarkson, K.: Nearest neighbor queries in metric spaces. Discrete Computational Geometry 22(1), 63–93 (1999)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Duda, R.O., Hart, P.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)MATHGoogle Scholar
  9. 9.
    Eppstein, D., Erickson, J.: Iterated nearest neighbors and finding minimal polytopes. Discrete & Computational Geometry 11, 321–350 (1994)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Figueroa, K.: An efficient algorithm to all k nearest neighbor problem in metric spaces. Master’s thesis, Universidad Michoacana, Mexico (2000) (In Spanish)Google Scholar
  11. 11.
    Hjaltason, G., Samet, H.: Incremental similarity search in multimedia databases. Technical Report TR 4199, Dept. of Comp. Sci. Univ. of Maryland (November 2000)Google Scholar
  12. 12.
    Navarro, G., Paredes, R.: Practical construction of metric t-spanners. In: Proc. ALENEX 2003, pp. 69–81 (2003)Google Scholar
  13. 13.
    Navarro, G., Paredes, R., Chávez, E.: t-Spanners as a data structure for metric space searching. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 298–309. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  14. 14.
    Paredes, R., Navarro, G.: Practical construction of k nearest neighbor graphs in metric spaces. Technical Report TR/DCC-2005-6, Dept. of Comp. Sci. Univ. of Chile (May 2005), ftp://ftp.dcc.uchile.cl/pub/users/gnavarro/knnconstr.ps.gz
  15. 15.
    Vidal, E.: An algorithm for finding nearest neighbors in (approximately) constant average time. Pattern Recognition Letters 4, 145–157 (1986)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Rodrigo Paredes
    • 1
  • Edgar Chávez
    • 2
  1. 1.Center for Web Research, Dept. of Computer ScienceUniversity of ChileSantiagoChile
  2. 2.Escuela de Ciencias Físico-MatemáticasUniv. MichoacanaMorelia, Mich.México

Personalised recommendations