Advertisement

Fast Filtering for Nearest Neighbor Search by Sketch Enumeration Without Using Matching

  • Naoya HiguchiEmail author
  • Yasunobu Imamura
  • Tetsuji Kuboyama
  • Kouichi Hirata
  • Takeshi Shinohara
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11919)

Abstract

A sketch is a lossy compression of high-dimensional data into compact bit strings such as locality sensitive hash. In general, k nearest neighbor search using sketch consists of the following two stages. The first stage narrows down the top K candidates, for some \(K \ge k\), using a priority measure of sketch as a filter. The second stage selects the k nearest objects from K candidates. In this paper, we discuss the search algorithms using fast filtering by sketch enumeration without using matching. Surprisingly, the search performance is rather improved by the proposed method when narrow sketches with smaller number of bits such as 16-bits than the conventional ones are used. Furthermore, we compare the search efficiency by sketches of various widths for several databases, which have different numbers of objects and dimensionalities. Then, we can observe that wider sketches are appropriate for larger databases, while narrower sketches are appropriate for higher dimension.

Keywords

Similarity search Nearest neighbor search Sketch enumeration Ball partitioning Hamming distance Dimension reduction 

Notes

Acknowledgments

This work was partially supported by JSPS KAKENHI Grant Numbers 16H02870, 17H00762, 16H01743, 17H01788, and 18K11443.

References

  1. 1.
    Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings VLBD’97, pp. 426–435 (1997)Google Scholar
  2. 2.
    Dong, W., Charikar, M., Li, K.: Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces. In: Proceedings ACM SIGIR’08, pp. 123–130 (2008)Google Scholar
  3. 3.
    Fukunaga, K.: Statistical Pattern Recognition, 2nd edn. Academic Press, Cambridge (1990)zbMATHGoogle Scholar
  4. 4.
    Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Yormark, B. (ed.) Proceedings SIGMOD’84, pp. 47–57. ACM Press (1984)Google Scholar
  5. 5.
    Higuchi, N., Imamura, Y., Kuboyama, T., Hirata, K., Shinohara, T.: Nearest neighbor search using sketches as quantized images of dimension reduction. In: Proceedings ICPRAM 2018, pp. 356–363 (2018)Google Scholar
  6. 6.
    Higuchi, N., Imamura, Y., Kuboyama, T., Hirata, K., Shinohara, T.: Fast nearest neighbor search with narrow 16-bit sketch. In: Proceedings ICPRAM 2019, pp. 540–547 (2019)Google Scholar
  7. 7.
    Imamura, Y., Higuchi, N., Kuboyama, T., Hirata, K., Shinohara, T.: Pivot selection for dimension reduction using annealing by increasing resampling. In: Proceedings LWDA 2017, pp. 15–23 (2017)Google Scholar
  8. 8.
    Mic, V., Novak, D., Zezula, P.: Improving sketches for similarity search. In: Proceedings MEMICS’15, pp. 45–57 (2015)Google Scholar
  9. 9.
    Mic, V., Novak, D., Zezula, P.: Speeding up similarity search by sketches. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 250–258. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46759-7_19CrossRefGoogle Scholar
  10. 10.
    Müller, A., Shinohara, T.: Efficient similarity search by reducing I/O with compressed sketches. In: Proceedings SISAP’09, pp. 30–38 (2009)Google Scholar
  11. 11.
    Shinohara, T., Ishizaka, H.: On dimension reduction mappings for approximate retrieval of multi-dimensional data. In: Arikawa, S., Shinohara, A. (eds.) Progress in Discovery Science. LNCS, vol. 2281, pp. 224–231. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-45884-0_14CrossRefGoogle Scholar
  12. 12.
    Wang, Z., Dong, W., Josephson, W., Lv, Q., Charikar, M., Li, K.: Sizing sketches: a rank-based analysis for similarity search. In: Proceedings ACM SIGMETRICS’07, pp. 157–168 (2007)CrossRefGoogle Scholar
  13. 13.
    Yianilos, P.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings SODA 1993, pp. 311–321. ACM Press (1993)Google Scholar
  14. 14.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Advances in Database Systems. Springer, Heidelberg (2006).  https://doi.org/10.1007/0-387-29151-2CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Naoya Higuchi
    • 1
    Email author
  • Yasunobu Imamura
    • 2
  • Tetsuji Kuboyama
    • 3
  • Kouichi Hirata
    • 1
  • Takeshi Shinohara
    • 1
  1. 1.Kyushu Institute of TechnologyIizukaJapan
  2. 2.System Studio COLUNKurumeJapan
  3. 3.Gakushuin UniversityToshimaJapan

Personalised recommendations