Skip to main content

Towards a Learned Index Structure for Approximate Nearest Neighbor Search Query Processing

  • Conference paper
  • First Online:
Book cover Similarity Search and Applications (SISAP 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13058))

Included in the following conference series:

Abstract

In this short paper, we outline the idea of applying the concept of a learned index structure to approximate nearest neighbor query processing. We discuss different data partitioning approaches and show how the task of identifying the disc pages of potential hits for a given query can be solved by a predictive machine learning model. In a preliminary experimental case study we evaluate and discuss the general applicability of different partitioning approaches as well as of different predictive models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html.

  2. 2.

    https://github.com/huenemoerder/kmean-lis.git.

  3. 3.

    https://pytorch.org/.

  4. 4.

    https://github.com/huenemoerder/kmean-lis.git.

References

  1. Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the International Conference on Management of Data (SIGMOD), Houston, TX, pp. 489–504 (2018)

    Google Scholar 

  2. Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the International Conference on Very Large Databases (VLDB) (1997)

    Google Scholar 

  3. Sakurai, Y., Yoshikawa, M., Uemura, S., Kojima, H., et al.: The A-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the International Conference on Very Large Databases (VLDB), pp. 5–16 (2000)

    Google Scholar 

  4. Amsaleg, L., Jónsson, B.Þ, Lejsek, H.: Scalability of the NV-tree: three experiments. In: Marchand-Maillet, S., Silva, Y.N., Chávez, E. (eds.) SISAP 2018. LNCS, vol. 11223, pp. 59–72. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02224-2_5

    Chapter  Google Scholar 

  5. Christiani, T.: Fast locality-sensitive hashing frameworks for approximate near neighbor search. In: Amato, G., Gennaro, C., Oria, V., Radovanović, M. (eds.) SISAP 2019. LNCS, vol. 11807, pp. 3–17. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32047-8_1

    Chapter  Google Scholar 

  6. Jafari, O., Nagarkar, P., Montaño, J.: mmLSH: a practical and efficient technique for processing approximate nearest neighbor queries on multimedia data. In: Satoh, S., et al. (eds.) SISAP 2020. LNCS, vol. 12440, pp. 47–61. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60936-8_4

    Chapter  Google Scholar 

  7. Jafari, O., Nagarkar, P., Montaño, J.: Improving locality sensitive hashing by efficiently finding projected nearest neighbors. In: Satoh, S., et al. (eds.) SISAP 2020. LNCS, vol. 12440, pp. 323–337. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60936-8_25

    Chapter  Google Scholar 

  8. Ahle, T.D.: On the problem of \(p_1^{-1}\) in locality-sensitive hashing. In: Satoh, S., et al. (eds.) SISAP 2020. LNCS, vol. 12440, pp. 85–93. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60936-8_7

    Chapter  Google Scholar 

  9. Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the International Conference on Very Large Databases (VLDB), pp. 194–205 (1998)

    Google Scholar 

  10. Ferhatosmanoglu, H., Tuncel, E., Agrawal, D., Abbadi, A.E.: Vector approximation based indexing for non-uniform high dimensional data sets. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), McLean, VA, pp. 202–209 (2000)

    Google Scholar 

  11. Houle, M.E., Oria, V., Rohloff, K.R., Wali, A.M.: LID-fingerprint: a local intrinsic dimensionality-based fingerprinting method. In: Marchand-Maillet, S., Silva, Y.N., Chávez, E. (eds.) SISAP 2018. LNCS, vol. 11223, pp. 134–147. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02224-2_11

    Chapter  Google Scholar 

  12. Aumüller, M., Bernhardsson, E., Faithfull, A.: ANN-benchmarks: a benchmarking tool for approximate nearest neighbor algorithms. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds.) SISAP 2017. LNCS, vol. 10609, pp. 34–49. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68474-1_3

    Chapter  Google Scholar 

  13. Berrendorf, M., Borutta, F., Kröger, P.: k-distance approximation for memory-efficient RkNN retrieval. In: Amato, G., Gennaro, C., Oria, V., Radovanović, M. (eds.) SISAP 2019. LNCS, vol. 11807, pp. 57–71. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32047-8_6

    Chapter  Google Scholar 

  14. Amato, G., Falchi, F., Gennaro, C., Vadicamo, L.: Deep permutations: deep convolutional neural networks and permutation-based indexing. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 93–106. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46759-7_7

    Chapter  Google Scholar 

  15. Antol, M., Ol’ha, J., Slanináková, T., Dohnal, V.: Learned metric index-proposition of learned indexing for unstructured data. Inf. Syst. 100, 101774 (2021)

    Article  Google Scholar 

  16. Slanináková, T., Antol, M., Ol’ha, J., Vojtěch, K., Dohnal, V.: Data-driven learned metric index: an unsupervised approach. In: International Conference on Similarity Search and Applications, Springer (2021, to appear)

    Google Scholar 

  17. Bennett, K., Bradley, P., Demiriz, A.: Constrained k-means clustering. In: Technical Report MSR-TR-2000-65, Microsoft Research (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maximilian Hünemörder .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hünemörder, M., Kröger, P., Renz, M. (2021). Towards a Learned Index Structure for Approximate Nearest Neighbor Search Query Processing. In: Reyes, N., et al. Similarity Search and Applications. SISAP 2021. Lecture Notes in Computer Science(), vol 13058. Springer, Cham. https://doi.org/10.1007/978-3-030-89657-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89657-7_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89656-0

  • Online ISBN: 978-3-030-89657-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics