Abstract
In this short paper, we outline the idea of applying the concept of a learned index structure to approximate nearest neighbor query processing. We discuss different data partitioning approaches and show how the task of identifying the disc pages of potential hits for a given query can be solved by a predictive machine learning model. In a preliminary experimental case study we evaluate and discuss the general applicability of different partitioning approaches as well as of different predictive models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the International Conference on Management of Data (SIGMOD), Houston, TX, pp. 489–504 (2018)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the International Conference on Very Large Databases (VLDB) (1997)
Sakurai, Y., Yoshikawa, M., Uemura, S., Kojima, H., et al.: The A-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the International Conference on Very Large Databases (VLDB), pp. 5–16 (2000)
Amsaleg, L., Jónsson, B.Þ, Lejsek, H.: Scalability of the NV-tree: three experiments. In: Marchand-Maillet, S., Silva, Y.N., Chávez, E. (eds.) SISAP 2018. LNCS, vol. 11223, pp. 59–72. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02224-2_5
Christiani, T.: Fast locality-sensitive hashing frameworks for approximate near neighbor search. In: Amato, G., Gennaro, C., Oria, V., Radovanović, M. (eds.) SISAP 2019. LNCS, vol. 11807, pp. 3–17. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32047-8_1
Jafari, O., Nagarkar, P., Montaño, J.: mmLSH: a practical and efficient technique for processing approximate nearest neighbor queries on multimedia data. In: Satoh, S., et al. (eds.) SISAP 2020. LNCS, vol. 12440, pp. 47–61. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60936-8_4
Jafari, O., Nagarkar, P., Montaño, J.: Improving locality sensitive hashing by efficiently finding projected nearest neighbors. In: Satoh, S., et al. (eds.) SISAP 2020. LNCS, vol. 12440, pp. 323–337. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60936-8_25
Ahle, T.D.: On the problem of \(p_1^{-1}\) in locality-sensitive hashing. In: Satoh, S., et al. (eds.) SISAP 2020. LNCS, vol. 12440, pp. 85–93. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60936-8_7
Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the International Conference on Very Large Databases (VLDB), pp. 194–205 (1998)
Ferhatosmanoglu, H., Tuncel, E., Agrawal, D., Abbadi, A.E.: Vector approximation based indexing for non-uniform high dimensional data sets. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), McLean, VA, pp. 202–209 (2000)
Houle, M.E., Oria, V., Rohloff, K.R., Wali, A.M.: LID-fingerprint: a local intrinsic dimensionality-based fingerprinting method. In: Marchand-Maillet, S., Silva, Y.N., Chávez, E. (eds.) SISAP 2018. LNCS, vol. 11223, pp. 134–147. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02224-2_11
Aumüller, M., Bernhardsson, E., Faithfull, A.: ANN-benchmarks: a benchmarking tool for approximate nearest neighbor algorithms. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds.) SISAP 2017. LNCS, vol. 10609, pp. 34–49. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68474-1_3
Berrendorf, M., Borutta, F., Kröger, P.: k-distance approximation for memory-efficient RkNN retrieval. In: Amato, G., Gennaro, C., Oria, V., Radovanović, M. (eds.) SISAP 2019. LNCS, vol. 11807, pp. 57–71. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32047-8_6
Amato, G., Falchi, F., Gennaro, C., Vadicamo, L.: Deep permutations: deep convolutional neural networks and permutation-based indexing. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 93–106. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46759-7_7
Antol, M., Ol’ha, J., Slanináková, T., Dohnal, V.: Learned metric index-proposition of learned indexing for unstructured data. Inf. Syst. 100, 101774 (2021)
Slanináková, T., Antol, M., Ol’ha, J., Vojtěch, K., Dohnal, V.: Data-driven learned metric index: an unsupervised approach. In: International Conference on Similarity Search and Applications, Springer (2021, to appear)
Bennett, K., Bradley, P., Demiriz, A.: Constrained k-means clustering. In: Technical Report MSR-TR-2000-65, Microsoft Research (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Hünemörder, M., Kröger, P., Renz, M. (2021). Towards a Learned Index Structure for Approximate Nearest Neighbor Search Query Processing. In: Reyes, N., et al. Similarity Search and Applications. SISAP 2021. Lecture Notes in Computer Science(), vol 13058. Springer, Cham. https://doi.org/10.1007/978-3-030-89657-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-89657-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89656-0
Online ISBN: 978-3-030-89657-7
eBook Packages: Computer ScienceComputer Science (R0)