Abstract
Approximate nearest neighbor (ANN) search allows us to perform similarity search over massive vectors with less memory and computation. Optimized Product Quantization (OPQ) is one of the state-of-the-art methods for ANN where data vectors are represented as combinations of codewords by taking into account the data distribution. However, it suffers from degradation in accuracy when the database is frequently updated with incoming data whose distribution is different. An existing work, Online OPQ, addressed this problem, but the computational cost is high because it requires to perform of costly singular value decompositions for updating the codewords. To this problem, we propose a method for updating the rotation matrix using SVD-Updating, which can dynamically update the singular matrix using low-rank approximation. Using SVD-Updating, instead of performing multiple singular value decompositions on a high-rank matrix, we can update the rotation matrix by performing only one singular value decomposition on a low-rank matrix. In the experiments, we prove that the proposed method shows a better trade-off between update time and retrieval accuracy than the comparative methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jégou, H., et al.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
Ge, T., et al.: Optimized product quantization. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 744–755 (2014)
Moffat, A., et al.: Text compression for dynamic document databases. IEEE Trans. Knowl. Data Eng. 9(2), 302–313 (1997)
Dong, A., Bhanu, B.: Concept learning and transplantation for dynamic image databases. In: ICME, pp. 765–768 IEEE Computer Society (2003)
Xu, D., et al.: Online product quantization. IEEE Trans. Knowl. Data Eng. 30(11), 2185–2198 (2018)
Liu, C., et al.: Online optimized product quantization. IEEE International Conference on Data Mining (ICDM), pp. 362–371 (2020)
Berry, M.W., et al.: Using linear algebra for intelligent information Retrieval. (1994)
Gray, R.M., et al.: Quantization. IEEE Trans. Inf. Theor. 44, 2325–2384 (1998)
Sch¨onemann, P.: A generalized solution of the orthogonal procrustes problem. Psychometrika 31(1), 1–10 (1966)
Gong, Y., et al.: Iterative quantization: a procrustean approach to learning binary codes. CVPR (2011)
Norouzi, M., Fleet, D.J.: Cartesian K-means. In: CVPR, pp. 3017–3024. IEEE Computer Society (2013)
Wang, J., et al.: Optimized Cartesian K-means. IEEE Trans. Knowl. Data Eng. 27(1), 180–192 (2015)
Babenko, A., Lempitsky, V.S.: Tree quantization for large-scale similarity search and classification. In: CVPR, pp. 4240–4248. IEEE Computer Society (2015)
Babenko, A., Lempitsky, V.S.: Additive quantization for extreme vector compression. In: CVPR, pp. 931–938. IEEE Computer Society (2014)
Zhang, T., et al.: Composite quantization for approximate nearest neighbor search. In: ICML, pp. 838–846. JMLR.org (2014)
Zhang, T., et al.: Sparse composite quantization. In: CVPR, pp. 4548–4556. IEEE Computer Society (2015)
Wang, X., et al.: Supervised quantization for similarity search. In: CVPR, pp. 2018–2026. IEEE Computer Society (2016)
Zhang, T., Wang, J.: Collaborative quantization for cross-modal similarity search. In: CVPR, pp. 2036–2045. IEEE Computer Society (2016)
Cao, Y., et al.: Deep quantization network for efficient image retrieval. In: Schuurmans, D., Wellman, M.P. (eds.) In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI), pp. 3457–3463. AAAI Press (2016)
Cao, Y., et al.: Deep visual-semantic quantization for efficient image retrieval. In: CVPR, pp. 916–925. IEEE Computer Society (2017)
Sarwar, B., et al.: Incremental singular value decomposition algorithms for highly scalable recommender systems. In: Fifth International Conference on Computer and Information Science, pp. 27--28 (2002)
Deerwester, S.C., et al.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Lang, K.: NewsWeeder: learning to filter netnews. In: International Conference on Machine Learning (1995)
Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 4171—4186. Association for Computational Linguistics, Minneapolis (2019)
Fei-Fei, L., et al.: Learning generative visual models from few training examples an incremental Bayesian approach tested on 101 object categories. In: Proceedings of the Workshop on Generative-Model Based Vision. Washington, DC (2004)
Xiao, J., et al.: SUN database: large-scale scene recognition from abbey to zoo. In: CVPR, pp. 3485–3492. IEEE Computer Society (2010)
Friedman, A.: Framing pictures: the role of knowledge in automatized encoding and memory for gist. J. Exp. Psychol. Gen. 108(3), 316–355 (1979)
Acknowledgment
This work was partly supported by the Project Commissioned by New Energy and Industrial Technology Development Organization (JPNP20006).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Yukawa, K., Amagasa, T. (2021). Online Optimized Product Quantization for Dynamic Database Using SVD-Updating. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12923. Springer, Cham. https://doi.org/10.1007/978-3-030-86472-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-86472-9_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86471-2
Online ISBN: 978-3-030-86472-9
eBook Packages: Computer ScienceComputer Science (R0)