CCCV 2017: Computer Vision pp 99-110 | Cite as
Massively Parallel Image Index for Vocabulary Tree Based Image Retrieval
Abstract
Although vocabulary tree based algorithm has high efficiency for image retrieval, it still faces a dilemma when dealing with large data. In this paper, we show that image indexing is the main bottleneck of vocabulary tree based image retrieval and then propose how to exploit the GPU hardware and CUDA parallel programming model for efficiently solving the image index phase and subsequently accelerating the remaining retrieval stage. Our main contributions include tree structure transformation, image package processing and task parallelism. Our GPU-based image index is up to around thirty times faster than the original method and the whole GPU-based vocabulary tree algorithm is improved by twenty percentage in speed.
Keywords
Large-scale image retrieval Vocabulary tree Image index GPU-based modelNotes
Acknowledgments
The authors would like to acknowledge Henrik Stewénius, David Nistér, Mikolajczyk, K. and Noah Snavely et al. for making their related datasets and source codes publicly available to us. This work is supported by the National Natural Science Foundation of China (Grant 61772213 and 61371140) and the Special Fund CZY17011 for Basic Scientific Research of Central Colleges, South-Central University for Nationalities, also in part by Grants 2015CFA062, 2015BAA133 and 2017010201010121.
References
- 1.Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: CVPR (2003)Google Scholar
- 2.Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)Google Scholar
- 3.Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)CrossRefGoogle Scholar
- 4.Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)Google Scholar
- 5.Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)Google Scholar
- 6.Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: ECCV (2008)Google Scholar
- 7.Jgou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vision 87, 316–336 (2010)CrossRefGoogle Scholar
- 8.Wu, C., SiftGPU: A GPU implementation of david lowe’s scale invariant feature transform (SIFT). http://cs.unc.edu/~ccwu/siftgpu/
- 9.Wu, C., Agarwal, S., Curless, B., Seitz, S.M.: Multicore bundle adjustment. In: CVPR (2011)Google Scholar
- 10.Farber, R.: CUDA Application Design and Development. Morgan Kaufmann, San Francisco (2011)Google Scholar
- 11.Arandjelović, R., Zisserman, A.: DisLocation: scalable descriptor distinctiveness for location recognition. In: ACCV (2014)Google Scholar
- 12.Wilson, K., Snavely, N.: Robust global translations with 1DSfM. In: ECCV (2014)Google Scholar
- 13.Snavely, N.: A CPU implementation of David Nistér and Henrik Stewénius’s vocabulary tree algorithm. https://github.com/snavely/VocabTree2
- 14.Agarwal, S., Snavely, N., Simon, I., Seitz, S., Szeliski, R.: Building Rome in a day. In: ICCV (2009)Google Scholar
- 15.Sattler, T., Havlena, M., Schindler, K., Pollefeys, M.: Large-scale location recognition and the geometric burstiness problem. In: CVPR (2016)Google Scholar
- 16.Koniusz, P., Yan, F., Gosselin, P.H., Mikolajczyk, K.: Higher-order occurrence pooling for bags-of-words: visual concept detection. IEEE Trans. Pattern Anal. Mach. Intell. 39, 313–326 (2017)CrossRefGoogle Scholar
- 17.Schönberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: CVPR (2016)Google Scholar
- 18.Shen, T., Zhu, S., Fang, T., Zhang, R., Quan, L.: Graph-based consistent matching for structure-from-motion. In: ECCV (2016)Google Scholar
- 19.Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: BMVC (2002)Google Scholar
- 20.Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Comput. Vision 60, 63–86 (2004)CrossRefGoogle Scholar
- 21.Stewénius, H., Nistér, D.: UKbench dataset. http://vis.uky.edu/~stewe/ukbench/
- 22.Mikolajczyk, K.: Binaries for affine covariant region descriptors. http://www.robots.ox.ac.uk/~vgg/research/affine/