Fast Visual Vocabulary Construction for Image Retrieval Using Skewed-Split k-d Trees
Most of the image retrieval approaches nowadays are based on the Bag-of-Words (BoW) model, which allows for representing an image efficiently and quickly. The efficiency of the BoW model is related to the efficiency of the visual vocabulary. In general, visual vocabularies are created by clustering all available visual features, formulating specific patterns. Clustering techniques are k-means oriented and they are replaced by approximate k-means methods for very large datasets. In this work, we propose a faster construction of visual vocabularies compared to the existing method in the case of SIFT descriptors, based on our observation that the values of the 128-dimensional SIFT descriptors follow the exponential distribution. The application of our method to image retrieval in specific image datasets showed that the mean Average Precision is not reduced by our approximation, despite that the visual vocabulary has been constructed significantly faster compared to the state of the art methods.
This work was supported by the projects MULTISENSOR (FP7-610411) and KRISTINA (H2020-645012), funded by the European Commission.
- 2.Devroye, L.: Sample-based non-uniform random variate generation. In: Proceedings of the 18th Conference on Winter Simulation, pp. 260–265. ACM, December 1986Google Scholar
- 3.Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311. IEEE, June 2010Google Scholar
- 5.Luo, Q., Zhang, S., Huang, T., Gao, W., Tian, Q.: Superimage: packing semantic-relevant images for indexing and retrieval. In: Proceedings of International Conference on Multimedia Retrieval, p. 41. ACM, April 2014Google Scholar
- 6.Mikolajczyk, K., Leibe, B., Schiele, B.: Multiple object class detection with a generative model. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 26–36. IEEE, June 2006Google Scholar
- 8.Moise, D., Shestakov, D., Gudmundsson, G., Amsaleg, L.: Indexing and searching 100 M images with map-reduce. In: Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval, pp. 17–24. ACM, April 2013Google Scholar
- 9.Philbin, J.: Scalable object retrieval in very large image collections. Doctoral dissertation, Oxford University (2010)Google Scholar
- 10.Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8. IEEE, June 2007Google Scholar
- 12.Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, Proceedings, pp. 1470–1477. IEEE, October 2003Google Scholar