Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Indexing

Kastrinakis, Dimitris; Papadopoulos, Symeon; Vakali, Athena

doi:10.1007/978-3-642-40683-6_8

Dimitris Kastrinakis¹⁹,
Symeon Papadopoulos²⁰ &
Athena Vakali¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8133))

Included in the following conference series:

East European Conference on Advances in Databases and Information Systems

1018 Accesses

Abstract

Multimedia data indexing for content-based retrieval has attracted significant attention in recent years due to the commoditization of multimedia capturing equipment and the widespread adoption of social networking platforms as means for sharing media content online. Due to the very large amounts of multimedia content, notably images, produced and shared online by people, a very important requirement for multimedia indexing approaches pertains to their efficiency both in terms of computation and memory usage. A common approach to support query-by-example image search is based on the extraction of visual words from images and their indexing by means of inverted indices, a method proposed and popularized in the field of text retrieval.

The main challenge that visual word indexing systems currently face arises from the fact that it is necessary to build very large visual vocabularies (hundreds of thousands or even millions of words) to support sufficiently precise search. However, when the visual vocabulary is large, the image indexing process becomes computationally expensive due to the fact that the local image descriptors (e.g. SIFT) need to be quantized to the nearest visual words.

To this end, this paper proposes a novel method that significantly decreases the time required for the above quantization process. Instead of using hundreds of thousands of visual words for quantization, the proposed method manages to preserve retrieval quality by using a much smaller number of words for indexing. This is achieved by the concept of composite words, i.e. assigning multiple words to a local descriptor in ascending order of distance. We evaluate the proposed method in the Oxford and Paris buildings datasets to demonstrate the validity of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arthur, D., Vassilvitskii, S.: k-means++: The advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics (2007)
Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)
Chapter Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV International Workshop on Statistical Learning in Computer Vision, Prague (2004)
Google Scholar
Elkan, C.: Using the triangle inequality to accelerate k-means. In: Proceedings of the 20th International Conference on Machine Learning (2003)
Google Scholar
Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the Alvey Vision Conference, pp. 147–151 (1988)
Google Scholar
Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: Tenth IEEE International Conference on Computer Vision, ICCV, vol. 1. IEEE (2005)
Google Scholar
Lepetit, V., Lagger, P., Fua, P.: Randomized trees for real-time keypoint recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2. IEEE (2005)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Advances in Neural Information Processing Systems, vol. 19, pp. 985–992 (2007)
Google Scholar
Nir, A., Jaiswal, R., Monteleoni, C.: Streaming k-means approximation. In: Advances in Neural Information Processing Systems, vol. 22, pp. 10–18 (2009)
Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2 (2006)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007 (2007)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Google Scholar
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, ICCV (2003)
Google Scholar
Van De Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1582–1596 (2010)
Article Google Scholar
Vedaldi, A., Fulkerson, B.: VLFeat: An Open and Portable Library of Computer Vision Algorithms (2008), http://www.vlfeat.org/
Wang, C., Zhang, L., Zhang, H.: Learning to reduce the semantic gap in web image retrieval and annotation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 355–362 (2008)
Google Scholar
Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2009)
Google Scholar
Yuan, J., Wu, Y., Yang, M.: Discovery of collocation patterns: from visual words to visual phrases. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8. IEEE (2007)
Google Scholar
Zhang, S., Tian, Q., Hua, G., Huang, Q., Li, S.: Descriptive visual words and visual phrases for image applications. In: Proceedings of the 17th ACM International Conference on Multimedia, pp. 75–84. ACM (2009)
Google Scholar
Zhang, S., Huang, Q., Hua, G., Jiang, S., Gao, W., Tian, Q.: Building contextual visual vocabulary for large-scale image applications. In: Proceedings of the International Conference on Multimedia, pp. 501–510. ACM (2010)
Google Scholar
Zhao, R., Grosky, W.: Bridging the semantic gap in image retrieval. In: Distributed Multimedia Databases: Techniques and Applications, pp. 14–36 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Aristotle University, 54124, Thessaloniki, Greece
Dimitris Kastrinakis & Athena Vakali
Information Technologies Institute, CERTH-ITI, 57001, Thessaloniki, Greece
Symeon Papadopoulos

Authors

Dimitris Kastrinakis
View author publications
You can also search for this author in PubMed Google Scholar
Symeon Papadopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Athena Vakali
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Università di Genova, Italy
Barbara Catania
DIBRIS, Università di Genova, Italy
Giovanna Guerrini
Department of Software Engineering Faculty of Mathematics and Physics, Charles University, Malostranské nám. 25, 11800, Prague 1, Czech Republic
Jaroslav Pokorný

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kastrinakis, D., Papadopoulos, S., Vakali, A. (2013). Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Indexing. In: Catania, B., Guerrini, G., Pokorný, J. (eds) Advances in Databases and Information Systems. ADBIS 2013. Lecture Notes in Computer Science, vol 8133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40683-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-40683-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40682-9
Online ISBN: 978-3-642-40683-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics