Accelerating Visual Categorization with the GPU

van de Sande, Koen E. A.; Gevers, Theo; Snoek, Cees G. M.

doi:10.1007/978-3-642-35740-4_34

Koen E. A. van de Sande¹⁷,
Theo Gevers¹⁷ &
Cees G. M. Snoek¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6554))

Included in the following conference series:

European Conference on Computer Vision

2050 Accesses
1 Citations

Abstract

Visual categorization is important to manage large collections of digital images and video, where textual meta-data is often incomplete or simply unavailable. The bag-of-words model has become the most powerful method for visual categorization of images and video. Despite its high accuracy, a severe drawback of this model is its high computational cost. As the trend to increase computational power in newer CPU and GPU architectures is to increase their level of parallelism, exploiting this parallelism becomes an important direction to handle the computational cost of the bag-of-words approach. In this paper, we analyze the bag-of-words model for visual categorization in terms of computational cost and identify two major bottlenecks: the quantization step and the classification step. We address these two bottlenecks by proposing two efficient algorithms for quantization and classification by exploiting the GPU hardware and the CUDA parallel programming model. The algorithms are designed to keep categorization accuracy intact and give the same numerical results.

In the experiments on large scale datasets it is shown that, by using a parallel implementation on the GPU, quantization is 28 times faster and classification is 35 faster than a single-threaded CPU version, while giving the exact same numerical results. The GPU accelerations are applicable to both the learning phase and the testing phase of visual categorization systems. For software visit http://www.colordescriptors.com/ .

Download to read the full chapter text

Chapter PDF

Large Scale Visual Classification with Many Classes

Large Scale Image Classification: Fast Feature Extraction, Multi-codebook Approach and Multi-core SVM Training

Large scale classifiers for visual classification tasks

Article 13 June 2014

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Empowering visual categorization with the GPU. IEEE Transactions on Multimedia (2011) (in press)
Google Scholar
Hollink, L., Huurnink, B., van Liempt, M., Oomen, J., de Jong, A., de Rijke, M., Schreiber, G., Smeulders, A.W.M.: A multidisciplinary approach to unlocking television broadcast archives. Interdisciplinary Science Reviews 34, 253–267 (2009)
Article Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision, pp. 1470–1477 (2003)
Google Scholar
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. International Journal of Computer Vision 73, 213–238 (2007)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. International Journal of Computer Vision 88, 303–338 (2010)
Article Google Scholar
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1582–1596 (2010)
Article Google Scholar
Jiang, Y.G., Yang, J., Ngo, C.W., Hauptmann, A.: Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Transactions on Multimedia 12, 42–53 (2010)
Article Google Scholar
van de Sande, K.E.A., Gevers, T.: University of Amsterdam at the Visual Concept Detection and Annotation Tasks. The Information Retrieval Series: Image CLEF, vol. 32, ch. 18, pp. 343–358. Springer (2010)
Google Scholar
Gaidon, A., Marszałek, M., Schmid, C.: The PASCAL visual object classes challenge 2008 submission. Technical report, INRIA-LEAR (2008)
Google Scholar
Snoek, C.G.M., van de Sande, K.E.A., de Rooij, O., Huurnink, B., Uijlings, J.R.R., van Liempt, M., Bugalho, M., Trancoso, I., Yan, F., Tahir, M.A., Mikolajczyk, K., Kittler, J., de Rijke, M., Geusebroek, J.M., Gevers, T., Worring, M., Koelma, D.C., Smeulders, A.W.M.: The MediaMill TRECVID 2009 semantic video search engine. In: Proceedings of the TRECVID Workshop (2009)
Google Scholar
Wang, D., Liu, X., Luo, L., Li, J., Zhang, B.: Video diver: generic video indexing with diverse features. In: ACM International Workshop on Multimedia Information Retrieval, pp. 61–70 (2007)
Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: ACM International Workshop on Multimedia Information Retrieval, pp. 321–330 (2006)
Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Computer Vision and Image Understanding 110, 346–359 (2008)
Article Google Scholar
Uijlings, J.R.R., Smeulders, A.W.M., Scha, R.J.H.: Real-time bag-of-words, approximately. In: ACM International Conference on Image and Video Retrieval (2009)
Google Scholar
Chang, C.C., Li, Y.C., Yeh, J.B.: Fast codebook search algorithms based on tree-structured vector quantization. Pattern Recognition Letters 27, 1077–1086 (2006)
Article Google Scholar
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Neural Information Processing Systems, pp. 985–992 (2006)
Google Scholar
Cornelis, N., Van Gool, L.: Fast scale invariant feature detection and matching on programmable graphics hardware. In: IEEE Computer Vision and Pattern Recognition Workshops (2008)
Google Scholar
Sinha, S.N., Frahm, J.M., Pollefeys, M., Genc, Y.: Feature tracking and matching in video using programmable graphics hardware. Machine Vision and Applications (2007)
Google Scholar
Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proceedings of the IEEE 96, 879–899 (2008)
Article Google Scholar
Sharp, T.: Implementing Decision Trees and Forests on a GPU. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 595–608. Springer, Heidelberg (2008)
Chapter Google Scholar
Catanzaro, B., Sundaram, N., Keutzer, K.: Fast support vector machine training and classification on graphics processors. In: International Conference on Machine Learning, pp. 104–111 (2008)
Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40, 1–60 (2008)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Mikolajczyk, K., et al.: A comparison of affine region detectors. International Journal of Computer Vision 65, 43–72 (2005)
Article Google Scholar
Geusebroek, J.M., Smeulders, A.W.M., van de Weijer, J.: Fast anisotropic gauss filtering. IEEE Transactions on Image Processing 12, 938–943 (2003)
Article MathSciNet Google Scholar
Jégou, H., Douze, M., Schmid, C.: Packing bag-of-features. In: IEEE International Conference on Computer Vision (2009)
Google Scholar
van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1271–1283 (2010)
Article Google Scholar
Cai, D., He, X., Han, J.: Efficient kernel discriminant analysis via spectral regression. In: IEEE International Conference on Data Mining, pp. 427–432 (2007)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. (2001) Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Do, T.-N., Nguyen, V.-H., Poulet, F.: Speed Up SVM Algorithm for Massive Classification Tasks. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS (LNAI), vol. 5139, pp. 147–157. Springer, Heidelberg (2008)
Chapter Google Scholar
Sengupta, S., Harris, M., Zhang, Y., Owens, J.D.: Scan primitives for GPU computing. In: Graphics Hardware, pp. 97–106 (2007)
Google Scholar
Chang, D., Jones, N.A., Li, D., Ouyang, M.: Compute pairwise euclidean distances of data points with GPUs. In: Intelligent Systems and Control, pp. 278–283 (2008)
Google Scholar
Snoek, C.G.M., Worring, M., van Gemert, J.C., Geusebroek, J.M., Smeulders, A.W.M.: The challenge problem for automated detection of 101 semantic concepts in multimedia. In: ACM International Conference on Multimedia, pp. 421–430 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Systems Lab Amsterdam (ISLA), University of Amsterdam, Science Park 904, 1098 XH, Amsterdam, The Netherlands
Koen E. A. van de Sande, Theo Gevers & Cees G. M. Snoek

Authors

Koen E. A. van de Sande
View author publications
You can also search for this author in PubMed Google Scholar
Theo Gevers
View author publications
You can also search for this author in PubMed Google Scholar
Cees G. M. Snoek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 10 King’s College Road, M5S 3G4, Toronto, ON, Canada
Kiriakos N. Kutulakos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

van de Sande, K.E.A., Gevers, T., Snoek, C.G.M. (2012). Accelerating Visual Categorization with the GPU. In: Kutulakos, K.N. (eds) Trends and Topics in Computer Vision. ECCV 2010. Lecture Notes in Computer Science, vol 6554. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35740-4_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-35740-4_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35739-8
Online ISBN: 978-3-642-35740-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Accelerating Visual Categorization with the GPU

Abstract

Chapter PDF

Similar content being viewed by others

Large Scale Visual Classification with Many Classes

Large Scale Image Classification: Fast Feature Extraction, Multi-codebook Approach and Multi-core SVM Training

Large scale classifiers for visual classification tasks

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Accelerating Visual Categorization with the GPU

Abstract

Chapter PDF

Similar content being viewed by others

Large Scale Visual Classification with Many Classes

Large Scale Image Classification: Fast Feature Extraction, Multi-codebook Approach and Multi-core SVM Training

Large scale classifiers for visual classification tasks

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation