Improving Image Classification Using Semantic Attributes

Su, Yu; Jurie, Frédéric

doi:10.1007/s11263-012-0529-4

Improving Image Classification Using Semantic Attributes

Published: 08 May 2012

Volume 100, pages 59–77, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yu Su¹ &
Frédéric Jurie¹

2429 Accesses
75 Citations
3 Altmetric
Explore all metrics

Abstract

The Bag-of-Words (BoW) model—commonly used for image classification—has two strong limitations: on one hand, visual words are lacking of explicit meanings, on the other hand, they are usually polysemous. This paper proposes to address these two limitations by introducing an intermediate representation based on the use of semantic attributes. Specifically, two different approaches are proposed. Both approaches consist in predicting a set of semantic attributes for the entire images as well as for local image regions, and in using these predictions to build the intermediate level features. Experiments on four challenging image databases (PASCAL VOC 2007, Scene-15, MSRCv2 and SUN-397) show that both approaches improve performance of the BoW model significantly. Moreover, their combination achieves the state-of-the-art results on several of these image databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Classification Model Using Visual Bag of Semantic Words

Article 01 July 2019

Image Classification Based on Modified BOW Model

Discriminative Image Representation for Classification

References

Bosch, A., Zisserman, A., & Munoz, X. (2006). Scene classification via pLSA. In ECCV.
Google Scholar
Chang, C. C., & Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Google Scholar
Chapelle, O., Haffner, P., & Vapnik, V. (1999). Support vector machines for histogram-based image classification. IEEE Transactions on Neural Networks, 10(5), 1055–1064.
Article Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints. In Proc. workshop on statistical learning in computer vision, at ECCV.
Google Scholar
Delaitre, V., Laptev, I., & Sivic, J. (2010). Recognizing human actions in still images: a study of bag-of-features and part-based representations. In BMVC.
Google Scholar
Deselaers, T., & Ferrari, V. (2011). Visual and semantic similarity in ImageNet. In CVPR.
Google Scholar
Everingham, M., Van Gool, L., Williams, C., Winn, J., & Zisserman, A. (2007). The PASCAL visual object classes challenge 2007 results. http://www.pascal-network.org/challenges/VOC/voc2007/.
Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. (2009). Describing objects by their attributes. In CVPR.
Google Scholar
Fei-Fei, L., & Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. In CVPR.
Google Scholar
Gehler, P., & Nowozin, S. (2009). On feature combination for multiclass object classification. In ICCV.
Google Scholar
van Gemert, J., Veenman, C., Smeulders, A., & Geusebroek, J. M. (2010). Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(7), 1271–1283.
Article Google Scholar
Griffin, G., Holub, A., & Perona, P. (2007). Caltech-256 object category dataset. Tech. rep. 7694. California Institute of Technology.
Harzallah, H., Jurie, F., & Schmid, C. (2009). Combining efficient object localization and image classification. In ICCV.
Google Scholar
Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proc. of uncertainty in artificial intelligence.
Google Scholar
Ji, R., Yao, H., Sun, X., Zhong, B., & Gao, W. (2010). Towards semantic embedding in visual vocabulary. In CVPR.
Google Scholar
Khan, F., van de Weijer, J., & Vanrell, M. (2009). Top-down color attention for object recognition. In ICCV.
Google Scholar
Kittler, J., Hatef, M., Duin, R., & Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 226–239.
Article Google Scholar
Kumar, N., Berg, A., Belhumeur, P., & Nayar, S. (2009). Attribute and simile classifiers for face verification. In ICCV.
Google Scholar
Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In CVPR.
Google Scholar
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In CVPR.
Google Scholar
Leung, T., & Malik, J. (2001). Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision, 43, 29–44.
Article MATH Google Scholar
Li, L., Su, H., Xing, E., & Fei-Fei, L. (2010a). Object bank: a high-level image representation for scene classification & semantic feature sparsification. In NIPS.
Google Scholar
Li, L. J., Wang, C., Lim, Y., Blei, D., & Fei-Fei, L. (2010b). Building and using a semantivisual image hierarchy. In CVPR.
Google Scholar
Liu, J., Yang, Y., & Shah, M. (2009). Learning semantic visual vocabularies using diffusion distance. In CVPR.
Google Scholar
Moosmann, F., Triggs, B., & Jurie, F. (2007). Fast discriminative visual codebooks using randomized clustering forests. In NIPS.
Google Scholar
Morioka, N., & Satoh, S. (2010). Building compact local pairwise codebook with joint feature space clustering. In ECCV.
Google Scholar
Perronnin, F., Senchez, J., et al. (2010). Large-scale image categorization with explicit data embedding. In CVPR.
Google Scholar
Rosch, E., Mervis, C., Gray, W., Johnson, D., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8(3), 382–439.
Article Google Scholar
Saghafi, B., Farahzadeh, E., Rajan, D., & Sluzek, A. (2010). Embedding visual words into concept space for action and scene recognition. In BMVC.
Google Scholar
Sivic, J., & Zisserman, A. (2003). Video Google: a text retrieval approach to object matching in videos. In ICCV.
Google Scholar
Sivic, J., Russell, B., Efros, A., Zisserman, A., & Freeman, W. (2005). Discovering objects and their location in images. In ICCV.
Google Scholar
Sivic, J., Russell, B., Zisserman, A., Freeman, W., & Efros, A. (2008). Unsupervised discovery of visual object class hierarchies. In CVPR.
Google Scholar
Su, Y., & Jurie, F. (2011). Visual word disambiguation by semantic contexts.
Su, Y., Allan, M., & Jurie, F. (2010). Improving object classification using semantic attributes. In BMVC.
Google Scholar
Torresani, L., Szummer, M., & Fitzgibbon, A. (2010). Efficient object category recognition using classemes. In ECCV.
Google Scholar
Ullah, M., Parizi, S., & Laptev, I. (2010). Improving bag-of-features action recognition with non-local cues. In BMVC.
Google Scholar
Vogel, J., & Schiele, B. (2007). Semantic modeling of natural scenes for content-based image retrieval. International Journal of Computer Vision, 72(2), 133–157.
Article Google Scholar
Wang, G., Hoiem, D., & Forsyth, D. (2009). Learning image similarity from Flickr groups using stochastic intersection kernel machines. In ICCV.
Google Scholar
Winn, J., Criminisi, A., & Minka, T. (2005). Object categorization by learned universal visual dictionary. In ICCV.
Google Scholar
Xiao, J., Hays, J., Ehinger, K., Oliva, A., & Torralba, A. (2010). Sun database: large-scale scene recognition from abbey to zoo. In CVPR.
Google Scholar
Yang, J., Li, Y., Tian, Y., Duan, L., & Gao, W. (2009). Group-sensitive multiple kernel learning for object categorization. In ICCV.
Google Scholar
Yuan, J., Wu, Y., & Yang, M. (2007). Discovery of collocation patterns: from visual words to visual phrases. In CVPR.
Google Scholar
Zhang, Y., & Chen, T. (2009). Efficient kernels for identifying unbounded-order spatial features. In CVPR.
Google Scholar
Zheng, Y., Zhao, M., Neo, S., Chua, T., & Tian, Q. (2008). Visual synset: towards a higher-level visual representation. In CVPR.
Google Scholar
Zhou, X., Yu, K., Zhang, T., & Huang, T. (2010). Image classification using super-vector coding of local image descriptors. In ECCV.
Google Scholar

Download references

Acknowledgement

This work was partly realized under the Quaero Programme, funded by OSEO, French State agency for innovation.

Author information

Authors and Affiliations

GREYC—UMR6072 CNRS, University of Caen, Caen, France
Yu Su & Frédéric Jurie

Authors

Yu Su
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Jurie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Su.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Su, Y., Jurie, F. Improving Image Classification Using Semantic Attributes. Int J Comput Vis 100, 59–77 (2012). https://doi.org/10.1007/s11263-012-0529-4

Download citation

Received: 22 July 2011
Accepted: 15 April 2012
Published: 08 May 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s11263-012-0529-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Image Classification Using Semantic Attributes

Abstract

Access this article

Similar content being viewed by others

Image Classification Model Using Visual Bag of Semantic Words

Image Classification Based on Modified BOW Model

Discriminative Image Representation for Classification

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving Image Classification Using Semantic Attributes

Abstract

Access this article

Similar content being viewed by others

Image Classification Model Using Visual Bag of Semantic Words

Image Classification Based on Modified BOW Model

Discriminative Image Representation for Classification

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation