Learning Compact Visual Attributes for Large-Scale Image Classification

  • Yu Su
  • Frédéric Jurie
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7585)


Attributes based image classification has received a lot of attention recently, as an interesting tool to share knowledge across different categories or to produce compact signature of images. However, when high classification performance is expected, state-of-the-art results are typically obtained by combining Fisher Vectors (FV) and Spatial Pyramid Matching (SPM), leading to image signatures with dimensionality up to 262,144 [1]. This is a hindrance to large-scale image classification tasks, for which the attribute based approaches would be more efficient. This paper proposes a new compact way to represent images, based on attributes, which allows to obtain image signatures that are typically 103 times smaller than the FV+SPM combination without significant loss of performance. The main idea lies in the definition of intermediate level representation built by learning both image and region level visual attributes. Experiments on three challenging image databases (PASCAL VOC 2007, CalTech256 and SUN-397) validate our method.


Gaussian Mixture Model Training Image Spectral Cluster Image Signature Visual Attribute 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  2. 2.
    Li, L., Su, H., Xing, E., Fei-Fei, L.: Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification. In: NIPS (2010)Google Scholar
  3. 3.
    Su, Y., Jurie, F.: Visual word disambiguation by semantic contexts. In: ICCV (2011)Google Scholar
  4. 4.
    Vogel, J., Schiele, B.: Semantic modeling of natural scenes for content-based image retrieval. International Journal on Computer Vision 72, 133–157 (2007)CrossRefGoogle Scholar
  5. 5.
    Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient Object Category Recognition Using Classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  6. 6.
    Parikh, D., Grauman, K.: Interactively building a discriminative vocabulary of nameable attributes. In: CVPR (2011)Google Scholar
  7. 7.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)Google Scholar
  8. 8.
    Cao, Y., Wang, C., Li, Z., Zhang, L., Zhang, L.: Spatial-bag-of-features. In: CVPR (2010)Google Scholar
  9. 9.
    Sharma, G., Jurie, F.: Learning discriminative spatial representation for image classification. In: BMVC (2011)Google Scholar
  10. 10.
    Harada, T., Ushiku, Y., Yamashita, Y., Kuniyoshi, Y.: Discriminative spatial pyramid. In: CVPR (2011)Google Scholar
  11. 11.
    Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: CVPR (2010)Google Scholar
  12. 12.
    Perronnin, F., Liu, Y., Sánchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: CVPR (2010)Google Scholar
  13. 13.
    Sanchez, J., Perronnin, F.: High-dimensional signature compression for large-scale image classification. In: CVPR (2011)Google Scholar
  14. 14.
    Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: NIPS (2001)Google Scholar
  15. 15.
    van Gemert, J., Veenman, C., Smeulders, A., Geusebroek, J.M.: Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1271–1283 (2010)CrossRefGoogle Scholar
  16. 16.
    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1226–1238 (2005)Google Scholar
  17. 17.
    Charikar, M.: Similarity estimation techniques from rounding algorithms. In: ACM Symposium on Theory of Computing, pp. 380–388 (2002)Google Scholar
  18. 18.
    Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 results (2007)Google Scholar
  19. 19.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)Google Scholar
  20. 20.
    Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: CVPR (2010)Google Scholar
  21. 21.
    Bergamo, A., Torresani, L., Fitzgibbon, A.: Picodes: Learning a compact code for novel-category recognition. In: NIPS (2011)Google Scholar
  22. 22.
    Deng, J., Berg, A.C., Li, K., Fei-Fei, L.: What Does Classifying More Than 10,000 Image Categories Tell Us? In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 71–84. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  23. 23.
    Jegou, H., Douze, M., Schmid, C.: Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Yu Su
    • 1
  • Frédéric Jurie
    • 1
  1. 1.GREYC – CNRS UMR 6072University of CaenCaenFrance

Personalised recommendations