Classifying Images at Scene Level: Comparing Global and Local Descriptors

  • Christian Hentschel
  • Sebastian Gerke
  • Eugene Mbanya
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7836)


In this paper we compare two state-of-the-art approaches for image classification. The first approach follows the Bag-of-Keypoints method for classifying images based on local image pattern frequency distribution. The second approach computes the gist of an image by computing global image statistics. Both approaches are explained in detail and their performance is compared using a subset of images taken from the ImageClef 2011 PhotoAnnotation task. The images were selected based on the assumption they could be better described using global features. Results show that while Bag-of-Keypoints-like classification performs better even for global concepts the classification accuracy of the global descriptor remains acceptable at a much smaller computational footprint.


Support Vector Machine Feature Vector Local Descriptor Mean Average Precision Video Retrieval 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: CIVR 2007: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 401–408. ACM Press, New York (2007)Google Scholar
  2. 2.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C., Maupertuis, D.: Visual Categorization with Bags of Keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)Google Scholar
  3. 3.
    Douze, M., Jégou, H., Sandhawalia, H., Amsaleg, L., Schmid, C.: Evaluation of GIST descriptors for web-scale image search. In: Proceeding of the ACM International Conference on Image and Video Retrieval, CIVR 2009, p. 1 (2009)Google Scholar
  4. 4.
    Friedman, A.: Framing pictures: The role of knowledge in automatized encoding and memory for gist. Journal of Experimental Psychology: General (1979)Google Scholar
  5. 5.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 2, pp. 2169–2178. IEEE (2006)Google Scholar
  6. 6.
    Leung, T., Malik, J.: Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons. International Journal of Computer Vision 43(1), 29–44 (2001)zbMATHCrossRefGoogle Scholar
  7. 7.
    Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  8. 8.
    Oliva, A.: Gist of the Scene, ch. 41, pp. 251–257. Elsevier, San Diego (2005)Google Scholar
  9. 9.
    Oliva, A., Torralba, A.: Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope. International Journal of Computer Vision 42(3), 145–175 (2001)zbMATHCrossRefGoogle Scholar
  10. 10.
    Snoek, C.G.M., Worring, M.: Concept-Based Video Retrieval. Foundations and Trends® in Information Retrieval 2(4), 215–322 (2009)CrossRefGoogle Scholar
  11. 11.
    Sonnenburg, S., Rätsch, G., Schäer, C., Schölkopf, B.: Large scale multiple kernel learning. The Journal of Machine Learning Research 7, 1531–1565 (2006)zbMATHGoogle Scholar
  12. 12.
    van De Sande, K.E., Gevers, T., Snoek, C.G.: A comparison of color features for visual concept classification. In: Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval, CIVR 2008, p. 141. ACM Press, New York (2008)CrossRefGoogle Scholar
  13. 13.
    van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1582–1596 (2010)CrossRefGoogle Scholar
  14. 14.
    Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. International Journal of Computer Vision 73(2), 213–238 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Christian Hentschel
    • 1
  • Sebastian Gerke
    • 1
  • Eugene Mbanya
    • 1
  1. 1.Fraunhofer Institute for TelecommunicationsHeinrich Hertz InstituteGermany

Personalised recommendations