Enhancing Semantic Features with Compositional Analysis for Scene Recognition

  • Miriam Redi
  • Bernard Merialdo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7585)


Scene recognition systems are generally based on features that represent the image semantics by modeling the content depicted in a given image. In this paper we propose a framework for scene recognition that goes beyond the mere visual content analysis by exploiting a new cue for categorization: the image composition, namely its photographic style and layout. We extract information about the image composition by storing the values of affective, aesthetic and artistic features in a compositional vector. We verify the discriminative power of our compositional vector for scene categorization by using it for the classification of images from various, diverse, large scale scene understanding datasets. We then combine the compositional features with traditional semantic features in a complete scene recognition framework. Results show that, due to the complementarity of compositional and semantic features, scene categorization systems indeed benefit from the incorporation of descriptors representing the image photographic layout (+ 13-15% over semantic-only categorization).


Semantic Feature Compositional Feature Scene Categorization Outdoor Scene Scene Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  2. 2.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, vol. 1, p. 22. Citeseer (2004)Google Scholar
  3. 3.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42 (2001)Google Scholar
  4. 4.
    Krages, B.: Photography: the art of composition. Allworth Pr. (2005)Google Scholar
  5. 5.
    Freeman, M.: The photographer’s eye: composition and design for better digital photos. Focal Pr. (2007)Google Scholar
  6. 6.
    van Gemert, J.: Exploiting photographic style for category-level image classification by generalizing the spatial pyramid. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, p. 14. ACM (2011)Google Scholar
  7. 7.
    Dhar, S., Ordonez, V., Berg, T.: High level describable attributes for predicting aesthetics and interestingness. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1657–1664. IEEE (2011)Google Scholar
  8. 8.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying Aesthetics in Photographic Images Using a Computational Approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Obrador, P., Saad, M.A., Suryanarayan, P., Oliver, N.: Towards Category-Based Aesthetic Models of Photographs. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 63–76. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  10. 10.
    Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the International Conference on Multimedia, pp. 83–92. ACM (2010)Google Scholar
  11. 11.
    Rigau, J., Feixas, M., Sbert, M.: Conceptualizing birkhoff’s aesthetic measure using shannon entropy and kolmogorov complexity. In: Computational Aesthetics in Graphics, Visualization, and Imaging (2007)Google Scholar
  12. 12.
    Redi, M., Merialdo, B.: Saliency moments for image categorization. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR 2011 (2011)Google Scholar
  13. 13.
    Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)Google Scholar
  14. 14.
    Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492. IEEE (2010)Google Scholar
  15. 15.
    Wong, L., Low, K.: Saliency-enhanced image aesthetics class prediction. In: 2009 16th IEEE International Conference on Image Processing (ICIP). IEEE (2009)Google Scholar
  16. 16.
    Wang, W., Yu, Y.: Image emotional semantic query based on color semantic description. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, vol. 7, pp. 4571–4576. IEEE (2005)Google Scholar
  17. 17.
    Li, C., Chen, T.: Aesthetic visual quality assessment of paintings. IEEE Journal of Selected Topics in Signal Processing 3, 236–252 (2009)CrossRefGoogle Scholar
  18. 18.
    Leslie, L., Chua, T., Ramesh, J.: Annotation of paintings with high-level semantic concepts using transductive inference and ontology-based concept disambiguation. In: Proceedings of the 15th International Conference on Multimedia. ACM (2007)Google Scholar
  19. 19.
    Haralick, R.M., Shapiro, L.G.: Computer and Robot Vision, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1992)Google Scholar
  20. 20.
    Desnoyer, M., Wettergreen, D.: Aesthetic image classification for autonomous agents. In: Proc. ICPR. Citeseer (2010)Google Scholar
  21. 21.
    Michelson, A.: Studies in optics. Dover Pubns. (1995)Google Scholar
  22. 22.
    Birkhoff, G.: Aesthetic measure (1933)Google Scholar
  23. 23.
    Won, C., Park, D., Park, S.: Efficient use of mpeg-7 edge histogram descriptor. Etri Journal 24, 23–30 (2002)CrossRefGoogle Scholar
  24. 24.
    Ruderman, D.: The statistics of natural images. Network: Computation in Neural Systems 5, 517–548 (1994)zbMATHCrossRefGoogle Scholar
  25. 25.
    Delezoide, B., Precioso, F., Redi, M., Merialdo, B., Granjon, L., Pellerin, D., Rombaut, M., Jégou, H., Vieux, R., Mansencal, B., et al.: Irim at trecvid 2011: Semantic indexing and instance search. TREC Online Proceedings (2011)Google Scholar
  26. 26.
    Hou, X., Zhang, L.: Saliency detection: A spectral residual approach. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007. IEEE (2007)Google Scholar
  27. 27.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2. IEEE (2003)Google Scholar
  28. 28.
    Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 524–531. IEEE (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Miriam Redi
    • 1
  • Bernard Merialdo
    • 1
  1. 1.EURECOM, Sophia AntipolisSophia AntipolisFrance

Personalised recommendations