Object-Based Representation for Scene Classification

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9673)

Abstract

How to encode and represent a scene remains a critical problem in both human and computer vision. Traditional local and global features are useful and have some successes; however, many observations on human scene perception seem to point to an object-based representation. In this paper, we propose a high-level representation for scene categorization. First, we utilize semantic segmentation to get semantic regions. Then we obtain an object histogram representation of a scene by summation pooling over all regions. Second, we build spatial and geometrical priors for each object and each pair of co-occurrent objects from training scenes, and integrate the spatial and geometrical information of objects into the scene representation. Experimental results on two datasets demonstrate that the proposed representation is effective and competitive.

Keywords

Gaussian Mixture Model Convolutional Neural Network Scene Categorization High Level Representation Indoor Scene 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Project 61175116, the Science and Technology Commission of Shanghai Municipality under research grant no. 14DZ2260800 and Shanghai Knowledge Service Platform for Trustworthy Internet of Things (No. ZF1213).

References

  1. 1.
    Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic segmentation with second-order pooling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 430–443. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  2. 2.
    Dixit, M., Chen, S., Gao, D., Rasiwasia, N., Vasconcelos, N.: Scene classification with semantic fisher vectors. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  3. 3.
    Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 392–407. Springer, Heidelberg (2014)Google Scholar
  4. 4.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)Google Scholar
  5. 5.
    Kwitt, R., Vasconcelos, N., Rasiwasia, N.: Scene recognition on the semantic manifold. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 359–372. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Li, L.J., Su, H., Lim, Y., Li, F.F.: Object bank: an object-level image representation for high-level visual recognition. Int. J. Comput. Vision 107, 20–39 (2014)CrossRefGoogle Scholar
  7. 7.
    Li, X., Guo, Y.: Latent semantic representation learning for scene classification. In: Proceedings of the 31st International Conference on Machine Learning (2014)Google Scholar
  8. 8.
    Su, Y., Jurie, F.: Improving image classification using semantic attributes. Int. J. Comput. Vision 100, 59–77 (2012)CrossRefGoogle Scholar
  9. 9.
    Vogel, J., Schiele, B.: Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vision 72, 133–157 (2007)CrossRefGoogle Scholar
  10. 10.
    Wu, R., Wang, B., Wang, W., Yu, Y.: Harvesting discriminative meta objects with deep CNN features for scene classification. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)Google Scholar
  11. 11.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Proceedings of Advances in Neural Information Processing Systems (NIPS) (2014)Google Scholar
  12. 12.
    Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNS. In: Proceedings of International Conference on Learning Representations (ICLR) (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Shanghai Key Laboratory of Multidimensional Information Processing, Department of Computer Science and TechnologyEast China Normal UniversityShanghaiChina

Personalised recommendations