Latent Pyramidal Regions for Recognizing Scenes
Abstract
In this paper we propose a simple but efficient image representation for solving the scene classification problem. Our new representation combines the benefits of spatial pyramid representation using nonlinear feature coding and latent Support Vector Machine (LSVM) to train a set of Latent Pyramidal Regions (LPR). Each of our LPRs captures a discriminative characteristic of the scenes and is trained by searching over all possible sub-windows of the images in a latent SVM training procedure. Each LPR is represented in a spatial pyramid and uses non-linear locality constraint coding for learning both shape and texture patterns of the scene. The final response of the LPRs form a single feature vector which we call the LPR representation and can be used for the classification task. We tested our proposed scene representation model in three datasets which contain a variety of scene categories (15-Scenes, UIUC-Sports and MIT-indoor). Our LPR representation obtains state-of-the-art results on all these datasets which shows that it can simultaneously model the global and local scene characteristics in a single framework and is general enough to be used for both indoor and outdoor scene classification.
Keywords
Feature Vector Sparse Code Region Detector Image Descriptor Scene RecognitionReferences
- 1.Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2) (2004)Google Scholar
- 2.Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)Google Scholar
- 3.Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: NIPS, pp. 801–808 (2007)Google Scholar
- 4.Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)Google Scholar
- 5.Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp. 1794–1801 (2009)Google Scholar
- 6.Gemert, J.C., Geusebroek, J.M., Veenman, C.J., Smeulders, A.W.: Kernel Codebooks for Scene Categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)CrossRefGoogle Scholar
- 7.Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)Google Scholar
- 8.Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: ICCV (2011)Google Scholar
- 9.Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. PAMI 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
- 10.Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML, pp. 1169–1176 (2009)Google Scholar
- 11.Blaschko, M.B., Lampert, C.H.: Learning to Localize Objects with Structured Output Regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)CrossRefGoogle Scholar
- 12.Do, T.M.T., Artières, T.: Large margin training for hidden markov models with partially observed states. In: ICML, pp. 265–272 (2009)Google Scholar
- 13.Teo, C.H., Smola, A., Vishwanathan, S.V., Le, Q.V.: A scalable modular convex solver for regularized risk minimization. In: ACM SIGKDD, pp. 727–736 (2007)Google Scholar
- 14.Teo, C.H., Vishwanthan, S., Smola, A.J., Le, Q.V.: Bundle methods for regularized risk minimization. JMLR, 311–365 (2010)Google Scholar
- 15.Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR, pp. 413–420 (2009)Google Scholar
- 16.Li, L.-J., HaoSu, E.P.X., Fei-Fei, L.: Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: NIPS (2010)Google Scholar
- 17.Li, L.-J., Hao Su, Y.L., Fei-Fei, L.: Objects as attributes for scene classification. In: ECCV (2010)Google Scholar
- 18.Li, L.J., Fei-Fei, L.: What, where and who? classifying event by scene and object recognition. In: ICCV (2007)Google Scholar
- 19.Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: CVPR, pp. 3485–3492 (2010)Google Scholar
- 20.Han, Y., Liu, G.: Efficient learning of sample-specific discriminative features for scene classification. SPLetters 18(11), 683–686 (2011)Google Scholar
- 21.Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV 42, 145–175 (2001)zbMATHCrossRefGoogle Scholar
- 22.Zhu, J., Li, L.J., Fei-Fei, L., Xing, E.P.: Large margin learning of upstream scene understanding models. In: NIPS (2010)Google Scholar
- 23.Wu, J., Rehg, J.: Centrist: A visual descriptor for scene categorization. IEEE Trans. PAMI 33(8), 1489–1501 (2011)CrossRefGoogle Scholar