ECCV 2012: Computer Vision – ECCV 2012 pp 1-15 | Cite as
Object-Centric Spatial Pooling for Image Classification
Abstract
Spatial pyramid matching (SPM) based pooling has been the dominant choice for state-of-art image classification systems. In contrast, we propose a novel object-centric spatial pooling (OCP) approach, following the intuition that knowing the location of the object of interest can be useful for image classification. OCP consists of two steps: (1) inferring the location of the objects, and (2) using the location information to pool foreground and background features separately to form the image-level representation. Step (1) is particularly challenging in a typical classification setting where precise object location annotations are not available during training. To address this challenge, we propose a framework that learns object detectors using only image-level class labels, or so-called weak labels. We validate our approach on the challenging PASCAL07 dataset. Our learned detectors are comparable in accuracy with state-of-the-art weakly supervised detection methods. More importantly, the resulting OCP approach significantly outperforms SPM-based pooling in image classification.
Keywords
Object Detector Outer Loop Foreground Region Spatial Pyramid Match Simple DatasetPreview
Unable to display preview. Download preview PDF.
References
- 1.Nguyen, M.H., Torresani, L., de la Torre, F., Rother, C.: Weakly supervised discriminative localization and classification: a joint learning process. In: ICCV (2009)Google Scholar
- 2.Bilen, H., Namboodiri, V.P., Gool, L.V.: Object and action classification with latent variables. In: BMVC (2010)Google Scholar
- 3.Chai, Y., Lempitsky, V., Zisserman, A.: BiCoS: A bi-level co-segmentation method for image classification. In: CVPR (2011)Google Scholar
- 4.Murphy, K., Torralba, A., Eaton, D., Freeman, W.: Object detection and localization using local and global features. Lecture Notes in Compute Science (2006)Google Scholar
- 5.Crandall, D.J., Huttenlocher, D.P.: Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 16–29. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 6.Zhang, Y., Chen, T.: Weakly supervised object recognition and localization with invariant high order features. In: BMVC (2010)Google Scholar
- 7.Feng, J., Ni, B., Tian, Q., Yan, S.: Geometric ℓp-norm feature pooling for image classification. In: CVPR (2011)Google Scholar
- 8.Hedi, H., Frederic, J., Cordelia, S.: Combining efficient object localization and image classification. In: ICCV (2009)Google Scholar
- 9.Song, Z., Chen, Q., Huang, Z., Hua, Y., Yan, S.: Contextualizing object detection and classification. In: CVPR (2011)Google Scholar
- 10.Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained Linear Coding for image classification. In: CVPR (2010)Google Scholar
- 11.Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image Classification Using Super-Vector Coding of Local Image Descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 12.Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)Google Scholar
- 13.Berg, A., Deng, J., Satheesh, S., Su, H., Fei-Fei, L.: Large scale visual recognition challenge (2010-2011), http://www.image-net.org/challenges/LSVRC/2011/
- 14.Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) challenge. IJCV 88, 303–338 (2010)CrossRefGoogle Scholar
- 15.Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial Pyramid Matching for recognizing natural scene categories. In: CVPR (2006)Google Scholar
- 16.Deselaers, T., Alexe, B., Ferrari, V.: Localizing Objects While Learning Their Appearance. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 452–466. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 17.Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: ICCV (2011)Google Scholar
- 18.Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: CVPR (2005)Google Scholar
- 19.Ahonen, T., Hadid, A., Pietikinen, M.: Face description with local binary patterns: Application to face recognition. PAMI 28 (2006)Google Scholar
- 20.Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)CrossRefGoogle Scholar
- 21.Huang, Y., Huang, K., Tan, T.: Salient coding for image classification. In: CVPR (2011)Google Scholar
- 22.Gao, S., Chia, L.T., Tsang, I.W.: Multi-layer group sparse coding – for concurrent image classification and annotation. In: CVPR (2011)Google Scholar
- 23.Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 24.Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI 32 (2010)Google Scholar
- 25.ovan de Sande, K.E.A., Uijlings, J.R.R., Gevers, T., Smeulders, A.W.M.: Segmentation as selective search for object recognition. In: ICCV (2011)Google Scholar
- 26.Russell, B.C., Freeman, W.T., Effros, A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR (2006)Google Scholar
- 27.Kim, G., Torralba, A.: Unsupervised detection of regions of interest using iterative link analysis. In: NIPS (2009)Google Scholar
- 28.Chum, O., Zisserman, A.: An exemplar model for learning object classes. In: CVPR (2007)Google Scholar
- 29.Oliva, A., Torralba, A.: The role of context in object recognition. Trends in Cognitive Sciences 11 (2007)Google Scholar
- 30.Lin, Y., Lv, F., Cao, L., Zhu, S., Yang, M., Cour, T., Yu, K., Huang, T.: Large-scale image classification: Fast feature extraction and SVM training. In: CVPR (2011)Google Scholar
- 31.Guillaumin, M., Verbeek, J., Schmid, C.: Multimodal semi-supervised learning for image classification. In: CVPR (2010)Google Scholar