Patch-Level Spatial Layout for Classification and Weakly Supervised Localization

Zadrija, Valentina; Krapac, Josip; Verbeek, Jakob; Šegvić, Siniša

doi:10.1007/978-3-319-24947-6_41

Valentina Zadrija¹⁷,
Josip Krapac¹⁷,
Jakob Verbeek¹⁸ &
…
Siniša Šegvić¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9358))

Included in the following conference series:

German Conference on Pattern Recognition

2089 Accesses
2 Citations

Abstract

We propose a discriminative patch-level model which combines appearance and spatial layout cues. We start from a block-sparse model of patch appearance based on the normalized Fisher vector representation. The appearance model is responsible for (i) selecting a discriminative subset of visual words, and (ii) identifying distinctive patches assigned to the selected subset. These patches are further filtered by a sparse spatial model operating on a novel representation of pairwise patch layout. We have evaluated the proposed pipeline in image classification and weakly supervised localization experiments on a public traffic sign dataset. The results show significant advantage of the combined model over state of the art appearance models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
For the sake of simplicity, we assume the global \(\ell _2\) normalization n(X). We later show the proposed reasoning also holds in the case of the intra- \(\ell _2\) normalization.
2.
These results are worse than [21] since here we do not use additional negative images for training, i.e. the training dataset is the same as in other experiments.

References

Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)
Article Google Scholar
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS (2003)
Google Scholar
Arandjelović, R., Zisserman, A.: All about VLAD. In: CVPR (2013)
Google Scholar
Baecchi, C., Turchini, F., Seidenari, L., Bagdanov, A.D., Bimbo, A.D.: Fisher vectors over random density forests for object recognition. In: ICPR (2014)
Google Scholar
Brkić, K., Pinz, A., Šegvić, S., Kalafatić, Z.: Histogram-based description of local space-time appearance. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 206–217. Springer, Heidelberg (2011)
Chapter Google Scholar
Cinbis, R., Verbeek, J., Schmid, C.: Segmentation driven object detection with Fisher vectors. In: ICCV (2013)
Google Scholar
Cinbis, R., Verbeek, J., Schmid, C.: Multi-fold MIL training for weakly supervised object localization. In: CVPR (2014)
Google Scholar
Crowley, E.J., Zisserman, A.: Of gods and goats: weakly supervised learning of figurative art. In: BMVC (2013)
Google Scholar
Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV (2004)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Deselaers, T., Alexe, B., Ferrari, V.: Weakly supervised localization and learning with generic knowledge. Int. J. Comput. Vis. 100(3), 275–293 (2012)
Article MathSciNet Google Scholar
Dollár, P., Belongie, S., Perona, P.: The fastest pedestrian detector in the west. In: BMVC (2010)
Google Scholar
Douze, M., Jégou, H.: The Yael library. In: Proceedings of the ACM International Conference on Multimedia (2014)
Google Scholar
Everingham, M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Fernando, B., Fromont, E., Tuytelaars, T.: Mining mid-level features for image classification. Int. J. Comput. Vis. 108(3), 186–203 (2014)
Article MathSciNet Google Scholar
Galleguillos, C., Babenko, B., Rabinovich, A., Belongie, S.: Weakly supervised object localization with stable segmentations. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 193–207. Springer, Heidelberg (2008)
Chapter Google Scholar
Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR (2009)
Google Scholar
Jenatton, R., Mairal, J., Obozinski, G., Bach, F.R.: Proximal methods for hierarchical sparse coding. J. Mach. Learn. Res. 12, 2297–2334 (2011)
MathSciNet Google Scholar
Krapac, J., Šegvić, S.: Fast approximate GMM soft-assign for fine-grained image classification with large Fisher vectors. In: GCPR (2015)
Google Scholar
Krapac, J., Šegvić, S.: Weakly supervised object localization with large Fisher vectors. In: VISAPP (2015)
Google Scholar
Krapac, J., Verbeek, J., Jurie, F.: Modeling spatial layout with Fisher vectors for image categorization. In: ICCV (2011)
Google Scholar
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: a branch and bound framework for object localization. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2129–2142 (2009)
Article Google Scholar
Liu, D., Hua, G., Viola, P.A., Chen, T.: Integrated feature selection and higher-order spatial feature extraction for object categorization. In: CVPR (2008)
Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. 11, 19–60 (2010)
MathSciNet Google Scholar
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: NIPS, pp. 570–576 (1997)
Google Scholar
Mathias, M., Timofte, R., Benenson, R., Gool, L.J.V.: Traffic sign recognition - how far are we from the solution? In: IJCNN, pp. 1–8 (2013)
Google Scholar
Mobileye: Traffic Sign Detection. http://www.mobileye.com. Accessed 22 July 2015
Murphy, K.: Machine learning a probabilistic perspective. MIT Press, Cambridge (2012)
Google Scholar
Nguyen, M.H., Torresani, L., De la Torre, F., Rother, C.: Learning discriminative localization from weakly labeled data. Pattern Recogn. 47(3), 1523–1534 (2014)
Article Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the Fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)
Article MathSciNet Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep Fisher networks for large-scale image classification. In: NIPS, pp. 163–171 (2013)
Google Scholar
Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 73–86. Springer, Heidelberg (2012)
Chapter Google Scholar
Siva, P., Xiang, T.: Weakly supervised object detector learning with model drift detection. In: ICCV (2011)
Google Scholar
Viola, P.A., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004). http://dx.doi.org/10.1023/B:VISI.0000013087.49260.fb
Article Google Scholar
Voravuthikunchai, W., Cremilleux, B., Jurie, F.: Histograms of pattern sets for image classification and object recognition. In: CVPR (2014)
Google Scholar
Šegvić, S., Brkic, K., Kalafatic, Z., Pinz, A.: Exploiting temporal and spatial constraints in traffic sign detection from a moving vehicle. Mach. Vis. Appl. 25(3), 649–665 (2014)
Article Google Scholar
Weng, C., Yuan, J.: Efficient mining of optimal AND/OR patterns for visual recognition. IEEE Trans. Multimedia 17(5), 626–635 (2015)
Article Google Scholar
Yang, Y., Newsam, S.: Spatial pyramid co-occurrence for image classification. In: ICCV (2011)
Google Scholar
Yuan, J., Wu, Y., Yang, M.: Discovery of collocation patterns: from visual words to visual phrases. In: CVPR (2007)
Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 68(1), 49–67 (2006)
Article MathSciNet Google Scholar

Download references

Acknowledgement

This work has been fully supported by Croatian Science Foundation under the project I-2433-2014.

Author information

Authors and Affiliations

Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
Valentina Zadrija, Josip Krapac & Siniša Šegvić
INRIA Rhone-Alpes, Grenoble, France
Jakob Verbeek

Authors

Valentina Zadrija
View author publications
You can also search for this author in PubMed Google Scholar
Josip Krapac
View author publications
You can also search for this author in PubMed Google Scholar
Jakob Verbeek
View author publications
You can also search for this author in PubMed Google Scholar
Siniša Šegvić
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Valentina Zadrija .

Editor information

Editors and Affiliations

Institute of Computer Science III, University of Bonn, Bonn, Germany
Juergen Gall
MPI for Intelligent Systems, University of Tübingen, Tübingen, Germany
Peter Gehler
Computer Vision Group, Visual Computing Institute, RWTH Aachen, Aachen, Germany
Bastian Leibe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zadrija, V., Krapac, J., Verbeek, J., Šegvić, S. (2015). Patch-Level Spatial Layout for Classification and Weakly Supervised Localization. In: Gall, J., Gehler, P., Leibe, B. (eds) Pattern Recognition. DAGM 2015. Lecture Notes in Computer Science(), vol 9358. Springer, Cham. https://doi.org/10.1007/978-3-319-24947-6_41

Download citation

DOI: https://doi.org/10.1007/978-3-319-24947-6_41
Published: 03 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24946-9
Online ISBN: 978-3-319-24947-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics