Skip to main content

Patch-Level Spatial Layout for Classification and Weakly Supervised Localization

  • Conference paper
  • First Online:
Pattern Recognition (DAGM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9358))

Included in the following conference series:

Abstract

We propose a discriminative patch-level model which combines appearance and spatial layout cues. We start from a block-sparse model of patch appearance based on the normalized Fisher vector representation. The appearance model is responsible for (i) selecting a discriminative subset of visual words, and (ii) identifying distinctive patches assigned to the selected subset. These patches are further filtered by a sparse spatial model operating on a novel representation of pairwise patch layout. We have evaluated the proposed pipeline in image classification and weakly supervised localization experiments on a public traffic sign dataset. The results show significant advantage of the combined model over state of the art appearance models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    For the sake of simplicity, we assume the global \(\ell _2\) normalization n(X). We later show the proposed reasoning also holds in the case of the intra- \(\ell _2\) normalization.

  2. 2.

    These results are worse than [21] since here we do not use additional negative images for training, i.e. the training dataset is the same as in other experiments.

References

  1. Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)

    Article  Google Scholar 

  2. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS (2003)

    Google Scholar 

  3. Arandjelović, R., Zisserman, A.: All about VLAD. In: CVPR (2013)

    Google Scholar 

  4. Baecchi, C., Turchini, F., Seidenari, L., Bagdanov, A.D., Bimbo, A.D.: Fisher vectors over random density forests for object recognition. In: ICPR (2014)

    Google Scholar 

  5. Brkić, K., Pinz, A., Šegvić, S., Kalafatić, Z.: Histogram-based description of local space-time appearance. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 206–217. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Cinbis, R., Verbeek, J., Schmid, C.: Segmentation driven object detection with Fisher vectors. In: ICCV (2013)

    Google Scholar 

  7. Cinbis, R., Verbeek, J., Schmid, C.: Multi-fold MIL training for weakly supervised object localization. In: CVPR (2014)

    Google Scholar 

  8. Crowley, E.J., Zisserman, A.: Of gods and goats: weakly supervised learning of figurative art. In: BMVC (2013)

    Google Scholar 

  9. Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV (2004)

    Google Scholar 

  10. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)

    Google Scholar 

  11. Deselaers, T., Alexe, B., Ferrari, V.: Weakly supervised localization and learning with generic knowledge. Int. J. Comput. Vis. 100(3), 275–293 (2012)

    Article  MathSciNet  Google Scholar 

  12. Dollár, P., Belongie, S., Perona, P.: The fastest pedestrian detector in the west. In: BMVC (2010)

    Google Scholar 

  13. Douze, M., Jégou, H.: The Yael library. In: Proceedings of the ACM International Conference on Multimedia (2014)

    Google Scholar 

  14. Everingham, M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  15. Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  16. Fernando, B., Fromont, E., Tuytelaars, T.: Mining mid-level features for image classification. Int. J. Comput. Vis. 108(3), 186–203 (2014)

    Article  MathSciNet  Google Scholar 

  17. Galleguillos, C., Babenko, B., Rabinovich, A., Belongie, S.: Weakly supervised object localization with stable segmentations. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 193–207. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR (2009)

    Google Scholar 

  19. Jenatton, R., Mairal, J., Obozinski, G., Bach, F.R.: Proximal methods for hierarchical sparse coding. J. Mach. Learn. Res. 12, 2297–2334 (2011)

    MathSciNet  Google Scholar 

  20. Krapac, J., Šegvić, S.: Fast approximate GMM soft-assign for fine-grained image classification with large Fisher vectors. In: GCPR (2015)

    Google Scholar 

  21. Krapac, J., Šegvić, S.: Weakly supervised object localization with large Fisher vectors. In: VISAPP (2015)

    Google Scholar 

  22. Krapac, J., Verbeek, J., Jurie, F.: Modeling spatial layout with Fisher vectors for image categorization. In: ICCV (2011)

    Google Scholar 

  23. Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: a branch and bound framework for object localization. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2129–2142 (2009)

    Article  Google Scholar 

  24. Liu, D., Hua, G., Viola, P.A., Chen, T.: Integrated feature selection and higher-order spatial feature extraction for object categorization. In: CVPR (2008)

    Google Scholar 

  25. Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. 11, 19–60 (2010)

    MathSciNet  Google Scholar 

  26. Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: NIPS, pp. 570–576 (1997)

    Google Scholar 

  27. Mathias, M., Timofte, R., Benenson, R., Gool, L.J.V.: Traffic sign recognition - how far are we from the solution? In: IJCNN, pp. 1–8 (2013)

    Google Scholar 

  28. Mobileye: Traffic Sign Detection. http://www.mobileye.com. Accessed 22 July 2015

  29. Murphy, K.: Machine learning a probabilistic perspective. MIT Press, Cambridge (2012)

    Google Scholar 

  30. Nguyen, M.H., Torresani, L., De la Torre, F., Rother, C.: Learning discriminative localization from weakly labeled data. Pattern Recogn. 47(3), 1523–1534 (2014)

    Article  Google Scholar 

  31. Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  32. Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the Fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)

    Article  MathSciNet  Google Scholar 

  33. Simonyan, K., Vedaldi, A., Zisserman, A.: Deep Fisher networks for large-scale image classification. In: NIPS, pp. 163–171 (2013)

    Google Scholar 

  34. Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 73–86. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  35. Siva, P., Xiang, T.: Weakly supervised object detector learning with model drift detection. In: ICCV (2011)

    Google Scholar 

  36. Viola, P.A., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004). http://dx.doi.org/10.1023/B:VISI.0000013087.49260.fb

    Article  Google Scholar 

  37. Voravuthikunchai, W., Cremilleux, B., Jurie, F.: Histograms of pattern sets for image classification and object recognition. In: CVPR (2014)

    Google Scholar 

  38. Šegvić, S., Brkic, K., Kalafatic, Z., Pinz, A.: Exploiting temporal and spatial constraints in traffic sign detection from a moving vehicle. Mach. Vis. Appl. 25(3), 649–665 (2014)

    Article  Google Scholar 

  39. Weng, C., Yuan, J.: Efficient mining of optimal AND/OR patterns for visual recognition. IEEE Trans. Multimedia 17(5), 626–635 (2015)

    Article  Google Scholar 

  40. Yang, Y., Newsam, S.: Spatial pyramid co-occurrence for image classification. In: ICCV (2011)

    Google Scholar 

  41. Yuan, J., Wu, Y., Yang, M.: Discovery of collocation patterns: from visual words to visual phrases. In: CVPR (2007)

    Google Scholar 

  42. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 68(1), 49–67 (2006)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgement

This work has been fully supported by Croatian Science Foundation under the project I-2433-2014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Valentina Zadrija .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zadrija, V., Krapac, J., Verbeek, J., Šegvić, S. (2015). Patch-Level Spatial Layout for Classification and Weakly Supervised Localization. In: Gall, J., Gehler, P., Leibe, B. (eds) Pattern Recognition. DAGM 2015. Lecture Notes in Computer Science(), vol 9358. Springer, Cham. https://doi.org/10.1007/978-3-319-24947-6_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24947-6_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24946-9

  • Online ISBN: 978-3-319-24947-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics