Ex Paucis Plura: Learning Affordance Segmentation from Very Few Examples

Sawatzky, Johann; Garbade, Martin; Gall, Juergen

doi:10.1007/978-3-030-12939-2_13

Ex Paucis Plura: Learning Affordance Segmentation from Very Few Examples

Johann Sawatzky¹⁵,
Martin Garbade¹⁵ &
Juergen Gall¹⁵

Conference paper
First Online: 14 February 2019

2663 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11269))

Abstract

While annotating objects in images is already time-consuming, annotating finer details like object parts or affordances of objects is even more tedious. Given the fact that large datasets with object annotations already exist, we address the question whether we can leverage such information to train a convolutional neural network for segmenting affordances or object parts from very few examples with finer annotations. To achieve this, we use a semantic alignment network to transfer the annotations from the small set of annotated examples to a large set of images with only coarse annotations at object level. We then train a convolutional neural network weakly supervised on the small annotated training set and the additional images with transferred labels. We evaluate our approach on the IIT-AFF and Pascal Parts dataset where our approach outperforms other weakly supervised approaches.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Arbeláez, P.A., Pont-Tuset, J., Barron, J.T., Marqués, F., Malik, J.: Multiscale combinatorial grouping. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 328–335. IEEE Computer Society (2014)
Google Scholar
Bearman, A., Russakovsky, O., Ferrari, V., Fei-Fei, L.: What’s the point: semantic segmentation with point supervision. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_34
Chapter Google Scholar
Chaudhry, A., Dokania, P.K., Torr, P.H.S.: Discovering class-specific pixels for weakly-supervised semantic segmentation. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, 4–7 September 2017. BMVA Press (2017)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations (2015)
Google Scholar
Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. CoRR abs/1606.00915 (2016)
Google Scholar
Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.L.: Detect what you can: detecting and representing objects using holistic models and body parts. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1979–1986. IEEE Computer Society (2014)
Google Scholar
Dai, J., He, K., Sun, J.: BoxSup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1635–1643. IEEE Computer Society (2015)
Google Scholar
Do, T., Nguyen, A., Reid, I.D., Caldwell, D.G., Tsagarakis, N.G.: Affordancenet: an end-to-end deep learning approach for object affordance detection. CoRR abs/1709.07326 (2017)
Google Scholar
Everingham, M., Eslami, S.M.A., Van Gool, L.J., Williams, C.K.I., Winn, J.M., Zisserman, A.: The PASCAL visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)
Article Google Scholar
Ham, B., Cho, M., Schmid, C., Ponce, J.: Proposal flow: semantic correspondences from object proposals. IEEE Trans. Pattern Anal. Mach. Intell. 40(7), 1711–1725 (2018)
Article Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2980–2988. IEEE Computer Society (2017)
Google Scholar
Hou, Q., Dokania, P.K., Massiceti, D., Wei, Y., Cheng, M., Torr, P.H.S.: Mining pixels: weakly supervised semantic segmentation using image labels. CoRR abs/1612.02101 (2016)
Google Scholar
Khoreva, A., Benenson, R., Hosang, J.H., Hein, M., Schiele, B.: Simple does it: weakly supervised instance and semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 1665–1674. IEEE Computer Society (2017)
Google Scholar
Kim, D., Cho, D., Yoo, D.: Two-phase learning for weakly supervised object localization. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 3554–3563. IEEE Computer Society (2017)
Google Scholar
Kjellström, H., Romero, J., Kragic, D.: Visual object-action recognition: inferring object affordances from human demonstration. Comput. Vis. Image Underst. 115(1), 81–90 (2011)
Article Google Scholar
Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 695–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_42
Chapter Google Scholar
Krause, J., Jin, H., Yang, J., Li, F.: Fine-grained recognition without part annotations. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 5546–5555. IEEE Computer Society (2015)
Google Scholar
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Inter. J. Robot. Res. 34(4–5), 705–724 (2015)
Article Google Scholar
Li, Y., Qi, H., Dai, J., Ji, X., Wei, Y.: Fully convolutional instance-aware semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 4438–4446. IEEE Computer Society (2017)
Google Scholar
Lin, D., Dai, J., Jia, J., He, K., Sun, J.: Scribblesup: scribble-supervised convolutional networks for semantic segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 3159–3167. IEEE Computer Society (2016)
Google Scholar
Lin, G., Milan, A., Shen, C., Reid, I.D.: Refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5168–5177. IEEE Computer Society (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Meng, F., Li, H., Wu, Q., Luo, B., Ngan, K.N.: Weakly supervised part proposal segmentation from multiple images. IEEE Trans. Image Process. 26(8), 4019–4031 (2017)
Article MathSciNet Google Scholar
Myers, A., Teo, C.L., Fermüller, C., Aloimonos, Y.: Affordance detection of tool parts from geometric features. In: IEEE International Conference on Robotics and Automation, ICRA 2015, Seattle, WA, USA, 26–30 May 2015, pp. 1374–1381. IEEE (2015)
Google Scholar
Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Detecting object affordances with convolutional neural networks. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, Daejeon, South Korea, 9–14 October 2016, pp. 2765–2770. IEEE (2016)
Google Scholar
Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Object-based affordances detection with convolutional neural networks and dense conditional random fields. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, 24–28 September 2017, pp. 5908–5915. IEEE (2017)
Google Scholar
Oh, S.J., Benenson, R., Khoreva, A., Akata, Z., Fritz, M., Schiele, B.: Exploiting saliency for object segmentation from image level labels. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5038–5047. IEEE Computer Society (2017)
Google Scholar
Papandreou, G., Chen, L., Murphy, K.P., Yuille, A.L.: Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1742–1750. IEEE Computer Society (2015)
Google Scholar
Pathak, D., Krähenbühl, P., Darrell, T.: Constrained convolutional neural networks for weakly supervised segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1796–1804. IEEE Computer Society (2015)
Google Scholar
Pathak, D., Shelhamer, E., Long, J., Darrell, T.: Fully convolutional multi-class multiple instance learning. In: International Conference on Learning Representations Workshop (2015)
Google Scholar
Pham, T., Do, T.T., Sünderhauf, N., Reid, I.: SceneCut: joint geometric and object segmentation for indoor scenes (2018)
Google Scholar
Pinheiro, P.H.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1713–1721. IEEE Computer Society (2015)
Google Scholar
Rocco, I., Arandjelović, R., Sivic, J.: End-to-end weakly-supervised semantic alignment. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, USA, 19–21 June 2018 (2018)
Google Scholar
Roy, A., Todorovic, S.: A multi-scale CNN for affordance segmentation in RGB images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 186–201. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_12
Chapter Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Sawatzky, J., Srikantha, A., Gall, J.: Weakly supervised affordance detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5197–5206. IEEE Computer Society (2017)
Google Scholar
Schoeler, M., Wörgötter, F.: Bootstrapping the semantics of tools: affordance analysis of real world objects on a per-part basis. IEEE Trans. Cogn. Dev. Syst. 8(2), 84–98 (2016)
Article Google Scholar
Shen, T., Lin, G., Liu, L., Shen, C., Reid, I.D.: Weakly supervised semantic segmentation based on co-segmentation. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, 4–7 September 2017, BMVA Press (2017)
Google Scholar
Song, H.O., Fritz, M., Göhring, D., Darrell, T.: Learning to detect visual grasp affordance. IEEE Trans. Autom. Sci. Eng. 13(2), 798–809 (2016)
Article Google Scholar
Wei, Y., Feng, J., Liang, X., Cheng, M., Zhao, Y., Yan, S.: Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 6488–6496. IEEE Computer Society (2017)
Google Scholar
Wei, Y., et al.: STC: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2314–2320 (2017)
Article Google Scholar
Zhang, Y., Bai, Y., Ding, M., Li, Y., Ghanem, B.: W2F: a weakly-supervised to fully-supervised framework for object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, USA, 19–21 June 2018. IEEE Computer Society (2018)
Google Scholar

Download references

Acknowledgement

The work has been financially supported by the DFG projects GA 1927/5-1 (DFG Research Unit FOR 2535 Anticipating Human Behavior).

Author information

Authors and Affiliations

University of Bonn, Bonn, Germany
Johann Sawatzky, Martin Garbade & Juergen Gall

Authors

Johann Sawatzky
View author publications
You can also search for this author in PubMed Google Scholar
Martin Garbade
View author publications
You can also search for this author in PubMed Google Scholar
Juergen Gall
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johann Sawatzky .

Editor information

Editors and Affiliations

University of Freiburg, Freiburg im Breisgau, Baden-Württemberg, Germany
Thomas Brox
University of Stuttgart, Stuttgart, Baden-Württemberg, Germany
Andrés Bruhn
CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Mario Fritz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sawatzky, J., Garbade, M., Gall, J. (2019). Ex Paucis Plura: Learning Affordance Segmentation from Very Few Examples. In: Brox, T., Bruhn, A., Fritz, M. (eds) Pattern Recognition. GCPR 2018. Lecture Notes in Computer Science(), vol 11269. Springer, Cham. https://doi.org/10.1007/978-3-030-12939-2_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-12939-2_13
Published: 14 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12938-5
Online ISBN: 978-3-030-12939-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics