Skip to main content

Ex Paucis Plura: Learning Affordance Segmentation from Very Few Examples

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11269))

Abstract

While annotating objects in images is already time-consuming, annotating finer details like object parts or affordances of objects is even more tedious. Given the fact that large datasets with object annotations already exist, we address the question whether we can leverage such information to train a convolutional neural network for segmenting affordances or object parts from very few examples with finer annotations. To achieve this, we use a semantic alignment network to transfer the annotations from the small set of annotated examples to a large set of images with only coarse annotations at object level. We then train a convolutional neural network weakly supervised on the small annotated training set and the additional images with transferred labels. We evaluate our approach on the IIT-AFF and Pascal Parts dataset where our approach outperforms other weakly supervised approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Arbeláez, P.A., Pont-Tuset, J., Barron, J.T., Marqués, F., Malik, J.: Multiscale combinatorial grouping. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 328–335. IEEE Computer Society (2014)

    Google Scholar 

  2. Bearman, A., Russakovsky, O., Ferrari, V., Fei-Fei, L.: What’s the point: semantic segmentation with point supervision. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_34

    Chapter  Google Scholar 

  3. Chaudhry, A., Dokania, P.K., Torr, P.H.S.: Discovering class-specific pixels for weakly-supervised semantic segmentation. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, 4–7 September 2017. BMVA Press (2017)

    Google Scholar 

  4. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations (2015)

    Google Scholar 

  5. Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. CoRR abs/1606.00915 (2016)

    Google Scholar 

  6. Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.L.: Detect what you can: detecting and representing objects using holistic models and body parts. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1979–1986. IEEE Computer Society (2014)

    Google Scholar 

  7. Dai, J., He, K., Sun, J.: BoxSup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1635–1643. IEEE Computer Society (2015)

    Google Scholar 

  8. Do, T., Nguyen, A., Reid, I.D., Caldwell, D.G., Tsagarakis, N.G.: Affordancenet: an end-to-end deep learning approach for object affordance detection. CoRR abs/1709.07326 (2017)

    Google Scholar 

  9. Everingham, M., Eslami, S.M.A., Van Gool, L.J., Williams, C.K.I., Winn, J.M., Zisserman, A.: The PASCAL visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)

    Article  Google Scholar 

  10. Ham, B., Cho, M., Schmid, C., Ponce, J.: Proposal flow: semantic correspondences from object proposals. IEEE Trans. Pattern Anal. Mach. Intell. 40(7), 1711–1725 (2018)

    Article  Google Scholar 

  11. He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2980–2988. IEEE Computer Society (2017)

    Google Scholar 

  12. Hou, Q., Dokania, P.K., Massiceti, D., Wei, Y., Cheng, M., Torr, P.H.S.: Mining pixels: weakly supervised semantic segmentation using image labels. CoRR abs/1612.02101 (2016)

    Google Scholar 

  13. Khoreva, A., Benenson, R., Hosang, J.H., Hein, M., Schiele, B.: Simple does it: weakly supervised instance and semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 1665–1674. IEEE Computer Society (2017)

    Google Scholar 

  14. Kim, D., Cho, D., Yoo, D.: Two-phase learning for weakly supervised object localization. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 3554–3563. IEEE Computer Society (2017)

    Google Scholar 

  15. Kjellström, H., Romero, J., Kragic, D.: Visual object-action recognition: inferring object affordances from human demonstration. Comput. Vis. Image Underst. 115(1), 81–90 (2011)

    Article  Google Scholar 

  16. Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 695–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_42

    Chapter  Google Scholar 

  17. Krause, J., Jin, H., Yang, J., Li, F.: Fine-grained recognition without part annotations. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 5546–5555. IEEE Computer Society (2015)

    Google Scholar 

  18. Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Inter. J. Robot. Res. 34(4–5), 705–724 (2015)

    Article  Google Scholar 

  19. Li, Y., Qi, H., Dai, J., Ji, X., Wei, Y.: Fully convolutional instance-aware semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 4438–4446. IEEE Computer Society (2017)

    Google Scholar 

  20. Lin, D., Dai, J., Jia, J., He, K., Sun, J.: Scribblesup: scribble-supervised convolutional networks for semantic segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 3159–3167. IEEE Computer Society (2016)

    Google Scholar 

  21. Lin, G., Milan, A., Shen, C., Reid, I.D.: Refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5168–5177. IEEE Computer Society (2017)

    Google Scholar 

  22. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  23. Meng, F., Li, H., Wu, Q., Luo, B., Ngan, K.N.: Weakly supervised part proposal segmentation from multiple images. IEEE Trans. Image Process. 26(8), 4019–4031 (2017)

    Article  MathSciNet  Google Scholar 

  24. Myers, A., Teo, C.L., Fermüller, C., Aloimonos, Y.: Affordance detection of tool parts from geometric features. In: IEEE International Conference on Robotics and Automation, ICRA 2015, Seattle, WA, USA, 26–30 May 2015, pp. 1374–1381. IEEE (2015)

    Google Scholar 

  25. Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Detecting object affordances with convolutional neural networks. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, Daejeon, South Korea, 9–14 October 2016, pp. 2765–2770. IEEE (2016)

    Google Scholar 

  26. Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Object-based affordances detection with convolutional neural networks and dense conditional random fields. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, 24–28 September 2017, pp. 5908–5915. IEEE (2017)

    Google Scholar 

  27. Oh, S.J., Benenson, R., Khoreva, A., Akata, Z., Fritz, M., Schiele, B.: Exploiting saliency for object segmentation from image level labels. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5038–5047. IEEE Computer Society (2017)

    Google Scholar 

  28. Papandreou, G., Chen, L., Murphy, K.P., Yuille, A.L.: Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1742–1750. IEEE Computer Society (2015)

    Google Scholar 

  29. Pathak, D., Krähenbühl, P., Darrell, T.: Constrained convolutional neural networks for weakly supervised segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1796–1804. IEEE Computer Society (2015)

    Google Scholar 

  30. Pathak, D., Shelhamer, E., Long, J., Darrell, T.: Fully convolutional multi-class multiple instance learning. In: International Conference on Learning Representations Workshop (2015)

    Google Scholar 

  31. Pham, T., Do, T.T., Sünderhauf, N., Reid, I.: SceneCut: joint geometric and object segmentation for indoor scenes (2018)

    Google Scholar 

  32. Pinheiro, P.H.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1713–1721. IEEE Computer Society (2015)

    Google Scholar 

  33. Rocco, I., Arandjelović, R., Sivic, J.: End-to-end weakly-supervised semantic alignment. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, USA, 19–21 June 2018 (2018)

    Google Scholar 

  34. Roy, A., Todorovic, S.: A multi-scale CNN for affordance segmentation in RGB images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 186–201. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_12

    Chapter  Google Scholar 

  35. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  36. Sawatzky, J., Srikantha, A., Gall, J.: Weakly supervised affordance detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5197–5206. IEEE Computer Society (2017)

    Google Scholar 

  37. Schoeler, M., Wörgötter, F.: Bootstrapping the semantics of tools: affordance analysis of real world objects on a per-part basis. IEEE Trans. Cogn. Dev. Syst. 8(2), 84–98 (2016)

    Article  Google Scholar 

  38. Shen, T., Lin, G., Liu, L., Shen, C., Reid, I.D.: Weakly supervised semantic segmentation based on co-segmentation. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, 4–7 September 2017, BMVA Press (2017)

    Google Scholar 

  39. Song, H.O., Fritz, M., Göhring, D., Darrell, T.: Learning to detect visual grasp affordance. IEEE Trans. Autom. Sci. Eng. 13(2), 798–809 (2016)

    Article  Google Scholar 

  40. Wei, Y., Feng, J., Liang, X., Cheng, M., Zhao, Y., Yan, S.: Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 6488–6496. IEEE Computer Society (2017)

    Google Scholar 

  41. Wei, Y., et al.: STC: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2314–2320 (2017)

    Article  Google Scholar 

  42. Zhang, Y., Bai, Y., Ding, M., Li, Y., Ghanem, B.: W2F: a weakly-supervised to fully-supervised framework for object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, USA, 19–21 June 2018. IEEE Computer Society (2018)

    Google Scholar 

Download references

Acknowledgement

The work has been financially supported by the DFG projects GA 1927/5-1 (DFG Research Unit FOR 2535 Anticipating Human Behavior).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johann Sawatzky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sawatzky, J., Garbade, M., Gall, J. (2019). Ex Paucis Plura: Learning Affordance Segmentation from Very Few Examples. In: Brox, T., Bruhn, A., Fritz, M. (eds) Pattern Recognition. GCPR 2018. Lecture Notes in Computer Science(), vol 11269. Springer, Cham. https://doi.org/10.1007/978-3-030-12939-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-12939-2_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-12938-5

  • Online ISBN: 978-3-030-12939-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics