Advertisement

Ex Paucis Plura: Learning Affordance Segmentation from Very Few Examples

  • Johann SawatzkyEmail author
  • Martin Garbade
  • Juergen Gall
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11269)

Abstract

While annotating objects in images is already time-consuming, annotating finer details like object parts or affordances of objects is even more tedious. Given the fact that large datasets with object annotations already exist, we address the question whether we can leverage such information to train a convolutional neural network for segmenting affordances or object parts from very few examples with finer annotations. To achieve this, we use a semantic alignment network to transfer the annotations from the small set of annotated examples to a large set of images with only coarse annotations at object level. We then train a convolutional neural network weakly supervised on the small annotated training set and the additional images with transferred labels. We evaluate our approach on the IIT-AFF and Pascal Parts dataset where our approach outperforms other weakly supervised approaches.

Notes

Acknowledgement

The work has been financially supported by the DFG projects GA 1927/5-1 (DFG Research Unit FOR 2535 Anticipating Human Behavior).

References

  1. 1.
    Arbeláez, P.A., Pont-Tuset, J., Barron, J.T., Marqués, F., Malik, J.: Multiscale combinatorial grouping. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 328–335. IEEE Computer Society (2014)Google Scholar
  2. 2.
    Bearman, A., Russakovsky, O., Ferrari, V., Fei-Fei, L.: What’s the point: semantic segmentation with point supervision. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 549–565. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_34CrossRefGoogle Scholar
  3. 3.
    Chaudhry, A., Dokania, P.K., Torr, P.H.S.: Discovering class-specific pixels for weakly-supervised semantic segmentation. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, 4–7 September 2017. BMVA Press (2017)Google Scholar
  4. 4.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations (2015)Google Scholar
  5. 5.
    Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. CoRR abs/1606.00915 (2016)Google Scholar
  6. 6.
    Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.L.: Detect what you can: detecting and representing objects using holistic models and body parts. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1979–1986. IEEE Computer Society (2014)Google Scholar
  7. 7.
    Dai, J., He, K., Sun, J.: BoxSup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1635–1643. IEEE Computer Society (2015)Google Scholar
  8. 8.
    Do, T., Nguyen, A., Reid, I.D., Caldwell, D.G., Tsagarakis, N.G.: Affordancenet: an end-to-end deep learning approach for object affordance detection. CoRR abs/1709.07326 (2017)Google Scholar
  9. 9.
    Everingham, M., Eslami, S.M.A., Van Gool, L.J., Williams, C.K.I., Winn, J.M., Zisserman, A.: The PASCAL visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)CrossRefGoogle Scholar
  10. 10.
    Ham, B., Cho, M., Schmid, C., Ponce, J.: Proposal flow: semantic correspondences from object proposals. IEEE Trans. Pattern Anal. Mach. Intell. 40(7), 1711–1725 (2018)CrossRefGoogle Scholar
  11. 11.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2980–2988. IEEE Computer Society (2017)Google Scholar
  12. 12.
    Hou, Q., Dokania, P.K., Massiceti, D., Wei, Y., Cheng, M., Torr, P.H.S.: Mining pixels: weakly supervised semantic segmentation using image labels. CoRR abs/1612.02101 (2016)Google Scholar
  13. 13.
    Khoreva, A., Benenson, R., Hosang, J.H., Hein, M., Schiele, B.: Simple does it: weakly supervised instance and semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 1665–1674. IEEE Computer Society (2017)Google Scholar
  14. 14.
    Kim, D., Cho, D., Yoo, D.: Two-phase learning for weakly supervised object localization. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 3554–3563. IEEE Computer Society (2017)Google Scholar
  15. 15.
    Kjellström, H., Romero, J., Kragic, D.: Visual object-action recognition: inferring object affordances from human demonstration. Comput. Vis. Image Underst. 115(1), 81–90 (2011)CrossRefGoogle Scholar
  16. 16.
    Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 695–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_42CrossRefGoogle Scholar
  17. 17.
    Krause, J., Jin, H., Yang, J., Li, F.: Fine-grained recognition without part annotations. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 5546–5555. IEEE Computer Society (2015)Google Scholar
  18. 18.
    Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Inter. J. Robot. Res. 34(4–5), 705–724 (2015)CrossRefGoogle Scholar
  19. 19.
    Li, Y., Qi, H., Dai, J., Ji, X., Wei, Y.: Fully convolutional instance-aware semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 4438–4446. IEEE Computer Society (2017)Google Scholar
  20. 20.
    Lin, D., Dai, J., Jia, J., He, K., Sun, J.: Scribblesup: scribble-supervised convolutional networks for semantic segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 3159–3167. IEEE Computer Society (2016)Google Scholar
  21. 21.
    Lin, G., Milan, A., Shen, C., Reid, I.D.: Refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5168–5177. IEEE Computer Society (2017)Google Scholar
  22. 22.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  23. 23.
    Meng, F., Li, H., Wu, Q., Luo, B., Ngan, K.N.: Weakly supervised part proposal segmentation from multiple images. IEEE Trans. Image Process. 26(8), 4019–4031 (2017)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Myers, A., Teo, C.L., Fermüller, C., Aloimonos, Y.: Affordance detection of tool parts from geometric features. In: IEEE International Conference on Robotics and Automation, ICRA 2015, Seattle, WA, USA, 26–30 May 2015, pp. 1374–1381. IEEE (2015)Google Scholar
  25. 25.
    Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Detecting object affordances with convolutional neural networks. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, Daejeon, South Korea, 9–14 October 2016, pp. 2765–2770. IEEE (2016)Google Scholar
  26. 26.
    Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Object-based affordances detection with convolutional neural networks and dense conditional random fields. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, 24–28 September 2017, pp. 5908–5915. IEEE (2017)Google Scholar
  27. 27.
    Oh, S.J., Benenson, R., Khoreva, A., Akata, Z., Fritz, M., Schiele, B.: Exploiting saliency for object segmentation from image level labels. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5038–5047. IEEE Computer Society (2017)Google Scholar
  28. 28.
    Papandreou, G., Chen, L., Murphy, K.P., Yuille, A.L.: Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1742–1750. IEEE Computer Society (2015)Google Scholar
  29. 29.
    Pathak, D., Krähenbühl, P., Darrell, T.: Constrained convolutional neural networks for weakly supervised segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1796–1804. IEEE Computer Society (2015)Google Scholar
  30. 30.
    Pathak, D., Shelhamer, E., Long, J., Darrell, T.: Fully convolutional multi-class multiple instance learning. In: International Conference on Learning Representations Workshop (2015)Google Scholar
  31. 31.
    Pham, T., Do, T.T., Sünderhauf, N., Reid, I.: SceneCut: joint geometric and object segmentation for indoor scenes (2018)Google Scholar
  32. 32.
    Pinheiro, P.H.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1713–1721. IEEE Computer Society (2015)Google Scholar
  33. 33.
    Rocco, I., Arandjelović, R., Sivic, J.: End-to-end weakly-supervised semantic alignment. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, USA, 19–21 June 2018 (2018)Google Scholar
  34. 34.
    Roy, A., Todorovic, S.: A multi-scale CNN for affordance segmentation in RGB images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 186–201. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_12CrossRefGoogle Scholar
  35. 35.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Sawatzky, J., Srikantha, A., Gall, J.: Weakly supervised affordance detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5197–5206. IEEE Computer Society (2017)Google Scholar
  37. 37.
    Schoeler, M., Wörgötter, F.: Bootstrapping the semantics of tools: affordance analysis of real world objects on a per-part basis. IEEE Trans. Cogn. Dev. Syst. 8(2), 84–98 (2016)CrossRefGoogle Scholar
  38. 38.
    Shen, T., Lin, G., Liu, L., Shen, C., Reid, I.D.: Weakly supervised semantic segmentation based on co-segmentation. In: British Machine Vision Conference 2017, BMVC 2017, London, UK, 4–7 September 2017, BMVA Press (2017)Google Scholar
  39. 39.
    Song, H.O., Fritz, M., Göhring, D., Darrell, T.: Learning to detect visual grasp affordance. IEEE Trans. Autom. Sci. Eng. 13(2), 798–809 (2016)CrossRefGoogle Scholar
  40. 40.
    Wei, Y., Feng, J., Liang, X., Cheng, M., Zhao, Y., Yan, S.: Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 6488–6496. IEEE Computer Society (2017)Google Scholar
  41. 41.
    Wei, Y., et al.: STC: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2314–2320 (2017)CrossRefGoogle Scholar
  42. 42.
    Zhang, Y., Bai, Y., Ding, M., Li, Y., Ghanem, B.: W2F: a weakly-supervised to fully-supervised framework for object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, USA, 19–21 June 2018. IEEE Computer Society (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of BonnBonnGermany

Personalised recommendations