Towards affordance detection for robot manipulation using affordance for parts and parts for affordance

  • Safoura Rezapour Lakani
  • Antonio J. Rodríguez-Sánchez
  • Justus Piater


As robots start to interact with their environments, they need to reason about the affordances of objects in those environments. In most cases, affordances can be inferred only from parts of objects, such as the blade of a knife for cutting or the head of a hammer for pounding. We propose an RGB-D part-based affordance detection method where the parts are obtained based on the affordances as well. We show that affordance detection benefits from a part-based object representation since parts are distinctive and generalizable to novel objects. We compare our method with other state-of-the-art affordance detection methods on a benchmark dataset (Myers et al. in International conference on robotics and automation (ICRA), 2015), outperforming these methods by an average of 14% on novel object instances. Furthermore, we apply our affordance detection method to a robotic grasping scenario to demonstrate that the robot is able to perform grasps after detecting the affordances.


Affordances Part segmentation RGB-D perception Supervised learning 


Supplementary material

10514_2018_9787_MOESM1_ESM.mp4 (31.6 mb)
Supplementary material 1 (mp4 32348 KB)


  1. Aldoma, A., Tombari, F., & Vincze, M. (2012). Supervised learning of hidden and non-hidden 0-order affordances and detection in real scenes. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp 1732–1739Google Scholar
  2. Bo, L., Ren, X., & Fox, D. (2013). Unsupervised feature learning for RGB-D based object recognition. In J. P. Desai, G. Dudek, O. Khatib, & V. Kumar (Eds.), Experimental robotics (pp. 387–402). Springer.Google Scholar
  3. Desai, C & Ramanan, D. (2013). Predicting functional regions on objects. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.Google Scholar
  4. Fei-Fei, L & Perona, P. (2005). A bayesian hierarchical model for learning natural scene categories. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE, vol 2, pp. 524–531.Google Scholar
  5. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32(9), 1627–1645.CrossRefGoogle Scholar
  6. Fidler, S & Leonardis, A. (2007). Towards scalable representations of object categories: Learning a hierarchy of parts. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 1–8.Google Scholar
  7. Fu, H., Cohen-Or, D., Dror, G., & Sheffer, A. (2008). Upright orientation of man-made objects. In ACM transactions on graphics (TOG) (Vol. 27, p. 42). ACM.Google Scholar
  8. Gibson, J. J. (1977). The theory of affordances. Perceiving, Acting, and Knowing: Toward an Ecological Psychology, 67–82.Google Scholar
  9. Gibson, J. J. (1979). The ecological approach to visual perception. Hove: Psychology Press.Google Scholar
  10. Hart, S., Dinh, P., & Hambuchen, K. (2014). Affordance templates for shared robot control. In: Artificial Intelligence and Human-Robot Interaction, AAAI Fall Symposium Series, Arlington, VA, USA.Google Scholar
  11. Hart, S., Dinh, P., & Hambuchen, K. (2015). The affordance template ros package for robot task programming. In: IEEE International Conference on Robotics and Automation (ICRA), 2015, IEEE, pp. 6227–6234.Google Scholar
  12. Hermans, T., Rehg, J. M. & Bobick, A. (2011). Affordance prediction via learned object attributes. In: IEEE International Conference on Robotics and Automation (ICRA): Workshop on Semantic Perception, Mapping, and Exploration, pp. 181–184.Google Scholar
  13. Katz, D., Venkatraman, A., Kazemi, M., Bagnell, J. A., & Stentz, A. (2014). Perceiving, learning, and exploiting object affordances for autonomous pile manipulation. Autonomous Robots, 37(4), 369–382.CrossRefGoogle Scholar
  14. Koppula, H. S. & Saxena, A. (2014). Physically grounded spatio-temporal object affordances. In: European Conference on Computer Vision, Springer, pp. 831–847.Google Scholar
  15. Laga, H., Mortara, M., & Spagnuolo, M. (2013). Geometry and context for semantic correspondences and functionality recognition in man-made 3D shapes. ACM Transactions on Graphics (TOG), 32(5), 150.CrossRefGoogle Scholar
  16. Lazebnik, S., Schmid, C & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, IEEE, vol. 2, pp. 2169–2178.Google Scholar
  17. Leung, T., & Malik, J. (2001). Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision, 43(1), 29–44.CrossRefzbMATHGoogle Scholar
  18. Margolin, R., Zelnik-Manor, L., Tal, A. (2014). How to evaluate foreground maps? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255.Google Scholar
  19. Myers, A., Teo, CL., Fermüller, C., & Aloimonos, Y. (2015). Affordance detection of tool parts from geometric features. In: International Conference on Robotics and Automation (ICRA).Google Scholar
  20. Nguyen, A., Kanoulas, D., Caldwell, D. G., & Tsagarakis, N. G. (2016). Detecting object affordances with convolutional neural networks. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, pp. 2765–2770.Google Scholar
  21. Norman, D. A. (1988). The psychology of everyday things. New York: Basic Books.Google Scholar
  22. Omrčen, D., Böge, C., Asfour, T., Ude, A., & Dillmann, R. (2009). Autonomous acquisition of pushing actions to support object grasping with a humanoid robot. In: 9th IEEE-RAS International Conference on Humanoid Robots, IEEE, pp. 277–283.Google Scholar
  23. Platt, J., et al. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers, 10(3), 61–74.Google Scholar
  24. Rabbani, T., Van Den Heuvel, F., & Vosselmann, G. (2006). Segmentation of point clouds using smoothness constraint. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 36(5), 248–253.Google Scholar
  25. Rezapour Lakani, S., Rodríguez-Sánchez, A., & Piater, J. (2017). Can affordances guide object decomposition into semantically meaningful parts? In: IEEE Winter Conference on Applications of Computer Vision (WACV).Google Scholar
  26. Richtsfeld, A., Mörwald, T., Prankl, J., Zillich, M., & Vincze, M. (2014). Learning of perceptual grouping for object segmentation on rgb-d data. Journal of visual communication and image representation, 25(1), 64–73.CrossRefGoogle Scholar
  27. Rivlin, E., Dickinson, S. J., & Rosenfeld, A. (1995). Recognition by functional parts. Computer Vision and Image Understanding, 62(2), 164–176.CrossRefzbMATHGoogle Scholar
  28. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1985). Learning internal representations by error propagation. DTIC Document: Tech. rep.Google Scholar
  29. Rusu, R. B., & Cousins, S. (2011). 3D is here: Point cloud library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp. 1–4.Google Scholar
  30. Sawatzky, J., Srikantha, A & Gall, J. (2017). Weakly supervised affordance detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  31. Schmidt, M. (2007). UGM: A matlab toolbox for probabilistic undirected graphical models.
  32. Stark, L., & Bowyer, K. (1991). Achieving generalized object recognition through reasoning about association of function to structure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(10), 1097–1104.CrossRefGoogle Scholar
  33. Stark, M., Lies, P., Zillich, M., Wyatt, J., & Schiele, B. (2008). Functional object class detection based on learned affordance cues. Computer Vision Systems, 5008, 435–444.CrossRefGoogle Scholar
  34. Stein, C. S., Schoeler, M., Papon, J., & Wörgötter, F. (2014). Object partitioning using local convexity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  35. Varadarajan, K. M., & Vincze, M. (2011). Affordance based part recognition for grasping and manipulation. In: Workshop on Autonomous Grasping, ICRA.Google Scholar
  36. Wang, J., & Yuille, A. L. (2015). Semantic part segmentation using compositional model combining shape and appearance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1788–1797.Google Scholar
  37. Yao, B., Ma, J., & Fei-Fei, L. (2013). Discovering object functionality. In: The IEEE International Conference on Computer Vision (ICCV).Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Universität InnsbruckInnsbruckAustria

Personalised recommendations