Skip to main content
Log in

Perceiving, learning, and exploiting object affordances for autonomous pile manipulation

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Autonomous manipulation in unstructured environments will enable a large variety of exciting and important applications. Despite its promise, autonomous manipulation remains largely unsolved. Even the most rudimentary manipulation task—such as removing objects from a pile—remains challenging for robots. We identify three major challenges that must be addressed to enable autonomous manipulation: object segmentation, action selection, and motion generation. These challenges become more pronounced when unknown man-made or natural objects are cluttered together in a pile. We present a system capable of manipulating unknown objects in such an environment. Our robot is tasked with clearing a table by removing objects from a pile and placing them into a bin. To that end, we address the three aforementioned challenges. Our robot perceives the environment with an RGB-D sensor, segmenting the pile into object hypotheses using non-parametric surface models. Our system then computes the affordances of each object, and selects the best affordance and its associated action to execute. Finally, our robot instantiates the proper compliant motion primitive to safely execute the desired action. For efficient and reliable action selection, we developed a framework for supervised learning of manipulation expertise. To verify the performance of our system, we conducted dozens of trials and report on several hours of experiments involving more than 1,500 interactions. The results show that our learning-based approach for pile manipulation outperforms a common sense heuristic as well as a random strategy, and is on par with human action selection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. We scale each feature \(f_i\) using its mean \(E(f_i)\) and variance \(V(f_i)\). \(f_i^{scaled} = (f_i-E(f_i))/\sqrt{Var(f_i)}\). If a scaled feature is more than two standard deviations away from the mean, we cap \(f_i^{scaled}\) at either \(-2\) or \(2\). Finally, we divide \(f_i^{scaled}\) by \(2\) to guarantee that all features are in the range \([-1,1]\).

References

  • Bagnell, J.A., Cavalcanti, F., Cui, L., Galluzzo, T., Hebert, M., Kazemi, M., Libby, J., Liu, T.Y., Pollard, N. S., Pivtoraiko, M., Valois, J-S., Klingensmith, M., & Zhu, R. (2012). An integrated system for autonomous robotics manipulation. In IROS (pp. 2955–2962).

  • Barck-Holst, C., Ralph, M., Holmar, F., & Kragic, D. (2009). Learning grasping affordance using probabilistic and ontological approaches. In ICAR (pp. 1–6).

  • Bishop, Christopher M., et al. (2006). Pattern recognition and machine learning. New York: Springer.

    MATH  Google Scholar 

  • Chang, Lillian Y., Smith, Joshua R., & Fox, D., (2012). Interactive singulation of objects from a pile. In ICRA, (pp. 3875–3882).

  • Comaniciu, Dorin, & Meer, Peter. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.

    Article  Google Scholar 

  • Forsyth, D. A., & Ponce, J. (2002). Computer Vision: A Modern Approach. Englewood Cliffs: Prentice Hall Professional Technical Reference.

    Google Scholar 

  • Gibson, J. J. (1977). The theory of affordances, volume Perceiving (pp. 67–82). Mahwah: Lawrence Erlbaum.

    Google Scholar 

  • Goh, A., & Vidal, R., (2007). Segmenting motions of different types by unsupervised manifold clustering. In CVPR (pp. 1–6).

  • Gupta, M., & Sukhatme, G. (2012). Using manipulation primitives for brick sorting in clutter. In ICRA (pp. 3883–3889).

  • Hausman, K., Balint-Benczedi, F., Pangercic, D., Marton, Z.-C., Ueda, R., Okada, K., & Beetz, M. (2013). Tracking-based interactive segmentation of textureless objects. In IEEE ICRA (pp. 1122–1129).

  • Hermans, T., Rehg, J.M., & Bobick, A. (2012). Guided pushing for object singulation. In IEEE IROS (pp. 4783–4790).

  • Hermans, T., Li, F., Rehg, J.M., & Bobick, A.F. (2013). Learning contact locations for pushing and orienting unknown objects. In IEEE-RAS International Conference on Humanoid Robotics.

  • Joachims, T. (1999). Making large-scale svm learning practical. In B. Schlkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods—support vector learning. Cambridge: MIT Press.

    Google Scholar 

  • Katz, D., & Brock, O. (2008). Manipulating articulated objects with interactive perception. In ICRA (pp. 272–277).

  • Katz, D., Pyuro, Y., & Brock, O. (2008). Learning to manipulate articulated objects in unstructured environments using a grounded relational representation. In RSS (pp. 254–261), Zurich, Switzerland.

  • Katz, D., Kazemi, M., Bagnell, J.A., & Stentz, A. (2013a). Clearing a pile of unknown objects using interactive perception. In ICRA (pp. 154–161), Karlsruhe, Germany.

  • Katz, D., Venkatraman, A., Kazemi, M., Bagnell, J.A., & Stentz, A. (2013b). Perceiving, learning, and exploiting object affordances for autonomous pile manipulation. In RSS.

  • Kazemi, M., Valois, J-S., Bagnell, J.A., & Pollard, N. (2012) Robust object grasping using force compliant motion primitives. In RSS.

  • Kenney, J., Buckley, T., & Brock, O. (2009). Interactive segmentation for manipulation in unstructured environments. In ICRA (pp. 1343–48).

  • Lang, Tobias, & Toussaint, Marc. (2010). Planning with noisy probabilistic relational rules. JAIR, 39, 1–49.

  • Le, Q.V., Kamm, D., Kara, A.F., Ng, A.Y. (2010). Learning to grasp objects with multiple contact points. In ICRA (pp. 5062–5069).

  • Ratliff, N., Zucker, M., Bagnell, J.A., & Srinivasa, S. (2009) CHOMP: Gradient optimization techniques for efficient motion planning. In ICRA (pp. 489–494).

  • Saxena, Ashutosh, Driemeyer, Justin, & Ng, Andrew Y. (2008). Robotic grasping of novel objects using vision. The International Journal of Robotics Research, 27(2), 157–173.

    Article  Google Scholar 

  • Stolkin, R., Greig, A., Hodgetts, M., & Gilby, J. (2008). An EM/E-MRF Algorithm for adaptive model based tracking in extremely poor visibility. Image and Vision Computing, 26(4), 480–495.

    Article  Google Scholar 

  • Taylor, C.J., & Cowley, A. (2011). In RSS Workshop on RGB-D Cameras: Segmentation and analysis of rgb-d data.

  • Ugur, E., Sahin, E., Oztop, E. (2012). Self-discovery of motor primitives and learning grasp affordances. In IEEE IROS (pp. 3260–3267).

  • van Hoof, H., Kroemer, O., & Peters, J. (2013). Probabilistic interactive segmentation for anthropomorphic robots in cluttered environments. In Proceedings of the International Conference on Humanoid Robots (HUMANOIDS).

  • Yang, S.-W., Wang, C.-C., & Chang, C.-H. (2010). Ransac matching: Simultaneous registration and segmentation. In ICRA (pp. 1905–1912).

  • Zappella, L. (2008). Motion segmentation from feature trajectories. Master’s thesis, University of Girona, Girona, Spain.

  • Zhang, J., Shi, F., Wang, J., Liu, Y. (2007). 3D motion segmentation from straight-line optical flow. In Multimedia Content Analysis and Mining (pp. 85–94). Springer, Berlin/Heidelberg.

Download references

Acknowledgments

This work was conducted in part through collaborative participation in the Robotics Consortium sponsored by the U.S Army Research Laboratory under the Collaborative Technology Alliance Program, Cooperative Agreement W911NF-10-2-0016 and also in part by Intel (Embedded Technology Intel Science and Technology Center). The authors also gratefully acknowledge funding under the DARPA Autonomous Robotic Manipulation Software Track (ARM-S) program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dov Katz.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Katz, D., Venkatraman, A., Kazemi, M. et al. Perceiving, learning, and exploiting object affordances for autonomous pile manipulation. Auton Robot 37, 369–382 (2014). https://doi.org/10.1007/s10514-014-9407-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-014-9407-y

Keywords

Navigation