Advertisement

Learning Dynamic Robot-to-Human Object Handover from Human Feedback

  • Andras KupcsikEmail author
  • David Hsu
  • Wee Sun Lee
Chapter
Part of the Springer Proceedings in Advanced Robotics book series (SPAR, volume 2)

Abstract

Object handover is a basic, but essential capability for robots interacting with humans in many applications, e.g., caring for the elderly and assisting workers in manufacturing workshops. It appears deceptively simple, as humans perform object handover almost flawlessly. The success of humans, however, belies the complexity of object handover as collaborative physical interaction between two agents with limited communication. This paper presents a learning algorithm for dynamic object handover, for example, when a robot hands over water bottles to marathon runners passing by the water station. We formulate the problem as contextual policy search, in which the robot learns object handover by interacting with the human. A key challenge here is to learn the latent reward of the handover task under noisy human feedback. Preliminary experiments show that the robot learns to hand over a water bottle naturally and that it adapts to the dynamics of human motion. One challenge for the future is to combine the model-free learning algorithm with a model-based planning approach and enable the robot to adapt over human preferences and object characteristics, such as shape, weight, and surface texture.

Notes

Acknowledgements

This research was supported in part an A*STAR Industrial Robotics Program grant (R-252-506-001-305) and a SMART Phase-2 Pilot grant (R-252-000-571-592).

References

  1. 1.
    Agah, A., Tanie, K.: Human interaction with a service robot: mobile-manipulator handing over an object to a human. In: Proceedings of the IEEE International Conference on Robotics and Automation (1997)Google Scholar
  2. 2.
    Ben Amor, H., Neumann, G., Kamthe, S., Kroemer, O., Peters, J.: Interaction primitives for human-robot cooperation tasks. In: Proceedings of the IEEE International Conference on Robotics and Automation (2014)Google Scholar
  3. 3.
    Bruno, S., Khatib, O. (eds.): Handbook of Robotics. Springer, Berlin (2008)Google Scholar
  4. 4.
    Cakmak, M., Srinivasa, S., Lee, M., Forlizzi, J., Kiesler, S.: Human preferences for robot-human hand-over configurations. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (2011)Google Scholar
  5. 5.
    Chan, W., Parker, C., Van der Loos, H., Croft, E.: A human-inspired object handover controller. Int. J. Robot. Res. 32(8), 971–983 (2013)CrossRefGoogle Scholar
  6. 6.
    Chan, W.P., Kumagai, I., Nozawa, S., Kakiuchi, Y., Okada, K., Inaba, M.: Implementation of a robot-human object handover controller on a compliant underactuated hand using joint position error measurements for grip force and load force estimations. In: Proceedings of the IEEE International Conference on Robotics and Automation (2014)Google Scholar
  7. 7.
    Chu, W., Ghahramani, Z.: Preference learning with Gaussian processes. In: Proceedings of the International Conference on Machine Learning (2005)Google Scholar
  8. 8.
    da Silva, B., Konidaris, G., Barto, A.: Learning parameterized skills. In: Proceedings of the International Conference on Machine Learning (2012)Google Scholar
  9. 9.
    Daniel, C., Neumann, G., Peters, J.: Hierarchical relative entropy policy search. In: AISTATS (2012)Google Scholar
  10. 10.
    Daniel, C., Viering, M., Metz, J., Kroemer, O., Peters, J.: Active reward learning. In: Proceedings of the Robotics: Science and Systems (2014)Google Scholar
  11. 11.
    Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot. 2(1–2), 1–142 (2013)Google Scholar
  12. 12.
    Dragan, A., Srinivasa, S.: Generating legible motion. In: Proceedings of the Robotics: Science and Systems (2013)Google Scholar
  13. 13.
    Grigore, E.C., Eder, K., Pipe, A.G., Melhuish, C., Leonards, U.: Joint action understanding improves robot-to-human object handover. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4622–4629. IEEE (2013)Google Scholar
  14. 14.
    Huang, C.-M., Cakmak, M., Mutlu, B.: Adaptive coordination strategies for human-robot handovers. In: Proceedings of the Robotics: Science and Systems (2015)Google Scholar
  15. 15.
    Huber, M., Kupferberg, A., Lenz, C., Knoll, A., Brandt, T., Glasauer, S.: Spatiotemporal movement planning and rapid adaptation for manual interaction. PLoS One (2013)Google Scholar
  16. 16.
    Ijspeert, A.J., Schaal, S.: Learning attractor landscapes for learning motor primitives. In: Advances in Neural Information Processing Systems (2003)Google Scholar
  17. 17.
    Jain, A., Wojcik, B., Joachims, T., Saxena, A.: Learning trajectory preferences for manipulators via iterative improvement. In: Advances in Neural Information Processing Systems (2013)Google Scholar
  18. 18.
    Kupcsik, A., Deisenroth, M., Peters, J., Ai Poh, L., Vadakkepat, V., Neumann, G.: Model-based contextual policy search for data-efficient generalization of robot skills. Artif. Intell. (2015)Google Scholar
  19. 19.
    Kupcsik, A., Deisenroth, M.P., Peters, J., Neumann, G.: Data-efficient contextual policy search for robot movement skills. In: Proceedings of the AAAI Conference on Artificial Intelligence (2013)Google Scholar
  20. 20.
    Mainprice, J., Gharbi, M., Siméon, T., Alami, R.: Sharing effort in planning human-robot handover tasks. In: Proceedings of the International Symposium on Robot and Human Interactive Communication (2012)Google Scholar
  21. 21.
    Nagata, K., Oosaki, Y., Kakikura, M., Tsukune, H.: Delivery by hand between human and robot based on fingertip force-torque information. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (1998)Google Scholar
  22. 22.
    Ng, A., Russell, S.: Algorithms for inverse reinforcement learning. In: Proceedings of the International Conference on Machine Learning (2000)Google Scholar
  23. 23.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2005)Google Scholar
  24. 24.
    Ratliff, N., Silver, D., Bagnell, J.: Learning to search: functional gradient techniques for imitation learning. Auton. Robot. 27(1), 25–53 (2009)CrossRefGoogle Scholar
  25. 25.
    Sisbot, E., Alami, R., Siméon, T., Dautenhahn, K., Walters, M., Woods, S.: Navigation in the presence of humans. In: Proceedings of the IEEE-RAS International Conference on Humanoid Robots (2005)Google Scholar
  26. 26.
    Strabala, K., Lee, M.K., Dragan, A., Forlizzi, J., Srinivasa, S., Cakmak, M., Micelli, V.: Towards seamless human-robot handovers. J. Hum.-Robot Interact. (2013)Google Scholar
  27. 27.
    Wilson, A., Fern, A., Tadepalli, P.: A Bayesian approach for policy learning from trajectory preference queries. In: Advances in Neural Information Processing Systems (2012)Google Scholar
  28. 28.
    Wirth, C., Fürnkranz, J.: Preference-based reinforcement learning: a preliminary survey. In: Fürnkranz, J., Hüllermeier, E. (eds.) Proceedings of the ECML/PKDD Workshop on Reinforcement Learning from Generalized Feedback: Beyond Numeric Rewards (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.School of ComputingNational University of SingaporeSingaporeSingapore

Personalised recommendations