Teaching a Robot to Perform Task through Imitation and On-line Feedback

  • Adrián León
  • Eduardo F. Morales
  • Leopoldo Altamirano
  • Jaime R. Ruiz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7042)


Service robots are becoming increasingly available and it is expected that they will be part of many human activities in the near future. It is desirable for these robots to adapt themselves to the user’s needs, so non-expert users will have to teach them how to perform new tasks in natural ways. In this paper a new teaching by demonstration algorithm is described. It uses a Kinect® sensor to track the movements of a user, eliminating the need of special sensors or environment conditions, it represents the tasks with a relational representation to facilitate the correspondence problem between the user and robot arm and to learn how to perform tasks in a more general description, it uses reinforcement learning to improve over the initial sequences provided by the user, and it incorporates on-line feedback from the user during the learning process creating a novel dynamic reward shaping mechanism to converge faster to an optimal policy. We demonstrate the approach by learning simple manipulation tasks of a robot arm and show its superiority over more traditional reinforcement learning algorithms.


robot learning reinforcement learning programming by demonstration reward shaping 


  1. 1.
    Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-first International Conference on Machine Learning. ACM Press (2004)Google Scholar
  2. 2.
    Argall, B., Browning, B., Veloso, M.: Learning by demonstration with critique from a human teacher. In: 2nd Conf. on Human-Robot Interaction (HRI), pp. 57–64 (2007)Google Scholar
  3. 3.
    Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration (2009)Google Scholar
  4. 4.
    Billard, A.G., Calinon, S., Dillmann, R., Schaal, S.: Robot programming by demonstration. In: Siciliano, B., Khatib, O. (eds.) Handbook of Robotics, ch. 59. Springer, New York (2008)Google Scholar
  5. 5.
    Grzes, M., Kudenko, D.: Learning shaping rewards in model-based reinforcement learning (2009)Google Scholar
  6. 6.
    Judah, K., Roy, S., Fern, A., Dietterich, T.G.: Reinforcement learning via practice and critique advice. In: AAAI (2010)Google Scholar
  7. 7.
    Bradley Knox, W., Stone, P.: Combining manual feedback with subsequent mdp reward signals for reinforcement learning (2010)Google Scholar
  8. 8.
    Konidaris, G., Barto, A.: Autonomous shaping: knowledge transfer in reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 489–496 (2006)Google Scholar
  9. 9.
    Laud, A.: Theory and application of reward shaping in reinforcement learning (2004)Google Scholar
  10. 10.
    Mataric, M.J.: Reward functions for accelerated learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 181–189. Morgan Kaufmann (1994)Google Scholar
  11. 11.
    Ng, A.Y., Harada, D., Russell, S.: Policy invariance under reward transformations: Theory and application to reward shaping. In: Proceedings of the Sixteenth International Conference on Machine Learning, pp. 278–287. Morgan Kaufmann (1999)Google Scholar
  12. 12.
    Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. The MIT Press, Cambridge (1998)Google Scholar
  13. 13.
    Tenorio-Gonzalez, A.C., Morales, E.F., Villaseñor-Pineda, L.: Dynamic Reward Shaping: Training a Robot by Voice. In: Kuri-Morales, A., Simari, G.R. (eds.) IBERAMIA 2010. LNCS, vol. 6433, pp. 483–492. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Adrián León
    • 1
  • Eduardo F. Morales
    • 1
  • Leopoldo Altamirano
    • 1
  • Jaime R. Ruiz
    • 1
  1. 1.Optics and ElectronicsNational Institute of AstrophysicsTonantzintlaMéxico

Personalised recommendations