Skip to main content

Coaching Robots: Online Behavior Learning from Human Subjective Feedback

  • Chapter
Innovations in Intelligent Machines -3

Part of the book series: Studies in Computational Intelligence ((SCI,volume 442))

Abstract

This chapter describes a novel methodology for behavior learning of an agent, called Coaching. The proposed method is an interactive and iterative learning method which allows a human trainer to give a subjective evaluation to the robotic agent in real time, and the agent can update the reward function dynamically based on this evaluation simultaneously. We demonstrated that the agent is capable of learning the desired behavior by receiving simple and subjective instructions such as positive and negative. The proposed approach is also effective when it is difficult to determine a suitable reward function for the learning situation in advance. We have conducted several experiments with a simulated and a real robot arm system, and the advantage of the proposed method is verified throughout those experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Atkenson, C.G., Schaal, S.: Robot learning from demonstration. In: Proc. of 14th Intl. Conf. on Machine Learning (1997)

    Google Scholar 

  2. Atkenson, C.G., Schaal, S.: Learning tasks from a single demonstration. In: Proc. of IEEE Intl. Conf. on Robotics and Automation, pp. 1706–1712 (1997)

    Google Scholar 

  3. Cypher, A., Halbert, D.C., Kurlander, D., et al.: Watch what I do: programming by demonstration. MIT Press, Cambridge (1993)

    Google Scholar 

  4. Doya, K.: Reinforcement learning in continuous time and space. J. Neural Computation 12(1), 219–245 (2000), doi:10.1162/089976600300015880

    Article  Google Scholar 

  5. Inamura, T., Toshima, I., Tanie, H., et al.: Embodied Symbol Emergence Based on Mimesis Theory. J. Robotics Research 23(4), 363–377 (2004), doi:10.1177/0278364904042199

    Article  Google Scholar 

  6. Jakel, R., Schmidt-Rohr, S.R., Xue, Z., et al.: Learning of probabilistic grasping strategies using programming by demonstration. In: Proc. of IEEE Intl. on Robotics and Automation, pp. 873–880 (2010)

    Google Scholar 

  7. Kamatani, H., Kitayama, K., Fujimura, A., et al.: Reinforcement learning in continuous state space. In: SICE Tohoku Chapter Workshop (2006)

    Google Scholar 

  8. Marcia, R., Ude, A., Atkenson, C., et al.: Coaching: An Approach to Efficiently and Intuitively Create Humanoid Robot Behaviors. In: Proc. of IEEE Intl. Conf. on Humanoid Robots, pp. 567–574 (2007)

    Google Scholar 

  9. Morimoto, J., Doya, K.: Reinforcement learning of dynamic motor sequence: learning to stand up. In: Proc. of IEEE Intl. Conf. on Intelligent Robots and Systems, pp. 567–574 (1998)

    Google Scholar 

  10. Nakatani, M., Suzuki, K., Hashimoto, S.: Subjective-Evaluation Oriented Teaching Scheme for a Biped Humanoid Robot. In: Proc. of IEEE Intl. Conf. on Humanoid Robots (2003)

    Google Scholar 

  11. Schaal, S.: Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences (1999), doi: 10.1016/s1364-6613(99)01327-3

    Google Scholar 

  12. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  13. Tamosiunaite, M., Asfour, T., Florentin, W.: Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions. J. Biological Cybernetics 100(3), 249–260 (2009), doi:10.1007/s00422-009-0295-8

    Article  MathSciNet  Google Scholar 

  14. Thomaz, A.L.: Socially Guided Machine Learning. PhD thesis, Massachusetts Institute of Technology, Cambridge (2006)

    Google Scholar 

  15. Thomaz, A.L., Hoffman, G., Breazeal, C.: Experiments in Socially Guided Machine Learning: Understanding How Humans Teach. In: Proc. of the 1st Annual Conf. on Human-Robot Interaction, pp. 359–360 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masakazu Hirkoawa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Berlin Heidelberg

About this chapter

Cite this chapter

Hirkoawa, M., Suzuki, K. (2013). Coaching Robots: Online Behavior Learning from Human Subjective Feedback. In: Jordanov, I., Jain, L.C. (eds) Innovations in Intelligent Machines -3. Studies in Computational Intelligence, vol 442. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32177-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32177-1_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32176-4

  • Online ISBN: 978-3-642-32177-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics