Coaching Robots: Online Behavior Learning from Human Subjective Feedback

Hirkoawa, Masakazu; Suzuki, Kenji

doi:10.1007/978-3-642-32177-1_3

Masakazu Hirkoawa⁴ &
Kenji Suzuki⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 442))

584 Accesses
3 Citations

Abstract

This chapter describes a novel methodology for behavior learning of an agent, called Coaching. The proposed method is an interactive and iterative learning method which allows a human trainer to give a subjective evaluation to the robotic agent in real time, and the agent can update the reward function dynamically based on this evaluation simultaneously. We demonstrated that the agent is capable of learning the desired behavior by receiving simple and subjective instructions such as positive and negative. The proposed approach is also effective when it is difficult to determine a suitable reward function for the learning situation in advance. We have conducted several experiments with a simulated and a real robot arm system, and the advantage of the proposed method is verified throughout those experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Atkenson, C.G., Schaal, S.: Robot learning from demonstration. In: Proc. of 14th Intl. Conf. on Machine Learning (1997)
Google Scholar
Atkenson, C.G., Schaal, S.: Learning tasks from a single demonstration. In: Proc. of IEEE Intl. Conf. on Robotics and Automation, pp. 1706–1712 (1997)
Google Scholar
Cypher, A., Halbert, D.C., Kurlander, D., et al.: Watch what I do: programming by demonstration. MIT Press, Cambridge (1993)
Google Scholar
Doya, K.: Reinforcement learning in continuous time and space. J. Neural Computation 12(1), 219–245 (2000), doi:10.1162/089976600300015880
Article Google Scholar
Inamura, T., Toshima, I., Tanie, H., et al.: Embodied Symbol Emergence Based on Mimesis Theory. J. Robotics Research 23(4), 363–377 (2004), doi:10.1177/0278364904042199
Article Google Scholar
Jakel, R., Schmidt-Rohr, S.R., Xue, Z., et al.: Learning of probabilistic grasping strategies using programming by demonstration. In: Proc. of IEEE Intl. on Robotics and Automation, pp. 873–880 (2010)
Google Scholar
Kamatani, H., Kitayama, K., Fujimura, A., et al.: Reinforcement learning in continuous state space. In: SICE Tohoku Chapter Workshop (2006)
Google Scholar
Marcia, R., Ude, A., Atkenson, C., et al.: Coaching: An Approach to Efficiently and Intuitively Create Humanoid Robot Behaviors. In: Proc. of IEEE Intl. Conf. on Humanoid Robots, pp. 567–574 (2007)
Google Scholar
Morimoto, J., Doya, K.: Reinforcement learning of dynamic motor sequence: learning to stand up. In: Proc. of IEEE Intl. Conf. on Intelligent Robots and Systems, pp. 567–574 (1998)
Google Scholar
Nakatani, M., Suzuki, K., Hashimoto, S.: Subjective-Evaluation Oriented Teaching Scheme for a Biped Humanoid Robot. In: Proc. of IEEE Intl. Conf. on Humanoid Robots (2003)
Google Scholar
Schaal, S.: Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences (1999), doi: 10.1016/s1364-6613(99)01327-3
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Tamosiunaite, M., Asfour, T., Florentin, W.: Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions. J. Biological Cybernetics 100(3), 249–260 (2009), doi:10.1007/s00422-009-0295-8
Article MathSciNet Google Scholar
Thomaz, A.L.: Socially Guided Machine Learning. PhD thesis, Massachusetts Institute of Technology, Cambridge (2006)
Google Scholar
Thomaz, A.L., Hoffman, G., Breazeal, C.: Experiments in Socially Guided Machine Learning: Understanding How Humans Teach. In: Proc. of the 1st Annual Conf. on Human-Robot Interaction, pp. 359–360 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Intelligent Interaction Technologies, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8573, Japan
Masakazu Hirkoawa
Faculty of Engineering, Information and Systems, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8573, Japan
Kenji Suzuki

Authors

Masakazu Hirkoawa
View author publications
You can also search for this author in PubMed Google Scholar
Kenji Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masakazu Hirkoawa .

Editor information

Editors and Affiliations

School of Computing, University of Portsmouth, PO1 3HE, Portsmouth, United Kingdom
Ivan Jordanov
School of Electrical and Information Eng, University of South Australia, Adelaide, SA 5095, Adelaide, South Australia, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hirkoawa, M., Suzuki, K. (2013). Coaching Robots: Online Behavior Learning from Human Subjective Feedback. In: Jordanov, I., Jain, L.C. (eds) Innovations in Intelligent Machines -3. Studies in Computational Intelligence, vol 442. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32177-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-32177-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32176-4
Online ISBN: 978-3-642-32177-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics