Learning the Selection of Actions for an Autonomous Social Robot by Reinforcement Learning Based on Motivations

Castro-González, Álvaro; Malfaz, María; Salichs, Miguel A.

doi:10.1007/s12369-011-0113-z

Learning the Selection of Actions for an Autonomous Social Robot by Reinforcement Learning Based on Motivations

Published: 29 September 2011

Volume 3, pages 427–441, (2011)
Cite this article

International Journal of Social Robotics Aims and scope Submit manuscript

Álvaro Castro-González¹,
María Malfaz¹ &
Miguel A. Salichs¹

397 Accesses
15 Citations
Explore all metrics

Abstract

Autonomy is a prime issue on robotics field and it is closely related to decision making. Last researches on decision making for social robots are focused on biologically inspired mechanisms for taking decisions. Following this approach, we propose a motivational system for decision making, using internal (drives) and external stimuli for learning to choose the right action. Actions are selected from a finite set of skills in order to keep robot’s needs within an acceptable range. The robot uses reinforcement learning in order to calculate the suitability of every action in each state. The state of the robot is determined by the dominant motivation and its relation to the objects presents in its environment.

The used reinforcement learning method exploits a new algorithm called Object Q-Learning. The proposed reduction of the state space and the new algorithm considering the collateral effects (relationship between different objects) results in a suitable algorithm to be applied to robots living in real environments.

In this paper, a first implementation of the decision making system and the learning process is implemented on a social robot showing an improvement in robot’s performance. The quality of its performance will be determined by observing the evolution of the robot’s wellbeing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Framing the predictive mind: why we should think again about Dreyfus

Article Open access 06 May 2024

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Social Robotics

References

Alami R, Chatila R, Fleury S, Ghallab M, Ingrand F (1998) An architecture for autonomy. Int J Robot Res 17(4):315–337. doi:10.1177/027836499801700402
Article Google Scholar
Aldewereld H (2007) Autonomy vs. conformity an institutional perspective on norms and protocols. PhD thesis
Ávila García O, Cañamero, L (2004) Using hormonal feedback to modulate action selection in a competitive scenario. In: Proc 8th intl conference on simulation of adaptive behavior (SAB’04)
Google Scholar
Bakker B, Zhumatiy V, Gruener G, Schmidhuber J (2003) A robot that reinforcement-learns to identify and memorize important previous observations. In: The IEEE/RSJ international conference on intelligent robots and systems, IROS2003
Google Scholar
Barber K, Martin C (1999) Agent autonomy: specification, measurement, and dynamic adjustment. In: Proceedings of the autonomy control software workshop at autonomous agents, vol 1999, pp 8–15
Google Scholar
Barber R, Salichs MA (2001) Mobile robot navigation based on event maps. In: International conference on field and service robotics, pp 61–66
Google Scholar
Barber R, Salichs M (2002) A new human based architecture for intelligent autonomous robots. In: Proceedings of the 4th IFAC symposium on intelligent autonomous vehicles. Elsevier, Amsterdam, pp 85–90
Google Scholar
Bechara A, Damasio H, Damasio AR (2000) Emotion decision making and the orbitofrontal cortex. Cereb Cortex (NY 1991) 10(3):295–307
Article Google Scholar
Bekey G (2005) Autonomous robots: from biological inspiration to implementation and control. MIT Press, Cambridge
Google Scholar
Bellman KL (2003) Emotions in humans and artifacts, chap. Emotions: meaningful mappings between the individual and its world. MIT Press, Cambridge
Google Scholar
Berridge KC (2004) Motivation concepts in behavioural neuroscience. Physiol Behav 81:179–209
Article Google Scholar
Bonarini A, Lazaric A, Restelli M, Vitali P (2006) Self-development framework for reinforcement learning agents. In: The 5th international conference on developmental learning (ICDL)
Google Scholar
Boutilier C, Dearden R, Goldszmidt M (2000) Stochastic dynamic programming with factored representation. Artif Intell 121(1–2):49–107
Article MathSciNet MATH Google Scholar
Boyan J, Moore A (1995) Generalization in reinforcement learning: Safely approximating the value function. In: Advances in neural information processing systems, vol 7. MIT Press, Cambridge, pp 369–376
Google Scholar
Callum A (1995) Reinforcement learning with selective perception and hidden state. Ph.D. thesis, University of Rochester, Rochester, NY
Cañamero L (1997) Modeling motivations and emotions as a basis for intelligent behavior. In: First international symposium on autonomous agents (Agents’97). ACM Press, New York, pp 148–155
Chapter Google Scholar
Cañamero L (2000) Designing emotions for activity selection. Tech. rep., Dept. of Computer Science Technical Report DAIMI PB 545, University of Aarhus, Denmark
Cañamero L (2003) In: Emotions in humans and artifacts, chap. Designing emotions for activity selection in autonomous agents. MIT Press, Cambridge
Google Scholar
Estlin T, Volpe R, Nesnas I, Mutz D, Fisher F, Engelhardt B, Chien S (2001) Decision-making in a robotic architecture for autonomy. In: Proceedings of the international symposium on artificial intelligence, robotics, and automation in space
Google Scholar
Gadanho S (1999) Reinforcement learning in autonomous robots: an empirical investigation of the role of emotions. PhD thesis, University of Edinburgh
Gadanho S (2003) Learning behavior-selection by emotions and cognition in a multi-goal robot task. J Mach Learn Res 4:385–412
Google Scholar
Gadanho S, Custodio L (2002) Asynchronous learning by emotions and cognition. In: From animals to animats VII, proceedings of the seventh international conference on simulation of adaptive behavior (SAB’02), Edinburgh, UK
Google Scholar
Gancet J, Lacroix S (2007) Embedding heterogeneous levels of decisional autonomy in multi-robot systems. In: Distributed autonomous robotic systems, vol 6. Springer, Berlin, pp 263–272. doi:10.1007/978-4-431-35873-2
Chapter Google Scholar
Geerinck T, Colon E, Berrabah SA, Cauwerts K, Sahli H (2006) Tele-robot with shared autonomy: distributed navigation development framework. Integr Comput-Aided Eng 13:329–345
Google Scholar
Givan R, Dean T, Greig M (2003) Equivalence notions and model minimization in Markov decision processes. Artif Intell 147(1–2), 163–223
Article MathSciNet MATH Google Scholar
Guestrin C, Koller D, Parr R, Venkataraman S (2003) Efficient solution algorithms for factored mdps. J Artif Intell Res 19:399–468
MathSciNet MATH Google Scholar
Hull CL (1943) Principles of behavior. Appleton Century Crofts, New York
Google Scholar
Humphrys M (1997) Action selection methods using reinforcement learning. PhD thesis, Trinity Hall, Cambridge
Isbell C, Shelton CR, Kearns M, Singh S, Stone P (2001) A social reinforcement learning agent. In: the fifth international conference on autonomous agents, Montreal, Quebec, Canada
Google Scholar
Kaelbling LP, Littman LM, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Google Scholar
Kaupp T, Makarenko A, Durrant-Whyte H (2010) Human-robot communication for collaborative decision making—a probabilistic approach. Robot Auton Syst 58(5):444–456. doi:10.1016/j.robot.2010.02.003
Article Google Scholar
Li L, Walsh T, Littman M (2006) Towards a unified theory of state abstraction for MDP. In: Ninth international symposium on artificial intelligence and mathematics, pp 531–539
Google Scholar
Lorenz K (1977) Behind the mirror. Methuen Young Books, London. ISBN 04 16942709
Google Scholar
Lorenz K, Leyhausen P (1973) Motivation of human and animal behaviour; an ethological view, vol XIX. Van Nostrand-Reinhold, New York
Google Scholar
Malfaz M (2007) Decision making system for autonomous social agents based on emotions and self-learning. PhD thesis, Carlos III University of Madrid (2007)
Malfaz M, Salichs M (2009) Learning to deal with objects. In: The 8th international conference on development and learning (ICDL 2009)
Google Scholar
Malfaz M, Salichs M (2009) The use of emotions in an autonomous agent’s decision making process. In: Ninth international conference on epigenetic robotics: modeling cognitive development in robotic systems (EpiRob09), Venice, Italy
Google Scholar
Malfaz M, Salichs M (2010) Using muds as an experimental platform for testing a decision making system for self-motivated autonomous agents. AISB J 2(1):21–44
Google Scholar
Malfaz M, Castro-Gonzalez A, Barber R, Salichs M (2011) A biologically inspired architecture for an autonomous and social robot. IEEE Trans Auton Ment Dev 3(2). doi:10.1109/TAMD.2011.2112766
Martinson E, Stoytchev A, Arkin R (2002) Robot behavioral selection using q-learning. In: The IEEE/RSJ international conference on intelligent robots and systems (IROS), EPFL, Switzerland
Google Scholar
Mataric M (1998) Behavior-based robotics as a tool for synthesis of artificial behavior and analysis of natural behavior. Trends Cogn Sci 2(3):82–87
Article Google Scholar
Michaud F, Ferland F, Létourneau D, Legault MA, Lauria M (2010) Toward autonomous, compliant, omnidirectional humanoid robots for natural interaction in real-life settings. Paladyn 1(1):57–65. doi:10.2478/s13230-010-0003-3
Article Google Scholar
Ribeiro CHC, Pegoraro R, RealiCosta AH (2002) Experience generalization for concurrent reinforcement learners: the minimax-qs algorithm. In: AAMAS 2002
Google Scholar
Rivas R, Corrales A, Barber R, Salichs MA (2007) Robot skill abstraction for ad architecture. In: 6th IFAC symposium on intelligent autonomous vehicles
Google Scholar
Salichs MA, Barber R, Khamis MA, Malfaz M, Gorostiza FJ, Pacheco R, Rivas R, Corrales A, Delgado E (2006) Maggie: a robotic platform for human-robot social interaction. In: IEEE international conference on robotics, automation and mechatronics (RAM), Bangkok, Thailand
Google Scholar
Salichs J, Castro-Gonzalez A, Salichs MA (2009) Infrared remote control with a social robot. In: FIRA RoboWorld congress 2009, Incheon, Korea. Springer, Berlin
Google Scholar
Salichs MA, Malfaz M, Gorostiza JF (2010) Toma de decisiones en robotica. Rev Iberoam Autom Inf Ind 7(2):5–16
Article Google Scholar
Santa-Cruz J, Tobal JM, Vindel AC, Fernández EG (1989) Introducción a la psicología. Facultad de Psicología. Universidad Complutense de Madrid
Schermerhorn P, Scheutz M (2009) Dynamic robot autonomy: investigating the effects of robot decision-making in a human-robot team task. In: Procceding of ICMI-MLMI 2009, Cambridge, MA, USA. doi:10.1145/1647314.1647328
Google Scholar
Scheutz M, Schermerhorn P (2009) Affective goal and task selection for social robots. Handbook of research on synthetic emotions and sociable robotics: new applications in affective computing and artificial intelligence, p 74
Smart WD, Kaelbling LP (2002) Effective reinforcement learning for mobile robots. In: International conference on robotics and automation (ICRA2002)
Google Scholar
Smith EB (2009) The motion control of a mobile robot using multiobjective decision making. In: ACM-SE 47: proceedings of the 47th annual southeast regional conference. ACM Press, New York, pp 1–6
Chapter Google Scholar
Sprague N, Ballard D (2003) Multiple-goal reinforcement learning with modular sarsa(0). In: The 18th international joint conference on artificial intelligence (IJCAI-03), Acapulco, Mexico
Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press/Bradford Book, Cambridge
Google Scholar
Thomaz AL, Breazeal C (2006) Transparency and socially guided machine learning. In: The 5th international conference on developmental learning (ICDL)
Google Scholar
Touzet C (2003) Q-learning for robots. In: The handbook of brain theory and neural networks. MIT Press, Cambridge, pp 934–937
Google Scholar
Velásquez J (1997) Modeling emotions and other motivations in synthetic agents. In: Fourteenth national conf artificial intelligence
Google Scholar
Velásquez J (1998) Modelling emotion-based decision-making. In: 1998 AAAI fall symposium emotional and intelligent: the tangled knot of cognition
Google Scholar
Velásquez J (1998) When robots weep: emotional memories and decision making. In: Proceedings of AAAI-98
Google Scholar
Verhagen H (2000) Norm autonomous agents. PhD thesis, The Royal Institute of Technology and Stockholm University
Vigorito C, Barto A (2010) Intrinsically motivated hierarchical skill learning in structured environment. IEEE Trans Auton Ment Dev 2(2):132–143. Special Issue on Active Learning and Intrinsically Motivated Exploration in Robots
Article Google Scholar
Watkins CJ (1989) Models of delayed reinforcement learning. PhD thesis, Cambridge University, Cambridge, UK

Download references

Author information

Authors and Affiliations

RoboticsLab, Carlos III University of Madrid, 28911, Leganés, Spain
Álvaro Castro-González, María Malfaz & Miguel A. Salichs

Authors

Álvaro Castro-González
View author publications
You can also search for this author in PubMed Google Scholar
María Malfaz
View author publications
You can also search for this author in PubMed Google Scholar
Miguel A. Salichs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Álvaro Castro-González.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Castro-González, Á., Malfaz, M. & Salichs, M.A. Learning the Selection of Actions for an Autonomous Social Robot by Reinforcement Learning Based on Motivations. Int J of Soc Robotics 3, 427–441 (2011). https://doi.org/10.1007/s12369-011-0113-z

Download citation

Accepted: 12 September 2011
Published: 29 September 2011
Issue Date: November 2011
DOI: https://doi.org/10.1007/s12369-011-0113-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning the Selection of Actions for an Autonomous Social Robot by Reinforcement Learning Based on Motivations

Abstract

Access this article

Similar content being viewed by others

Framing the predictive mind: why we should think again about Dreyfus

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Social Robotics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning the Selection of Actions for an Autonomous Social Robot by Reinforcement Learning Based on Motivations

Abstract

Access this article

Similar content being viewed by others

Framing the predictive mind: why we should think again about Dreyfus

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Social Robotics

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation