Abstract
Autonomy is a prime issue on robotics field and it is closely related to decision making. Last researches on decision making for social robots are focused on biologically inspired mechanisms for taking decisions. Following this approach, we propose a motivational system for decision making, using internal (drives) and external stimuli for learning to choose the right action. Actions are selected from a finite set of skills in order to keep robot’s needs within an acceptable range. The robot uses reinforcement learning in order to calculate the suitability of every action in each state. The state of the robot is determined by the dominant motivation and its relation to the objects presents in its environment.
The used reinforcement learning method exploits a new algorithm called Object Q-Learning. The proposed reduction of the state space and the new algorithm considering the collateral effects (relationship between different objects) results in a suitable algorithm to be applied to robots living in real environments.
In this paper, a first implementation of the decision making system and the learning process is implemented on a social robot showing an improvement in robot’s performance. The quality of its performance will be determined by observing the evolution of the robot’s wellbeing.
Similar content being viewed by others
References
Alami R, Chatila R, Fleury S, Ghallab M, Ingrand F (1998) An architecture for autonomy. Int J Robot Res 17(4):315–337. doi:10.1177/027836499801700402
Aldewereld H (2007) Autonomy vs. conformity an institutional perspective on norms and protocols. PhD thesis
Ávila García O, Cañamero, L (2004) Using hormonal feedback to modulate action selection in a competitive scenario. In: Proc 8th intl conference on simulation of adaptive behavior (SAB’04)
Bakker B, Zhumatiy V, Gruener G, Schmidhuber J (2003) A robot that reinforcement-learns to identify and memorize important previous observations. In: The IEEE/RSJ international conference on intelligent robots and systems, IROS2003
Barber K, Martin C (1999) Agent autonomy: specification, measurement, and dynamic adjustment. In: Proceedings of the autonomy control software workshop at autonomous agents, vol 1999, pp 8–15
Barber R, Salichs MA (2001) Mobile robot navigation based on event maps. In: International conference on field and service robotics, pp 61–66
Barber R, Salichs M (2002) A new human based architecture for intelligent autonomous robots. In: Proceedings of the 4th IFAC symposium on intelligent autonomous vehicles. Elsevier, Amsterdam, pp 85–90
Bechara A, Damasio H, Damasio AR (2000) Emotion decision making and the orbitofrontal cortex. Cereb Cortex (NY 1991) 10(3):295–307
Bekey G (2005) Autonomous robots: from biological inspiration to implementation and control. MIT Press, Cambridge
Bellman KL (2003) Emotions in humans and artifacts, chap. Emotions: meaningful mappings between the individual and its world. MIT Press, Cambridge
Berridge KC (2004) Motivation concepts in behavioural neuroscience. Physiol Behav 81:179–209
Bonarini A, Lazaric A, Restelli M, Vitali P (2006) Self-development framework for reinforcement learning agents. In: The 5th international conference on developmental learning (ICDL)
Boutilier C, Dearden R, Goldszmidt M (2000) Stochastic dynamic programming with factored representation. Artif Intell 121(1–2):49–107
Boyan J, Moore A (1995) Generalization in reinforcement learning: Safely approximating the value function. In: Advances in neural information processing systems, vol 7. MIT Press, Cambridge, pp 369–376
Callum A (1995) Reinforcement learning with selective perception and hidden state. Ph.D. thesis, University of Rochester, Rochester, NY
Cañamero L (1997) Modeling motivations and emotions as a basis for intelligent behavior. In: First international symposium on autonomous agents (Agents’97). ACM Press, New York, pp 148–155
Cañamero L (2000) Designing emotions for activity selection. Tech. rep., Dept. of Computer Science Technical Report DAIMI PB 545, University of Aarhus, Denmark
Cañamero L (2003) In: Emotions in humans and artifacts, chap. Designing emotions for activity selection in autonomous agents. MIT Press, Cambridge
Estlin T, Volpe R, Nesnas I, Mutz D, Fisher F, Engelhardt B, Chien S (2001) Decision-making in a robotic architecture for autonomy. In: Proceedings of the international symposium on artificial intelligence, robotics, and automation in space
Gadanho S (1999) Reinforcement learning in autonomous robots: an empirical investigation of the role of emotions. PhD thesis, University of Edinburgh
Gadanho S (2003) Learning behavior-selection by emotions and cognition in a multi-goal robot task. J Mach Learn Res 4:385–412
Gadanho S, Custodio L (2002) Asynchronous learning by emotions and cognition. In: From animals to animats VII, proceedings of the seventh international conference on simulation of adaptive behavior (SAB’02), Edinburgh, UK
Gancet J, Lacroix S (2007) Embedding heterogeneous levels of decisional autonomy in multi-robot systems. In: Distributed autonomous robotic systems, vol 6. Springer, Berlin, pp 263–272. doi:10.1007/978-4-431-35873-2
Geerinck T, Colon E, Berrabah SA, Cauwerts K, Sahli H (2006) Tele-robot with shared autonomy: distributed navigation development framework. Integr Comput-Aided Eng 13:329–345
Givan R, Dean T, Greig M (2003) Equivalence notions and model minimization in Markov decision processes. Artif Intell 147(1–2), 163–223
Guestrin C, Koller D, Parr R, Venkataraman S (2003) Efficient solution algorithms for factored mdps. J Artif Intell Res 19:399–468
Hull CL (1943) Principles of behavior. Appleton Century Crofts, New York
Humphrys M (1997) Action selection methods using reinforcement learning. PhD thesis, Trinity Hall, Cambridge
Isbell C, Shelton CR, Kearns M, Singh S, Stone P (2001) A social reinforcement learning agent. In: the fifth international conference on autonomous agents, Montreal, Quebec, Canada
Kaelbling LP, Littman LM, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Kaupp T, Makarenko A, Durrant-Whyte H (2010) Human-robot communication for collaborative decision making—a probabilistic approach. Robot Auton Syst 58(5):444–456. doi:10.1016/j.robot.2010.02.003
Li L, Walsh T, Littman M (2006) Towards a unified theory of state abstraction for MDP. In: Ninth international symposium on artificial intelligence and mathematics, pp 531–539
Lorenz K (1977) Behind the mirror. Methuen Young Books, London. ISBN 04 16942709
Lorenz K, Leyhausen P (1973) Motivation of human and animal behaviour; an ethological view, vol XIX. Van Nostrand-Reinhold, New York
Malfaz M (2007) Decision making system for autonomous social agents based on emotions and self-learning. PhD thesis, Carlos III University of Madrid (2007)
Malfaz M, Salichs M (2009) Learning to deal with objects. In: The 8th international conference on development and learning (ICDL 2009)
Malfaz M, Salichs M (2009) The use of emotions in an autonomous agent’s decision making process. In: Ninth international conference on epigenetic robotics: modeling cognitive development in robotic systems (EpiRob09), Venice, Italy
Malfaz M, Salichs M (2010) Using muds as an experimental platform for testing a decision making system for self-motivated autonomous agents. AISB J 2(1):21–44
Malfaz M, Castro-Gonzalez A, Barber R, Salichs M (2011) A biologically inspired architecture for an autonomous and social robot. IEEE Trans Auton Ment Dev 3(2). doi:10.1109/TAMD.2011.2112766
Martinson E, Stoytchev A, Arkin R (2002) Robot behavioral selection using q-learning. In: The IEEE/RSJ international conference on intelligent robots and systems (IROS), EPFL, Switzerland
Mataric M (1998) Behavior-based robotics as a tool for synthesis of artificial behavior and analysis of natural behavior. Trends Cogn Sci 2(3):82–87
Michaud F, Ferland F, Létourneau D, Legault MA, Lauria M (2010) Toward autonomous, compliant, omnidirectional humanoid robots for natural interaction in real-life settings. Paladyn 1(1):57–65. doi:10.2478/s13230-010-0003-3
Ribeiro CHC, Pegoraro R, RealiCosta AH (2002) Experience generalization for concurrent reinforcement learners: the minimax-qs algorithm. In: AAMAS 2002
Rivas R, Corrales A, Barber R, Salichs MA (2007) Robot skill abstraction for ad architecture. In: 6th IFAC symposium on intelligent autonomous vehicles
Salichs MA, Barber R, Khamis MA, Malfaz M, Gorostiza FJ, Pacheco R, Rivas R, Corrales A, Delgado E (2006) Maggie: a robotic platform for human-robot social interaction. In: IEEE international conference on robotics, automation and mechatronics (RAM), Bangkok, Thailand
Salichs J, Castro-Gonzalez A, Salichs MA (2009) Infrared remote control with a social robot. In: FIRA RoboWorld congress 2009, Incheon, Korea. Springer, Berlin
Salichs MA, Malfaz M, Gorostiza JF (2010) Toma de decisiones en robotica. Rev Iberoam Autom Inf Ind 7(2):5–16
Santa-Cruz J, Tobal JM, Vindel AC, Fernández EG (1989) Introducción a la psicología. Facultad de Psicología. Universidad Complutense de Madrid
Schermerhorn P, Scheutz M (2009) Dynamic robot autonomy: investigating the effects of robot decision-making in a human-robot team task. In: Procceding of ICMI-MLMI 2009, Cambridge, MA, USA. doi:10.1145/1647314.1647328
Scheutz M, Schermerhorn P (2009) Affective goal and task selection for social robots. Handbook of research on synthetic emotions and sociable robotics: new applications in affective computing and artificial intelligence, p 74
Smart WD, Kaelbling LP (2002) Effective reinforcement learning for mobile robots. In: International conference on robotics and automation (ICRA2002)
Smith EB (2009) The motion control of a mobile robot using multiobjective decision making. In: ACM-SE 47: proceedings of the 47th annual southeast regional conference. ACM Press, New York, pp 1–6
Sprague N, Ballard D (2003) Multiple-goal reinforcement learning with modular sarsa(0). In: The 18th international joint conference on artificial intelligence (IJCAI-03), Acapulco, Mexico
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press/Bradford Book, Cambridge
Thomaz AL, Breazeal C (2006) Transparency and socially guided machine learning. In: The 5th international conference on developmental learning (ICDL)
Touzet C (2003) Q-learning for robots. In: The handbook of brain theory and neural networks. MIT Press, Cambridge, pp 934–937
Velásquez J (1997) Modeling emotions and other motivations in synthetic agents. In: Fourteenth national conf artificial intelligence
Velásquez J (1998) Modelling emotion-based decision-making. In: 1998 AAAI fall symposium emotional and intelligent: the tangled knot of cognition
Velásquez J (1998) When robots weep: emotional memories and decision making. In: Proceedings of AAAI-98
Verhagen H (2000) Norm autonomous agents. PhD thesis, The Royal Institute of Technology and Stockholm University
Vigorito C, Barto A (2010) Intrinsically motivated hierarchical skill learning in structured environment. IEEE Trans Auton Ment Dev 2(2):132–143. Special Issue on Active Learning and Intrinsically Motivated Exploration in Robots
Watkins CJ (1989) Models of delayed reinforcement learning. PhD thesis, Cambridge University, Cambridge, UK
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Castro-González, Á., Malfaz, M. & Salichs, M.A. Learning the Selection of Actions for an Autonomous Social Robot by Reinforcement Learning Based on Motivations. Int J of Soc Robotics 3, 427–441 (2011). https://doi.org/10.1007/s12369-011-0113-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-011-0113-z