Abstract
In the last years, considerable research has been carried out to develop robots that can improve our quality of life during tedious and challenging tasks. In these contexts, robots operating without human supervision open many possibilities to assist people in their daily activities. When autonomous robots collaborate with humans, social skills are necessary for adequate communication and cooperation. Considering these facts, endowing autonomous and social robots with decision-making and control models is critical for appropriately fulfiling their initial goals. This manuscript presents a systematic review of the evolution of decision-making systems and control architectures for autonomous and social robots in the last three decades. These architectures have been incorporating new methods based on biologically inspired models and Machine Learning to enhance these systems’ possibilities to developed societies. The review explores the most novel advances in each application area, comparing their most essential features. Additionally, we describe the current challenges of software architecture devoted to action selection, an analysis not provided in similar reviews of behavioural models for autonomous and social robots. Finally, we present the future directions that these systems can take in the future.
Similar content being viewed by others
1 Introduction
The unceasing technological development of the last decades has brought many advances to our society. Among these new advances, the development of autonomous robots that operate without human supervision opens a wide range of possibilities in tasks that can be dangerous for humans, are repetitive, or where the workforce is scarce (e.g. nursing). In these scenarios, autonomous robots usually have to assist people and interact with them, so endowing these machines with social behaviour is essential.
According to [1], social behaviour can be defined as “all behaviour that influences, or is influenced by, other members of the same species”. Therefore, since social robots are for interacting with and assisting people, emulating human behaviour and decision-making to make these systems autonomously fulfil their tasks enable better cooperation between social robots and their users [2]. Nevertheless, emulating biological functions in robots is not easy as many concepts intertwine to shape human behaviour. However, the artificial life community has typically addressed this challenge using ethological (study of animal behaviour) approaches where terms like perception, cognition, emotion and affect, homeostasis, motivation, learning, or social interaction are widely used. Next, we define these terms to help frame and understand the importance of these concepts on the autonomous and social behaviour of social robots, the topic of this review.
-
Perception: Human perception can be defined as our primary form of cognitive contact with the world around us [3]. Therefore, in robotics, it refers to the capacity to perceive the external environment.
-
Cognition: This term refers to the human ability to know, learn, and understand things [4]. Consequently, designing cognitive systems implies making them capable of reasoning about their actions.
-
Emotion and affect: Emotions are mental states derived from the situations that we experience that are sometimes translated into physical responses [5]. Thus, emotions and affect are typically used in robotics to emulate how the robot feels due to its experiences.
-
Homeostasis: The regulation by an organism of all aspects of its internal environment [6]. In robotics, it means emulating animal functions such as heart rate to regulate internal functions.
-
Motivation: Motivation is what urges and drives behaviour [7]. It is closely related to perception and physiological needs as the basis of behaviour selection and execution.
-
Learning: According to [6], learning implies gaining knowledge from study and experience. In robotics, it refers to improving the robot’s behaviour using past experiences after interacting with the environment. In social robots, the typical approach is Reinforcement Learning (RL).
-
Social interaction: Social interaction can be defined as any process involving reciprocal stimulation or response between two or more individuals [6].
Since the late 1990 s, many social robots with autonomous behaviour have been designed in areas such as education [8, 9], healthcare [10], companionship [11], or social interaction [12,13,14] emulating many of the previous ideas. In social scenarios, the interaction dynamics are typically unknown and unpredictable, so robots working in these environments must have appropriate decision-making capabilities to autonomously select their actions and successfully fulfil the task for which they are intended [15].
Considering these facts, since the early 1990 s, many researchers have focused on investigating how to endow robots with decision-making capabilities and have designed many models, typically emulating animal behaviour [16]. Nowadays, autonomous and social robots are deployed in many scenarios as promising systems aiming to improve our lives quality. Nevertheless, to continue enhancing these systems’ capabilities, we believe that a deeper analysis of the current situation of decision-making and control architectures is necessary, assessing their evolution over the years and framing their challenges and future goals. Previously, Cao et al. [10] described state of the art in behavioural models for social robots in healthcare. However, we have not found any review that addresses the evolution of decision-making systems (DMSs) for autonomous and social robots. For this reason, we propose this contribution that fills this gap in the literature by providing a comprehensive overview of control architectures for autonomous and social robots.
This manuscript reviews the evolution and trends of DMSs and control architectures for autonomous and social robots in the last three decades. Moreover, we analyse how these systems have evolved in their application to specific areas, the duration of their operation, the included learning methods and the use of biologically inspired models that emulate animal (human) decision-making. From this analysis, we evaluate some of the principal challenges of DMSs and control architectures to envision future work that may help overcome some of their main limitations.
This manuscript is organised as follows. Section 2 presents the materials and methods followed during our study. Section 3 reviews the state of the art of DMSs and control architectures for autonomous and social robots by area of application. Next, Sect. 4 analyses the results of our survey in the last three decades, attempting to study the tendencies thoroughly these systems have experienced across decades. Then, in Sect. 5, we go deeply into the challenges that autonomous and social robots have to tackle, emphasising those aspects that we have acknowledged in our prior study. Section 6 addresses the future of DMSs for artificial embodied agents, providing our own experience as designers. Finally, we provide comparative extensive tables of the work reviewed in this manuscript related to each area in the “Appendix” section.
2 Materials and Methods
This section presents the methodology, based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), we followed during our study to select the most appropriate contributions in control architectures and DMSs for autonomous and social robots.
2.1 Study Selection Procedures
This manuscript analyses empirical studies in the last three decades as the bulk of the contributions in the area were carried out within this period. The bibliography database we used to build our database and perform the analysis was developed by searching in Google Scholar, Scopus, and Web of Science electronic databases. These databases were selected due to the ample number of publications they contain and because they are used worldwide. Table 1 contains the queries employed for building the database used for conducting our examination.
The use of these keywords results in 182 hits in Google Scholar, 18 in Scopus, and 8 in Web of Science. The search was first conducted in Google Scholar, then Scopus, and finally Web of Science, obtaining 208 hits without duplicates. Unfortunately, we could not obtain the full-text of 5 works from this list, leading to a final number of 203 works to be screened. After reading the title and abstract of these works, we excluded 31 papers because (i) they were not written in English or (ii) the architectures presented were for fully teleoperated robots. Consequently, 172 full-text articles were assessed for eligibility.
Finally, we selected 148 works out of 172 possibilities because they fulfil our final requirements. These requirements were (i) describing the action selection or decision-making method for generating autonomous behaviour, (ii) involving humans in the decision-making process, (iii) indicating if they have been applied in real robots or just in simulation, and (iv) describing the system application. Figure 1 shows the process of identification, screening, eligibility, and inclusion in the analysis.
3 Review
The following review thoroughly describes autonomous and social agents’ outstanding decision-making and control systems in the last three decades. After carrying out an extensive review of contributions describing DMSs and exploring the areas where these systems are applied, we opted to review and narrate the works in the following categories:
-
Research: In this category, we classify those publications that present decision-making and control systems as conceptual models not applied to any specific domain but purely designed for research.
-
Manufacturing: This category clusters contributions that present decision systems used in manufacturing and production environments, such as factories.
-
Healthcare: We classified the publications where a robot with autonomous decision-making improves people’s healthcare.
-
Education: This category includes contributions that promote people’s learning by using autonomous and social robots.
-
Entertainment: The contributions where decision-making and control architectures are used for the users’ entertainment are in this category.
-
Companionship: This category contains publications where DMSs are integrated into robots that provide companionship to vulnerable sectors of society, such as older adults.
-
Assistance and service: In this area, we present those publications concerned with assisting people and providing them with essential services to facilitate their tasks.
Then, Sect. 4 studies the number of works per decade and area, the evolution of the action selection and learning methods, if these works use bioinspiration, the HRI duration of the experiment where the architecture is integrated, and whether they applied on real scenarios and a real robot. These classes were selected to provide an accurate vision of the evolution and challenges of these systems. Besides, we use them in the comparative tables included in this manuscript in the “Appendix” section.
3.1 Decision-Making Systems in Research
In the last thirty years, many contributions have described decision-making and control systems for robots. Since the term social robot was not coined until the 1990 s, as Fig. 2 shows, our review starts with action selection architectures intended for autonomous robots. Then, with the rise of social robots, we provide a more detailed vision of architectures designed explicitly for social contexts and, more specifically, for Human–Robot Interaction (HRI). A comparative analysis of the works described in this section is in “Appendix A”.
3.1.1 The 1990 s: Initial Research Models
We begin this survey with the work carried out by Meyer and Wilson [18] at the beginning of the 1990 s, who presented a book about making robots intelligent and autonomous, providing insights about how to replicate human behaviour in robots using ideas previously published by Lorenz [19] and Tinbergen [20].
One year later, Maja [21] designed a framework for the autonomous navigation of mobile robots. The robot uses a compass and sonars as well as if–then rules to accomplish its navigation goals. Similarly, Mahadevan and Connell [22] proposed the autonomous control of a robot, but this time, using Q-learning combined with statistical clustering to select actions. They were the first to use Reinforcement Learning (RL) in autonomous robots, a widespread technique nowadays. In the same year, Elliot [23] designed a multi-agent virtual world to simulate the emotional behaviour of autonomous agents. Each agent presented their personality generated from the model of emotion due to Ortony et al. [24]. The behaviour of the agent depends on the agent’s personality. The study aimed to analyse the role of emotion in decision-making and behaviour.
Dorigo and Schnepf [25] designed a conceptual robot controller that can adapt to a dynamic environment. The robot incorporates Genetic Learning [26] to update its behaviour depending on the state of the environment as perceived by the robot’s sensors. Then, using an arbitration system, a set of rules selects an action. Like the previous paper, Hayes and Demiris [27] presented a model based on learning by imitation, where a robot selected its behaviour by perceiving the actions of a teacher robot. The novelty resides in knowing when to carry out learning depending on the usefulness of the teacher’s action. Continuing in the design of autonomous mobile robots, Nolfi et al. [28] analysed in 1994 how to conceptually design autonomous mobile robots using evolutionary approaches, providing different neural controllers to evaluate the behaviour exhibited by the robot in each situation and obtain the best solution depending on the robot’s situation. In similar scenarios, García et al. [29] explored in 1995 how to make autonomous robots work in navigation tasks, focusing on obtaining a scalable and modular model based on rules organised in decision trees.
By the mid-1990 s, the tendency started to change, with the development models inspired by nature. This does not mean that researchers abandoned probabilistic and rule-based models, but the number of publications emulating biological functions in robots grew notably. In this line, Steels [30] explored how to address autonomy and intelligence in artificial agents from a biologically inspired perspective. The author stated that biologically inspired decision-making is essential for the agents’ survival to make more capable robots. Deepening this concept, Webb [31] presented in 1995 a publication concerning emulating the behaviour of crickets so as to be implemented by a robot. The study’s goal was to understand better animal ethology and the sensorimotor problems of animal robots. The decision-making consisted of selecting the best action depending on the robot’s state.
During those years, many authors started their research careers in autonomous robots. Some of these authors considered the work of Velásquez [32] in 1996 as a model for representing emotion and motivation in autonomous artificial agents. The architecture includes many essential biological aspects of humans, characterizing how we perceive the environment to make decisions. The dissertation presents great insights about making emotional and intelligent agents. In addition, the system was tested in the social robot Simon to work in HRI.
A couple of years later, in 1998, Velásquez [33] developed a new model for the autonomous decision-making of artificial emotional agents. The model simulates internal deficits that originate with the emotional responses to perceptions. Then, emotions, perceptions, and deficits influence the robot’s decision-making to produce appropriate actuation commands. The system was tested in different robots to explore the role of emotions in selecting actions. Using some of Velásquez’s ideas, Webster [34] introduced in 1997 the basics of emotional computing and intelligent processing to attain autonomous behaviour, positing that autonomous agents require reasoning and emotion to adapt to dynamic and complex situations. Like Webster, Arkin [35] studied how to endow a robot with autonomous behaviour, but from a more motivational perspective, addressing important aspects of human behaviour such as socialization, adaptation, and perception, from the perspective of both deliberative and reactive processes.
Cañamero presented in 1997 one of her first publications [36] in autonomous artificial agents with biologically inspired behaviour, describing a newborn living in a virtual world, whose primary goal is to survive. The model shapes essential functions like physiological deficits, motivations, and emotion, allowing the agent to exhibit a fully autonomous behaviour. In addition, the agent needs to interact socially with other virtual agents to reduce social needs and obtain the resources to incorporate learning mechanisms. The action selection consists of reducing the deficit associated with the motivational state with the highest level of intensity. As we will present later, this author updated this initial model on many occasions, applying it to HRI scenarios.
Moving back to fuzzy control and mobile robots, Tunstel et al. [37] presented in 1997 a DMS based on fuzzy rules for autonomous navigation. The fuzzy rules evaluate the robot’s goal and the sensory information to generate appropriate behaviours and fulfil the predefined task. Similarly, El-Nasr and Skubic [38] proposed a DMS based on fuzzy control and emotion for autonomous mobile robots. The system evaluates the robot’s internal and external state, allowing it to react to unexpected situations. The model explored the significant role of emotion in decision-making, paying particular attention to negative emotions such as fear, pain, or anger. Continuing in this line, Arsene and Zalzala [39] designed in 1999 a fuzzy controller for autonomous navigation in complex environments. The robot’s decision-making combines a deliberative task planner based on fuzzy rules with a reactive layer for collision avoidance.
As did the previous authors, Matarić [40] explored in 1998 how mobile robots should produce autonomous behaviour based on biologically inspired concepts such as learning and adaptation. The paper identifies coordinating multiple behaviours and working in multi-agent environments as significant challenges of mobile robots. In this line, Reif and Wang [41] also presented one year later a DMS for the distributed motion control of mobile robots working in groups. The decisions of each robot are based on potential fields, so the action taken by a robot depends on the position and potential field of the other robots. The field could be attractive or repulsive, depending on the social relations between the robots. Additionally, to the previous literature, Ishiguro et al. [42] also presented in 1999 a robot controller for autonomous mobile robots. In this case, the action selection depended on a planner situated module that generated appropriate paths for navigating safely using the information provided by the sensors.
In 1999, the term autonomous social robot started to appear in the literature frequently. Billard and Dautenhahn [43] designed a DMS for robots to imitate the behaviour of other robots in social tasks. Similar to and contemporary with the previous two publications, Rooney et al. [44] developed a decision-maker for social robots working in HRI. The architecture contains a deliberative layer and a reactive layer. While the deliberative layer makes long-term plans, the reactive layer produces fast reflex behaviour reacting to stimuli. Like the previous publication, the architecture proposed by Gadanho in her PhD thesis [45] also combined deliberative and reactive processes supported by learning, adaptation, and emotion. All processes were based on biological animal functions. The system was tested in simulation in long-term trials and is considered a noteworthy advance in including RL with autonomous decision-making. Some years later, in 2003, Gadanho [46] updated her architecture, including perceptions, needs, emotions, and RL, to select the most appropriate behaviour to maintain optimal well-being. The emotion system modulates action selection and learning, including a cognitive system that depends on the robot’s goals and internal state.
3.1.2 The 2000 s: The Rise of Social Robots
Already in the 2000 s, Webb [47] proposed a theoretical overview of how robots with biologically inspired behaviour can improve our understanding of animal behaviour. In line with Webb, Bryson [48] also reviewed bioinspired theories for endowing artificial agents with autonomous and intelligent behaviour. Their survey describes decision-making architectures based on modular systems, deliberation and reactiveness, and evolutionary theories. Both addressed decision-making from a conceptual point of view but provided brushstrokes about biologically inspired action selection methods for artificial systems.
Moving back to architectures applied in real agents, Estlin et al. [49] presented in 2001 a novel two-layered DMS for controlling robots. The top level generates plans, and the low level works as an interface to command the robot’s actuators. Decisions are based on a set of rules that evaluate the robot’s state and goals. The model allowed action blending and continuous operation in lengthy tasks. Scheutz [50] designed in 2002 an action selection architecture for autonomous robots. The architecture was developed so as to be integrated into different robots. The decision-making process considers the robot’s emotional state and an arbitration method to select the most appropriate behaviour. Also, in real applications, Nakauchi and Simmons [51] presented in 2002 a system for social robots acting in crowded scenarios. The system allows a robot to successfully obtain resources by recognising people’s social behaviour and navigation. Then, it generates appropriate behaviours using visual information and probabilistic estimations.
The work of Cañamero over the last decades is ample and provides an accurate representation of action selection methods emulating human biological functions. In 2003, she [52] explored how to simulate emotions in artificial agents for action selection, providing notions about the essentials of modelling emotion and how they affect decisions. One year later, Cañamero worked with Ávila-García [53] on how to modulate action selection using hormones. Their model contemplated essential aspects of human behaviour to endow robots with autonomy, emulating key processes such as homeostasis (autonomous control of internal body functions) and motivation [54]. Their Action Selection Architecture (ASA) computes the robot motivational states and, using a winner-take-all approach [55], selects the behaviour linked to the motivation with the highest intensity. The behaviour selection aims to maintain the robot’s internal milieu in the best possible condition. In 2005, they continued [56] their previous research in adaptive systems studying the role of artificial hormones on motivated behaviour, investigating how autonomous artificial agents’ social behaviour varies by modulating two hormones, influencing how we perceive necessary resources (stimuli) for surviving.
Unlike the previous literature, the framework of Duffy et al. [57] in 2005 was intended both for Human–Robot and Robot–Robot interaction in humanoid and navigation domains. The DMS contains deliberative, reactive, and social components to produce the most appropriate decision based on the robot’s goal, the information gathered from the environment, and structured rules. Konidaris and Barto [58] designed in 2006 an action selection method based on RL. The model emulates physiological functions in the robot and its deficits (drives). Then, a priority system determines the most urgent drive to define motivated behaviour. Consequently, the robot aims to learn a behaviour policy to maintain an optimal internal state.
Following with the use of RL in decision-making, Malfaz and Salichs [59] proposed in 2006 a system for autonomous social robots. The model simulates physiological functions like thirst or hunger, and the deficits originate in these variables with time. The robot’s goal is to maximize its well-being by learning which behaviour to execute depending on its internal and external situation. Moreover, the model incorporates the emotions of happiness, fear, and sadness to represent the well-being state of the robot and reinforce the learning. In 2010, they [60] extended their previous work by designing a DMS for autonomous social agents. The model grounds biological functions such as drives, motivation, and learning (RL) to allow agents to survive in a virtual world. Finally, they moved in [61] to a system more focused on the emotional component of decision-making and expressiveness.
Since 2005, several architectures have been developed for HRI. In this line, Michalowski et al. [62] introduced in 2006 a model for representing the engagement of people interacting with a social robot. Depending on the user’s spatial position and head pose, the robot generates a profile that determines their level of engagement. Then, the robot chooses its subsequent behaviour based on that level to continue engaging the user in the interaction. In 2008, Walters [63] presented his PhD thesis about generating behaviour in non-verbal human–robot communicative scenarios. The study contains a large set of HRI experiments where the robot chooses its actions based on the non-verbal information provided by the user during the interaction, so as to attain a well-defined social behaviour. Mohammad and Nishida [64] designed in 2009 a robotic architecture for social robots working in HRI. The system draws on neuropsychology to create complex action selection mechanisms that provide autonomous behaviour, selecting the most optimal action considering sensory information and specifically selected plans. Balkenius et al. [65] studied in 2009 the interaction between motivation, emotion, and attention in social robots. They designed a control model to learn how to autonomously behave using the influence of stimuli such as objects of attention, emotion, and motivated behaviour. The model emulates cortical brain functions to represent essential aspects of human decision-making in a robotic head to learn how to map specific situations to actions.
3.1.3 The 2010 s and Present: Cognitive Models for HRI
By the beginning of the 2010 s, the main goal of these architectures continued to be improving HRI. Scheutz and Schermerhorn [66] developed an emotional architecture for the autonomous control of social robots. The selection of an action is grounded on each action’s utility value defined from the evaluation of environmental cues. Thus, the goal and action selection of the robot depend on evaluating the benefits/drawbacks of executing a specific action in each situation using RL. Those authors [67] presented in 2015 a new decision-maker that acknowledges the violation of social norms by including predefined rules about how social robots should behave.
Like these two publications, the architecture developed by Shi et al. [68] proposes a method for creating dialogues for a social robot to communicate during HRI verbally. The system uses a tag-based method to generate appropriate sentences and coherent dialogue. Castro et al. [69] also designed in 2010 a DMS for the social robot Maggie [70]. The system uses biologically inspired functions to represent the robot’s internal deficits and external state by perceiving the environment. Then, the robot’s motivational states grow to urge behaviour using Boltzmann’s equation and RL.
Floreano and Keller [71] researched Darwinian Evolutionary Methods to endow robots with autonomous behaviour. Their idea was to build more and more capable robots by prioritizing the information of those agents that perform better. This proposal is applied to collision avoidance in navigation tasks. Action selection occurs with random mutation based on neural networks. Also, for mobile robots, Buendía et al. [72] presented in 2012 a controller for the task of following a person. The engine combines selecting a strategy for the navigation and a perception system that uses objects to generate the strategy.
Arkin et al. [73] developed in 2011 a DMS based on ethical and moral judgments for social robots. The behaviour selection module evaluates at every moment the agent’s perceptions and an interface that stores responsibilities and constraints to avoid unethical behaviour. Leite [74] addressed in 2015 how to maintain positive feelings in users during HRI. The robot’s decision-making is based on the inferred emotion of the user and on adaptive mechanisms to promote positive social behaviours in lengthy interactions. The decision-maker proposed by Scheidler et al. [75] in 2015 was intended to allow swarms of robots to operate in navigation tasks successfully. The model uses Monte Carlo RL methods [76] to produce the most accurate performance and fast execution time and feedback. The model presented by Qureshi et al. [77] in 2016 allowed a social robot to exhibit autonomous behaviour by learning social skills during HRI. The method uses Deep RL to obtain feedback about the robot’s action and, using trial and error, learn the best combination of actions in each situation.
By the mid-2010 s, there was a new tendency in the models: biologically inspired methods represent more complex cognitive functions than previously. The CAIO architecture developed by Adam et al. [78] is a clear example of this trend. This architecture consists of a deliberative loop that generates emotions and plans and a sensorimotor loop that evaluates external information and produces appropriate reactions. Cervantes et al. [79] proposed decision-making based on ethical behaviour. The selection of an action depends on the agent’s preferences, good and bad experiences, ethical rules, and current emotional state, drawing on studies in neuroscience and psychology. Vallverdú et al. [80] expanded the Lövheims model [81] to a more complex system in which emotional states influence the agent’s behaviour. The neurotransmitters dopamine, serotonin, and norepinephrine affect important brain regions involved in emotion and the selection of actions, varying the emotional behaviour of the agent [81].
Following the ideas previously presented by Cañamero, Cos et al. [82] designed in 2013 a homeostatic adaptive mechanism based on RL to modulate the internal deficits of a social agent. The model simulates physiological functions that evolve over time and adapt to the situation. The robot’s goal is to maintain its good physiological condition by learning which action produces the best result in each situation. Three years later, Lewis and Cañamero [83] investigated in 2016 the role of pleasure in decision-making. Using their previously presented architecture, they model a pleasure hormone that modulates their internal needs based on perceived stimuli. Then, action selection occurs using a winner-take-all approach [55]. Like the previous works, Lones et al. [84] studied in 2017 the role of epigenetic mechanisms in endowing an autonomous robot with adaptive behaviour. The model shapes the influence of different artificial physiological processes that control the energy or temperature of the robot. The errors of such variables translate into motivated behaviour.
Influenced by Cañamero’s ideas, Maroto-Gómez et al. [7] proposed in 2018 an RL model to allow autonomous social robots to learn how to behave in a dynamic environment. The robot had to learn how to maintain an optimal internal state by reducing its internal needs. The decision-making is grounded on the robot’s motivations psychological states that represent the robot’s needs. Then, three years later, in [85], they updated their previous model with Dyna-Q+, an RL algorithm that allows autonomous agents to speed up the learning process by representing a model of the environment. In this case, the robot’s goal is to behave while maintaining its internal deficits in good condition motivationally. The decision-making process uses autonomous action selection to reduce the most prominent deficit.
Kowalczuk and Czubenko have presented systems for general-purpose social robots in designing decision-making architectures. In an initial contribution, they [86] designed in 2011 a robot controller modelling biological functions such as emotion, personality, needs, and motivation. The decisions are made using fuzzy rules that also reflect the agent’s emotional state, considering the effect of external stimuli. Then, in 2018, they [87] presented ISD (Intelligent System for Decision-making), a cognitive architecture for autonomous robots. Decisions depend on the perception of stimuli, past experiences stored in long-term memory, and the robot’s artificial needs. In addition, the model includes emotional factors, such as emotion and mood, influencing how the robot perceives objects. Finally, Kowalczuk et al. [88] developed in 2020 a fuzzy control system for autonomous emotional, social robots. Emotions and mood arise from the stimuli the robot perceives, defining its emotional state. Then, the emotional state modulates the selection of actions using the ISD cognitive architecture applied to driving scenarios.
Moving back to specific HRI domains, Romero et al. [89] used utility functions based on probabilistic rules to generate the appropriate plans of a social robot. The utility model builds upon a motivational model that represents the cognitive functions of the robot. During HRI, robots should explain their actions proactively. Stange et al. [90] addressed this issue in 2019. They presented an architecture that allows social robots to explain themselves during HRI scenarios. The robot uses verbal communication to proactively let the user know its needs and intentions. The robot’s explanations arise by considering the user’s behaviour and the robot’s needs derived from motivational processes.
Esteban and Insua [91] presented a decision-maker for social robots based on emotion generation. The robot’s emotional state depends on the interaction with people, modulating at the same time the scores associated with a set of actions. In the final step, the action with the highest score is selected to improve the robot’s performance in HRI scenarios. Cunningham et al. [92] presented in 2019 a multi-policy decision-making architecture for allowing a social robot to navigate autonomously in dynamic, multi-agent environments. The novelty of the work lies in the planning of the trajectory selected from a predefined set of close-loop behaviours whose utility is previously calculated using a simulation process that considers complex interactions among the possible actions of the robot. Martins et al. describe in [93] a DMS based on partially observable Markov decision processes (POMDP), reward shaping, and RL. The POMDP deals with the fact that some information is not available when making decisions, using transition probabilities to select the best alternative. The reward function considers the impact of the robot’s action on the user on the fly, supposing a novel technique that had not been used previously. Lastly, the RL system lets the robot know the best action to execute considering its state. Decisions are planned considering a user model and a context model that situates the robot in the environment. Compared to many other algorithms, the system provides good results in HRI tasks with different levels of complexity.
Various contributions have been presented in HRI and modelling cognitive development in the last three years. In this line, Man and Damasio [94] studied in 2019 the role of homeostasis [54] in the self-regulation of artificial functions in robots. Their study proposes a biological model where the robot is built using soft materials, and the way it selects its actions is oriented towards self-regulating its internal body and consciously feel the consequences of these actions. Consequently, the selection of an action incorporates biological mechanisms based on the model of its mind to produce natural behaviour.
Then, Augello et al. [95] worked in 2020 on modelling a somatosensory system for cognitive robots, emulating how humans perceive stimuli and how these stimuli affect our selection of an action. The model uses an RL algorithm to learn the optimal behaviour to maintain the best possible internal state during HRI. The LIDA architecture developed by McCall et al. [96] in 2020 is based on motivated behaviour for the control of autonomous robots, proposing a well-defined internal system that allows the robot to behave by emotionally combining planned activities with reactive behaviours. The model uses Machine Learning to map the robot’s state to specific actions to maintain an optimal internal state without forgetting the robot’s goals.
In 2020, Hong et al. [97] treated the problem of engaging people in human–robot scenarios, introducing a model that estimates the user’s emotional state and uses visual and auditory cues to create the robot’s emotional state. Then, using a predefined set of rules and learning based on Bayesian computing, the robot decides how to sustain engagement in bidirectional conversations. The cognitive architecture developed in 2020 by Martín-Rico et al. [98] promotes the learning of a person’s face during HRI. The action selection evaluates the situation and matches it with knowledge stored in the robot’s memory that defines its behaviour. Finally, Kim and Bodunkov [99] designed a robot architecture that makes autonomous decisions in situations where the information is not sufficient: to overcome the lack of information, the robot’s decisions are based on estimations from the robot’s situation. These estimations consider the probabilities of executing specific actions for attaining the goal during navigation tasks, using entropy as the selection criterion.
3.2 Manufacturing
In the manufacturing sector, autonomous systems are essential in several tasks, such as logistics or production lines. However, when talking about autonomous and social robots, the literature is not as extensive as for other areas, as Fig. 3 shows. A comparative analysis of the works described in this section is in “Appendix B”.
In these scenarios, the first publication we found was by Agrawal et al. [100], who in 1991 presented a decision-making architecture for robots working in factories. The architecture addressed the problem of making decisions using a finite set of alternatives and different configuration attributes that affect how the task is performed. The system was implemented in a real application to make industrial robots work autonomously. In a similar application, Wang et al. [101] proposed in 1996 a behaviour-based model for controlling robots in factories. Its novelty resided in a DMS that considers the actions of other robots to execute a predefined task cooperatively.
From the end of the 1990 s, it is possible to find many models for the autonomous control of robots that act jointly with humans. Kalenka and Jennings [102] presented in 1999 a mathematical model for the autonomous social decision-making of robots working in a warehouse, including social norms and attributes in multi-agent domains, unlike previous work. Shah et al. [103] proposed in 2002 a task controller for the autonomous and intelligent movement of vehicles and robots. The DMS combines planned actions based on a heuristic search and a database representing the world’s dynamics and reactive responses generated from the perception system.
Clodic et al. [104] presented in 2007 a DMS for human–robot collaborative scenarios. The framework is used to synchronise the communication between a social robot and a human worker during a fetch-and-carry task. The DMS uses predefined rules that evaluate the robot’s situation and the human’s speech. Czubenko et al. [105] applied the ISD architecture mentioned in the previous section to autonomous driving scenarios. The architecture emulates essential aspects of the road by replicating human drivers’ needs and motivations. To conclude with the manufacturing sector, in the context of autonomous robots and lengthy interactions, O’Brien and Arkin [106] developed a circadian system to work in agricultural tasks. The circadian functions evolve as timers to represent the system’s daily needs. Then, the action selection method uses such fluctuations in the circadian needs to execute actions using a kind of winner-take-all approach [55].
3.3 Healthcare
Among the many areas where autonomous and social robots have been applied, the healthcare sector contains decision-making architectures in real scenarios, as depicted in Fig. 4. A comparative analysis of the works described in this section is in “Appendix C”.
In healthcare, most of the work has been concentrated over the last twenty years, mainly applied to children, older adults, and assisting caregivers during therapies.
Working with children, Dautenhahn and Billard [107] studied the effect of an autonomous social robot in healthcare applications. The robot works with children with autism in gaming and educational sessions. In related work, Feil-Seifer and Mataric [108] presented in 2008 a robot architecture for engaging children with autism disorder. Behaviour selection considers predefined behaviours with the perceptions observed from the child’s behaviour. Also, in robot–children interactions, Senft et al. [109] introduced in 2015 a new model for social robots working in child therapy. The robot uses a set of rules, and a homeostatic signal [54] representing the children’s engagement and previous interactions to select actions that serve the therapist during exercises. Those authors [110] updated their previous work with a DMS to assist therapists during sessions with autistic children. The method of selecting an action evaluates external stimuli and the context of the interaction to produce autonomous behaviour under the therapist’s supervision.
There has been much work on assisting caregivers to conduct therapy. For example, Hiolle et al. [111] in 2014 presented a ‘baby’ robot that adapts its emotional behaviour depending on its needs. The aim of the study was to investigate the responsiveness of a caregiver to these needs. The robot explored and learned from the environment during its life using neural networks. The robot’s selection of an action uses the perceptions and needs to define the arousal/comfort system that determines which action to take to maintain its comfort. Another example is Lones et al. [112], who presented in 2014 a hormonal system for the adaptive behaviour of social robots in HRI with a caregiver, proposing an adaptive mechanism to modulate the robot’s selection of an action depending on the stimuli perceived, the valence value defined by the impact and type of stimuli, and biological functions. The model accurately represents essential biological functions behind the behaviour, providing a robust biological basis for autonomous behaviour.
Following this line of research, Cañamero and Lewis [113] designed an adaptive framework for social robots assisting in healthcare. The robot Robin (NAO) can teach children to manage their diabetes using different activities while presenting their internal needs. The selection of an action is based on a winner-take-all approach [55] where the robot’s motivations compete to urge specific behaviours. In 2019, Lewis and Cañamero [114] presented a research model for how stress leads to compulsory behaviour. The model emulates physiological functions that are modulated by an artificial stress hormone. The deficits of these functions and the perception of resources define the robot’s motivation. Finally, these motivations urge the selection of behaviour. The robot’s stress is a function of the other hormones, which evolve depending on the robot’s deficits. Consequently, the study explores behavioural changes depending on the robot’s stress levels. The architecture designed by González et al. [115] in 2017 used a three-level hierarchical decision to build personalised therapies in rehabilitation scenarios. In the first place, the robot generates a personalised therapy. Then, it modulates the activity using online perceptions. Finally, it translates abstract actions into specific motor commands.
Cao et al. [116] introduced in 2017 a collaborative architecture to support children and caregivers during therapy. The behaviour selection combines a hierarchical approach with parallel execution. The model generates its emotional state based on its internal needs and stimuli using a valence-arousal space. Then, each emotion is tied to a specific behaviour triggered when the corresponding emotion has the highest intensity. The architecture designed by Lazzeri et al. [117] in 2018 attempted to replicate human minds in social robots. The concept of decision-making consisted of perceiving the environment, evaluating the situation, and deciding on the most suitable action. The model was tested with children with autism disorders, conducting sessions oriented to provide entertainment and companionship. Park et al. [118] presented in 2019 a model-free emotional architecture for social robots working in education. The system uses verbal and non-verbal cues to learn engagement promoting lengthy interactions. Using its learning capabilities, the robot selects the most relevant stories for each child, personalizing the interaction.
Many social robots with decision-making capabilities have been designed in the last few years in healthcare applications. The social robot Pepper has been used in healthcare to autonomously retrieve information from patients in a hospital [119]. During this task, the robot was guided by nurses to improve the questions that the robot asked the patients. The dialogue with the patient included questions about the patients’ home situation, general health, use of medicines, smoking, alcohol use, dental issues, weight, defecation, activities for daily living, sleep, cognition, possible stress due to recently experienced severe life events, potential problems at home or work due to their admission, and religion or belief. The idea of deploying the robot in this scenario was to make the process more interactive and easier to follow than questionnaires or nurses’ time. The results showed favourable acceptance rates of the robot by both men and women (the study did not yield significant statistical differences between genders).
The social robot Mario [120] was created in 2020 to work in residential care, assisting elderly people with dementia. The robot includes a software architecture that allows it to perform autonomous social behaviour while engaging such adults with different activities. As in previous work, it combines deliberative and reactive layers to develop plans and reactions to unexpected situations. The social robot Mini [121] was created to assist caregivers during cognitive stimulation therapies. This robot has a fully autonomous DMS for generating personalised therapies for each user. The action selection combines RL with predefined rules that assign priorities to different possibilities, such as executing planned events or reacting to stimuli.
For the iCub robot, Tanevska et al. [122] designed in 2020 a framework to maximise the pleasantness during HRI. The robot can personalise its behaviour while assisting caregivers by learning the effects of its action on users using their social signals and its internal needs defined as motivational urges. In a recent study, Foster et al. [123] developed, in 2020, a social robot designed to alleviate children’s pain during medical assistance. The system’s goal was to decide on the appropriate behaviour with which to distract the child from the intervention and avoid painful and panic situations. The action selection method employs the user’s state and action to decide on the best action.
To conclude our review of the healthcare sector, we present two up-and-coming applications to more specialised scenarios. Robinson et al. [124] used in 2020 a social robot to reduce the caloric intake of people and promote a healthy diet. The robot could perceive the consumption of snacks and analyse if it was desirable to avoid binge eating. Asprino et al. [125] designed in 2022 a software architecture for the autonomous control of the social robot Mario. This robot works in healthcare applications with people presenting dementia. The behaviour selection evaluates the perception of the robot and a knowledge database containing information about the object and their influence to modulate behaviour execution learning to personalise HRI.
3.4 Education
As Fig. 5 shows, many DMSs have been realised in the educational environment in the last decades. A comparative analysis of the works described in this section is in “Appendix D”.
We begin our review with Dautenhahn [126], who studied in 1999 the influence of a social robot on autistic children. The robot teaches the children to perform specific activities, supervising this in such a way that each task is fulfiled.
Breazeal [127, 128] in 2003 presented the social robot Kismet, an expressive anthropomorphic robot head intended for HRI. The robot includes mechanisms to improve social abilities and cope with complex social environments. Its decision-making involves evaluating its goals and the people’s speech to build a coherent dialogue based on predefined rules that favour learning in educational contexts. Kismet can express emotional cues.
Another platform applied to education is the iCat robot [129]. It was designed in 2005 as an autonomous robot that works in education and HRI. It incorporates a DMS that merges the information generated by an animation engine with a series of predefined scripts that contain gestures and activities that the robot executes. Similarly, the software of the social robot PaPeRo [130], presented in 2006, contains a DMS that allows it to execute autonomous behaviour in educational scenarios with children. The selection of actions merges planned activities personalised to the audience using the robot with reactive behaviours elicited from the perception of stimuli.
In 2008, Mitnik [131] presented a line of research for deploying autonomous social robots in educational sessions. Unlike the previous literature, the robot can teach students different subjects, such as maths or geography, by performing a set of activities together. The sessions are predefined and involve the children by promoting their participation. Ushida [132] introduced in 2010 a mind model based on emotional responses for the autonomous control of social robots. The model was intended for HRI in educational environments, containing deliberative and reactive actions to build a natural behaviour using fuzzy logic. Like the previous work, using a mental model, Strohkorb and Scasselatti [133] developed in 2016 a reasoning architecture for human–robot collaboration in educational settings. The model focuses on maintaining a collaborative strategy while updating and optimizing it by gathering information from the environment. In addition, the action selection alternates the exploitation of the best alternative with exploring new strategies.
The following publications are examples of the impressive effects of using social robots in educational scenarios. Coninx et al. [134] presented an adaptive model for engaging children during educational sessions. Behaviour selection consists of adapting the behaviour by creating a specific profile for each child. This profile is built from feedback obtained during the execution of the exercises. Egido-García et al. [135] presented in 2020 the use of NAO robots in educational sessions with children. The model fuses the needs of the children, the caregiver, and the robot itself, to produce autonomous and personalised activities to improve the children’s logopedic skills. Mascarenhas et al. [136] designed in 2021 a new function for the FAtiMA toolkit, a model for the autonomous behaviour of socio-emotional robots in educational settings about bullying. The model makes decisions based on the exercise to be executed, the child’s emotional state, and a knowledge-based memory that stores rules linking situations to actions.
Ahmad et al. [137] introduced in 2021 an RL for improving the engagement and vocabulary learning of children. The decision-making uses social signals and memory-based knowledge to determine the best action to execute during the session. Kaptein et al. [138] addressed in 2021 the design of a DMS for lengthy interactions for educating children about healthy lifestyles using games. The action selection occurs in two stages, using an ontology-based system and evaluating the best action according to the robot’s current situation. In addition, the system includes learning methods to personalise each child’s activities to improve performance.
3.5 Entertainment
Some work has used social robots with autonomous decision-making in entertainment, as shown in Fig. 6. A comparative analysis of the works described in this section is in “Appendix E”.
Gu et al. [139] proposed in 2003 a DMS based on fuzzy logic for humanoid mobile robots in entertainment scenarios like the RoboCup. Kok et al. [140] presented in 2003 a DMS based on coordination graphs for robots working in multi-agent entertainment domains. The decision of each robot depends on the decisions and actions of the others, producing a coordinated sequence of behaviours. Using a biologically inspired model, Manzotti and Tagliasco [141] developed in 2005 a decision-maker based on motivations for robots. Unlike the previous literature, the motivations do not emerge from purely biological functions, but from the robot’s goals. The robot, intended for entertainment, generates motivated behaviour from the stimuli perceived from the environment and rules stored in a memory.
Four engaging autonomous platforms were designed for children’s entertainment. Kozima et al. [142] presented Keepon in 2009 as a social robot for research, entertainment, and therapy. The robot includes a decision-making module that evaluates its situation and the actions of people to produce autonomous decisions adapted to the interaction procedure. The social robot Pleo [143] was conceived in 2010 by Fernaeus et al. as a toy robot for children’s entertainment. It presents an autonomous action selection mechanism based on predefined rules adapted to external stimuli. In a similar scenario, the social robot Maggie [144] was also used as a gaming platform with children. The social robot MiRo [145] appeared in 2015 as a research platform for entertainment. The robot incorporates multiple sensors to navigate the environment, executing various expressive behaviours autonomously. Its action selection consists of predefined rules that map external stimuli to specific behaviours.
Kaupp et al. [146] introduced in 2010 a decision-making framework for human–robot interactive collaboration. The goal of the model is to appropriately decide what and when to communicate with the human operator to complete semi-teleoperated navigation tasks successfully. After assessing environmental stimuli and the operator’s commands, decisions are made using probabilistic formulae. Bicho et al. [147] presented in 2011 a DMS for HRI based on neural networks. The system is intended for entertainment activities like building a toy in a cooperative task. The action selection uses the robot’s perceptions and goals to decide on the HRI flow. Schneider et al. [148] proposed in 2017 a controller for a social robot working in HRI. The system encourages people to exercise for more extended periods by promoting motivational behaviour. The action selection depends on a set of rules that combine the features of the people and the inputs of the perception system.
Bagheri et al. [149] presented in 2021 a framework based on RL to motivate users during human–robot entertainment activities. The robot’s action selection depended on the emotional state inferred from the participant to be emphatic and improve the users’ confidence and satisfaction. Saunderson and Nejat [150] presented in 2022 a hybrid hierarchical decision-maker to persuade people to do their daily exercises. The robot uses different RL algorithms and user identification to personalise the exercises of each user during lengthy HRIs depending on their emotional state. Maroto-Gómez et al. [151] presented in 2022 a DMS for the social robot Mini working in entertainment. The model uses estimations based on Preference Learning to propose that the robot uses its favourite activities. Action selection uses the Boltzmann equation, which balances selecting the user’s favourite activities with exploring new alternatives.
3.6 Companionship
The use of autonomous social robots to provide older adults with companionship has been explored since the beginning of the century, as Fig. 7 shows. A comparative analysis of the works described in this section is in “Appendix F”.
Undoubtedly the most famous robot in this application is PaRo [152], a robot for physical and emotional interaction with people with mental impairments. The robot’s decision-making works as a reactive system that produces actions after evaluating the perceptions sensed from the environment. Similarly to PaRo, Arkin et al. [153] presented in 2003 a promising model for the intelligent decision-making of the social robot Aibo. The model emulates the physiological and emotional processes occurring in a dog, thus providing the robot with an intelligent and autonomous behaviour to provide companionship working as a cybernetic pet. The robot’s goal is to maintain homeostasis and regulate its internal deficits [54] to survive in a changing environment. In addition, the robot incorporates learning mechanisms to associate objects with certain biological variables (e.g. food with hunger) and identify people’s faces. Saldien presented in 2009 the social robot Probo [154], an autonomous agent for entertaining people in hospitals and providing them companionship. Its behaviour can be manually tuned using a friendly interface with spontaneous reactions.
Turning now to decision-making architectures developed for companion robots, Samani and Saadatian [155] developed an action selection architecture for social robots based on the Probabilistic Love Assembly (PLA) emotional model. The selection of an action is based on the evolution of artificial hormones, yielding different emotional states. These hormones evolve depending on social interaction with the user, making the robot adapt its emotion and establish a social relationship with the user based on love. Grigore et al. [156] designed in 2015 a motivational model for the adaptive autonomous behaviour of social robots working as companions. The action selection mechanism is based on RL and chooses appropriate actions depending on a user model representing daily goals.
3.7 Assistance and Service
The literature review presented in this manuscript has shown that there is a wide range of applications where autonomous social robots assist humans in different tasks. A comparative analysis of the works described in this section is in “Appendix G”.
Most of the previous work describes DMSs that facilitate humans’ execution of different tasks to a lesser or greater extent. Now we will focus on systems providing a purely assistive behaviour working as tour guides, bartenders, or office assistants. Figure 8 shows the evolution of the most important work in assistance and service in the last three decades.
The social robot Minerva [157] was created in 1999 by Thrun et al. as a robot tour guide. It exhibited autonomous social behaviour combined with a user interface where visitors could indicate to the robot what to do (e.g. visiting a specific location). Jung and Zelinsky [158] proposed in 1999 an action selection method for two cooperative robots executing cleaning tasks. The action selection uses dynamic generation of paths depending on the previous action and a set of rules that actively inhibit the robot’s possible alternatives. That same year, Van der Loos et al. [159] developed a controller for a manipulator assisting people with a physical disability. Using probabilistic rules autonomously, the controller enables a robotic arm to help the user.
Lisetti et al. [160] designed in 2004 a decision-making architecture for HRI. The system was integrated in the service robot Cherry, which could express different emotions to improve its social abilities with people while assisting them by working as an office assistant. The decision making evaluates the robot’s emotional state to make the most appropriate decision. Similarly, the social robot Maggie [70] was designed in 2006 to work in multiple domains, such as entertainment, assistance, and education. In addition, it served as a research platform to study HRI. Its DMS combines a deliberative layer that plans based on the robot’s goal and a reactive layer to respond to environmental stimuli. The robot also employs learning, adaptive, and emotional mechanisms to improve its performance and engage users.
Hollinger et al. [161] proposed in 2006 a decision-maker for mobile social robots based on emotion. The robot was designed to work in conference assistance using predefined functions that mapped stimuli to emotional actions. The goal of this system was to improve HRI by including reactive behaviour to engage users and improve people’s acceptance. Like Minerva [157], the robot Urbano [162] was born in 2008 to work as a museum tour guide. Its decision-making consists of three heuristic search algorithms combined with fuzzy rules to produce the best possible presentation to the audience. Shiomi et al. [163] designed in 2009 a DMS to control the action of a group of robots assisting users in a shopping mall. The system generates appropriate instructions based on predefined rules for each robot to provide information about routes and recommendations by estimating the users’ behaviours. Therefore, the decision-maker coordinates each robot’s HRI and navigation to approach different users.
Alili et al. [164] introduced in 2009 a decision planner for human–robot collaborative scenarios. The action selection is based on a probabilistic model that evaluates the robot’s goal and the perception system (including the human intention) to make appropriate decisions in different assistive tasks. In 2014, Foster et al. [165] showed how a bartender robot could autonomously work in complex scenarios with customers. This time the action selection consisted of learning an RL policy to meet the customer’s needs. Petrick and Foster [166] presented in 2016 their work about autonomous planned action selection in HRI. The robot acts as a bartender requesting the user’s petitions. Actions are selected based on perceiving such petitions and comparing them with a predefined set of rules that indicate how the robot behaves.
Liu et al. [167] showed in 2018 how a robot could learn proactive behaviour using neural networks and user feedback during HRI. The robot controller generates both motion and speech actions using associations created by the learning model. Similarly, Malviya et al. [168] developed in 2020 a navigation system for social robots based on a finite state social machine. The robot operates as a tour guide, using its embodied sensors to produce suitable state transitions to exhibit a fully autonomous behaviour. To conclude our review, in 2021, Hedblom et al. [169] presented an action selection method based on image schema. The system evaluates logical rules that allow the agent to decide how to behave to attain a specific goal. The architecture allows autonomous social robots to work in everyday housework activities.
4 Analysis
In this review of DMSs and control architectures for autonomous and social robots, we evaluated 148 (from 208 screened) publications derived from our study depicted in Fig. 1. We are aware that there is more literature in this area, but we believe that these publications accurately represent the evolution of these software architectures over the last thirty years. As Fig. 9 shows, the number of publications has increased over the years, constituting an unceasing growth of autonomous and social robots in our society.
In the upcoming sections, we investigate this evolution regarding:
-
Area of application.
-
Action selection method.
-
Duration of the HRI experiments (if specified).
-
Biologically inspired models included (if any).
-
Learning method used to produce decisions (if included).
-
Number of publications concerning real scenarios.
-
Number of publications concerning a specific robotic platform.
In all of these analyses, we first study the global distribution of the items in each category over the last three decades and then deepen our analysis, providing a detailed vision decade by decade.
4.1 Area of Application
In this study, we assessed the area where each work was applied. We classified the publication into one of the seven categories presented in Sect. 3. Although some work could belong to several categories, we opted to add each paper to its most relevant category to analyse if the areas of application have varied over the last three decades.
Figure 10 shows that of the 148 publications studied, 77 concerned applications to purely research purposes (\(52\%\)), 19 to healthcare (\(\sim 13\%\)), 14 to entertainment and assistance/services (\(\sim 9\%\)), 12 to education (\(\sim 8\%\)), 7 to the manufacturing sector (\(\sim 5\%\)), and 5 to robot that provide companionship (\(\sim 3\%\)).
If we deepen our analysis and review the last three decades (the 1990 s, 2000 s, and 2010 s to present) in detail, we obtain impressive results. As Fig. 11 shows, in the 1990 s, most work was not applied to a particular area, although some of them were applied in manufacturing, production, or assistance. However, with the rise of social robots, we observed that work has taken a more specific turn, especially towards healthcare, entertainment, and education. This does not mean a lack of work in research, since many systems are still applied in this context.
4.2 Action Selection Method
It is important to emphasise that we have been reviewing decision-making and control systems for autonomous and social robots in the last three decades. These systems are characterised by their presenting action selection methods to produce such autonomous decisions. In our study, we recognise four main types of action selection:
-
Biologically inspired methods: We include in this category those methods that take inspiration from biology by using emotions, homeostasis, or motivation to influence action selection.
-
Probabilistic and classical algorithms: In this field, we added those methods that based their action selection on probabilistic algorithms and classical approaches that do not imply learning (e.g. heuristics, genetic, or support vector machines).
-
Learning methods: Those contain some kind of learning, such as RL, Deep Learning, Neural Networks, or learning by demonstration/imitation are in this category.
-
Fuzzy control and predefined rules: We classified in this category systems that use fuzzy logic and predefined rules (e.g. if–then rules) to make autonomous decisions to fulfil the system’s task.
Figure 12 shows the distribution of the publications included in our review by the action selection method. It is noteworthy that all use at least one of the action selection methods studied but can incorporate more than one in specific cases. In this sense, the most used approach is the use of biologically inspired models (70 of 148) to drive action selection, followed by learning methods (63). Almost one-third of the publications (47) include probabilistic and classic approaches to decide which action the social robot should execute and 36 base this decision on fuzzy control or predefined rules.
The analysis by decade shown in Fig. 13 evidences some valuable dynamics. On the one hand, the number of publications basing the action selection on learning methods has significantly increased in the last years, probably due to the expansion and development of Machine Learning and Artificial Intelligence. On the other hand, since the 1990 s, biologically inspired methods have become a powerful source of inspiration for developers. Modelling animal (human) biological functions is always a widespread technique, primarily if the robot works in HRI. Lastly, the metrics for more classical approaches, such as probabilistic or fuzzy/rules control, do not vary over time, always being a good alternative for developers.
4.3 HRI Experiment Duration
In addition to the previous analysis, we investigated the duration of the HRI interactions occurring in the experiments included in these publications. It is worth noting that some publications did not include experiments or specify their duration. In this regard, our analysis finds that 114 of 148 publications specified the duration of their HRI interactions while 34 of 148 did not validate the model in HRI or indicated the experiment duration. Based on this, we opted for dividing them into three types:
-
Short: Experiments with a single interaction which lasted less than an hour.
-
Moderate: Experiments with interactions that lasted more than an hour but occurred on the same day.
-
Long: Experiments that included HRIs on different days with an average duration greater than one hour.
In this assessment, represented in Fig. 14, we observed that most publications only reference short HRIs (81 of 119) and only some involve moderate scenarios (25). It stands out that there is alack of systems (7) working in real scenarios where lengthy HRIs are required. We believe that if autonomous and social robots are to be deployed in real environments, exhibiting autonomous behaviour for long periods is essential since otherwise, the investment and development that these systems require is not worth it.
Analysing the previous results by decade (Fig. 15), it is possible to perceive how short interactions predominate. Although some publications are oriented to moderately long interactions and a few to lengthy interactions, it is impossible to perceive any tendency suggesting that DMSs and control architectures are extending their usability in this regard.
4.4 Use of Biologically Inspired Models
In the previous section, we saw that many publications (a total of 70) employ biologically inspired models to shape the decision-making. This section explores these biologically inspired models, further investigating how animals’ biological functions are emulated in autonomous and social robots. The literature review presented in Sect. 3 allows recognizing four different kinds of biologically inspired models:
-
Homeostatic model: the emulation of animal (human) biological functions to influence decision-making, such as neuroendocrine responses, homeostatic and allostatic control [54], or physiological variables (e.g. heart rate).
-
Motivational model: the use of motivations as psychological states that impel the agent’s behaviour.
-
Affective model: emotional models based on emotion, mood, and personality which influence how the autonomous robot makes decisions.
-
Cognitive model: modelling cognitive and mental functions in robots.
Figure 16 shows the distribution of publications presenting biologically inspired models to select actions. Of the 70 publications that included at least one of these models, 39 (\(\sim 26\%\)) include a homeostatic internal model that emulates physiological functions, another 39 (\(\sim 26\%\)) include an affective model where decisions depend on the emotional state of the autonomous agent, 32 (\(\sim 22\%\)) use motivations to urge behaviour selection, and 23 (\(\sim 16\%\)) model cognitive functions typically inspired by the functions of the brain and mental models. This suggests that most systems implement one model depending on the goal they want to reach, but just a few publications attempt to study the relations between cognitive and emotional functions.
If we focus on analysing the previous results by decade, we can observe the huge increase in using cognitive models in the last ten years, over the two previous decades. The internal, motivational, and affective models all present a homogeneous pattern where no one stands out above the others.
4.5 Use of Learning
Many of the publications (a total of 63) included in this review present learning models to improve the system’s decisions. This section explores the methods most used to gain experience and improve performance during autonomous action selection. Since some systems may combine different learning methods, each system can be classified into more than one of the following categories.
-
Reinforcement learning (RL): the decision-making is affected by learning from trial and error and past experiences.
-
Neural networks (NNs): systems that use neural networks to learn action selection, including Deep Learning, convolutional networks, and similar techniques.
-
Learning by imitation/demonstration: the systems gain knowledge for action selection by imitating other agents or after seeing a demonstration.
-
Other techniques: those systems that include learning methods to improve the action selection but using different methods, such as genetic programming or heuristic search.
The distribution of publications by learning methods represented in Fig. 18 shows that RL wins by a landslide. 38 publications (\(\sim 26\%\)) of the publications include RL to improve decision-making, something we believe is closely related to the spread of social robots in the last two decades. Regarding the other alternatives, neural networks are also much used since 13 works of 148 (\(\sim 9\%\)), followed by other approaches, such as genetic programming (\(\sim 8\%\)). Lastly, a few publications use learning by imitation and demonstrations (5 of 148, \(\sim 3\%\)), two techniques that are not very common in social robotics.
Our previous hypothesis about the spread of social robots and RL techniques is reinforced if we analyse the distribution by decade. As Fig. 19 shows, the number of publications using RL in the last decade has increased significantly. In this graph, it is also possible to see an increase in the number of publications using some kind of learning, probably due to the expansion of Machine Learning and Artificial Intelligence in the last two decades.
4.6 Application to Real Scenarios
This section analyses whether the work reviewed in this manuscript has been employed in real scenarios or conceptual design and simulation.
Figure 20 shows that \(\sim 58\%\) of the publications have presented tests in real scenarios in the last three decades. Deepening the analysis, Fig. 21 shows that the initial trend has reversed. In the 1990 s, most architectures were used in simulation or as conceptual designs (18 vs 15). However, in the 2000 s, most architectures were tested in real scenarios (18 vs 20). Finally, from the 2010 s to the present, there have been twice as many architectures tested in real scenarios as in simulations (52 vs 26). This suggests that most decision-making and control architectures are currently tested in real scenarios where people participate. These results align with the fact that current systems are applied to more specific tasks.
4.7 Systems Designed for Specific Platforms
This section investigates if the decision-making and control architectures reviewed in this survey have been specifically designed for a robot or, on the contrary, are general architectures designed to work in multi-system domains.
As Fig. 22 shows, most of the architectures are designed to be integrated into general platforms (\(\sim 63\%\)) rather than specific ones (\(\sim 37\%\)). The analysis per decade supports the general results. In this case, the applicability of DMSs and control architectures to general platforms follows the same tendency, winning over the alternative of developing these systems for specific robotic platforms.
5 Challenges to Autonomous and Social Robots
The literature review and previous analysis have provided a concise vision of the benefits of social robots in different domains. However, these architectures also face important challenges that should be addressed to continue developing more and more capable systems.
5.1 Engagement in Lengthy Interactions
Our previous analysis shows a clear tendency to use autonomous and social robots in lengthy interactions. The design of an autonomous system for lengthy interactions has been scarce in the last three decades. Although it seems that social robots are starting to work in real scenarios assisting people in lots of services [15], our results indicate that most research only focuses on testing these systems in controlled environments where the HRI only last a few minutes. In this regard, most recent work addresses how to engage users in the interaction, principally during the realization of cognitive stimulation, physical activities, and educational exercises to avoid fatigue.
We believe that researchers in Artificial Intelligence who are working on designing robust control architectures for social robots should be aware of this issue and concentrate on designing novel action selection architectures for extended periods rather than for customised sessions. Thus, testing these architectures in real and unpredictable environments over long periods is essential to measure their real applicability and usability.
5.2 Multi-applicability
In line with the previous challenge, our study suggests that the application of DMSs has taken a more specific nature in the last decades. Initially, most proposals concerned control architectures with conceptual designs that were not applied to any specific area. Nonetheless, the number of publications that describe applications of autonomous behaviour in healthcare or education has significantly increased in the last years. Although we can see this applicability as a positive fact, we are still far from developing robust systems that can be used in multiple and diverse domains. Consequently, a significant challenge that researchers will face in the upcoming years is to design intelligent machines with autonomous action selection methods that can be used in specific tasks and in a wide repertoire of activities to assist people. Thus, we believe that these robots’ potential customers will increase their potential if they are oriented to a broader target population.
5.3 Adaptation and Learning
Reinforcement learning has become the most used learning technique in the last decade [170]. The possibility of learning how to behave from trial and error opens a wide range of possibilities to build capable machines. More and more publications are presenting control architectures that incorporate some kind of learning or adaptive system in this context. However, most of them only focus on adapting to those users for whom the system has predefined information. However, adaptation is not fulfiled when the robot faces unknown users or requires lengthy training times.
In these situations, we propose generalization methods based on predictions that, after dynamically obtaining the necessary information from the user using HRI, can estimate essential features and attributes to start the assistance with some degree of adaptivity and not from scratch. Then, during subsequent interactions, the system can make an autonomous action selection combining the initial estimates with new adjustments that accurately represent the user preferences to improve the quality of the HRI and meet the initial goals. In this sense, the challenge of DMSs is integrating recent and adaptive learning methods to improve the robot’s behaviour selection.
5.4 Lack of General Models of Shared Knowledge
Lastly, we would like to highlight a significant problem that is typically overlooked but affects these systems’ evolution. As we saw in Sect. 3, there is a large number of publications that present DMSs for autonomous and social robots executing a wide range of behaviour. However, in most cases, the researchers design their approaches without considering integrating the models of other researchers into their own models taking advantage from the previous research. This issue is probably due to a significant lack in sharing knowledge and publicly freeing our codes. Although it seems that there is a tendency for new researchers to share their DMSs to improve scalability and modularity, we believe we are still far from developing software solutions (in this case, DMSs for social and autonomous robots) that can be generally and easily implemented in different platforms to speed up technological growth.
6 Conclusion: The Future of DMSs
This manuscript started with a thorough review of the evolution of DMSs and control architectures for autonomous and social robots over the last thirty years. Then, we analysed the most important trends of this work to provide a concise representation of the fundamental challenges on which we still need to continue working to deploy these systems in real and lengthy applications.
We believe that social robots can provide multiple benefits to society, alleviating the payload and facilitating the execution of tasks by the most vulnerable sectors of society, such as older adults, children, or disabled people. Additionally, the aging of the population of the developed countries puts forward the necessity of developing intelligent and autonomous machines with robust behaviour to overcome the possible lack of workforce in specific positions related to healthcare or education.
These challenges push us to continue investigating along these research lines to provide solutions that can improve people’s quality of life.
References
Kastin A (2013) Handbook of biologically active peptides. Academic press, Cambridge
Mitri S, Wischmann S, Floreano D, Keller L (2013) Using robots to understand social behaviour. Biol Rev 88(1):31–39
Efron R (1969) What is perception? In: Proceedings of the Boston Colloquium for the Philosophy of Science 1966/1968, pp 137–173. Springer
Allen C (2017) On (not) defining cognition. Synthese 194(11):4233–4249
Plass JL, Kalyuga S (2019) Four ways of considering emotion in cognitive load theory. Educ Psychol Rev 31(2):339–359
VandenBos GR (2007) APA dictionary of psychology. American Psychological Association
Maroto-Gómez M, Castro-González Á, Castillo JC, Malfaz M, Salichs MA (2018) A bio-inspired motivational decision making system for social robots based on the perception of the user. Sensors 18(8):2691–2710
Mubin O, Stevens CJ, Shahid S, Al Mahmud A, Dong J-J (2013) A review of the applicability of robots in education. J Technol Educ Learn 1(209–0015):1–7
Belpaeme T, Kennedy J, Ramachandran A, Scassellati B, Tanaka F (2018) Social robots for education: a review. Sci Robot 3(21):1–9
Cao H-L, Esteban PG, De Beir A, Simut R, van de Perre G, Lefeber D, Vanderborght B (2017) A survey on behavior control architectures for social robots in healthcare interventions. Int J Hum Robot 14(04):1–24
Broekens J, Heerink M, Rosendal H et al (2009) Assistive social robots in elderly care: a review. Gerontechnology 8(2):94–103
Fong T, Nourbakhsh I, Dautenhahn K (2003) A survey of socially interactive robots. Robot Auton Syst 42(3–4):143–166
Leite I, Martinho C, Paiva A (2013) Social robots for long-term interaction: a survey. Int J Soc Robot 5(2):291–308
Fernández-Rodicio E, Maroto-Gómez M, Castro-González Á, María M, Salichs MÁ (2022) Emotion and mood blending in embodied artificial agents: expressing affective states in the mini social robot. Int J Soc Robot 14(8):1841–1864
Kunze L, Hawes N, Duckett T, Hanheide M, Krajník T (2018) Artificial intelligence for long-term robot autonomy: a survey. IEEE Robot Autom Lett 3(4):4023–4030
Nocentini O, Fiorini L, Acerbi G, Sorrentino A, Mancioppi G, Cavallo F (2019) A survey of behavioral models for social robots. Robotics 8(3):1–35
Moher D, Liberati A, Tetzlaff J, Altman DG (2009) PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the prisma statement. Ann Int Med 151(4):264–269
Meyer J-A, Wilson SW (1991) Robot ethology: a proposal for the research into intelligent autonomous systems, pp 465–474
Lorenz K (1981) The foundations of ethology
Tinbergen N (1963) On aims and methods of ethology. Z Tierpsychol 20(4):410–433
Maja J (1992) Integration of representation into goal-driven behavior-based robots. IEEE Trans Robot Autom 13:304–312
Mahadevan S, Connell J (1992) Automatic programming of behavior-based robots using reinforcement learning. Artif Intell 55(2–3):311–365
Elliott CD (1992) The affective reasoner: a process model of emotions in a multiagent system. PhD thesis, Northwestern University
Ortony A, Clore GL, Collins A (1990) The cognitive structure of emotions
Dorigo M, Schnepf U (1993) Genetics-based machine learning and behavior-based robotics: a new synthesis. IEEE Trans Syst Man Cybern 23(1):141–154
Lambora A, Gupta K, Chopra K (2019) Genetic algorithm-a literature review. In: 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon), pp 380–384. IEEE
Hayes GM, Demiris J (1994) A robot controller using learning by imitation
Nolfi S, Floreano D, Miglino O, Mondada F (1994) How to evolve autonomous robots: different approaches in evolutionary robotics. In: Artificial life iv: Proceedings of the fourth international workshop on the synthesis and simulation of living systems, pp 190–197. MIT press
Garcia-Alegre MC, Bustos P, Guinea D (1995) Complex behaviour generation on autonomous robots: a case study. In: 1995 IEEE international conference on systems, man and cybernetics. intelligent systems for the 21st century, vol 2, pp 1729–1734. IEEE
Steels L (1995) When are robots intelligent autonomous agents? Robot Auton Syst 15(1–2):3–9
Webb B (1995) Using robots to model animals: a cricket test. Robot Auton Syst 16(2–4):117–134
Velásquez JD (1996) Cathexis–a computational model for the generation of emotions and their influence in the behavior of autonomous agents. PhD thesis, Massachusetts Institute of Technology
Velásquez J (1998) Modeling emotion-based decision-making. Emotional and intelligent: the tangled knot of cognition, pp 164–169
Webster C (1997) Adaptive depression, affective computing, and intelligent processing. In: 1997 IEEE international conference on intelligent processing systems (Cat. No. 97TH8335), vol 2, pp 1181–1184. IEEE
Arkin RC (1998) Behavior-based robotics
Canamero D (1997) Modeling motivations and emotions as a basis for intelligent behavior. In: Proceedings of the first international conference on Autonomous agents, pages 148–155
Tunstel E Jr, Lippincott T, Jamshidi M (1997) Behavior hierarchy for autonomous mobile robots: Fuzzy-behavior modulation and evolution. Intell Autom Soft Comput 3(1):37–49
El-Nasr MS, Skubic M (1998) A fuzzy emotional agent for decision-making in a mobile robot. In: 1998 IEEE international conference on fuzzy systems proceedings. IEEE world congress on computational intelligence (Cat. No. 98CH36228), vol 1, pp 135–140. IEEE
Arsene CTC, Zalzala AMS (1999) Control of autonomous robots using fuzzy logic controllers tuned by genetic algorithms. In: Proceedings of the 1999 congress on evolutionary computation-CEC99 (Cat. No. 99TH8406), vol 1, pp 428–435. IEEE
Matarić MJ (1998) Behavior-based robotics as a tool for synthesis of artificial behavior and analysis of natural behavior. Trends Cognit Sci 2(3):82–86
Reif JH, Wang H (1999) Social potential fields: a distributed behavioral control for autonomous robots. Robot Autonom Syst 27(3):171–194
Ishiguro H, Kanda T, Kimoto K, Ishida T (1999) A robot architecture based on situated modules. In Proceedings 1999 IEEE/RSJ international conference on intelligent robots and systems. human and environment friendly robots with high intelligence and emotional quotients (Cat. No. 99CH36289), vol 3, pp 1617–1624. IEEE
Billard A, Dautenhahn K (1999) Experiments in learning by imitation-grounding and use of communication in robotic agents. Adapt Behav 7(3–4):415–438
Rooney C, O’Donoghue R, Duffy BR, O’Hare GMP, Collier R (1999) The social robot architecture: towards sociality in a real world domain. In: Towards intelligent mobile robots (TIMR99), Bristol, UK, 1999
Gadanho SC (1999) Reinforcement learning in autonomous robots: an empirical investigation of the role of emotions. PhD thesis, University of Edinburgh. College of Science and Engineering
Gadanho SC (2003) Learning behavior-selection by emotions and cognition in a multi-goal robot task. J Mach Learn Res 4(Jul):385–412
Webb B (2000) What does robotics offer animal behaviour? Anim Behav 60(5):545–558
Bryson J (2000) Cross-paradigm analysis of autonomous agent architecture. J Exp Theor Artif Intell 12(2):165–189
Estlin TA, Volpe R, Nesnas I, Mutz D, Fisher F, Engelhardt B, Chien S (2001) Decision-making in a robotic architecture for autonomy. In: 6th international symposium on artificial intelligence, robotics and automation in space
Scheutz M (2002) Affective action selection and behavior arbitration for autonomous robots. In: IC-AI, pp 334–340
Nakauchi Y, Simmons R (2002) A social robot that stands in line. Auton Robot 12(3):313–324
Cañamero L (2003) Designing emotions for activity selection in autonomous agents. Emot Hum Artifacts 115:148–168
Avila-Garcia O, Cañamero L (2004) Using hormonal feedback to modulate action selection in a competitive scenario. In: From animals to animats 8: proceedings of the seventh [ie Eighth] international conference on simulation of adaptive behavior, vol 8, pp 243–253. MIT Press
Ramsay DS, Woods SC (2014) Clarifying the roles of homeostasis and allostasis in physiological regulation. Psychol Rev 121(2):225–247
Weisbuch M (2019) Winner-take-all as a collective action problem. JL Pol 35:67–93
Avila-Garcia O, Canamero L (2005) Hormonal modulation of perception in motivation-based action selection architectures. In: Proceedings of the symposium on agents that want and like, SSAISB
Duffy BR, Dragone M, O’Hare GMP (2005) Social robot architecture: a framework for explicit social interaction. In: Android science: towards social mechanisms, CogSci 2005 Workshop, Stresa, Italy, pp 3–4
Konidaris G, Barto A (2006) An adaptive robot motivational system. In: International conference on simulation of adaptive behavior, pp 346–356. Springer
Maria Malfaz and Miguel Angel Salichs (2006) Using emotions for behaviour-selection learning. Font Artif Intell Appl 141:697–699
Malfaz M, Salichs M (2010) Using muds as an experimental platform for testing a decision making system for self-motivated autonomous agents. Artif Intell Simul Behav J 2(1):21–44
Malfaz M, Castro-González Á, Barber R, Salichs MA (2011) A biologically inspired architecture for an autonomous and social robot. IEEE Trans Autonom Mental Dev 3(3):232–246
Michalowski MP, Sabanovic S, Simmons R (2006) A spatial model of engagement for a social robot. In: 9th IEEE international workshop on advanced motion control, 2006, pp 762–767. IEEE
Walters ML (2008)The design space for robot appearance and behaviour for social robot companions. PhD thesis, University of Hertforshire
Mohammad Y, Nishida T (2009) Toward combining autonomy and interactivity for social robots. AI Soc 24(1):35–49
Balkenius C, Morén J, Winberg S (2009) Interactions between motivation, emotion and attention: from biology to robotics. In: Proceedings of the ninth international conference on epigenetic robotics
Scheutz M, Schermerhorn P (2010) Affective goal and task selection for social robots. In: Social computing: concepts, methodologies, tools, and applications, pp 2150–2163. Springer
Scheutz M, Malle B, Briggs G (2015) Towards morally sensitive action selection for autonomous social robots. In: 2015 24th IEEE international symposium on robot and human interactive communication (RO-MAN), pp 492–497. IEEE
Shi C, Kanda T, Shimada M, Yamaoka F, Ishiguro H, Hagita N (2010) Easy development of communicative behaviors in social robots. In: 2010 IEEE/RSJ international conference on intelligent robots and systems, pp 5302–5309. IEEE
Castro-González Á, Malfaz M, Salichs MA (2010) Selection of actions for an autonomous social robot. In: International conference on social robotics, pp 110–119. Springer
Salichs MA, Barber R, Khamis AM, Malfaz M, Gorostiza JF, Pacheco R, Rivas R, Corrales A, Delgado E, Garcia D (2006) Maggie: a robotic platform for human-robot social interaction. In: 2006 IEEE conference on robotics, automation and mechatronics, pp 1–7. IEEE
Floreano D, Keller L (2010) Evolution of adaptive behaviour in robots by means of darwinian selection. PLoS Biol 8(1):1–8
Buendia A, Granata C, Bidaud P (2011) Interactive person following for social robots. In: CLAWAR 2011, 11th international conference on climbing and walking robots and the support technologies for mobile machines, pp 11–26
Arkin RC, Ulam P, Wagner AR (2011) Moral decision making in autonomous systems: enforcement, moral emotions, dignity, trust, and deception. Proc IEEE 100(3):571–589
Leite I (2015) Long-term interactions with empathic social robots. AI Matters 1(3):13–15
Scheidler A, Brutschy A, Ferrante E, Dorigo M (2015) The k-unanimity rule for self-organized decision-making in swarms of robots. IEEE Trans Cybern 46(5):1175–1188
Hammersley J (2013) Monte Carlo methods. Springer Science and Business Media, Berlin
Qureshi AH, Nakamura Y, Yoshikawa Y, Ishiguro H (2016) Robot gains social intelligence through multimodal deep reinforcement learning. In: 2016 IEEE-RAS 16th international conference on humanoid robots (Humanoids), pp 745–751. IEEE
Adam C, Johal W, Pellier D, Fiorino H, Pesty S (2016) Social human-robot interaction: a new cognitive and affective interaction-oriented architecture. In International conference on social robotics, pp 253–263. Springer
Cervantes J-A, Rodríguez L-F, López S, Ramos F, Robles F (2016) Autonomous agents and ethical decision-making. Cogn Comput 8(2):278–296
Vallverdú J, Talanov M, Distefano S, Mazzara M, Tchitchigin A, Nurgaliev I (2016) A cognitive architecture for the implementation of emotions in computing systems. Biol Inspir Cognit Archit 15:34–40
Lövheim H (2012) A new three-dimensional model for emotions and monoamine neurotransmitters. Med Hypotheses 78(2):341–348
Cos I, Canamero L, Hayes GM, Gillies A (2013) Hedonic value: enhancing adaptation for motivated agents. Adapt Behav 21(6):465–483
Lewis M, Canamero L (2016) Hedonic quality or reward? a study of basic pleasure in homeostasis and decision making of a motivated autonomous robot. Adapt Behav 24(5):267–291
Lones J, Lewis M, Cañamero L (2017) A hormone-driven epigenetic mechanism for adaptation in autonomous robots. IEEE Trans Cognit Dev Syst 10(2):445–454
Maroto-Gómez M, González R, Castro-González Á, Malfaz M, Salichs MÁ (2021) Speeding-up action learning in a social robot with dyna-q+: a bioinspired probabilistic model approach. IEEE Access 9:98381–98397
Kowalczuk Z, Czubenko M (2011) Intelligent decision-making system for autonomous robots. Int J Appl Math Comput Sci 21(4):671–684
Zdzisław K, Michał C (2018) An intelligent decision-making system for autonomous units based on the mind model. In 2018 23rd international conference on methods and models in automation and robotics (MMAR), pp 1–6. IEEE
Kowalczuk Z, Czubenko M, Merta T (2020) Interpretation and modeling of emotions in the management of autonomous robots using a control paradigm based on a scheduling variable. Eng Appl Artif Intell, 91
Alejandro R, Francisco B, Abraham P, Richard JD (2018) Utility model re-description within a motivational system for cognitive robotics. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 2324–2329. IEEE
Stange S, Buschmeier H, Hassan T, Ritter C, Kopp S (2019) Towards self-explaining social robots. verbal explanation strategies for a needs-based architecture. In: AAMAS 2019 workshop on cognitive architectures for HRI: embodied models of situated natural language interactions (MM-Cog)
Esteban PG, Insua DR (2019) A model for an affective non-expensive utility-based decision agent. IEEE Trans Affect Comput 10(4):498–509
Cunningham AG, Galceran E, Mehta D, Ferrer G, Eustice RM, Olson E (2019) Mpdm: multi-policy decision-making from autonomous driving to social robot navigation. In: Control strategies for advanced driver assistance systems and autonomous driving functions, pp 201–223
Pomdp-based user-adaptive decision-making for social robots (2019) Gonçalo S Martins, Hend Al Tair, Luís Santos, and Jorge Dias. \(\alpha \)pomdp. Pattern Recogn Lett 118:94–103
Man K, Damasio A (2019) Homeostasis and soft robotics in the design of feeling machines. Nat Mach Intell 1(10):446–452
Augello A, Infantino I, Gaglio S, Maniscalco U, Pilato G, Vella F (2020) An artificial soft somatosensory system for a cognitive robot. In: 2020 Fourth IEEE international conference on robotic computing (IRC), pp 319–326. IEEE
McCall RJ, Franklin S, Faghihi U, Snaider J, Kugele S (2020) Artificial motivation for cognitive software agents. J Artif Gener Intell 11(1):38–69
Hong A, Lunscher N, Hu T, Tsuboi Y, Zhang X, dos Franco RAS, Nejat G, Benhabib B (2021) A multimodal emotional human-robot interaction architecture for social robots engaged in bidirectional communication. IEEE Trans Cybern 51(12):5954–5968
Martin-Rico F, Gomez-Donoso F, Escalona F, Garcia-Rodriguez J, Cazorla M (2020) Semantic visual recognition in a cognitive architecture for social robots. Integr Computer-Aided Eng 27(3):301–316
Kim NV, Bodunkov NE (2021) The autonomous social robot control based on the situation analysis. J Phys Conf Ser 1958:012022
Agrawal VP, Kohli V, Gupta S (1991) Computer aided robot selection: the ‘multiple attribute decision making’ approach. Int J Prod Res 29(8):1629–1644
Wang Z-D, Nakano E, Matsukawa T (1996) A new approach to multiple robots’ behavior design for cooperative object manipulation. Distrib Autonom Robot Syst 2:350–361
Kalenka S, Jennings NR (1999) Socially responsible decision making by autonomous agents. In: Cognition, Agency and Rationality, pp 135–149
Shah HK, Bahl V, Martin J, Flann NS, Moore KL (2002) Intelligent behavior generator for autonomous mobile robots using planning-based ai decision making and supervisory control logic. In: Unmanned ground vehicle technology IV, vol 4715, pp 161–177. International Society for Optics and Photonics
Clodic A, Alami R, Montreuil V, Li S, Wrede B, Swadzba A (2007) A study of interaction between dialog and decision for human-robot collaborative task achievement. In: RO-MAN 2007-The 16th IEEE international symposium on robot and human interactive communication, pp 913–918. IEEE
Czubenko M, Kowalczuk Z, Ordys A (2015) Autonomous driver based on an intelligent system of decision-making. Cogn Comput 7(5):569–581
O’Brien MJ, Arkin RC (2020) Adapting to environmental dynamics with an artificial circadian system. Adapt Behav 28(3):165–179
Dautenhahn K, Billard A (2002) Games children with autism can play with robota, a humanoid robotic doll. In: Universal access and assistive technology, pp 179–190. Springer
Feil-Seifer D, Mataric MJ (2008) B 3 IA: a control architecture for autonomous robot-assisted behavior intervention for children with autism spectrum disorders. In: RO-MAN 2008-the 17th IEEE international symposium on robot and human interactive communication, pp 328–333. IEEE
Senft E, Baxter P, Kennedy J, Belpaeme T (2015) When is it better to give up? towards autonomous action selection for robot assisted asd therapy. In: Proceedings of the tenth annual ACM/IEEE international conference on human-robot interaction extended abstracts, pp 197–198
Senft E, Baxter P, Belpaeme T (2015) Human-guided learning of social action selection for robot-assisted therapy. In: Machine learning for interactive systems, pp 15–20. PMLR
Hiolle A, Lewis M, Cañamero L (2014) Arousal regulation and affective adaptation to human responsiveness by a robot that explores and learns a novel environment. Front Neurorobot 8:1–17
Lones J, Lewis M, Cañamero L (2014) Hormonal modulation of development and behaviour permits a robot to adapt to novel interactions. In: ALIFE 14: the fourteenth international conference on the synthesis and simulation of living systems, pp 184–191. MIT Press
Cañamero L, Lewis M (2016) Making new new ai friends: designing a social robot for diabetic children from an embodied AI perspective. Int J Soc Robot 8(4):523–537
Matthew L, Lola C (2019) A robot model of stress-induced compulsive behavior. In: 2019 8th international conference on affective computing and intelligent interaction (ACII), pp 559–565. IEEE
González JC, Pulido JC, Fernández F (2017) A three-layer planning architecture for the autonomous control of rehabilitation therapies based on social robots. Cognit Syst Res 43:232–249
Cao H-L, Gómez EP, De Albert B, Simut R, Van de Perre G, Lefeber D, Vanderborght B (2017) A collaborative homeostatic-based behavior controller for social robots in human-robot interaction experiments. Int J Soc Robot 9(5):675–690
Lazzeri N, Mazzei D, Cominelli L, Cisternino A, Rossi DED (2018) Designing the mind of a social robot. Appl Sci 8(2):302–320
Park HW, Grover I, Spaulding S, Gomez L, Breazeal C (2019) A model-free affective reinforcement learning approach to personalization of an autonomous social robot companion for early literacy education. In:Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 687–694
Van der Putte D, Boumans R, Neerincx M, Rikkert MO, De Mul M (2019) A social robot for autonomous health data acquisition among hospitalized patients: an exploratory field study. In: 2019 14th ACM/IEEE international conference on human-robot interaction (HRI), pp 658–659. IEEE
Mannion A, Summerville S, Barrett E, Burke M, Santorelli A, Kruschke C, Felzmann H, Kovacic T, Murphy K, Casey D et al (2020) Introducing the social robot mario to people living with dementia in long term residential care: Reflections. Int J Soc Robot 12(2):535–547
Miguel AS, Álvaro C-G, Esther S, Enrique F-R, Marcos M-G, Juan José G-M, Sara M-V, José CC, Fernando A-M, Maria M (2020) Mini: a new social robot for the elderly. Int J Soc Robot 12(6):1231–1249
Tanevska A, Rea F, Sandini G, Cañamero L, Sciutti A (2020) A socially adaptable framework for human-robot interaction. Front Robot AI, 121–137
Foster ME, Ali S, Litwin S, Parker J, Petrick R, Smith DH, Stinson J, Zeller F (2020) Using AI-enhanced social robots to improve children’s healthcare experiences. In: International conference on social robotics, pp 542–553. Springer
Robinson NL, Connolly J, Hides L, Kavanagh DJ (2020) Social robots as treatment agents: pilot randomized controlled trial to deliver a behavior change intervention. Int Interv 21:100320
Asprino L, Ciancarini P, Nuzzolese AG, Presutti V, Russo A (2022) A reference architecture for social robots. J Web Semant, 72
Kerstin D (1999) Robots as social actors: aurora and the case of autism. In: Proceedings of the CT99, the third international cognitive technology conference, August, San Francisco, vol 359, pp 374–390. Citeseer
Breazeal C, Scassellati B (1999) A context-dependent attention system for a social robot. In: Proceedings of the 16th international joint conference on artificial intelligence, 255(3)
Breazeal C (2003) Toward sociable robots. Robot Auton Syst 42(3–4):167–175
van Breemen A, Yan X, Meerbeek B (2005) ICAT: an animated user-interface robot with personality. In: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, pp 143–144
Osada J, Ohnaka S, Sato M (2006) The scenario and design process of childcare robot, PaPeRo. In: Proceedings of the 2006 ACM SIGCHI international conference on Advances in computer entertainment technology, pp 80–es
Mitnik R, Nussbaum M, Soto A (2008) An autonomous educational mobile robot mediator. Autom Robot 25(4):367–382
Ushida H (2010) Effect of social robot’s behavior in collaborative learning. In: 2010 5th ACM/IEEE international conference on human-robot interaction (HRI), pp 195–196. IEEE
Strohkorb S, Scassellati B (2016) Promoting collaboration with social robots. In: 2016 11th ACM/IEEE international conference on human-robot interaction (HRI), pp 639–640. IEEE
Coninx A, Baxter P, Oleari E, Bellini S, Bierman B, Henkemans O, Cañamero L, Cosi P, Enescu V, Espinoza R et al (2016) Towards long-term social child-robot interaction: using multi-activity switching to engage young users. J Human-Robot Interact
Egido-García V, Estévez D, Corrales-Paredes A, Terrón-López M-J, Velasco-Quintana P-J (2020) Integration of a social robot in a pedagogical and logopedic intervention with children: a case study. Sensors 20(22):6483–6502
Mascarenhas S, Guimarães M, Prada R, Santos PA, Dias J, Paiva A (2022) Fatima toolkit - toward an accessible tool for the development of socio-emotional agents. ACM Trans Interact Intell Syst 12(1):1–30
Ahmad MI, Gao Y, Alnajjar F, Shahid S, Mubin O (2021) Emotion and memory model for social robots: a reinforcement learning based behaviour selection. Behav Inform Technol, 1–27
Kaptein F, Kiefer B, Cully A, Celiktutan O, Bierman B, Rijgersberg-peters R, Broekens J, Van Vught W, Van Bekkum M, Demiris Y et al (2021) A cloud-based robot system for long-term interaction: principles, implementation, lessons learned. ACM Trans Human-Robot Interact (THRI) 11(1):1–27
Gu D, Hu H, Reynolds J, Tsang E (2003) Ga-based learning in behaviour based robotics. In: Proceedings 2003 IEEE international symposium on computational intelligence in robotics and automation. computational intelligence in robotics and automation for the New Millennium (Cat. No. 03EX694), vol 3, pp 1521–1526. IEEE
Kok JR, Spaan MTJ, Vlassis N et al (2003) Multi-robot decision making using coordination graphs. In Proceedings of the 11th international conference on advanced robotics, ICAR, vol 3, pp 1124–1129
Manzotti R, Tagliasco V (2005) From behaviour-based robots to motivation-based robots. Robot Auton Syst 51(2–3):175–190
Kozima H, Michalowski MP, Nakagawa C (2009) Keepon: A playful robot for research, therapy, and entertainment. Int J Soc Robot 1:3–18
Fernaeus Y, Håkansson M, Jacobsson M, Ljungblad S (2010) How do you play with a robotic toy animal? a long-term study of Pleo. In: Proceedings of the 9th international conference on interaction design and children, pp 39–48
Gonzalez-Pacheco V, Ramey A, Alonso-Martín F, Gonzalez AC, Salichs MA (2011) Maggie: A social robot as a gaming platform. Int J Soc Robot 3(4):371–381
Collins EC, Prescott TJ, Mitchinson B, Conran S (2015) MIRO: a versatile biomimetic edutainment robot. In: Proceedings of the 12th international conference on advances in computer entertainment technology, pp 1–4
Kaupp T, Makarenko A, Durrant-Whyte H (2010) Human-robot communication for collaborative decision making-a probabilistic approach. Robot Auton Syst 58(5):444–456
Bicho E, Erlhagen W, Louro L, Silva EC (2011) Neuro-cognitive mechanisms of decision making in joint action: a human-robot interaction study. Hum Mov Sci 30(5):846–868
Schneider S, Goerlich M, Kummert F (2017) A framework for designing socially assistive robot interactions. Cogn Syst Res 43:301–312
Bagheri E, Roesler O, Cao H-L, Vanderborght B (2021) A reinforcement learning based cognitive empathy framework for social robots. Int J Soc Robot 13(5):1079–1093
Saunderson S, Nejat G (2022) Hybrid hierarchical learning for adaptive persuasion in human-robot interaction. IEEE Robot Autom Lett 7(2):5520–5527
Maroto-Gómez M, Castro-González Á, Castillo JC, Malfaz M, Salichs MA (2022) An adaptive decision-making system supported on user preference predictions for human-robot interactive communication. User Model User Adapt Interact
Shibata T, Tanie K (2001) Physical and affective interaction between human and mental commit robot. In: Proceedings 2001 ICRA. IEEE international conference on robotics and automation (Cat. No. 01CH37164), vol 3, pp 2572–2577. IEEE
Arkin RC, Fujita M, Takagi T, Hasegawa R (2003) An ethological and emotional basis for human-robot interaction. Robot Auton Syst 42(3–4):191–201
Saldien J, Goris K, Yilmazyildiz S, Verhelst W, Lefeber D (2008) On the design of the huggable robot Probo. J Phys Agents 2(2):3–11
Hooman Aghaebrahimi Samani and Elham Saadatian (2012) A multidisciplinary artificial intelligence model of an affective robot. Int J Adv Rob Syst 9(1):1–6
Grigore EC, Pereira A, Scassellati B (2015) Modeling motivational states for adaptive robot companions. In: 2015 AAAI fall symposium series
Thrun S, Bennewitz M, Burgard W, Cremers AB, Dellaert F, Fox D, Hahnel D, Rosenberg C, Roy N, Schulte J et al (1999) Minerva: a second-generation museum tour-guide robot. In: Proceedings 1999 IEEE international conference on robotics and automation, vol 3. IEEE
Jung D, Zelinsky A (1999) An architecture for distributed cooperative planning in a behaviour-based multi-robot system. Robot Auton Syst 26(2–3):149–174
Van der Loos HFM, Wagner JJ, Smaby N, Chang K, Madrigal O, Leifer LJ, Khatib O (1999) Provar assistive robot system architecture. In: Proceedings 1999 IEEE international conference on robotics and automation (Cat. No. 99CH36288C), vol 1, pp 741–746. IEEE
Lisetti CL, Brown SM, Alvarez K, Marpaung AH (2004) A social informatics approach to human-robot interaction with a service social robot. IEEE Trans Syst Man Cybern Part C (Appl Rev) 34(2):195–209
Hollinger GA, Georgiev Y, Manfredi A, Maxwell BA, Pezzementi ZA, Mitchell B (2006) Design of a social mobile robot using emotion-based decision mechanisms. In: 2006 IEEE/RSJ international conference on intelligent robots and systems, pp 3093–3098. IEEE
Rodriguez-Losada D, Matia F, Galan R, Hernando M, Montero JM, Lucas JM(2008) Urbano, an interactive mobile tour-guide robot. IntechOpen, Rijeka
Shiomi M, Kanda T, Glas DF, Satake S, Ishiguro H, Hagita N (2009) Field trial of networked social robots in a shopping mall. In: 2009 IEEE/RSJ international conference on intelligent robots and systems, pp 2846–2853. IEEE
Alili S, Alami R, Montreuil V (2009) A task planner for an autonomous social robot. In: Distributed autonomous robotic systems, vol 8, pp 335–344. Springer
Foster ME, Keizer S, Lemon O (2014) Towards action selection under uncertainty for a socially aware robot bartender. In: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction, pp 158–159
Petrick RPA, Foster ME (2016) Using general-purpose planning for action selection in human-robot interaction. In: 2016 AAAI fall symposium series
Liu P, Glas DF, Kanda T, Ishiguro H (2018) Learning proactive behavior for interactive social robots. Auton Robots 42(5):1067–1085
Malviya V, Reddy AK, Kala R (2020) Autonomous social robot navigation using a behavioral finite state social machine. Robotica 38(12):2266–2289
Hedblom MM, Pomarlan M, Porzel R, Malaka R, Beetz M (2021) Dynamic action selection using image schema-based reasoning for robots. In: Proceedings of the joint ontology workshops
Akalin N, Loutfi A (2021) Reinforcement learning approaches in social robotics. Sensors, 21(4)
Velásquez JD (1998) When robots weep: emotional memories and decision-making. In: AAAI/IAAI, pp 70–75
Arbib MA, Fellous J-M (2004) Emotions: from brain to robot. Trends Cognit Sci 8(12):554–561
Tanevska A, Rea F, Sandini G, Cañamero L, Sciutti A (2019) A cognitive architecture for socially adaptable robots. In: 2019 joint IEEE 9th international conference on development and learning and epigenetic robotics (ICDL-EpiRob), pp 195–200. IEEE
Acknowledgements
The research leading to these results has received funding from the projects: Robots Sociales para Estimulación Física, Cognitiva y Afectiva de Mayores (ROSES), RTI2018-096338-B-I00, funded by the Ministerio de Ciencia, Innovación y Universidades; Robots sociales para mitigar la soledad y el aislamiento en mayores (SOROLI), PID2021-123941OA-I00, funded by Agencia Estatal de Investigación (AEI), Spanish Ministerio de Ciencia e Innovación. This publication is part of the R &D &I project PLEC2021-007819 funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR.
Funding
Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Data availibility
All data generated or analysed during this study are included in this published article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Maroto-Gómez, M., Alonso-Martín, F., Malfaz, M. et al. A Systematic Literature Review of Decision-Making and Control Systems for Autonomous and Social Robots. Int J of Soc Robotics 15, 745–789 (2023). https://doi.org/10.1007/s12369-023-00977-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-023-00977-3