Systematic Review of Affective Computing Techniques for Infant Robot Interaction

Filippini, Chiara; Merla, Arcangelo

doi:10.1007/s12369-023-00985-3

Systematic Review of Affective Computing Techniques for Infant Robot Interaction

Review
Open access
Published: 07 March 2023

Volume 15, pages 393–409, (2023)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Social Robotics Aims and scope Submit manuscript

Systematic Review of Affective Computing Techniques for Infant Robot Interaction

Download PDF

2195 Accesses
3 Citations
Explore all metrics

Abstract

Research studies on social robotics and human-robot interaction have gained insights into factors that influence people’s perceptions and behaviors towards robots. However, adults’ perceptions of robots may differ significantly from those of infants. Consequently, extending this knowledge also to infants’ attitudes toward robots is a growing field of research. Indeed, infant-robot interaction (IRI) is emerging as a critical and necessary area of research as robots are increasingly used in social environments, such as caring for infants with all types of disabilities, companionship, and education. Although studies have been conducted on the ability of robots to positively engage infants, little is known about the infants’ affective state when interacting with a robot. In this systematic review, technologies for infant affective state recognition relevant to IRI applications are presented and surveyed. Indeed, adapting techniques currently employed for infant’s emotion recognition to the field of IRI results to be a complex task, since it requires timely response while not interfering with the infant’s behavior. Those aspects have a crucial impact on the selection of the emotion recognition techniques and the related metrics to be used for this purpose. Therefore, this review is intended to shed light on the advantages and the current research challenges of the infants’ affective state recognition approaches in the IRI field, elucidates a roadmap for their use in forthcoming studies as well as potentially provide support to future developments of emotion-aware robots.

Survey of Emotions in Human–Robot Interactions: Perspectives from Robotic Psychology on 20 Years of Research

Article Open access 04 June 2021

A Review of Cognitive Psychology Applied in Robotics

Human–Robot Interactions and Affective Computing: The Ethical Implications

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Robots are currently being studied and expected to be used in a wide range of social applications [1]. Zviel-Girshin et al. envisaged a rapid increase in robots in modern society to the point that most children will potentially be surrounded by a robotic environment [2]. According to Beran et al., infants and children are increasingly playing with robotic technologies during their playtime [3]. Consequently, investigating infant-robot interactions (IRI) as well as robots’ influence on young children’s cognition, learning, language, social, and moral development, is crucially important [4]. Although studies have been conducted to investigate these aspects in children [5,6,7], still little is known about the robot’s influence on infants.

Recent studies used robots as a tool for delivering early interventions to address developmental disability in infants [8, 9]. These studies corroborated the ability of robots to inspire and encourage infants to imitate desired patterns of movements [8, 9]. The widespread literature on mirror neurons [10] further suggested that interactions with an infant-sized humanoid robot may lead infants to imitate and practice key motor skills such as standing and walking. These findings highlighted the importance of promoting research on infants’ robotic technologies aimed at providing complementary support to human-administered therapy.

A key component of robots interactions for infants is the use of visual stimuli presented via robot behaviors [11]. Robots must be capable of reliably capturing infants’ visual attention to be able to teach and reinforce infants’ actions. To this end, robots need a method to gain infants’ attention. In addition, once gained, the robot must be able to maintain the infant’s attention. This includes the robot’s understanding of the infant’s current psychophysiological and emotional states, as well as the determination of the optimal course of action depending on the infant’s current situation. Indeed, a seamless infant–robot interaction requires the infant to be motivated to follow the robot’s actions (such as leg movements or eye gaze) [1, 12, 13]. Therefore, it is recommended that the robot positively impacts the infants’ curiosity [3] and first engages the infant in some form of social interaction or socially intelligent behavior [12].

The quality of the interaction between infants and robots, and the analysis of the robots’ influence on infants, can be investigated by detecting infants’ affective cues during such interaction. For this reason, the social robotics field is supported by affective computing, which represents a branch of the data mining field aimed at providing effective and spontaneous interaction between humans and devices [14]. One of its primary goals is to enable systems to understand the emotional states expressed by human subjects so that personalized responses can be delivered accordingly [15]. Specifically, affective computing focuses on the study and development of systems and devices that can identify, interpret, process, and simulate human affects [16]. The machine should be able to detect human emotional states and modify its behavior responding appropriately to those emotions. Therefore, reliable robotic emotion detection remains the cornerstone of affective computing [17].

By contrast, although affective computing during infants’ interaction with robotics systems is a fundamental task in the IRI field, researchers in this area still struggle to figure out how to deal with it. This is mainly due to the fact that emotions in infants have been principally studied by analyzing vocal or facial expressions. In fact, over the first few years of life, children develop patterns of facial, vocal, and behavioral (i.e., bodily) expressions that allow them to communicate their feelings and adjust those communications according to the situation [18, 19]. However, the models underlying infants’ emotional phenomena on facial or vocal expression are built on varying theoretical assumptions that include anatomical and biological aspects as well as different theories about the cause and purpose of emotions. Transforming these theories into an emotional computational model is a difficult endeavor in and of itself.

Other than vocal or facial expressions, emotions can be detected from physiological signals. Efforts for emotion recognition through physiological markers are evident in many studies using electrocardiography (ECG) [20], electroencephalography (EEG) [21] functional near-infrared spectroscopy [22, 23], skin conductance [24], and thermography [25]. The drawback of working with physiological readings is the personal access required to measure them. Direct contact with the infant’s body is needed to acquire most readings, but contact sensors used for measurements may perturb the infants’ body, potentially biasing the results. To favor the ecological dimension of infant robot interaction, it would be desirable to assess the psychophysiological and emotional states non-invasively. To this end, recently contactless technology such as thermal infrared (IR) imaging, which enables monitoring human autonomic activity and inferring psychological and affective states in a contactless manner without the subject’s constraint [26,27,28,29], has been introduced in IRI field. Finally, since infants’ emotions and drives play essential roles in generating meaningful interactions [30], the study of infants’ affective states during robot interactions can lead to new knowledge in the field of developmental psychology and inspire new insights for developmental robotics improvement [31]. Indeed, the emerging area of developmental robotics is oriented toward the advancement of robotics by attempting to reproduce infant-like behavior and learning.

1.1 Structure and Aim of the Study

The study’s purpose is to evaluate and assess the technologies and procedures used for infant affective states recognition that are most significant to the IRI field and further investigate the main research areas of affective computing in IRI applications. The current infant affective state recognition technologies have been surveyed in terms of accuracy achieved, their suitability in the IRI field as well as their potentialities and limits. In this sense, the study is intended to offer a potential solution to overcome the existing difficulties in this area and provide new perspectives towards a successful interaction between infants and robots. Furthermore, the study provides a perspective about future developments of emotion-aware robots.

This work is structured as follows. Section 2 describes the methodological and search strategy structure. In detail, the search was based upon two research questions (RQs): (1) the infants’ affective states recognition techniques currently available, relevant to IRI studies and (2) the IRI main area of applications where affective computing is necessary. Indeed, answering those two RQs would provide a comprehensive overview of the potentialities and limits of affective computing in the IRI field. Section 3 presents the results obtained from the literature search and the outcomes of each RQ are explained in a dedicated subheading. Such results are then discussed in Sect. 4.

2 Methods

This study was conducted as a systematic literature review based on the original guidelines as proposed by Kitchenham [32]. The literature survey was organized into two sections addressing different RQs.

RQ1

Examining the infants’ affective states recognition techniques and their relevance to IRI studies.

To promote an affective computing system suitable for IRI research areas, it is necessary to have a comprehensive understanding of the infants’ affective computing approaches currently available in literature. Since infants cannot vocally express their emotions, understanding their affect has been considered by many to present a significant challenge. Indeed, emotion research in infants has received a great deal of attention over the years, and psychologists have developed various modalities for infants’ emotion recognition. However, questions have been raised about the soundness and applicability of these approaches in IRI applications. Therefore, this RQ aims to investigate existing emotion detection techniques and highlight relevant features for IRI applications. Besides, understanding the infant’s emotional states is critical in many robotic applications and considered fundamental to effective social relationships and psychological adjustment. Indeed, the design of robots that respond autonomously and emotionally may provide an alternative for assistive therapy. Nonetheless, before robotic systems can be endowed with affective computing abilities the efficacy of the approaches utilized for baby emotion identification and categorization needs to be examined.

RQ2

Assessing the main research areas of affective computing in IRI applications.

Recently, there has been a growing interest in designing interactive robots that humans can spontaneously and intuitively interact with. To allow spontaneous interaction, robotic applications frequently use technologies designed to recognize the affective state of the human interlocutor. However, IRI may be fundamentally different from adult human-robot Interaction in that infants are not simply small adults. Their physical, neurophysical, and mental growth are ongoing, which may result in conditions and operational circumstances that differ significantly from HRI [33, 34]. Furthermore, conducting studies involving infant participants places significant constraints on the experimenter as well as possible ethical issues. Therefore, the present RQ seeks to identify and describe the research fields in which the IRI and the infant’s affective states recognition are valued. Indeed, the answer to this question will provide a clear picture of the potential use of robots with infants on the one hand, while also pointing out the technological challenges that must be solved in order to develop an affective computing system that satisfies the demands of IRI on the other.

The databases searched were both Scopus and Google Scholar. All the papers published in conferences and journals between 1990 and 2022 were considered. Papers published from 1990 were considered since from these years intelligent technology development made robotics applications valuable and profitable.

Concerning RQ1 the search was based on the words “Infant” OR “Babies” OR “Toddler” OR “Pediatric” AND “emotion” AND “Recognition” OR “Detection” OR “Computing”. In the Scopus database, those keywords were surveyed in fields such as article title, abstract, and keywords. In the Scholar database, on the other hand, the advanced search can be performed either by searching (i) the entire text or (ii) the title only. Therefore, the advanced survey was carried out by searching for “Infant” OR “Babies” OR “Toddler” OR “Pediatric” AND “Emotion” with at least one of these words: “Recognition” OR “Detection” OR “Computing”, and the field searched was the entire text. The overall search generated 771 results in Scopus and 739 in Scholar.

With respect to RQ2 the search was based on the words “Infant” OR “Babies” OR “Toddler” OR “Pediatric” AND “Robot” AND “Interaction”. In the Scopus database, the survey was set up by searching for those words within the following fields: article title, abstract, and keywords. The basic search generated 375 results. Whereas in Scholar, the search was based on “Infant-Robot interaction” within the entire text. A total of 260 results were obtained from the Scholar survey. The search was performed independently by two researchers.

The initial analysis consisted in filtering out papers related to subject areas such as arts, agricultural, biochemistry, environmental science, medicine, physics and astronomy, pharmacology, chemical, and economics. This procedure reduced the considered pool to 255 papers in Scopus and 106 papers in Scholar related to the RQ2, whereas with regards to RQ1 the pool was reduced to 262 papers in Scopus and 208 papers in Scholars. Therefore, the total number of papers from all the RQs resulting from the first screening analysis were 517 and 314 for Scopus and Scholar database respectively. The search was performed independently by two researchers. Results from each source were carefully examined for duplications. Document types of Google Scholar research outcomes were examined closely to ensure that they were related to reliable scientific sources. The review papers and all the papers that did not refer to a user study or that did not relate to IRI or infant emotion recognition were excluded, which reduced the considered pool to 153 papers. Within those papers, 142 were related to Scopus research, and 89 were from Scholar with an overlap of 78 papers. The manual review process was adopted for the final exclusion by scanning the papers’ abstracts. Exclusion criteria regarded all the results that were not conference or journal papers actually related to the specific RQ. The age of the study population considered in this work ranged from 0 to 36 months. After the review process, performed in accordance with [35], 98 papers were included in the present study (Fig. 1). The resulting papers were analyzed and grouped based on their experimental applications and operative RQs. A separate paragraph was dedicated to each RQ to ensure that appropriate literature was accurately covered and that it was suitable to respond to the specific RQ.

3 Results

Over the last several years, there is an increasing interest in the field of human-robot interactions (HRI) due to the increasing usage of robots not only in industrial fields, but also in other areas such as schools [36], homes [37], hospitals [38], and rehabilitation centers [39]. Therefore, research in HRI has begun exploring people’s perceptions of, and attitudes toward robot systems, including kinds of applications and tasks for which they might be useful [40]; the attribution of competencies on the basis of their physical appearance [41]; the relationship between the robot’s physical appearance and its behavior, or the effect of its human-like-ness [42]. Within HRI, IRI holds a peculiar place. Indeed, the interaction between infants and robots can be very different from the interaction between adults and robots due to the infant’s continuing neuro-physical and mental development [33, 43]. A fascinating challenge in IRI is to implement a human-like interaction in which the choice of the robot’s actions is taken based on the infant’s behavior. To this purpose, robots should be endowed with the ability to assess the infant’s affective state and determine whether or not to take action based on the infant’s emotional behavior. Besides, making robots respond spontaneously and sociably to humans implies that robots should have a degree of sensibility to human emotions [44]. The methodologies for infant affective states recognition relevant for IRI studies are detailed in Sect. 3.1. Whilst the primary research areas of affective computing in IRI applications are discussed in Sect. 3.2. Specifically, the following paragraphs describe the results obtained for each RQs.

3.1 Technologies used to Evaluate Infants’ Affective State and their Relevance to IRI Studies (RQ1)

The strategies adopted for infant affective state recognition relevant for IRI studies are mostly focused on non-invasive techniques that can automatically detect emotions. Behavioral analysis, such as facial expression evaluation, and non-invasive physiological signal analysis are the most commonly used approaches in this regard.

3.1.1 Infants Affective Computing through Behavioral Analysis

Understanding infants’ affective state is important to researchers, pediatric professionals, and parents alike. In fact, because infants cannot verbally report on their emotions, understanding their affect has been considered by many to present a significant problem [45]. Infants communicate with the outside world through their facial expressions, hand gestures, body action, sounds [46], and signed speech [47]. Therefore, these are important information sources to detect their emotions and needs. To study infant emotion, researchers have traditionally relied upon observable features. Observable, behavioral indicators of emotion regulation strategies in infants include facial expressions, attention shifting (e.g., gazing at or away from a stimulus), engagement with objects, and self-soothing (e.g., rubbing face or thumb-sucking) [48]. Indeed, research on emotional development in infancy has been mostly based on facial expression analysis and had greatly benefited from the use of video analysis and coding systems [49]. Literature research on the most widely used coding systems designed to analyze infants’ affective state led to three results, namely the Monadic Phases Coding System [50], the Maximally Discriminative Facial Movement Coding System (MAX) [51], and the Facial Action Coding System (FACS) [52]. Their applications result in similar affect codes, but the three systems are quite different procedurally. The Monadic Phases System assesses infant affect by combining information about facial and vocal affective expression, gaze, posture, and type of action [53]. The affect categories derived from MAX, and FACS, on the other hand, are based on facial expression only. Both these coding systems rely on facial anatomy. In detail, FACS operates under the assumption that emotions activate micro-expressions, resulting in subtle changes in facial muscles activity [54]. For this purpose, it defines individual components of muscle movement, i.e. the Action Units (AU) [55]. Specifically, it distinguishes 30 AUs. AUs are reliably associated with distinct emotions [54], based on the six universal facial expressions (happiness, anger, disgust, sadness, fear, surprise). By contrast, MAX differentiates three anatomical facial regions (forehead and brows, midface, and mouth) and discriminates among types of movement within each region [50]. Combination rules convert facial movement codes into expressions of affect. These three systems are widely used to investigate similar research problems, for example, the relation between affective expression and infants’ age, gender, and birth status. Interesting results on gender differences in infants’ emotional expression showed that boys have higher levels of arousal than girls in infancy, and boys show less language ability and inhibitory control than girls [56, 57]. Also, a large number of studies have focused on the type and the direction of influence between mothers’ and babies’ affective expression [58,59,60,61] and infant affective response to experimental conditions [62,63,64]. Furthermore, facial expressions analysis was also used to differentiate discrete infants’ emotions [65]. This latter analysis was commonly based on MAX and FACS system rather than the Monadic Phases System. Figure 2 shows examples of infants’ facial expressions associated with different discrete emotions such as surprise (a), fear (b), sadness (c), and happiness (d).

Expressions, particularly when coupled with vocal and postural behaviors provide valuable clues to the motivational state of infants who are unable to report what they feel otherwise.

Other than facial expression, a behavioral measure that is increasingly being investigated is connected to body gestures and infant motion. Many applications for analyzing humans and their movements have been developed with the advance of commodity depth sensors such as the Microsoft Kinect and its body tracking capabilities. However, the Kinect body tracking is limited to persons taller than 1 m [67]. Therefore, systems aimed to automate infant motion analysis made use of sensors that are attached to the infant body. Other methods overcome these limitations by fitting a simplified body model to the whole body [68] or lower limbs [69] of infants captured by RGB-D devices. The features employed for emotion recognition based on body motion could include absolute or reciprocal positions and orientations of limbs, as well as movement information such as speed or acceleration.

3.1.2 Infants Affective Computing through Physiological Signal Analysis

Emotion recognition using physiological signals is one of the branches of affective computing and several researchers employ bio-signals to estimate people’s affect. Indeed, even though, to a lesser extent, infants’ emotions have also been investigated through the analysis of physiological signals. This is because the ability to control emotional reactions to environmental stimuli develops during the first years of life [48]. The behavioral and cognitive construct of emotion regulation have been extensively examined in the developmental psychology literature, and they have claimed that personality, social competence, and problematic behavior have their origins in (or are influenced by) early emotional control [70]. Emotional reactivity (i.e., arousal) and the regulation of that reactivity are the two processes that constitute an infant’s emotional experiences [48]. The latency to respond to a stimulus, the intensity of the reaction, and the stimulus threshold required to elicit a response are all terms used to describe reactivity [71]. Internal physiological markers of arousal and regulations are present from early childhood [72]. For this reason, physiological measurements are often needed to better understand emotion regulation in response to environmental challenges. Such measures are mostly focused on the autonomic nervous system (ANS) activity. Indeed, the ANS, together with the hypothalamus, regulates pulse, blood pressure, breathing, and arousal in response to emotional cues [73].

In addition, studies have indicated that specific patterns of infant physiology, including vagal withdrawal, are predictive of emotion regulation due to their relationship with affective emotional experiences. For example, when infants experienced negative affect in contexts involving interactions with another person, they demonstrated evidence of regulation via vagal withdrawal (i.e., lower respiratory rate that indicates regulation [74]). One of the frequently employed paradigm to investigate emotion regulation through physiological signal analysis in infancy is the Still-Face Paradigm (SFP). Such a paradigm was designed to investigate the role of infants in social interactions as well as infant reaction to depression stimulation. The SFP is typically composed of three sessions of face-to-face interaction between an adult and an infant: (i) a normal parent-infant interaction; (ii) the ‘still face’ moment, when the parent took on a neutral expression and he/she was no more responsive to the infant; (iii) the moment when the parent resumed the interaction with the infant [75]. Physiological changes associated with arousal and regulation can also be examined during SFP. Many studies on infants’ arousal have relied uniquely on infant heart rate recorded through electrocardiogram sensors [76, 77]. However, other studies, have used thermal infrared imaging techniques to assess infants’ arousal based on skin temperature variation. Such studies relied on research showing that thermal variation can be measured at 6 months of age and autonomic changes can be inferred [78]. Indeed, the skin temperature profile is subject to various influences, including the cutaneous blood flow, local tissue metabolism, and sudomotor response, all of them being in turn controlled by the ANS [79]. Consequently, advances in thermal IR imaging technology allowed monitoring infant autonomic functions and inferring psychological and affective states. Moreover, since thermal infrared imaging is a non-invasive and contactless technique, it permits ecologically-valid settings whereupon infants participants are free to move without restriction. For this reason, it is especially valuable for observing emotions in infancy research since infants are difficult to engage in strictly controlled experimental settings. Aureli et al. assessed the nose tip temperature variation in 3- to 4- months-old infants, in order to explore the natural human process of attachment between baby and mother, and the effects of the SFP [80]. Behavioral data were also collected. The finding confirmed a parallelism between physiological and behavioral responses: infants exhibited no signs of stress or discomfort during the still-face moment and no drop in face temperature which is considered a sign of stress or anxiety. In contrast, a temperature increase was recorded in support of the parasympathetic system activation as a result of infants’ greater interest in the surrounding environment. This was also confirmed by the behavioral evidence, which revealed that children drove their attention outward because of the interruption of the interaction with their mothers. Moreover, researchers used thermal IR imaging to measure facial skin temperature as an index of mental stress in 8 to 15 weeks old infants when they were separated from their mothers [81]. Nakanishi and Matsumura investigated changes in facial skin temperature in 2 to 8-months-old infants, when they were laughing, as typical behavior of pleasant and joyful emotion [82] .

Besides skin temperature variations, physiological changes in response to emotion also occur in parameters, controlled by the ANS, such as blood pressure, heart rate, electro-dermal activity, pupil dilation, and respiration rate, which are not directly recognized by human observers Cirelli et al. focused on the emotional arousal and emotional regulation in infancy by employing skin conductance measurement [83]. Bainbridge et al. by measuring heart rate and pupil dilation demonstrated that infants relaxed in response to unfamiliar foreign lullabies [84]. The heart rate is widely used for infant arousal detection using a comparison of sympathetic and parasympathetic frequency bands of the signal. However, it is highly dependent on the body position during monitoring [85]. Minagawa-Kawai et al. studied the emotional attachment between mother and infants [22], Parsons et al. between parents and infants [23], both using the functional infrared spectroscopy technique (fNIRS). This technique enables the measurement of the localized hemodynamic response in infants.

3.1.3 Relevance of the affective computing techniques based on behavioral and physiological data in the IRI field.

The ultimate goal of IRI and HRI, in general, is to have robots interact socially, which also means having the capacity of generating coordinated and timely behaviors predicated on their social surroundings. Therefore, the recognition of the emotional reaction of the human interlocutors also needs to be performed in a timely manner. This aspect has a crucial impact on the selection of the emotion recognition technique to use for this purpose.

Researchers have usually focused on observable characteristics such as facial expressions or gaze analysis to assess infants’ emotions. Indeed, expressions, especially when combined with vocal and postural clues provide valuable information on the affective state of infants who are unable to convey their feeling otherwise. As a result, emotions recognition based on infants’ facial expression analysis has become highly influential [86]. However, some limitations to apply this technique in the IRI field have been identified. For instance, the processing may be very time-consuming. Indeed, in most of the studies analyzed, emotions or facial expressions were encoded using stop-frame video [51]. Moreover, behavioral analysis needs to be performed from more than one observer in order to ensure inter-observer reliability, thus making measurement accuracy a critical concern [52]. By contrast, daily life scenarios such of those of IRI applications require real-time responses from the sensors of interest. Automatic data recording and processing are therefore preferred. Besides, conventional recognition methods using facial images may lack recognition accuracy since they are not universal and depend on culture, gender, and age [87]. Automated approaches to assessing facial action are a potential solution to manual coding difficulties. Nowadays, these automated approaches are an active research topic in the field of computer vision and machine learning and often involve collaborations between computer scientists and psychologists [88]. Hammal et al. reported that automatic coding of AUs showed moderate to strong reliability with manual coding [89]. Yet, lighting conditions, auditory noise, also make these techniques challenging to be implemented in a real-world environment [90]. Even if this automated approach in infants is in the early stage of research, introduction into the IRI field of advanced image processing techniques, such as convolutional neural networks, encourages and fosters its improvement [88, 89].

On the other end, physiological channels can deliver reliable, crucial, and timely information about infants’ emotional symptoms, and emotional response to environmental stimuli. They are, however, mostly obtained through contact sensors [91]. Direct contact with the person’s skin requires the ability and the willingness to properly wear the device. Moreover, the time required for attaching sensors to infants would not be negligible, and this could be a source of distress for the infants, further complicating the conducting of IRI studies. Even if easily wearable sensors were developed, they have to meet various technological requirements (e.g., reliability, robustness, availability, and quality of data), which are often very difficult to design. Therefore, although over the last decade, emotion recognition using physiological signals has gained its momentum, this approach is still hardly applicable in the IRI field. Indeed, for IRI applications it is important that these measurements do not interfere with the infants’ activity. For this reason, nonintrusive measurements are required, which can be performed without requiring additional cooperation from the infant or even without the subject’s awareness of the measurements. A suitable technique that well fits this requirement can be thermal IR imaging since it records physiological signals remotely. However, advancements in hardware and signal processing would be required for this technique to overcome the barriers related to its use with infants in real-life circumstances [92].

Another important aspect to consider when using sensors suitable for IRI is their camera angle. Indeed, IRI applications primarily require an ecological environment in which the child is free to move without the restriction of being within the camera angle of the sensors. The field of view (FOV) of an optical sensor (i.e., the maximum area of a sample that a camera can image) is related to the focal length of the lens and the sensor size. The sensor size is determined by both the number of pixels on the sensor, and the size of the pixels. Whereas there are visible cameras that can cover overhead camera angle, thermal imaging camera’s FOV values can range from 7° to a maximum of 80°. Although these values are suitable for a wide range of applications, in studies that require a higher FOV e.g., overhead camera angle, the combined use of multiple sensors is preferable.

3.2 Assessing the Main Research Areas of Affective Computing in IRI Applications (RQ2)

As robots move into more infant-centric environments, methods to develop robots that can spontaneously interact with infants are required. Robots must be capable of coordinated, timely behavior in response to social context and their interlocutor’s affective states in order to interact effectively with human users. This would require testing in the real world and addressing multidisciplinary challenges. Moreover, based on the analysis conducted on the papers reported in this section, to achieve a successful interaction with infants, robots need to pursue two main goals (1) to capture and (2) to maintain infants’ engagement. To this end, researchers have attempted to identify modes through which robots can recognize infants’ emotional states in order to socially and spontaneously interact with them. In fact, studies revealed that the success of robot acceptance lies in its capability to act as a social entity as well as its adaptability to differentiate behavior within appropriate response times and tasks [93]. At the same time, emotion recognition is a challenging task, especially when performed in a real-life IRI situation, where the scenario may differ significantly from the controlled environment in which most recognition experiments are conducted [94]. Based on the literature survey’s outcome, the IRI’s main applications that adopt emotion recognition techniques can be divided into two research fields: (1) the use of robots for infant rehabilitation or skill improvement, also known as assistive robotics for infants, and (2) the use of social robots for infant interaction, which aims to investigate infants’ perception of the robotic systems. Each of the two major topics is covered in the following sections and summarized in Table 1.

Table 1 List of robotic platforms, the metrics employed and their purposes of use in infant-robot interaction research

Full size table

3.2.1 Assistive Robotics for Infants

Robotic technologies and especially assistive robotics for infants are a growing area of research and can prove to be fundamental for infants with all kinds of disabilities. Indeed, it has been found that infants are often attracted to robotic devices, and that such technology may enable children with physical disabilities to play and exercise, and facilitate learning in those who have cognitive challenges [95]. For instance, motion demonstrations from humanoid robots have several unique advantages for studying infant motion adaptation compared to classical techniques [96]. Since infants prefer face-to-face interactions [97], interactive humanoid robots may capture and maintain the attention of infants longer than inanimate toys do. Furthermore, small humanoid robots can produce motions similar to those of infants. This ability may help the robot to inspire infants to imitate desired patterns of motion. Such assumption was tested and proved effective by Kokkoni et al. Indeed, the authors developed a pediatric learning environment using two socially assistive robots aimed at delivering motor interventions. The results of the study revealed that the robots were able to facilitate and encourage mobility in young children through play-based interaction with the robot [98]. Fitter et al. demonstrated that infants 6 to 8-month old imitate robot motion and robot rewards motivate infants to move in particular ways [8]. This suggests the potential role of the robot to teach and reinforce infants’ motion. These findings are indeed based on past infant behavior research that highlighted imitation and contingency learning as two infant behaviors that could be exploited to encourage robot-based motor interventions [99]. Likewise, Pulido et al. showed that the robot was able to encourage the infant to reach higher acceleration from their movement to get better rewards from the robot [100]. This finding demonstrated that the robot’s physical embodiment and ability to provide various reward types helped it motivate infants and keep their attention for longer than other therapeutic tools. The infant-like size and humanoid anatomy of the robot also allowed it to fit in the infant’s visual field and demonstrate motions that an infant can imitate. Galloway et al. employed a mobile robot to provide the first experiences of self-generated long-distance mobility to infants with special needs. The authors proved that even without training, both typically developing infants and infants diagnosed with Down Syndrome were able to independently move themselves using a mobile robot [101]. Chen et al. furthered this result by training 26–34-month-old special needs infants, sitting on a mobile robot, to navigate and avoid obstacles [102,103,104]. These observations may provide new insights or give tips for future interventions on assistive robotics for infants. Moreover, researchers outlined social robots as a tool to support the diagnosis, treatment, and understanding of developmental disorders such as autism [105].

3.2.2 Social Robotics for Infants’ Interaction

This second research area in IRI focused on the nature of the interaction between infants and robots by posing different questions, such as: (i) How do infants perceive a robotic system, and (ii) how does this relate to the forms of contingent interaction they are able to undertake with the system [106, 107]. Demonstrating that the infants’ perception and categorization of a robot emerged and changed step by step depending on the form of interaction [106].The Infant’s attitude toward the robots appeared to change based on the robot interaction as well. In fact, Funke et al. demonstrated that infants were more likely to be alert and engaged when the robot pursued an active interaction compared to when the robot was not active [96]. Furthermore, the robot’s social-communicative interaction capability also plays a key role in mediating infants’ behavioral state and their perception of the robot. Indeed, Meltzoff et al. working with a group of 64 infants who were 18-month-old showed that it is not just the appearance of the robot, its physical feature, or even how it moves, but how it interacts with others and reacts that is important to the infants and drives infant perception of the robot [12]. These findings can have implications for the future design of humanoid robots and the field of social robotics in general. In addition, Michaud et al. designed a spherical robot, Roball, intending to study the impact of robotic interaction on infants, and investigating the potential role of the robot in contributing to the development of their language, affective, motor, intellectual and social skills [95, 108]. The robot was able to move autonomously and generate various interplay situations. Although the purposes and design of the study were interesting, the authors concluded that conducting trials with infants and robots is highly challenging. Inconclusive results can occur, especially because infants’ unpredictable mood can influence the interaction [108]. Thus, the need to further investigate the affective state of the infant while interacting with a robot. Finally, two studies that resulted from the literary survey were related to an innovative system called RAVE (Robot AVatar thermal Enhanced language learning tool). This is a dual-agent system that uses a virtual human and a physical robot to engage 6-to-12-month-old deaf infants in linguistic interactions (Fig. 3) [78, 109]. The tool was endowed with a perception system that could estimate infant attention and engagement through thermal IR imaging and eye-tracking. It was intended to be an augmentative learning tool to facilitate language learning, in particular, visual language during one widely recognized critical developmental period for language (ages 6–12 months [110]). To this end, thermal IR imaging was used to determine the infants’ emotional arousal and attentional valence during the interaction with artificial agents. In detail, authors differentiated five discrete values of the infants’ engagement: very negative (sustained decrease in attention), negative (non-sustained decrease in attention), very positive (sustained increase in attention), positive (non-sustained increase in attention), and a None signal which shows the signal’s absence because of not detecting a reliable signal from the baby [78]. The robotic platforms used for infant interaction purposes are listed in Table 1.

3.2.3 Metric used for Affective Computing

By reviewing the studies reported in Sect. 3.2.1 and 3.2.2, the metrics adopted to verify whether the robots have succeeded in their purpose to identify and elicit infant engagement can be summarized in three types of measures, related to both behavioral and physiological analysis. In detail, the emotion recognition based on behavioral analysis is represented by the eye gaze metric and infant imitation occurrence.

Concerning the eye gaze, the studies analyzed revealed that robot eye gaze has been successful in acquiring visual attention across various settings. Furthermore, most of the research in IRI has employed eye gaze and looking time as an index to evaluate infant’s engagement and attention to the robot. In addition, since infants may direct their attention more quickly to stimuli, they previously found interesting [111], this suggested that more complex or surprising stimuli may be more successful at acquiring and maintaining infant visual attention. Therefore, the experimental protocol of the analyzed studies often incorporates some sort of surprising action.

Relating to infant imitation occurrence, this is a usually employed metrics to evaluate robot performance. It mainly concerns the child’s ability to imitate the same action that the robot performs. In fact, since motion demonstrations from a humanoid robot have several unique advantages for studying infant motion adaptation, this is widely used in IRI. Accelerometer data, as well as motion analysis of the infant’s action, inferred through video analysis, are the metrics used to evaluate the action completion by the infant.

On the other end, the emotion recognition based on physiological signals in the IRI field is currently mostly related to facial skin temperature modulation. Such a metric is used to evaluate infants’ engagement toward the robot [78]. Thermal sensing has recently been used to identify distinct thermal patterns related to subtle changes in the infant’s internal state. For instance, the thermal feedback indicating an increased level of infants’ distress was found consistent with temperature decrease in peculiar regions of interest. Conversely increase in temperature was linked with infants’ interest and social engagement [109]. Specifically, in Gilani et al. to calculate information about the infant’s affective state, authors used the nose tip’s average temperature, extracted in real-time from each frame [78]. The study revealed for the first-time insights into infants’ psychophysiological responses to artificial agents such as robots. The metrics used in the analyzed studies are listed in Table 1.

4 Discussion

An intriguing challenge in the field of IRI is the possibility to provide robots with emotional intelligence in order to make the interaction more genuine, and spontaneous. A crucial aspect in achieving this is the robots’ capacity of inferring and interpreting infants’ emotions. Emotion recognition has been widely investigated in the broader fields of HRI and affective computing. This review reported on emotion recognition techniques designed for infants’ studies, with particular regard to the IRI context. Our aim was to review currently adopted emotional recognition and robot interaction modalities for the infant population and offer our point of view on future developments and critical issues. The following paragraphs summarize considerations on the research questions addressed in this study as well as provide suggestions for future improvement. Finally, ethical concerns of IRI are also discussed.

4.1 Discussions on the RQs Addressed

RQ1 (i.e., Examining the infants’ affective states recognition techniques and their relevance to IRI studies) helped us to identify relevant aspects of the affective computing techniques used in literature useful for IRI applications. In detail, the most common and natural way to observe and recognize emotions in infants is the analysis of facial expressions. Quantitative evaluation of observational data typically consists of manual coding, on a second-by-second basis, so that statistical techniques can be applied to data gathered from a test population. However, spontaneous robotic interaction would benefit from on-line emotion recognition, making automated methods preferrable. Automated approaches for infants’ facial expression identification are currently being developed. In particular, two of the studies reviewed in this paper used automated methods, with both achieving average accuracy of 80 and 81% in recognizing facial action units [88, 89]. Although such automated approaches have not yet been employed in IRI studies, their use is encouraged. By contrast, the contribution of thermal imaging in this area should not be underestimated. Indeed, it provides information about physiological parameters associated with the infant’s affective state in real-time and contactless.

From RQ2 (i.e., Assessing the main research areas of affective computing in IRI applications) is possible to highlight that the robots have been designed to interact with infants in a manner consistent with human psychology and following the guidelines and rules of social interaction. The study reviewed demonstrated that the interaction between robots and infants can prove highly effective in healthcare, especially in robot-assisted therapies. Besides, it has been shown that infants, have the ability to engage with robots and follow their gaze, based not much on the robot’s appearance but rather on its capacity to interact with others [12]. However, still little is known about the infants’ affective states during the interaction with a robot or a social agent.

The metrics used to detect infants’ engagement and affective states in IRI’s applications resulted to be based on behavioral indices such as eye gaze or motion analysis and on physiological cues such as skin temperature modulation which can be revealed by thermal IR imaging. Indeed, compared to other techniques for emotion recognition through physiological signals, thermal IR imaging has the advantage of collecting thermal signals remotely. Such peculiarity would enable it to be integrated into a wide range of IRI settings and applications. Conversely, regarding the behavioral index, i.e., the eye-gaze, it is a reliable and quick to process index, therefore suitable for real-time assessment. Moreover, such a metric can be easily integrated with other technologies used to recognize infants’ affective states. Combining different modalities would promote the future development of affective computing in the IRI field. Indeed, whereas existing literature tends to evaluate each of these algorithms in their own metric space the emotion recognition quality would benefit for global level markers. Table 2 summarizes the focal points deduced from both RQs as well as the suggested future directions.

Table 2 Overview of the insight gained and suggested future direction

Full size table

5 Research Challenges and Open Problems

The present study aimed to provide new perspectives towards a successful interaction between infants and robots. In this regard, future vision will be to endow the robot with the capability of assessing the infant’s affective state based on real-time emotion recognition techniques. Indeed, affective state recognition capability would provide robots with a degree of “emotional intelligence” that would permit more meaningful and spontaneous IRI. However, some open problems were identified in the reviewed studies, which highlighted that emotion recognition in IRI is currently still a challenge for many robotic applications.

Those problems mostly apply to the real-world IRI scenario which is quite different from the laboratory setting. Previous studies used an experimental methodology and paradigm within which affective computing was performed without being embedded into a realistic IRI environment. While this approach could provide a baseline metric/value, the drawback is that it may not be representative of real-world IRI. The development of a reliable and accurate metrics to be used as a gold standard for affective computing in IRI could potentially allow overcoming such issues.

A second open problem is the accuracy achieved in the infants’ emotion recognition and the time it takes the affective algorithms to produce the emotional outcome. Real-world application indeed requires timely and accurate analysis. To increase accuracy, a multimodal information fusion system, rather than a single infant’s emotional states recognition technique, might be suggested as a future perspective. Indeed, advanced robots will benefit from integrating capabilities to detect and interpret infants’ emotions, motions, gestures, and sounds in order to accomplish tasks in synergy with infants or humans in general. Data fusion for visual tracking has already been used for robotic interactions with adults, involving the development of a real-time system for face/hand tracking and hand gesture identification within the particle filtering framework [112]. Moreover, combining data from infrared cameras with data from visible cameras, in order to integrate behavioral and physiological cues of the human interlocutor’s emotions was also proved effective in child-robot interaction applications [29]. A depth camera, in addition to infrared and visible cameras, can also be integrated into a multimodal data fusion system to improve the accuracy of motion and gestures analysis [113] and provide accurate cues for real-time infants’ emotions recognition by the robotic system.

In addition, when dealing with infants, a successful emotional-aware robot should be able to recognize different emotional states and be prepared to adjust its behavior based on the infant’s needs. As a result, it would be ideal, although challenging, to develop a database of the infants so that the robot can keep track of prior contacts and mood, as these aspects might influence social interactions with the infant. Moreover, a still open problem is the amount of infant data currently available for analysis. The database thus developed might potentially be used also to perform powerful machine learning algorithms that require a significant amount of data.

Finally, recognizing and classifying infant emotions may be a viable step toward synthetic reproduction of human emotions by the robotic system, which is a current fascinating challenge for most researchers in IRI and developmental robotic fields.

5.1 Future Perspective On the Developmental Robotics Field

Reliable infants’ emotion recognition technique would also greatly benefit the developmental robotics (DR) field. Indeed, the DR field aims to develop sensorimotor and cognitive capabilities in robots by drawing inspiration from child psychology and by modeling developmental changes [114]. Inspired by infants’ cognitive process, some researchers applied development theories into robotics. From these theories, researchers can understand the way infants build their structures of knowledge and develop their behaviors, language, and other complex skills [115]. Then let robots learn like human infants. DR is a rapidly growing research area having increasing interest, which constitutes an interdisciplinary approach to robotics, located at the intersection of developmental psychology and robotics [116]. It is, indeed, promoted by different driving forces: engineers seeking novel robotics advancements as well as neuroscientists interested in gaining new insight from trying to embed their models into robots [117]. Until now, some advances in neuroscience research have been successfully used for robotic development, such as sensorimotor architectures, motion control strategies, and behavioral models [118, 119]. Whereas, understanding and reproducing the mechanism of human learning, emotions, and curiosity is still an open debate [120]. At the same time, for advanced robot development, it is essential to make robots with anthropomorphic and diversified emotions in order to achieve efficient communication with the environment and humans [118]. To this end, endowing the robot with the capability of the on-line assessing of infants’ affective state could potentially pave the way for innovative applications in this field.

5.2 Ethical Consideration on IRI

Evaluating the ethical aspect of IRI is crucial when referring to sensitive populations such as infants. Robots for infants are likely to be designed so that their appearance, movements, and interactions encourage infants to form a relationship with them. The main concern in literature is whether this is a form of deception or whether is it ethically acceptable [121]. Since infants have a strong social drive and a lack of technology expertise, they are more likely to overestimate the capabilities of robots that have features comparable to those of humans or animals [122, 123]. Although this aspect can positively affect applications such as rehabilitation or pediatric care, it can become worrisome if the interaction between the child and the robot is long-lasting over-time. Indeed, there could be a risk that such overestimation by the infants themselves, might result in them spending too much time with robots. This could reduce the amount of time they spend in a sensible human caregiver company and hamper the development of their understanding of how to interact with friends or other human beings [124]. Similarly, a robot is not going to be an adequate replacement for a parent in terms of emotional relationships [125].

6 Conclusion

This review is intended to provide insight on the benefits and current research challenges of infants’ emotional state recognition methodologies in the IRI field. The robotic system’s ability to recognize infants’ emotions is the leading point for providing effective and spontaneous infant robotic interaction. Indeed, the studies reported in this review demonstrated that the interaction between robots and infants can be greatly useful in healthcare, especially in robot-assisted therapies [8, 100, 107]. Further highlighting the importance of investigating the robot’s influence on infants and the infant’s affective state during robotic interaction. According to the general guidelines derived from the studies reviewed, infant affective state recognition techniques are appropriate for use in the IRI field as long as they are able to provide real-time responses and the sensor used does not interfere with the infant’s activity or does not constitute a source of stress for him/her. In this perspective, contactless techniques should be favored. These include eye gaze, facial expression analysis, and thermal IR imaging. All the considered modalities represent promising information sources for future developments. Yet, infants’ emotions recognition remains a challenge for robots due to the need for accurate and timely outcomes. Future directions on this course include the employment of innovative and accessible technologies, such as depth cameras, and smart devices, along with advances in computer vision or machine learning that could lead to rapid developments of automated emotion recognition modalities and emotions-aware robots. Furthermore, the development of a robotic platform endowed with a multimodal emotion recognition system that integrates physiological signal analysis such as thermal imaging and visible domain analysis would make a significant contribution to science. To conclude, this review encourages and outlines guidelines for the use of affective computing in IRI applications as well as potentially provide support to future developments of emotion-aware robots.

Data Availability

The data presented in this study are available on request from the corresponding author.

References

Arita A, Hiraki K, Kanda T, Ishiguro H (2005) Can we talk to Robots? Ten-Month-Old Infants Expected Interactive Humanoid Robots to be talked to by persons. Cognition 95:B49–B57. https://doi.org/10.1016/j.cognition.2004.08.001
Article Google Scholar
Zviel-Girshin R, Rosenberg N (2018) Child Friendly Robotics. In Proceedings of the Proceedings of the World Congress on Engineering; ; Vol. 1
Beran TN, Ramirez-Serrano A, Kuzyk R, Fior M, Nugent S (2011) Understanding how Children Understand Robots: Perceived Animism in Child–Robot Interaction. Int J Hum Comput Stud 69:539–550
Article Google Scholar
Toh LPE, Causo A, Tzuo P-W, Chen I-M, Yeo SH (2016) A review on the Use of Robots in Education and Young Children. J Educational Technol Soc 19:148–163
Google Scholar
Kahn Jr PH, Kanda T, Ishiguro H, Freier NG, Severson RL, Gill BT, Ruckert JH, Shen S (2012) “Robovie, you’ll have to go into the Closet Now”: children’s Social and Moral Relationships with a Humanoid Robot. Dev Psychol 48:303
Article Google Scholar
Wei C-W, Hung IA, Joyful Classroom (2011) Learning System with Robot Learning Companion for Children to learn Mathematics Multiplication. Turkish Online Journal of Educational Technology-TOJET 10:11–23
Google Scholar
Shimada M, Kanda T, Koizumi S (2012) How Can a Social Robot Facilitate Children’s Collaboration? In Proceedings of the International Conference on Social Robotics; Springer, ; pp. 98–107
Fitter NT, Funke R, Pulido JC, Eisenman LE, Deng W, Rosales MR, Bradley NS, Sargent B, Smith BA, Mataric MJ (2019) Socially Assistive Infant-Robot Interaction: using Robots to encourage Infant Leg-Motion Training. IEEE Rob Autom Magazine 26:12–23
Article Google Scholar
Boyd J, Harris J, Smith M, García-Vergara S, Chen Y-P, Howard A (2017) An Infant Smart-Mobile System to Encourage Kicking Movements in Infants at-Risk of Cerebral Palsy. In Proceedings of the 2017 IEEE Workshop on Advanced Robotics and Its Social Impacts (ARSO); IEEE, ; pp. 1–5
Falck-Ytter T, Gredebäck G, von Hofsten C (2006) Infants Predict Other People’s action goals. Nat Neurosci 9:878–879. https://doi.org/10.1038/nn1729
Article Google Scholar
Klein L, Itti L, Smith BA, Rosales M, Nikolaidis S, Mataric MJ (2019) Surprise! Predicting Infant Visual Attention in a Socially Assistive Robot Contingent Learning Paradigm. In Proceedings of the 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN); IEEE: New Delhi, India, October ; pp. 1–7
Meltzoff AN, Brooks R, Shon AP, Rao RP (2010) “Social” Robots are Psychological Agents for Infants: a test of Gaze following. Neural Netw 23:966–972
Article Google Scholar
Brooks R, Meltzoff AN (2008) Infant Gaze following and pointing Predict Accelerated Vocabulary Growth through two years of age: a longitudinal, growth curve modeling Study*. J Child Lang 35:207–220. https://doi.org/10.1017/S030500090700829X
Article Google Scholar
Cavallo F, Semeraro F, Fiorini L, Magyar G, Sinčák P, Dario P (2018) Emotion modelling for Social Robotics Applications: a review. J Bionic Eng 15:185–203
Article Google Scholar
Cen L, Wu F, Yu ZL, Hu FA, Real-Time (2016) Speech emotion Recognition System and its application in Online Learning. Emotions, technology, design, and learning. Elsevier, pp 27–46
Kapoor A, Picard RW (2005) Multimodal Affect Recognition in Learning Environments. In Proceedings of the Proceedings of the 13th Annual ACM International Conference on Multimedia; ACM: New York, NY, USA, ; pp. 677–682
Garcia-Garcia JM, Penichet VMR, Lozano MD, Garrido JE, Law EL-C (2020) Multimodal Affective Computing to Enhance the User Experience of Educational Software Applications Available online: https://www.hindawi.com/journals/misy/2018/8751426/ (accessed on 9
Malatesta CZ, Wilson A (1988) Emotion Cognition Interaction in Personality Development: a discrete emotions, Functionalist Analysis. Br J Soc Psychol 27:91–112
Article Google Scholar
Petitto LA, Katerelos M, Levy BG, Gauna K, Tétreault K, Ferraro V (2001) Bilingual signed and spoken Language Acquisition from Birth: implications for the Mechanisms underlying early bilingual Language Acquisition. J Child Lang 28:453–496. https://doi.org/10.1017/S0305000901004718
Article Google Scholar
Valenza G, Lanata A, Scilingo EP (2011) The role of Nonlinear Dynamics in Affective Valence and Arousal Recognition. IEEE Trans Affect Comput 3:237–249
Article Google Scholar
Wu S, Xu X, Shu L, Hu B (2017) Estimation of Valence of Emotion Using Two Frontal EEG Channels. In Proceedings of the IEEE international conference on bioinformatics and biomedicine (BIBM); IEEE, 2017; pp. 1127–1130
Parsons CE, Young KS, Murray L, Stein A, Kringelbach ML (2010) The functional neuroanatomy of the Evolving parent–infant relationship. Prog Neurobiol 91:220–241. https://doi.org/10.1016/j.pneurobio.2010.03.001
Article Google Scholar
Minagawa-Kawai Y, Matsuoka S, Dan I, Naoi N, Nakamura K, Kojima S (2009) Prefrontal Activation Associated with Social attachment: facial-emotion recognition in mothers and infants. Cereb Cortex 19:284–292
Article Google Scholar
Wen W, Liu G, Cheng N, Wei J, Shangguan P, Huang W (2014) Emotion Recognition based on multi-variant correlation of physiological signals. IEEE Trans Affect Comput 5:126–140
Article Google Scholar
Filippini C, Perpetuini D, Cardone D, Chiarelli AM, Merla A (2020) Thermal infrared imaging-based Affective Computing and its application to Facilitate Human Robot Interaction: a review. Appl Sci 10:2924
Article Google Scholar
Cruz-Albarran IA, Benitez-Rangel JP, Osornio-Rios RA, Morales-Hernandez LA (2017) Human Emotions Detection based on a smart-thermal system of thermographic images. Infrared Phys Technol 81:250–261
Article Google Scholar
Topalidou A, Ali N (2017) “Infrared Emotions and Behaviours”: Thermal Imaging in Psychology. International Journal of Prenatal and Life Sciences 10.24946/IJPLS 1, 65–70
Puri C, Olson L, Pavlidis I, Levine J, Starren J, StressCam (2005) : Non-Contact Measurement of Users’ Emotional States through Thermal Imaging. In Proceedings of the CHI’05 extended abstracts on Human factors in computing systems; ; pp. 1725–1728
Filippini C, Spadolini E, Cardone D, Bianchi D, Preziuso M, Sciarretta C, del Cimmuto V, Lisciani D, Merla A (2020) Facilitating the Child–Robot Interaction by endowing the Robot with the capability of understanding the Child Engagement: the case of Mio Amico Robot.International Journal of Social Robotics1–13
Belin P, Grosbras M-H (2010) Before Speech: cerebral Voice Processing in Infants. Neuron 65:733–735. https://doi.org/10.1016/j.neuron.2010.03.018
Article Google Scholar
Breazeal C, Scassellati B (2000) Infant-like social interactions between a Robot and a human caregiver. Adapt Behav 8:49–74
Article Google Scholar
Kitchenham B Procedures for Performing Systematic Reviews. 33
Belpaeme T, Baxter P, De Greeff J, Kennedy J, Read R, Looije R, Neerincx M, Baroni I, Zelati MC (2013) Child-Robot Interaction: Perspectives and Challenges. In Proceedings of the International Conference on Social Robotics; Springer, ; pp. 452–459
Kokkoni E, Arnold AJ, Baxevani K, Tanner HG (2021) Infants Respond to Robot’s Need for Assistance in Pursuing Action-Based Goals. In Proceedings of the Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction; Association for Computing Machinery: New York, NY, USA, March 8 ; pp. 47–51
Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JP, Clarke M, Devereaux PJ, Kleijnen J, Moher D (2009) The PRISMA Statement for reporting systematic reviews and Meta-analyses of studies that evaluate Health Care Interventions: explanation and elaboration. J Clin Epidemiol 62:e1–e34
Article Google Scholar
Tanaka F, Isshiki K, Takahashi F, Uekusa M, Sei R, Hayashi K (2015) Pepper Learns Together with Children: Development of an Educational Application. In Proceedings of the 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids); IEEE: Seoul, South Korea, November ; pp. 270–275
Cavallo F, Aquilano M, Bonaccorsi M, Limosani R, Manzi A, Carrozza MC, Dario P (2014) Improving Domiciliary Robotic Services by integrating the ASTRO Robot in an AmI infrastructure. Gearing up and accelerating cross-fertilization between academic and industrial robotics research in Europe:. Springer, pp 267–282
Sancarlo D, D’Onofrio G, Oscar J, Ricciardi F, Casey D, Murphy K, Giuliani F, Greco A (2016) Mario Project: A Multicenter Survey about Companion Robot Acceptability in Caregivers of Patients with Dementia. In Proceedings of the Italian Forum of Ambient Assisted Living; Springer, ; pp. 311–336
Banks MR, Willoughby LM, Banks WA (2008) Animal-assisted therapy and loneliness in nursing Homes: Use of Robotic versus Living Dogs. J Am Med Dir Assoc 9:173–177. https://doi.org/10.1016/j.jamda.2007.11.007
Article Google Scholar
Dautenhahn K, Woods S, Kaouri C, Walters ML, Koay KL, Werry I (2005) What Is a Robot Companion-Friend, Assistant or Butler? In Proceedings of the 2005 IEEE/RSJ international conference on intelligent robots and systems; IEEE, ; pp. 1192–1197
Hegel F, Lohse M, Swadzba A, Wachsmuth S, Rohlfing K, Wrede B (2007) Classes of Applications for Social Robots: A User Study. In Proceedings of the RO-MAN 2007-The 16th IEEE International Symposium on Robot and Human Interactive Communication; IEEE, ; pp. 938–943
MacDorman KF (2006) Subjective Ratings of Robot Video Clips for Human Likeness, Familiarity, and Eeriness: An Exploration of the Uncanny Valley. In Proceedings of the ICCS/CogSci-2006 long symposium: Toward social mechanisms of android science; ; pp. 26–29
Belpaeme T, Baxter P, Read R, Wood R, Cuayáhuitl H, Kiefer B, Racioppa S, Kruijff-Korbayová I, Athanasopoulos G, Enescu V (2012) Multimodal Child-Robot Interaction: Building Social Bonds. Journal of Human-Robot Interaction 1
Rani P, Liu C, Sarkar N, Vanman E (2006) An empirical study of machine learning techniques for Affect Recognition in Human–Robot Interaction. Pattern Anal Applic 9:58–69. https://doi.org/10.1007/s10044-006-0025-y
Article Google Scholar
Camras LA, Shutter JM (2010) Emotional facial expressions in infancy. Emot Rev 2:120–129. https://doi.org/10.1177/1754073909352529
Article Google Scholar
Xuan Z, Hongzhi Hu, Lina W, Dai JW (2016) W.; Huajuan Mao Recognition of Infant’s Emotions and Needs from Speech Signals. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC); October 2016; pp. 004620–004625
Petitto LA, Holowka S, Sergio LE, Ostry D (2001) Language Rhythms in Baby Hand Movements. Nature 413:35–36. https://doi.org/10.1038/35092613
Article Google Scholar
Pruitt MM (2018) The Physiological Mechanisms of Emotion Regulation During Infancy: The Role of Maternal and Paternal Behaviors; Texas Christian University, ; ISBN 0-355-95982-8
Children Y, Infants Y, Sullivan MW, Lewis PMA Practitioner’s Primer
Matias R, Cohn JF, Ross SA (1989) Comparison of two Systems that code infant affective expression. Dev Psychol 25:483
Article Google Scholar
Izard CE, Weiss M (1979) Maximally discriminative Facial Movement Coding System. University of Delaware, instructional resources Center
Sayette MA, Cohn JF, Wertz JM, Perrott MA, Parrott DJ (2001) A psychometric evaluation of the Facial Action Coding System for assessing spontaneous expression. J Nonverbal Behav 25:167–185
Article Google Scholar
Cohn JF, Tronick EZ (1987) Mother–infant face-to-face Interaction: the sequence of Dyadic States at 3, 6, and 9 months. Dev Psychol 23:68
Article Google Scholar
Tracy JL, Robins RW, Schriber RA (2009) Development of a FACS-Verified Set of Basic and Self-Conscious emotion expressions. Emotion 9:554
Article Google Scholar
Kaiser S, Wehrle T (1992) Automated coding of facial behavior in human-computer interactions with FACS. J Nonverbal Behav 16:67–84
Article Google Scholar
Chaplin TM, Aldao A (2013) Gender differences in emotion expression in children: a Meta-Analytic Review. Psychol Bull 139:735–765. https://doi.org/10.1037/a0030737
Article Google Scholar
Brody I LR (1999). Gender, Emotion, and the Family; Cambridge, MA:Harvard University Press
Feldman R (2003) Infant–mother and Infant–Father Synchrony: the coregulation of positive Arousal. Infant Mental Health Journal: Official Publication of The World Association for Infant Mental Health 24:1–23
Article Google Scholar
Beebe B, Jaffe J, Markese S, Buck K, Chen H, Cohen P, Bahrick L, Andrews H, Feldstein S (2010) The Origins of 12-Month attachment: a Microanalysis of 4-Month Mother–Infant Interaction. Attach Hum Dev 12:3–141
Article Google Scholar
Schmidt KL, Cohn JF (2001) Human facial expressions as Adaptations: evolutionary questions in facial expression research. Am J Phys Anthropology: Official Publication Am Association Phys Anthropologists 116:3–24
Article Google Scholar
Feldman R (2007) Parent–infant synchrony: Biological Foundations and Developmental Outcomes. Curr Dir Psychol Sci 16:340–345
Article Google Scholar
Moore GA, Cohn JF, Campbell SB (2001) Infant affective responses to Mother’s still face at 6 months differentially predict externalizing and internalizing behaviors at 18 months. Dev Psychol 37:706
Article Google Scholar
Hart SL, Carrington HA, Tronick EZ, Carroll SR (2004) When Infants Lose Exclusive Maternal Attention: Is It Jealousy? Infancy 6:57–78
Google Scholar
Breau LM, McGrath PJ, Craig KD, Santor D, Cassidy K-L, Reid GJ (2001) Facial expression of children receiving Immunizations: a principal components analysis of the child facial coding system. Clin J Pain 17:178–186
Article Google Scholar
Sullivan MW, Lewis M (2003) Emotional expressions of Young Infants and Children: a practitioner’s primer. Infants & Young Children 16:120–142
Article Google Scholar
Field TM, Woodson R, Greenberg R, Cohen D (1982) Discrimination and imitation of facial expression by neonates. Science 218:179–181
Article Google Scholar
Hesse N, Schroder AS, Muller-Felber W, Bodensteiner C, Arens M, Hofmann UG (2017) Body Pose Estimation in Depth Images for Infant Motion Analysis. Annu Int Conf IEEE Eng Med Biol Soc 2017, 1909–1912, https://doi.org/10.1109/EMBC.2017.8037221
Olsen MD, Herskind A, Nielsen JB, Paulsen RR (2014) Model-Based Motion Tracking of Infants. In Proceedings of the European Conference on Computer Vision; Springer, ; pp. 673–685
Serrano MM, Chen Y-P, Howard A, Vela PA (2016) Lower Limb Pose Estimation for Monitoring the Kicking Patterns of Infants. In Proceedings of the 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC); IEEE, ; pp. 2157–2160
Ladd GW (1999) Peer Relationships and Social competence during early and middle childhood. Ann Rev Psychol 50:333–359
Article Google Scholar
Fox NA, Kirwan M, Reeb-Sutherland B (2012) Physiological measures of emotion from a developmental perspective: state of the Science: measuring the physiology of emotion and emotion regulation—timing is everything.Monographs of the Society for Research in Child Development
Suurland J, van der Heijden KB, Smaling HJ, Huijbregts SC, van Goozen SH, Swaab H (2017) Infant autonomic nervous system response and recovery: Associations with maternal risk status and infant emotion regulation. Dev Psychopathol 29:759–773
Article Google Scholar
Morris SS, Musser ED, Tenenbaum RB, Ward AR, Martinez J, Raiker JS, Coles EK, Riopelle C (2020) Emotion regulation via the autonomic nervous system in children with Attention-Deficit/Hyperactivity disorder (ADHD): replication and extension. J Abnorm Child Psychol 48:361–373
Article Google Scholar
Bazhenova OV, Plonskaia O, Porges SW (2001) Respiratory sinus arrhythmia. Child Dev 72:1314–1326
Article Google Scholar
Conradt E, Ablow J (2010) Infant physiological response to the still-face paradigm: contributions of maternal sensitivity and infants’ early Regulatory Behavior. Infant Behav Dev 33:251–265
Article Google Scholar
Shahrestani S, Stewart EM, Quintana DS, Hickie IB, Guastella AJ (2014) Heart rate variability during social interactions in children with and without psychopathology: a Meta-analysis. J Child Psychol Psychiatry 55:981–989
Article Google Scholar
Gunning M, Halligan SL, Murray L (2013) Contributions of maternal and infant factors to infant responding to the still face paradigm: a longitudinal study. Infant Behav Dev 36:319–328
Article Google Scholar
Nasihati Gilani S, Traum D, Merla A, Hee E, Walker Z, Manini B, Gallagher G, Petitto L-A (2018) on International Conference on Multimodal Interaction - ICMI ’18; ACM Press: Boulder, CO, USA, 2018; pp. 5–13
Filippini C, Perpetuini D, Cardone D, Chiarelli AM, Merla A Thermal Infrared Imaging and Artificial Intelligence Techniques Can Support Mild Alzheimer Disease Diagnosis. 9
Aureli T, Grazia A, Cardone D, Merla A (2015) Behavioral and facial thermal variations in 3-to 4-Month-Old Infants during the still-face paradigm. Front Psychol 6. https://doi.org/10.3389/fpsyg.2015.01586
Mizukami K, Kobayashi N, Ishii T, Iwata H (1990) First selective attachment begins in early infancy: a study using Telethermography. Infant Behav Dev 13:257–271
Article Google Scholar
Nakanishi R, Imai-Matsumura K (2008) Facial skin temperature decreases in infants with joyful expression. Infant Behav Dev 31:137–144. https://doi.org/10.1016/j.infbeh.2007.09.001
Article Google Scholar
Cirelli LK, Jurewicz ZB, Trehub SE Effects of Maternal Singing Style on Mother–Infant Arousal and Behavior. 9
Bainbridge C, Youngers J, Bertolo M, Atwood S, Lopez K, Xing F, Martin A, Mehr S (2020) Infants Relax in Response to Unfamiliar Foreign Lullabies.
Egger M, Ley M, Hanke S (2019) Emotion recognition from physiological Signal analysis: a review. Electron Notes Theor Comput Sci 343:35–55
Article Google Scholar
Walker-Andrews AS, Krogh-Jespersen S, Mayhew EMY, Coffield CN (2011) Young infants’ generalization of emotional expressions: Effects of Familiarity. Emotion 11:842–851. https://doi.org/10.1037/a0024435
Article Google Scholar
Elfenbein H, Ambady N (2002) On the Universality and Cultural specificity of emotion recognition: a Meta-analysis. Psychol Bull 128:203–235. https://doi.org/10.1037/0033-2909.128.2.203
Article Google Scholar
Tang C, Zheng W, Zong Y, Cui Z, Qiu N, Yan S, Ke X (2018) Automatic Smile Detection of Infants in Mother-Infant Interaction via CNN-Based Feature Learning. In Proceedings of the Proceedings of the Joint Workshop of the 4th Workshop on Affective Social Multimedia Computing and first Multi-Modal Affective Computing of Large-Scale Multimedia Data; ; pp. 35–40
Hammal Z, Chu W-S, Cohn JF, Heike C, Speltz ML (2017) Automatic Action Unit Detection in Infants Using Convolutional Neural Network. Int Conf Affect Comput Intell Interact Workshops 2017, 216–221, https://doi.org/10.1109/ACII.2017.8273603
Jafri R, Arabnia HA (2009) Survey of Face Recognition Techniques. JIPS 5, 41–68, https://doi.org/10.3745/JIPS.2009.5.2.041
Jerritta S, Murugappan M, Nagarajan R, Wan K (2011) Physiological Signals Based Human Emotion Recognition: A Review. In Proceedings of the 2011 IEEE 7th International Colloquium on Signal Processing and its Applications; IEEE, ; pp. 410–415
Goulart C, Valadão C, Delisle-Rodriguez D, Funayama D, Favarato A, Baldo G, Binotte V, Caldeira E, Bastos-Filho T (2019) Visual and thermal image Processing for Facial Specific Landmark detection to Infer Emotions in a Child-Robot Interaction. Sensors 19:2844. https://doi.org/10.3390/s19132844
Article Google Scholar
Szabóová M, Sarnovský M, Maslej Krešňáková V, Machová K (2020) Emotion analysis in Human–Robot Interaction. Electronics 9:1761. https://doi.org/10.3390/electronics9111761
Article Google Scholar
Spezialetti M, Placidi G, Rossi S (2020) Emotion Recognition for Human-Robot Interaction: recent advances and future perspectives. Front Robot AI 7. https://doi.org/10.3389/frobt.2020.532279
Michaud F, Salter T, Duquette A, Laplante J-F (2007) Perspectives on Mobile Robots as Tools for Child Development and Pediatric Rehabilitation. Assist Technol 19:21–36
Article Google Scholar
Funke R, Fitter NT, de Armendi JT, Bradley NS, Sargent B, Mataric MJ, Smith BA (2018) A Data Collection of Infants’ Visual, Physical, and Behavioral Reactions to a Small Humanoid Robot. In Proceedings of the 2018 IEEE Workshop on Advanced Robotics and its Social Impacts (ARSO); IEEE: Genova, Italy, September ; pp. 99–104
Frank MC, Vul E, Johnson SP (2009) Development of Infants’ attention to faces during the First Year. Cognition 110:160–170
Article Google Scholar
Kokkoni E, Mavroudi E, Zehfroosh A, Galloway JC, Vidal R, Heinz J, Tanner HG (2020) GEARing Smart environments for Pediatric Motor Rehabilitation. J Neuroeng Rehabil 17:16
Article Google Scholar
Mahzoon H, Yoshikawa Y, Ishiguro H (2019) Ostensive-cue sensitive learning and exclusive evaluation of policies: a solution for Measuring Contingency of Experiences for Social Developmental Robot. Front Rob AI 6:2
Article Google Scholar
Pulido JC, Funke R, García J, Smith BA, Matarić M (2018) Adaptation of the Difficulty Level in an Infant-Robot Movement Contingency Study. In Proceedings of the Workshop of Physical Agents; Springer, ; pp. 70–83
Galloway JCC, Ryu J-C, Agrawal SK (2008) Babies driving Robots: self-generated mobility in very young infants. Intel Serv Robot 1:123–134
Article Google Scholar
Chen X, Ragonesi C, Galloway JC, Agrawal SK (2011) Training toddlers seated on Mobile Robots to drive indoors amidst obstacles. IEEE Trans Neural Syst Rehabil Eng 19:271–279
Article Google Scholar
Agrawal SK, Chen X, Ragonesi C, Galloway JC (2011) Training toddlers seated on Mobile Robots to steer using Force-Feedback Joystick. IEEE Trans Haptics 5:376–383
Article Google Scholar
Chen X, Ragonesi C, Agrawal SK, Galloway JC (2010) Training Toddlers Seated on Mobile Robots to Drive Indoors amidst Obstacles.; ; pp. 576–581
Scassellati B (2007) How Social Robots Will help us to diagnose, treat, and Understand Autism. Robotics research. Springer, pp 552–563
Pitsch K, Koch B (2010) How Infants Perceive the Toy Robot Pleo. an Exploratory Case Study on Infant-Robot-Interaction. In Proceedings of the Second International Symposium on New Frontiers in Human-Robot-Interaction (AISB);
Kahn PH, Friedman B, Perez-Granados DR, Freier NG (2006) Robotic Pets in the lives of Preschool Children. Interact Stud 7:405–436
Article Google Scholar
Michaud F, Laplante J-F, Larouche H, Duquette A, Caron S, Létourneau D, Masson P (2005) Autonomous Spherical Mobile Robot for Child-Development Studies. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 35, 471–480
Scassellati B, Shapiro A, Traum D, Petitto L-A, Brawer J, Tsui K, Nasihati Gilani S, Malzkuhn M, Manini B, Stone A et al (2018) CHI Conference on Human Factors in Computing Systems - CHI ’18; ACM Press: Montreal QC, Canada, 2018; pp. 1–13
Petitto LA, Berens MS, Kovelman I, Dubins MH, Jasinska K, Shalinsky M (2012) The “Perceptual Wedge Hypothesis” as the basis for bilingual babies’ phonetic Processing advantage: New Insights from FNIRS Brain Imaging. Brain Lang 121:130–143. https://doi.org/10.1016/j.bandl.2011.05.003
Article Google Scholar
Cohen LB (1973) A two process model of infant visual attention. Merrill-Palmer Q Behav Dev 19:157–180
Google Scholar
Brethes L, Lerasle F, Danes P (2005) Data Fusion for Visual Tracking Dedicated to Human-Robot Interaction. In Proceedings of the Proceedings of the IEEE International Conference on Robotics and Automation; IEEE, 2005; pp. 2075–2080
Gao Q, Liu J, Ju Z (2021) Hand Gesture Recognition using Multimodal Data Fusion and Multiscale parallel convolutional neural network for Human–Robot Interaction. Expert Syst 38:e12490
Article Google Scholar
Cangelosi A, Schlesinger M (2018) From babies to Robots: the contribution of Developmental Robotics to Developmental psychology. Child Dev Perspect 12:183–188
Article Google Scholar
Huang K, Ma X, Tian G, Li Y (2017) Autonomous Cognitive Developmental Models of Robots-a Survey. In Proceedings of the 2017 Chinese Automation Congress (CAC); IEEE, ; pp. 2048–2053
Li K, Meng MQ-H (2015) Learn like Infants: a strategy for Developmental Learning of Symbolic Skills using Humanoid Robots. Int J Social Robot 7:439–450
Article Google Scholar
Lungarella M, Metta G (2003) Beyond Gazing, Pointing, and Reaching: A Survey of Developmental Robotics.
Li J, Li Z, Chen F, Bicchi A, Sun Y, Fukuda T (2019) Combined sensing, Cognition, Learning, and control for developing Future Neuro-Robotics Systems: a Survey. IEEE Trans Cogn Dev Syst 11:148–161
Article Google Scholar
Luo D, Hu F, Deng Y, Liu W, Wu X (2016) An Infant-Inspired Model for Robot Developing Its Reaching Ability. In Proceedings of the 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob); IEEE, ; pp. 310–317
Dietrich MK (2019) Doing the Robot: Theorizations of Transduction & Emotional Data Collection.
Sharkey A, Sharkey N (2011) Children, the Elderly, and Interactive Robots. IEEE Rob Autom Magazine 18:32–38
Article Google Scholar
Pitti A (2018) Ideas from Developmental Robotics and Embodied AI on the Questions of Ethics in Robots. arXiv preprint arXiv:1803.07506
Ros R, Nalin M, Wood R, Baxter P, Looije R, Demiris Y, Belpaeme T, Giusti A, Pozzi C (2011) Child-Robot Interaction in the Wild: Advice to the Aspiring Experimenter. In Proceedings of the Proceedings of the 13th international conference on multimodal interfaces; ; pp. 335–342
Sharkey N, Sharkey A (2010) The crying shame of Robot Nannies: an ethical Appraisal. Interact Stud 11:161–190
Article Google Scholar
Weber JH, Machines, and True Loving Care Givers (2005) : A Feminist Critique of recent Trends in human-robot Interaction.Journal of Information, CommunicationEthics in Society

Download references

Funding

This research was funded by PON MIUR SI-ROBOTICS, ARS01_01120.

Open access funding provided by Università degli Studi G. D’Annunzio Chieti Pescara within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Department of Neurosciences, Imaging and Clinical Sciences, University G. d’Annunzio of Chieti, Pescara, Italy
Chiara Filippini & Arcangelo Merla
Next2U s.r.l, Pescara, Italy
Arcangelo Merla

Authors

Chiara Filippini
View author publications
You can also search for this author in PubMed Google Scholar
Arcangelo Merla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chiara Filippini.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Filippini, C., Merla, A. Systematic Review of Affective Computing Techniques for Infant Robot Interaction. Int J of Soc Robotics 15, 393–409 (2023). https://doi.org/10.1007/s12369-023-00985-3

Download citation

Accepted: 20 February 2023
Published: 07 March 2023
Issue Date: March 2023
DOI: https://doi.org/10.1007/s12369-023-00985-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Systematic Review of Affective Computing Techniques for Infant Robot Interaction

Abstract

Similar content being viewed by others

Survey of Emotions in Human–Robot Interactions: Perspectives from Robotic Psychology on 20 Years of Research

A Review of Cognitive Psychology Applied in Robotics

Human–Robot Interactions and Affective Computing: The Ethical Implications