1 Introduction

Emotion detection has recently become an important research topic. In the last ten years, up to 300.000 papers about emotion detection have been published, according to Google Scholar [22]. Although a part of this research effort is focused on creating emotion detectors, there is also a big effort dedicated to the integration of these detectors into final products in order to improve the user experience. Being able to know how users feel while using a product and, more importantly, being able to change the product’s behavior so that the user experience is the best possible for each specific user, is a powerful tool that was not previously available in the field of human–computer interaction research. Moreover, this information about a person’s emotions is valuable not just for the researchers studying a product’s user experience or users’ behavior, but for the users themselves. Emotional awareness, or the ability to be aware of, and identify, internal emotional states [47] leads us to having a better understanding of our own emotions and to being able to better regulate the affect within ourselves and others, which contributes to improving our well-being [45]. Developing our emotional awareness also helps us in building our emotional intelligence, which has been proved to benefit individuals in several dimensions of their lives, including their academic and professional life [34].

Emotional awareness, i.e., the ability to recognize one’s own emotions, and emotional intelligence in general, is an ability that humans learn and develop throughout their lives. However, not every individual is equally able to cultivate this skill. For instance, people with autism spectrum disorder (ASD) find it hard to recognize emotions in others, as well as to understand and handle their own emotions [4]. This impairment (external emotion recognition) is related to their problems with paying attention and distinguishing faces. Fortunately, these emotional-intelligence-related skills can be learnt and trained, as we can see in the existent literature. In [12], Dawson et al. reviewed studies covering a long period of time which show how children with ASD can overcome one of the challenges that people with this disorder confront: facial recognition. Daou et al. [11] also reviewed existing literature to collect studies about teaching emotion expression and recognition to children with ASD, and most of these studies reported positive results. Since ASD is mostly correctly diagnosed during early childhood, the sooner this emotional intelligence education starts, the easier it will be for children to apply this knowledge in adulthood [12]. Technology has proven to be a powerful ally in this learning process. Yeni et al. [56] reviewed studies which used educational mobile applications (that is, applications running on mobiles or tablets) to teach different abilities to people with intellectual disabilities, showing that not only do they accept technology very easily, but also that they can become fluent using the portable device on which the application is running and learn the skill they are working on with constant practice. Nisiforou et al. [36] took a step forward in this type of literature reviews by examining works that use technology to teach skills specifically to children with disabilities, showing the popularity of works involving games or educational applications and robots.

Games are, in fact, an effective tool to teach new things to children [37]. Nowadays, when games are used with a purpose that is not simply entertainment they are referred to as serious games. Serious games aim to promote learning through entertainment, exploiting the cognitive benefits of games to ease the learning process [14]. Games can catch children’s attention, and this also includes children with ASD [24]. However, a game must meet a certain criterion to keep this attention, i.e., not every game is equally appealing for every person. Even when a serious game is used in a traditionally serious environment (e.g., a classroom), it must keep its fun component and it must keep the players engaged with it, introducing variability and challenges, ensuring that it is difficult enough to prevent the players from getting bored, but easy enough to avoid frustration. According to the existing literature, serious games have previously been used to teach emotional-intelligence-related skills with positive results [33], and we have based our proposal on this knowledge.

Considering the context described above, we have developed a serious game for Android devices combining affective computing and tangible user interfaces (TUI) [30]. Emotion detection technology is used to detect what emotion the player is expressing, in order to help them develop their emotional awareness, and tangible user interfaces are used because of their well-known beneficial influence on learning [30][43], and to overcome the difficulties that children with ASD experience when using computers.

The game consists of three parts, each one with its own goal and involving a different interaction style in order to use this variability to attract the player’s attention. These different parts are, in turn, three different games:

  • Game 1—Recognition of emotions using TUI. This game starts with players being prompted with a picture depicting an emotion in the app. While keeping this emotion in mind, they must go through a set of physical cards, each of them used as tangible objects by using NFC technology, looking for the card which best represents the emotion prompted. After the player has decided which card best expresses this emotion, they must bring it close to the NFC reader, i.e., the device where the app is running. If the card chosen is the right one, the player is prompted with the next emotion. If the chosen card does not represent the requested emotion, an error message is shown;

  • Game 2—Depiction of emotion. In the second game, the players are prompted again with a picture depicting an emotion. However, in this case they must express this emotion themselves. Using the device’s camera, emotion detection services are used to recognize what emotion the player is expressing with his/her face;

  • Game 3—Recognition of emotions in the wild using TUI. In this last phase, players have to recognize emotions again. However, instead of being prompted with a picture, they are shown a piece of video in which a specific emotion is being displayed. Using the cards from the first game (tangible user interfaces), the players must indicate what emotion is being shown in the video they are watching.

As was specified by the user requirements, in order to represent emotions in games 1 and 3 we have used pictograms as well as pictures of real people, so players have different references to learn about feelings and how to identify them. We turned these pictures into tangible user interfaces by using NFC tags hidden within them. Previously, these tags were programmed with their corresponding emotion name, e.g., the NFC tag attached to the picture of a woman smiling had the value “Happy” loaded onto it. For the part of the game entailing emotion detection, after reviewing several technologies [19] we chose Affectiva, which offers facial-expression-based emotion detection via a straightforward SDK, and was easy to integrate into the main app. As part of the application usage flow, players have to log in using their credentials (previously, they should have signed up in the app) and then choose the game they want to play.

The target audience of the application is children aged between 6 and 12 years old with ASD, so it is an audience mature enough to use electronic devices safely but still at an early phase in their education, so the learning of new emotional skills has a greater impact.

Regarding the evaluation of this tool, we carried out a preliminary evaluation of the system with the assistance of experts in the problem domain from the Association “Autism Development” to assess both the teaching capabilities of the tool and its usability. Two specialists from the Association, as well as three children with ASD, participated in this assessment, whose goal was to assess the acceptance of the system as a suitable tool to be used to teach emotions to children with ASD and to test the usability of the first prototype of this application.

It is important to point out that the key contribution of this article lies in the integration of automatic emotion recognition technologies and tangible user interfaces in a serious game in order to obtain a more natural interaction, which is an essential feature especially when working with children with ASD. While the use of games and software applications to teach skills to children with ASD is not something new, this kind of tools usually requires the assistance of a therapist, a parent or a caretaker. By using automatic emotion recognition and a more natural interaction mechanism (TUI), we have achieved a very user-friendly application, as the evaluation with experts has shown.

This paper is divided into six sections, including the current one. Section 2 introduces some key concepts that support the decisions taken during the development of the system. We provide a detailed description of the system developed in Sect. 3. Section 4 presents the evaluation process carried out with children with ASD and specialists from the Association. Section 5 reviews the outcomes of the evaluation process. Finally, Sect. 6 presents the main conclusions and lines for future work.

2 Background concepts and related work

2.1 Autism spectrum disorder

According to the Diagnostic and Statistical Manual of Mental Disorders, autism spectrum disorder (ASD), commonly referred to as autism, is a neurodevelopmental disorder characterized by persistent deficits in social communication and social interaction across multiple contexts and restricted, repetitive patterns of behavior, interests, or activities, with these symptoms being shown in the early developmental period [4]. Every case of autism is unique since autism encompasses a whole spectrum: some cases may be mild, while others may be severe with regard to the symptoms. In fact, in 2013 the term ASD became in an umbrella term for a set of behavior disorders, namely early infantile autism, childhood autism, Kanner’s autism, high-functioning autism, atypical autism, pervasive developmental disorder, childhood disintegrative disorder, and Asperger’s disorder. Furthermore, since there is no cure [7], early diagnosis is very important, since the sooner this disorder is detected, the sooner the treatment can begin. Treatment includes occupational therapy, applied behavioral analysis, sensory integration therapy, etc. [51]. Again, although autism is not a curable disorder, the aforementioned treatments can help decrease the social deficits associated with ASD.

As part of the diagnosis process, the person must be assessed to establish the presence and/or severity of the symptoms. For instance, in the case of impairments related to communication and social interaction, these symptoms are pervasive in all kinds of social interaction and sustained in time. In order to obtain the most reliable and valid assessment of these impairments, we must gather all the information available: clinicians’ observations, the caregiver’s history, colleagues’ impressions and, when possible, a self-report. Not only do we need to assess impairments in communication, but also deficits in social-emotional reciprocity. Even though people with ASD may be able to communicate correctly from a formal point of view (correct grammar, good vocabulary, etc.), they may still struggle to engage in social interaction due to not knowing what tone or attitude adopt on each occasion, not understanding how the other person is feeling, avoiding eye contact, etc. Lack of reciprocity is what characterizes social interaction with a person suffering from ASD [4].

Another characteristic symptom of ASD is restricted, repetitive patterns of behavior, activity or movement. Examples of this repetition are repeating movements over and over, aligning or ordering objects in a specific way, the parroting of heard words (echolalia), etc. This repetition also manifests itself through the adoption of routines, the ritualization of certain patterns (doing something by always following the same sequence of tasks), something that in the end evolves into a huge resistance to change. These routines can sometimes be the result of hypo- or hyperreactivity to certain stimuli, that is, an excessive fascination for, or rejection of, something involving taste, smell, texture or appearance, or rituals involving these senses.

In short, poor social skills and emotional instability are inherent in people with ASD, with the severity of these deficits varying to a great extent depending upon the type of ASD the person has. Even though autism (ASD) does not have, strictly speaking, a cure, therapy can help people with this disorder become more independent and improve their social communication and interaction skills. One of the methods applied in therapy is social skill groups. This type of therapy takes place once a week over 12 weeks or more and entails a group of between two and six individuals with ASD being led by one to three therapists. During these sessions, which last from 60 to 90 min, the therapists give a lesson about a specific social skill, including role playing, to practice this skill and promote a discussion about the whole lesson [44]. This form of therapy affects a person’s social functioning by providing a learning environment that allows immediate rehearsal and practice.

In contrast to this form of therapy, other types of therapy focus on improving communication in general, and for this purpose different protocols are deployed. For instance, when the goal of a therapy is speech production, speech imitation protocols are used. However, this approach presents some drawbacks. For example, since the subjects learn to imitate the teacher’s speech in a formal environment, they fail to generalize this new skill to new environments and social interactions [7].

As alternatives to speech imitation, other communication protocols have been used, such as sign language or picture-based communication systems. Nevertheless, like speech imitation, these protocols present several drawbacks, such as the difficulty of learning sign language, or the inaccuracy of picture-point systems. These systems usually fail because they do not consider the point of view of the child. For instance, they assume that once the child knows the word to name something or someone, he or she will be able to use it in all contexts [8], and this is simply not true.

The Picture Exchange Communication System (PECS) proposed in 1994 [8] represented an approach that corrected the flaws of previous communication systems. This system suggests several phases, all of them with their own prompting, reinforcement, and error correction strategies, based on the principles of applied behavioral analysis, to teach spontaneous, functional communication to children. In the course of these phases, children are taught how to communicate using pictograms, to go through their pictograms to find the most suitable image for some answer, how to prompt a social interaction, how to comment on something, etc. [7].

Apart from PECS, other proposals based on showing pictures, particularly pictures of faces, to children with ASD have been made. In [21], Golan et al. developed a children’s animation series called “The Transporters,” which was about eight characters who were vehicles with human faces, to teach children with ASD about facial expressions and emotions expressed through this mechanism. According to the systemizing theory of autism, individuals with ASD have intact, or even enhanced, systemizing skills, which help them understand and analyze rule-based systems, find patterns, and so on. A good example of a rule-based system is vehicles such as trains or cable cars, which only move back and forth along linear tracks, making them a predictable system. This series was built upon the following idea: since the vehicles in the show, which have faces of several actors and actresses expressing emotions attached to them, make up a “safe space,” children will pay more attention to them (even without realizing they are doing so) instead of avoiding the faces, helping them learn about emotional expressions [21]. Faces on vehicles are considered a “safe space” by children with ASD because vehicles are rule-based systems, they are predictable, in contrast to human bodies, which move in unexpected ways. This study has been replicated and reviewed [5]1155, and the results, as well as their generalization, appear to be valid. In our game, we take advantage of this concept, namely the idea of attaching faces to predictable elements, in the form of tangible user interfaces, though PECS, using pictograms together with real photographs, to teach the different emotions to children and to help them generalize this knowledge. In our setup, each physical image becomes a tangible user interface that children will use to tell the app what emotion they have been requested to recognize during the different games.

2.2 Emotions

While they are pervasive in every aspect of our lives and are receiving more and more attention every day, emotions are still difficult to define and classify. If we start tracing the definitions of emotions back in time, we find endless debates and endless definitions. Aristotle proposed his own taxonomy of emotions in 400 B.C. The catalogue of proposals is so extensive that, in 1981, the authors of [28] gathered 92 different definitions of emotion, each one considering a different aspect of the same topic. For now, and following the trends of affective computing over the last few years, we will take an emotion to be a physical reaction of the body, caused by the limbic system, to some event or circumstance. This reaction can be either perceptible for external observers (changes in the tone of voice, facial expressions, body gestures) or imperceptible (heartbeat, electrical brain activity, etc.). We will look at this more closely in the following subsection.

Two of the most popular proposals regarding emotions and their classification were made by Robert Plutchik and Paul Eckman. Robert Plutchik proposed a model based on a 2D/3D “flower” of emotions. In Plutchik’s model, called the wheel of emotions (Fig. 1), every human emotion is a combination of several primary emotions, namely ecstasy, admiration, terror, amazement, grief, loathing, rage and vigilance [42]. Each primary emotion can lead to others, depending on the degree of intensity with which someone feels it. The rest of the emotions are combinations of these primary emotions.

Fig. 1
figure 1

Wheel of emotions [53]

The other proposal was made by the psychologist Paul Ekman. One of the topics Ekman studied was the universality of emotions. Following the Darwinian view of emotions, Ekman wanted to prove that emotions, or at least a subset of them, were inherent in every human [15]. As part of this study, he developed the Facial Action Codification System, a system which identifies 42 points on the face, the eyes and the head and uses them to identify an emotion [16]17. In this way, a facial expression can be defined by the position of a set of 42 points on the face. By codifying the human face as a set of numbers, FACS opened the door to the creation of emotion detectors based on facial expression using machine learning and automatic classifiers.

As per other studies carried out by Ekman, he discovered that there were six emotions which were universal to every human being, regardless of culture or education, since they were hardcoded into our DNA, following the Darwinian explanation for the origin of emotions [15]. These universal emotions are joy, sadness, anger, surprise, fear and disgust. However, since this study was published, some researchers have reviewed it, and find flaws and holes in this universality [52]. Some of these studies have even proposed a different number of universal emotions [26]. Despite the imperfections in Ekman’s theory, the six-basic-emotions approach is widely extended in the affective computing field, it being the de facto classification system used by emotion detectors to express what emotions have been found.

With regard to children with ASD, in order to get them to express an emotion, they must first learn how to identify it, and this is what games or practice exercises are for. Therefore, the game we have developed has three well-distinguished parts: a first part in which the children learn to identify an emotion, a second part in which they learn how to express it themselves, and, finally, a third part in which they learn to identify these emotions in the wild, in a spontaneous situation.

2.3 Affective computing

Affective computing (AC), as it was defined in 1995, is any form of computing that relates to, arises from, or influences emotions [40]. Although affective computing presents several lines of work, one of the most popular is automatic emotion detection, i.e., the use of automatic classifiers to detect emotions in a voice, in a face, etc. As we mentioned in the previous subsection, in the field of automatic emotion detection, we understand an emotion to be its physical manifestation in the body. Hence, emotion detection encompasses the detection of those physical manifestations and the subsequent analysis of those signals.

In order to read these data, we need different types of sensors, depending on what type of information is going to be collected [19]. For instance, to read someone’s facial expression or body language, we need a camera or some device such as Kinect that allows us to capture images and to track the human body. Along these lines, there are also devices that allow us to track the user’s eyes, which, in the end, means that we can know what the user is looking at, for how long, etc. If we wish to analyze someone’s voice, we need a microphone; to read things such as someone’s heartbeat or muscle activity, we need more invasive tools, such as a wristband with sensors, electrodes placed on the part whose electrical activity we are going to measure, with all this connected to a controller, such as a computer or a Raspberry microcontroller.

Regarding emotion detection, a new trend has emerged in the field of HCI research that is based on the following idea: what if we could detect how users are feeling while using an application or system and use this information to change the behavior of that system in order to make the users’ experience as good as possible? With this idea in mind, much research work has been carried out over the past few years, and indeed our work forms part of this new trend for combining HCI and AC. However, marketing and user experience (UX) are not the only fields which are applying these AC-related technologies. In [31] the authors review how affective computing technologies and strategies have been applied to improve the lives of children with ASD. For this purpose, computer software that detects behavioral signs of emotions and models emotional functioning was used. Facial expressions, vocalizations, electrodermal activity, affective intelligent tutoring systems have been used to try to help children with ASD overcome their social deficits, and these studies obtained acceptable results, although there is a consensus about the need to continue replicating these experiments in order to actually systematize the application of AC technologies in ASD-related therapies [31],35.

With regard to the affective computing component included in this work, we have focused on emotion detection based on facial expressions, and we have integrated this kind of detection technology in our software application as part of one of the games. We chose this form of emotion detection over the other ones after reviewing the different emotional channels from which we can read affective information from the users, the different tools needed to gather that information, the availability of these tools and, also, after considering our previous experience with other emotion detectors [20].

The decision to choose facial expression over any other emotional channel was taken on the basis of the “universality” of facial expressions to express emotions, and the relative maturity of this type of automatic emotion detection. In the following subsection, the technologies used in the application developed are described.

2.4 Related work

Prior to the development of the app, we reviewed several off-the-shelf applications of a similar nature. A list of applications developed for children with ASD to help them in their daily activities and to learn about emotions can be found in Table 1. By taking a quick look at the table, we can see that, in this sample, all the reviewed systems are used on a touch-based device, usually a tablet or a smartphone. It is also noteworthy that only one application, Emotionalyser uses facial recognition, and just a few of them use physical images as part of the game.

Table 1 Comparison of related works

We also reviewed the existing literature regarding serious games applied in the education of children with ASD to find possible gaps in this research field. Reviews such as [33] and [57] reveal the state of the art regarding serious games applied to therapy for children with ASD. For instance, most current proposals are designed to be used on a desktop or a laptop computer. In [57], 40 papers are reviewed and 70 per cent employ serious games running on computers, disregarding more usual or natural interaction mechanisms such as touch screens or tangible user interfaces. With regard to automatic emotion recognition, the use of facial-expression-based emotion detectors is quite scarce, but it is present in some approaches using serious games [18, 25, 27, 32, 39, 46, 49]. It is worth highlighting that most of the reviewed games using emotion recognition use a “homemade” emotion detector instead of resorting to off-the-shelf solutions or open-source software, with the impact in time and cost that this fact has on the process of developing a game.

For the sake of completeness, we decided to also review other studies focused on teaching skills to children with ASD. Artoni et al. [3] developed a web application, namely ABCD SW, consisting of several types of exercises, in such a way that it can be used by the children while a tutor or a parent is monitoring their progress from another device. Studies such as [21] and [57] have used the animated series “The Transporters,” which shows different vehicles such as trains and cable cars with human faces, to teach children with ASD to recognize faces and emotions. These studies try to use the children’s usually intact systemizing abilities to their benefit to help them learn how to look at faces and recognize expressions on them. Another example of a study using multimedia resources is [6], which uses the Mindreading DVD to teach children how to recognize emotions. This DVD is essentially a set of 412 emotions, with each emotion being expressed by 6 different actors and actresses of different ages and cultures.

Our proposal covers some of the gaps found in these applications since it includes a mechanism to automatically check the progress of the children with immediate feedback thanks to NFC and Affectiva technologies. The system proposed in this paper is a serious game which has been developed for Android devices and uses both NFC and emotion detection (Affectiva) technologies. Furthermore, the picture exchange communication system (PECS) has been embedded in the activities of each game implemented in our proposal. As we indicated above, the key contribution of this proposal lies in the integration of different technologies, namely automatic emotion detection and tangible user interfaces, in order to provide an easy-to-use serious game so that therapists working with children with ASD can teach them to recognize emotions with a tool that keeps them engaged and entertained. In the next section, we present a description of the software application developed.

3 System description

EmoTEA is a serious game developed as a mobile application designed to help children with ASD to improve and develop their emotional intelligence, especially emotional skills regarding emotion recognition, whether their own emotions or the ones expressed by someone else. The system has two separate and well-defined parts: firstly, a user management section aimed at the person in charge of those children with ASD (a psychotherapist, their parents or legal tutor, etc.), and secondly, a section which contains the actual games, which is the part the users access after their carer has registered them in the application.

Regarding the multimedia resources used in the development of EmoTEA and shown in the figures used throughout this paper, all the images and videos were taken from [2, 38, 41] and [50], with license of use.

EmoTEA can be defined via three main elements:

  • Target population. The application is aimed at children with ASD aged between 6 and 12 years old, and this age range was set in agreement with the Association “Autism Development.” It is also worth mentioning that the idea of EmoTEA itself arose from a collaboration with said association;

  • Technology. EmoTEA has been developed using tangible user interfaces as the main interaction mechanism, and automatic face-based emotion detection. Besides using Android as the development platform, Affectiva’s SDK and NFC tags were used to enable EmoTEA to recognize emotions on the basis of facial expressions, and to turn physical objects into interfaces the app can interact with, respectively. By including TUIs in the application we take advantage of their beneficial effects in learning settings, while Affectiva gives EmoTEA the power to automatically evaluate users in emotion mimicking games;

  • Target Skills. As stated above, one of the difficulties people with ASD face is the inability to express and/or understand emotions, and to recognize their own ones or the ones expressed by other people. EmoTEA’s main purpose is to tackle this problem, offering exercises to learn how to identify and express basic emotions [15]. These exercises, based on emotion identification and mimicking, seek to help people with ASD to develop their emotional intelligence skills.

The idea of developing this application arose from the collaboration established with the local Association called “Autism Development” and their need to improve the emotional skills of children with autism spectrum disorder. In the course of several meetings, the main requirements of EmoTEA were established, such as the target population, for instance. By performing a critical review of studies on autism, it was observed that the authors in [10] concluded that autism spectrum disorders were usually diagnosed at ages which range from 3 to 10 years old, approximately, with that range shrinking by three years, from 3 to 7 years old, for children with autistic disorders. Furthermore, in [29] the authors state that ASD diagnoses are usually quite stable, i.e., diagnoses made in the early years of the child are rarely mistaken. The authors in [23] analyzed the progress of several children with ASD from their early childhood to their early adolescence, concluding that even when most of the participants did not experience any improvement, there was a small group of them who did. Based on this information, the target population was chosen to be children with ASD aged between 6 and 12 years old, since it is a range of ages in which most of the diagnoses have been made but children have not entered adolescence yet, so it is easier for them to grasp new emotional skills [7],12. However, it is important to recall that ASD is a spectrum disorder, and the differences between each patient (intellectual ability, associated symptoms, etc.) can be huge, which may lead us to modify the limits of this range for certain situations. For instance, a child within this age range with extreme symptoms and low cognitive skills may need dedicated help from a psychologist or assistant, and would not be able to use EmoTEA. On the other hand, a child younger than 6 or older than 12 with good intellectual skills may be suitable for the app. This is why the application was developed for children with autism spectrum disorder aged between 6 and 12 years old, but this range of ages may vary.

Regarding the technical development aspects of EmoTEA, tangible user interfaces (TUIs) are included as the main interaction mechanism with the system to make it easier to use for children with special needs, based on existing evidence in the literature about how the naturalness of TUIs allows children to be more explorative and expressive [43],48. TUIs were implemented as physical cards with images representing the different emotions, with these images being both pictograms and pictures of real faces portraying some emotion. Children have to manually handle the cards to find the one required by the system. In this way, they can learn to differentiate emotions in a more playful way. In addition, affective computing technology is included in the system to support the identification of facial expressions and to help children to learn how to express emotions. In this case, they have to "imitate" the emotion depicted by a picture on the screen and in this way they learn to express it. Finally, taking into account what the children have previously learned in the first activities, they will have the chance to apply this knowledge by observing videos and identifying the emotions shown in them.

Introducing this type of applications in the teaching process of children with ASD is a challenge because it is difficult to alter their routine and when they change the activities they usually perform or the environment where they perform them, they get nervous and their behavior changes. Therefore, this application also aims to observe the adaptation of children to the changes that take place within their environment.

The technical requirements for this application to run properly are minimal. It only requires a touch-based device (mobile phone or tablet), Android O.S. version 5.1 (Lollipop) or higher, an NFC reader (available on most smartphones and tablets), an Internet connection and a camera with a minimum of 3 megapixels.

Additionally, and based on the experience we gained while assessing the system, we established the following ergonomic guidelines:

  • There must be good lighting so that the users can see themselves. It is recommended to place the camera facing away from the light source and in front of the user. The user's face must be clear (with the exception of glasses as they can be worn). The users must be facing the camera and keeping their hands from their faces since the application will not detect their face if they turn their head or they are partially covering it with their hands;

  • The device’s camera must be placed in front of the user and focused on the face of the user, thus avoiding possible shadows that could be created if the camera is placed at a different angle. The recommended distance to place the camera at is about 30 cm from the user. If the device with the integrated camera is resting on a table, you should find a suitable chair so that the camera is placed in the correct position. The cards depicting the different emotions must be brought close to the NFC reader, to within a distance of less than 10 cm for the reader to detect them properly and allow the application to work correctly.

It is also important to point out that even though the application was developed and assessed within the context of the “Autism Development” Association, the application domain of our system is much bigger, and EmoTEA can be used in more general settings, e.g., at home, in regular classes at school, etc.

In the next subsection, we describe the three different games that make up the application, each of them with specific features and goals.

3.1 Game 1

In the first game, the user interacts with the mobile device and the cards representing the basic emotions as tangible objects. The app shows a picture of a face, and the user has to choose from among the different cards the one which represents the emotion corresponding to that picture, bringing the chosen card close to the NFC reader built into the mobile device. When they get 4 or more out of the 6 emotions right, they pass the level and, therefore, move on to the next one. If they fail, they keep playing at the same level. The application provides different types of feedback depending on whether the user succeeds or fails, always encouraging them to try again.

In the first level of difficulty, different images of real faces representing the basic emotions are shown and the user has to choose the card which corresponds to the emotion requested, as shown in Fig. 2. In this level, the cards available to the users are pictures of real faces.

Fig. 2
figure 2

Game 1—level 1

Once the first level is passed, the second level is unlocked. In this second level, real images are shown on the device, as in the previous level, but in this case the cards offered to the users show pictograms representing the different emotions. In this way, the user has to choose from among the different cards which pictogram corresponds to the real image shown on the device. Therefore, they learn how to identify different representations of the same emotion, both in real pictures and in pictograms.

3.2 Game 2

Once the children have learned and identified the different emotions, linking emotions with real (faces) and conceptual (pictograms) representations of them, they are now ready to learn how to express them themselves. This is the main purpose of the second game: teaching children how to express emotions with their faces by mimicking. When the game starts, the picture of an emotion is shown, and the name of the emotion is displayed. Then, the user has to express it with their face, and their facial expressions are analyzed to know whether they are expressing the emotion correctly or not. The feedback provided in this game is the same as in the previous one. The emotion detection through facial expressions was implemented by integrating Affectiva technology [1] in the solution proposed.

In the first level, pictures of emotions expressed by real faces are shown. The user has to imitate the facial expression according to the expression depicted in the picture, as shown in Fig. 3.

Fig. 3
figure 3

Game 2—level 1. Imitating joy and surprise

In the second level, the goal is the same, but this time the system shows pictograms instead of real pictures to ask the user to express an emotion. It is not so much a question of “imitating" what the pictogram shows as a question of recognizing what emotion a pictogram is portraying and knowing how to express it themselves. Figure 4 shows an example of this level in which emotions of joy and anger had to be expressed.

Fig. 4
figure 4

Game 2—level 2. Expressing joy and anger

With games 1 and 2, we aim at teaching children with ASD how to recognize and express the facial expression of the six basic emotions. As for the TUIs used in these first two games, the user requirements stated that the cards used as TUIs shall include both pictograms and real pictures, so that players could have different references to learn about feelings and how to identify them.

3.3 Game 3

In this last game, the system displays fragments of the Pixar film called “Inside out” showing situations in which the emotions learned in the previous games can be identified. In this game, users also interact with the cards, used as TUIs. The aim of this game is to help users recognize emotions in contextual situations. To this end, firstly the user watches the video fragment for a few seconds, and then they have to choose the corresponding emotion from among the cards available for this purpose, bringing the correct card close to the mobile device with the NFC reader, as in game 1.

This game has been divided into two levels of difficulty according to the difficulty of recognizing emotions. In the first level, we set joy, sadness and anger as the possible emotions to be recognized, and in the second level, we set surprise, fear and disgust, since they are more difficult to recognize than the first ones. Figure 5 shows a video fragment representing joy in a contextual situation.

Fig. 5
figure 5

Game 3—level 1

This third game perfectly complements the first two games since it allows the users to put their emotion recognition skills to the test. In this game, emotion is not portrayed by a fixed image, but by a whole set of features: the faces of the characters shown in the video, their surroundings, the color palette, the theme of the scene, etc. By playing this game, users start linking emotions, not only to faces and words but to situations, which helps them to generalize their emotion recognition knowledge to new settings.

4 Preliminary assessment of the system

As was mentioned above, we carried out an evaluation with ASD specialists from the Association in order to assess the teaching capabilities of the application in learning emotions, and its acceptance by the specialists who work with children with ASD. The idea was to assess the use of our system as a good alternative or complement to the traditional therapies they usually apply for this purpose, as well as its usability. The system was assessed on the premises of the Autism Development Association, which is registered by the Local Council under Organizations for Disabled People, with number 25.2255/03. This association works at helping children with ASD develop their personal skills and facilitate their integration into society. Both the children and the psychotherapists belonging to this association participated in the evaluation, and in this way the educators could assist the children with ASD while using the application in a real usage scenario.

The study was ethically approved by both the executive team and the professional staff of the Autism Development Association and was in accordance with the Declaration of Helsinki [54]. Before the beginning of the assessment sessions, informed consent was obtained from all the participants, including the parents or legal guardians of the children participating in the assessment. The children also had their parents’ permission to use images or videos of the evaluation sessions.

To limit the scope of this process, the evaluation was delimited so that the participants would only play the first level of each one of the games, since the corresponding second levels use the same mechanics. Below we describe the key aspects considered during the assessment:

  • Functionality. The application provides what is necessary to successfully meet the objectives for which it was designed;

  • Usefulness. Educators believe that the application is useful to improve the emotional skills of children in a playful and entertaining way;

  • Ease of use. The application has a simple interface both for those who have been previously informed about how it is used and for those who do not have prior information.

In summary, the preliminary assessment performed was mainly focused on the acceptance of the system by the specialists who work with children with ASD, as a good alternative or complement to the traditional therapies they usually apply to teach children emotion-related concepts. This preliminary assessment helped us gather their opinions on the usefulness of the system and identify specific usability problems.

4.1 Method

In this section, we present the method used during the assessment, including the description of participants, the context in which the evaluation was carried out, the tasks to be performed, as well as the place, the device used to run the application and the tools used by the evaluators. We also describe the process and the usability metrics applied. Finally, we present the results of the evaluation.

4.1.1 Evaluation techniques

The system evaluation was designed considering traditional evaluation techniques [13] and the current context of this project. The main goal of this preliminary evaluation, as has been mentioned above, was to assess the suitability of the proposed tool for its objective, that is, to teach children with ASD to recognize emotions. To this end, we designed the evaluation by combining the techniques of Cognitive Walkthrough, Thinking Aloud and Cooperative Evaluation. The evaluation was carried out by specialists from the Association and one child each time, so the psychotherapists could check in real time how the children react to the application, whether they understood how it works, whether the content was appropriate for the skill they should learn from it, whether they were getting frustrated without us noticing, etc. During the whole process, the specialists made comments to the evaluator about the children’s reactions, the application’s usability, etc. One of the benefits of this evaluation technique is that we do not need many children to assess the tool, but just a set of archetypical participants who are representative of the different kinds of children that might use the system.

4.1.2 Participants

The evaluation was carried out with two psychotherapists and three children of different ages and different degrees of autism. The children, aged between 8 and 10 years, had some previous experience of using similar applications and devices such as tablets and/or mobile phones. One of these children presented a mild degree of autism, while the other two presented a more severe degree. This sample allowed us to appreciate differences between users with different levels of ASD. Despite the apparent small number of children we could recruit, as the main goal was to assess the acceptance of the system by the physiotherapists, we designed the evaluation as a cognitive walkthrough, where it is more important to have representative participants than to have a lot of children with similar characteristics repeating the evaluation tasks. Finally, at the end of the evaluation, the psychotherapists who participated in the evaluation provided important feedback, as experts in the field, in relevant aspects observed during the evaluation. Both the educators and the children, with their help, completed the SUS (System Usability Scale) questionnaire to measure the users’ satisfaction [9].

4.1.3 Context of use

We defined two tasks to be performed during the evaluation: navigate between the different parts of the application and play the first level of each game. Although they are presented here as two different tasks, the navigation task was transversal to the playing task. The navigation task consists in guiding the user through the different screens and levels of the application. It is important to note that the evaluation was not focused on evaluating the children’s knowledge of emotions, but their ability to interact with the application, that is, whether they were able to interact properly with the cards as TUIs and express emotions in front of the mobile camera in a natural way. The tasks that the participants had to perform were the same as those that a user willing to complete the entire game should carry out. The navigation task was considered finished when the "Thank you for playing" screen popped up, while the task of playing the first level of each game was considered finished when the "Level passed" screen popped up.

As mentioned at the beginning of the section, the evaluation was carried out on the premises of the Autism Development Association of Albacete. The evaluation was performed individually, i.e., there was one child with the psychotherapists in each evaluation session. The evaluator was in charge of writing down the most important aspects of the evaluation, promoting the Thinking Aloud technique to gather all comments and suggestions from the educators, as well as helping children in the case of technical problems. We used a stopwatch to calculate the time spent performing each task, and a camera to take pictures and videos during the evaluation (see Fig. 6). Finally, all the participants completed the SUS questionnaire at the end of the evaluation.

Fig. 6
figure 6

Child interacting with EmoTEA during the evaluation process

4.1.4 Experimental design

Before starting the evaluation, the parents of the participants were required to sign an authorization to allow their children to participate in it. When children arrived at the assessment session, they were informed about the evaluation process for testing the application. They were also informed that this evaluation would only be used for testing the software application, not for assessing their personal capabilities. Initially, children were informed that the evaluation consisted in playing the first level of the three games, and involved interacting with the cards (tangible objects) and the device’s camera. The interaction mechanisms with both the cards and the camera were also explained in detail. Once all the above were completed, we began with the individual evaluation of each participant together with their psychotherapist.

At the end of the evaluation, all participants were asked to complete an adapted SUS questionnaire. Before completing it, they were instructed about how to do so. The educators also completed the SUS questionnaire to give feedback from their point of view as specialists in the field.

The tasks performed in the evaluation were the following:

  • Task 1: Browse the application. This task was performed throughout the evaluation and consisted in browsing the application to test that the navigability and options provided were easily understood. In this task, the user was guided through the different screens of the application, and this represented the flow a user would follow while using the application, except for the fact that the participants of this evaluation only played one level of each game;

  • Task 2: Performing the first level of the three games. To simplify the evaluation, children only had to tackle the first difficulty level of each game. This is enough for them to interact with the tangible objects and the device’s camera to test the practicality and ease of use of these interaction mechanisms incorporated in the system to play the different games.

4.1.5 Usability metrics

The usability metrics applied in the evaluation were the following:

  • Effectiveness

    • Completion rate. Percentage of tasks completed with (assisted completion rate) and without (unassisted completion rate) the help of the person in charge of the evaluation;

    • Errors. Actions that do not lead to completing the task or times the child needs to tackle the task in order to complete it;

    • Assistance. The number of times that help is offered by the person in charge of the evaluation so that a task can be carried out and finished;

  • Efficiency

    • Task time. Amount of time, in minutes and seconds, needed for a user to complete a task;

  • Satisfaction. This metric was measured with the System Usability Scale Test (SUS test) that all participants completed at the end of the assessment session. The questionnaire consists of 10 questions that assess various aspects related to the usability of the application [9].

4.2 Assessment results

Table 2 shows the outcomes of the first task involving browsing the application, whereas Table 3 shows a statistical summary of these data. In addition, we can observe that, except for one participant, the others were able to complete the tasks on their own.

Table 2 Results of task 1 (browsing the app)
Table 3 Statistical data for Task 1

The outcomes of the second task show that the participants required more help in this case. Table 4 shows the outcomes of this task, and Table 5 shows the statistical summary of these data.

Table 4 Results of Task 2 (playing games)
Table 5 Statistical data for Task 2

Finally, the average value obtained from the SUS test was 90.625 out of 100. Apart from the answers to the SUS test, it was important to know the opinion and suggestions of the educators in order to obtain information that might be valuable for the further improvement of the system, and assess the acceptance of the system, so we wrote down all the comments they made on different aspects that arose during the assessment.

5 Assessment discussion and conclusions

The evaluation process was carried out without major problems, and the results obtained provided very valuable information. The browsing activities presented no problems for the participants, as the educators informed us that the children had previous experience with computer devices. However, the activities involving playing the different games posed a bigger challenge for our participants. For instance, the children with a more severe degree of autism had some problems when expressing emotions using their facial expression (game 2). These problems are conveyed by the increasing assisted completion rate, errors, and assistance during the playing task (Table 5). The use of NFC tags also posed a challenge at the beginning of the evaluation: one of the children tried to place the cards over the screen, since he did not know where the NFC reader was. Surprisingly, they very quickly learned the interaction mechanism of grasping the cards and bringing them close to the mobile device. Despite these initial difficulties, the children became totally engaged with the application and enjoyed its interaction mechanisms (TUIs), since they were new and fun for them. As a matter of fact, they wanted to continue playing after having finished the session. The psychotherapists highlighted this fact, as children usually get tired very soon when doing any kind of training, but in this case, their attitude was totally different. Thus, one of the main findings of the assessment was the engagement of the children when using the system.

As for the SUS test results, they fit with what the educators told us about the application with the Think Aloud assessment technique. Once the children learned how to use the cards (as TUIs), they started to have fun, even when some of the participants could not finish the second game. We will consider all these data when preparing a future version of the application.

In addition, it is important to acknowledge the limitations of the evaluation performed. It was carried out in a very controlled environment, with each child being assisted by their psychotherapist. A more complete, long-term evaluation with a higher number of participants is still necessary to really assess the educative value of the system proposed and to detect further usability issues.

In conclusion, the psychotherapists who participated in the system evaluation gave their approval and considered the system a good tool to teach emotion-related concepts to children with ASD. One of the main features that contributed to the success of the system and that differs from previous works is the combination of automatic emotion recognition and tangible objects as the main interaction mechanism. According to them, the system not only has the necessary contents to teach the different aspects of emotion recognition to children, but also has an innovative interaction mechanism, based on tangible and graspable cards, which appeals to children and keeps them engaged for longer while they are also learning. This is one of the most common deficiencies in applications of this nature according to the existent literature, since researchers and developers usually overlook the interaction mechanisms integrated in their systems in favor of creating more different types of exercises, more complex control panels, etc.

Some improvements are proposed as future work based on the collaboration established with experts in this field. Firstly, the order in which the pictures currently appear is always the same. It seems that a random order would be a much better idea to prevent children from learning the order in which the pictures appear, and thus learning the emotion by heart, not by a correct identification. On the basis of a similar concept, we also plan to increase the number of real pictures, as looking at a wider range of different people will help them identify the different expressions of an emotion, since not all people express emotions in the same way. Secondly, we also plan to include collaborative activities. Collaboration in reaching different levels would encourage them to practice other important abilities. Lastly, secondary emotions could be added so that they could learn to identify more sophisticated and complex types of emotions.

6 Conclusions

Autism spectrum disorder is a neurodevelopmental disorder which impairs the social skills of a person, especially those relating to emotional awareness and emotion recognition. However, these emotion-related skills can be learnt, especially if this learning process starts in early childhood.

In this paper, we present a novel system based on tangible user interfaces implemented with NFC technology and face-based emotion recognition software to help children suffering from ASD recognize and express emotions, supporting our proposal on the existent literature related to specialized therapies for children with this disorder. Our application is mainly based on novel interactive mechanisms, namely automatic emotion recognition through the device’s built-in camera, and tangible user interfaces (TUIs). NFC (near field communication) technology has also been used to implement natural interfaces for children to handle the objects needed for playing the different games. TUIs provide a familiar and simple way for children to interact with the game in a fun and intuitive way. According to the specialists, we have developed a serious game that avoids disruptive elements so that the attention of children can be focused on learning how to identify emotions in different situations as well as how to express such emotions themselves. The software application has been assessed with children with ASD and their psychotherapists in a real setting, obtaining very good results and the specialists’ acceptance of the system as a useful tool to be used for teaching emotion-related concepts. In addition, we have gathered important feedback that will help us improve the application. Once these improvements have been implemented, we plan to carry out a long-term evaluation of our tool to assess its impact on learning.