1 Introduction

The potential of a socially assistive robot as an educational tool for children with special needs has been widely examined ([1, 2]) and studies with positive outcome have been reported in robotic therapy. This could be explained by the fact that knowledge could be gained by observation. However, deeper learning could occur when learners perform the actions themselves [3, 4]. Recently, robots have appeared on theatre stages to educate and promote social changes in behaviour [5] and to evaluate their use in public and private settings by studying interaction with care robots [6]. The term “robot puppeteering” is also used. It describes controlling the head, arms and legs of a robot by one or several puppeteers, e.g. Loren the Robot Butler in the movie “Teach Me How to Dougie”. However, these results cannot be used for educational purposes because puppeteers or human-robot interactions take place in the theatre, which is unaffordable for the repetitive needs of teaching children with Special Educational Needs (SEN). In the frame of H2020 CybSPEED project for pedagogical rehabilitation of children with Special Educational Needs [7] the roboticists have met actors from the “Tsvete” Educational Theatre [8]. The actors follow the European culture, where the expression of emotions is encouraged. The Theatre provides children with a safe environment and uses creative tools to address critical problems in a nuanced way - very emotional, strange or unusual. The actors have long experience in puppet therapy and we have combined forces to transfer the actors’ emotional and social talents to socially assistive robots (SARs) in order to use their methods for capturing attention by the non-verbal expression of emotions and social behaviour. To the best of our knowledge, we mark the first step towards transferring the advantages of educational theatre into social robotics and augmenting a socially assistive robot with the emotional and social talent of an actor. Thus, we want to develop child-robot interactions that catch quickly the child’s attention and enhance emotional intelligence by implementing the actors’ way of performing in the robot intervention, understanding and responding to emotions in order for the child to experience the emotion-generative process together with the robot and thus to develop the child’s episodic memory.

Emotions play a central role in the day-to-day living and the emotional regulation has an impact on the intensity, duration, and expression of emotional reactions [9]. The emotional reaction is usually caused by our thoughts. However, it can be triggered unconsciously without understanding why this happens. Recent neuroimaging findings have indicated that emotions have not only a significant influence on the cognitive processes in humans, including perception, attention, learning, memory and reasoning but they could also be used for motivating action and behaviour [10, 11]. However, children with Autism Spectrum Condition (ASC) have difficulties in understanding and correctly reacting to the actions and expressions of emotions when interacting with people. In this case, not only is the emotion reading incorrect but the timing of understanding is improper as well. Thus, this might create an unpleasant social situation perceived as lack of empathy. Our goal is to augment robotics therapy by using the advantages of educational theatre and especially to learn how to experience an understanding of another person’s thoughts, feelings and conditions. However, using puppeteering robots controlled by actors will be a very expensive therapy because according to neuroscience, effective learning needs doses of practice over time. Designing an artistic interactive robotic scenario by scientists is not a solution because it needs using many sensorimotor modalities to express emotions in a nuanced way and there is lack of knowledge as to how to centre a child’s attention system on strong emotional inputs. Therefore, we have decided to record the artistic performance of expressing basic emotions and to transfer the emotional and social talents of actors to socially assistive robots in order to teach children to understand, recognize and regulate emotions by imitation. We expect to influence the joint attention and establish emotional contact and teamwork with children with SEN. However, we have faced both technical and artistic challenges to track and translate movements expressing emotions from an actor to a robot.

Emotions could be expressed by verbal messages, intonation, body postures or facial expression. They can be captured by internal or external observation, such as tracking body movements or facial expressions. Livingstone and Palmer [12] found that when people talk, the ways in which they move their head reveal the emotions that they are expressing and people are remarkably accurate at identifying a person’s emotion just by seeing their head movements. They use motion-capture equipment to track people’s head movements in three dimensions. Emotions can be captured physiologically, as well. Papers such as [13] have opened up a whole range of possibilities for research on emotion evaluation using electroencephalogram (EEG) portable devices of training the emotional reaction in virtual reality (VR) environment or serious games to improve the timing of emotional responses. Here, by timing we mean the choice, judgement or control of something at exactly the right time. In the “modal model” of emotion, described by Gross and Thompson [9], the timing of generation of emotions is a special sequential “situation-attention-appraisal-response” process which occurs over time. Any case of bad timing, i.e. delay or deficiencies in these stages is a sort of mental dysfunction and requires emotional regulation strategies. The timing in emotion experience might be improved by practising the process of modulating emotional feeling or response. That is why we propose a system to give children a level of awareness of problems with basic emotions and emotional responses by making visible the human way of expressing emotions on a semi-humanoid robot. The robot performs recorded actor’s movements that are typical for several simple emotions in a storytelling narrated either by a therapist or the NAO humanoid-robot. We have designed and developed a low-cost emotional expressive robot head called “EmoSan” [14]. The EmoSan robot movements are realized by a parallel mechanism based on the Gough-Stewart platform (Fig. 1). The Gough-Stewart platform has many applications such as flight simulators, machine tools, various robots, etc. To our knowledge, this platform has been used for the first time as an emotion-expressive robot, representing the motion of the neck in addition to commonly used eyebrow, eye and mouth motions. In our ongoing research we use EMOTIV EPOC EEG brain tracking device [15] to teach children with ASC basic emotions through play, which involves a human therapist in order to enable the child to experience emotions in the safe world of robots. To augment the interactive and autonomous emotionally-expressive robot EmoSan with natural and expressive movements, we combined techniques from motion capture and brain-inspired control.

The technical challenge was how to design the robot in order to be capable of imitating the movements of the human head to a great extent as well as of capturing the movements and expression of the human head. We have chosen to use EMOTIV EPOC brain tracking devices because besides inertial sensors the device has a built-in classifier that detects many facial expressions, including blink, left wink, right wink, raised eyebrows (detecting surprise), furrowed brows (detecting frown), smile or clenched teeth. Since we have access to the raw EEG data stream and motion sensor data stream, we can augment not only the motion capturing but also the facial expression and mental intentions during emotion expressions. EMOTIV EPOC (Fig. 2) has 14 EEG sensors of which 8 are positioned around the prefrontal and frontal lobes. These sensors detect signals from the facial muscles and eyes. They induce noise in EEG signals and most EEG systems filter or ignore artefacts when interpreting the signals. The EMOTIV detection system filters also out these signals before interpreting those of the brain. In addition, it uses these signals to classify which muscle groups cause them. In this case, EMOTIV Facial Expression is based on the so-called “Smart Artefacts” [16].

The artistic challenge is how to create the movements and facial expressions representing emotions within a sufficient time period during which the child is able to experience the emotion-generative process together with the robot. By analogy to the puppet therapy in education, we have tried experimenting with the robot to destroy traditional thinking by metamorphoses and to reduce the emotional tension which these children face when experiencing emotions. Thus, the robot catches the child’s attention easily and serves as a medium between the therapist and the child for gathering knowledge by observation and imitation of the correct timing in the interpersonal communication. It is shown in [17] that in the context of the human-robot interaction, the interactive social cues which persuasive robots display influence positive social responses in humans. Social robots should adapt their behaviour to the social context [18]. Application of social robots includes robot-assisted psychotherapy, facilitators for communication and interaction with children, therapeutic interventions for children with Autism Spectrum Disorder, interactive social companions [19], robot-assisted language learning [20], storytelling [21], etc.

In the present work, we propose a framework for motion capturing and transferring the motions of the actor’s head and face muscles to the robot. The robot should be capable of covering the desired motion of the human head, eyes and lips. A new algorithm for processing the data from the gyroscope and accelerometer based on the geometric algebra mathematical apparatus is proposed. A new IoT framework for creating human-robot interaction applications based on Node-RED [22] “wiring” of EMOTIV brain-listening headset [23] and a socially assistive robot is designed, developed and tested with data obtained from two actors, expressing six basic emotions.

2 Non-Humanoid Emotionally-Expressive Robot and the Applied Mathematical Apparatus

In this section, the description of the non-humanoid emotionally-expressive robot is provided firstly. Secondly, the mathematical apparatus used for the control of the robot and for the motion capture is briefly presented. In this study, the mathematical apparatus of geometric algebra is used.

The human face is very complex and it displays human expressions. Thus, the motion and expression of the human face help in the communication with other people and make one’s behaviour understandable to the others. Social robotics has emerged as a new research area in the past decade due to the rapid improvement of sensors, actuator and processing capabilities of modern hardware, enabling robots to interact with humans more effectively [24]. While the unsophisticated robots can generate basic emotions like happiness, anger, sadness and surprise, the most advanced social robots can express a greater variety of emotions. For example, the iCub robot ([25, 26]) is a versatile humanoid robot which was designed by the RobotCub Consortium of several European universities. It was built by the Italian Institute of Technology (IIT) as part of the EU project RobotCub and subsequently adopted by more than 20 laboratories worldwide. This robot can perform many tasks and it is also capable of expressing emotions (facial expressions). Many researchers have been developing social robots which can express emotions. Some of the developed robots could be mentioned briefly: KASPAR [27], Furhat [28], Socibot [29], FloBi [30]. A comprehensive review of the robotic head design, systematised as non-expressive face robot heads and expressive face robot heads, is provided in [31].

2.1 Non-Humanoid Emotionally-Expressive Robot

We have designed and developed a robot for head movement with the name “Emosan”. It is a semi-humanoid robot capable of playing by reading pre-loaded movements and emotions autonomously or following the real-time output data received from the EMOTIV device. Nowadays, the communication between children using emoticons is wide-spread. For this reason, the design of the robot resembles the well-known symbol “Emoticon” and its goal is to be as nice and friendly as possible. The robot kinematic design is based on the Gough-Stewart platform. This novel application of the Gough-Stewart platform makes the emotion-expressive robot compact and easy to control. The computer simulation model and the arrangements of the coordinate frames are shown in Fig. 1. The motion capabilities of the robot have been analysed and presented in [14]. We have concluded that the robot could fully reproduce the motion of the human head. The robot has six degrees of freedom and could cover the bending of the human neck, which means that it could reproduce the movements of the human head more realistically. What is more, the robot is endowed with a mouth, eyes and eyebrows so that the emoticon-like robot head could express some emotions. The base and moving platforms of the robot are connected by six identical legs, each one having SPU (S stands for spherical, P for prismatic and U for universal joints, respectively) architecture (Fig. 1). Its six prismatic joints are driven by linear actuators. The robot has six degrees of freedom, i.e. the moving platform can be translated along three axes and rotated around them.

Fig. 1
figure 1

The emotionally-expressive robot “Emosan”

2.2 Geometric Algebra Basics

The mathematical apparatus of geometric algebra is applied for the purpose of this study. A brief introduction of the geometric algebra is presented in this subsection.

The now-called Clifford Algebra was introduced by William Kingdon Clifford (1845–1879) in the 19th century and was further developed into a unified language named “geometric algebra” by Hestenes [32], Lasenby and Doran [33], Dorst, Fontijne and Mann [34], and some other authors in the second half of the 20th century. In geometric algebra, a single basic kind of multiplication called geometric product between two vectors is defined. The geometric product of two vectors a and b can be decomposed into symmetric and antisymmetric parts ( [32]). i.e

$$\begin{aligned} {{\textbf {a}}}{{\textbf {b}}}={{\textbf {a}}}\cdot {{\textbf {b}}}+{{\textbf {a}}}\wedge {{\textbf {b}}}, \end{aligned}$$
(1)

where \( {{\textbf {a}}}\cdot {{\textbf {b}}} \) is the inner product and \( {{\textbf {a}}}\wedge {{\textbf {b}}} \) is the outer product of the two vectors.

The inner product \( {{\textbf {a}}}\cdot {{\textbf {b}}} \) is scalar-valued (grade 0) while the result of the other product \( {{\textbf {a}}}\wedge {{\textbf {b}}} \) is called bivector (grade 2). Higher-grade elements can be constructed by introducing more vectors in the outer product. Thus, the outer product of k vectors \( {{\textbf {a}}}_{1},{{\textbf {a}}}_{2},...,{{\textbf {a}}}_{k} \) generates a new entity \( {{\textbf {a}}}_{1}\wedge {{\textbf {a}}}_{2}\wedge ...\wedge {{\textbf {a}}}_{k} \) called a k blade. The integer k is named grade. Blades are the basic algebraic elements of Geometric Algebra. A linear combination of k-blades is called a k-vector and a linear combination of blades with different grades is called a multivector. The geometric algebra \( G_{n} \) contains nonzero blades of maximum grade n which are called pseudoscalars of \( G_{n} \). The unit pseudoscalar of \( G_{3} \) of 3-D Euclidean metric space with the standard orthonormal basis \( \{{{\textbf {e}}}_{1},{{\textbf {e}}}_{2},{{\textbf {e}}}_{3}\} \) could be written as

$$\begin{aligned} I_{3}={{\textbf {e}}}_{1}\wedge {{\textbf {e}}}_{2}\wedge {{\textbf {e}}}_{3}={{\textbf {e}}}_{1}{{\textbf {e}}}_{2}{{\textbf {e}}}_{3}. \end{aligned}$$
(2)

The inverse of the unit pseudoscalar of \( G_{3} \) is

$$\begin{aligned} \textit{I}_3^{-1}={{\textbf {e}}}_{3}\wedge {{\textbf {e}}}_{2}\wedge {{\textbf {e}}}_{1}=-{{\textbf {e}}}_{1}\wedge {{\textbf {e}}}_{2}\wedge {{\textbf {e}}}_{3}. \end{aligned}$$
(3)

In an n-dimensional geometric algebra, \( 2^n \) blades exist, e.g. there are 8 blades in 3D Euclidean Geometric Algebra.

Rotations in the geometric algebra can be performed using rotors. A rotor in geometric algebra is defined as

$$\begin{aligned} {R}=e^{-\left( \frac{\theta }{2}\right) {B}}=cos\left( \frac{\theta }{2}\right) -{B}\,sin\left( \frac{\theta }{2}\right) , \end{aligned}$$
(4)

where B is a normalized bivector specifying the plane of rotation.

3 Motion and Basic Expression Capture of an Actor’s Head

The EMOTIV EPOC+ EEG (Fig. 2) [15] portable tracking device is used together with its “Performance Metrics” and “Facial Expressions” detection algorithms. A magnetic and inertial measurement unit (MIMU) is embedded in this device. Thus, the EMOTIV device is suitable for both head motion and expression capture. Since the experiments have been carried out in laboratory indoor environment with many electronic devices, only the data from the gyroscope and accelerometer have been considered, because the accurateness of the data from the magnetometer may be compromised due to environmental noise. Gyroscopes and accelerometers allow the tracking of rotational and translational movements. Some of the newest versions of the EMOTIV devices provide a quaternion data output processed from the embedded MIMU. Our EMOTIV device provides only raw data from MIMU and that is why processing is needed in order to obtain the exact motion. This section introduces an approach to the estimation of the orientation which employs the mathematical apparatus of geometric algebra. After featuring and classification of the EEG raw signals and processing the gyroscope and accelerometer measurements, the facial expression, emotional states output and movements are mapped into robot coordinates.

Fig. 2
figure 2

The EMOTIV EPOC+

3.1 Geometric Algebra Approach to Motion Capture

In this paper we introduce a geometric algebra approach to capture the motion of the human head. The motion capture involves transformation of vectors between frames. The most used methods of vector transformation are Euler angles and direction cosine matrices. 3D orientation can be represented by Euler angles, i.e. using a combination of three rotations about different axes. It is a well-known fact in kinematics that Euler angles are simple to use but their drawback is that the solution contains singularity. An alternative to Euler angles are computationally efficient quaternions, which is an advantage. In this paper, the mathematical apparatus of geometric algebra is used in order to handle the representation of the orientation. Rotors, which are elements of the geometric algebra, have the advantages of quaternions and furthermore, they are more general, i.e. they can be applied to any dimension. And last but not least, we utilize our previous experience in applying geometric algebra in robotics [35,36,37].

3.1.1 Orientation from Gyroscope Data

The gyroscope provides data for the instantaneous angular velocity which is measured about the three axes of the sensor frame. In geometric algebra the orientation of the rigid body can be tracked using the time dependent rotor \({R}\left( t\right) \). According to [32], the following kinematic equation can be written,

$$\begin{aligned} {\dot{R}}=- \frac{1}{2} {R} \varOmega , \end{aligned}$$
(5)

where \( \varOmega \) is a bivector and represents the angular velocity of the body given in the body frame.

The angular velocity bivector can be defined as

$$\begin{aligned} \varOmega =I_{3}\varvec{\omega }=\omega _{x}{{\textbf {e}}}_{2}\wedge {{\textbf {e}}}_{3} + \omega _{y}{{\textbf {e}}}_{3}\wedge {{\textbf {e}}}_{1} + \omega _{z}{{\textbf {e}}}_{1}\wedge {{\textbf {e}}}_{2}, \end{aligned}$$
(6)

where \( \varvec{\omega }= (\omega _{x},\omega _{y},\omega _{z}) \) is the angular velocity vector.

In order to obtain the rotor from the gyroscope readings, an approach described by Candy and Lasenby in [38] based on rotating bivectors is used . The rotation bivector can be written as

$$\begin{aligned} \varPhi =\theta B, \end{aligned}$$
(7)

where \( \theta \) and B are the angle and the unit bivector from Eq. (4).

Then, according to [38], the kinematic equation is given by

$$\begin{aligned} {\dot{\varPhi }}=\varOmega - \frac{\left\langle \varPhi \varOmega \right\rangle _{2} }{2}+\left( \frac{\arrowvert \varPhi \arrowvert }{2} \cot \frac{\arrowvert \varPhi \arrowvert }{2} - 1 \right) \left[ \varOmega + \frac{\left( \varOmega \cdot \varPhi \right) \varPhi }{\arrowvert \varPhi \arrowvert ^{2}}\right] ,\nonumber \\ \end{aligned}$$
(8)

where \( \left\langle \varPhi \varOmega \right\rangle _{2} \) notates the 2-grade part of a multivector \( \varPhi \varOmega \), which multivector results from the geometric product of \( \varPhi \) and \( \varOmega \) . In general, the notational convention \( \left\langle A \right\rangle _{k} \) gives the k-grade part of a multivector A.

An extended proof of Eq. (8) is provided in [38]. Since the component \( \left( \frac{\arrowvert \varPhi \arrowvert }{2} \cot \frac{\arrowvert \varPhi \arrowvert }{2} - 1 \right) \left[ \varOmega + \frac{\left( \varOmega \cdot \varPhi \right) \varPhi }{\arrowvert \varPhi \arrowvert ^{2}}\right] \) is small, it can be neglected for practical purposes. Thus, the following final simplified form can be written [38]:

$$\begin{aligned} {\dot{\varPhi }}=\varOmega - \frac{\left\langle \varPhi \varOmega \right\rangle _{2} }{2}. \end{aligned}$$
(9)

In order to obtain the rotation from the gyroscope data, an integration algorithm needs to be applied on Eq. (9). Several standard integration algorithms exist but because of the interpolation properties of bivectors, an elegant formula is proposed in [38] i.e.,

$$\begin{aligned} \begin{aligned} {\varPhi }\left( 2T\right) =&{T} \left( \frac{1}{3} \varOmega _{i-2} + \frac{4}{3} \varOmega _{i-1}+ \frac{1}{3} \varOmega _{i}\right) +\\&\frac{T^{2}}{3}\left( \varOmega _{i-2} \times \varOmega _{i} - 4\varOmega _{i-2} \times \varOmega _{i-1} \right) , \end{aligned} \end{aligned}$$
(10)

where i is the sample number and T is the sampling period.

The commutator product (from Eq. 10) for two bivectors A and B is defined as

$$\begin{aligned} A \times B = \frac{1}{2}\left( AB-BA \right) = \left\langle A B \right\rangle _{2}. \end{aligned}$$
(11)

It must be pointed out that Eq. (10) needs to be evaluated only on every second sample.

Then, having obtained the rotating bivector \( \varPhi \), we can construct the rotor R using Eq. (4) by substituting \( B=\frac{\varPhi }{\arrowvert \varPhi \arrowvert } \) and \( \theta = \arrowvert \varPhi \arrowvert \). The finite rotation is given by the geometric product of the obtained rotor of the particular step and the previous finite rotor, i.e.

$$\begin{aligned} Q_{j} = R \, Q_{j-1}. \end{aligned}$$
(12)

3.1.2 Orientation from Accelerometer Data and Fusion Process

In this subsection we introduce an approach for derivation of the orientation from the accelerometer readings, which is followed by a fusion of both types of filtered orientation data. This approach is a novel contribution in the field, i.e., a known mathematical apparatus (geometric algebra with its operators, components and rules) is used to obtain the orientation (rotor) from the accelerometer readings, followed by a fusion of the orientations from the gyroscope and the accelerometer in a unique approach and algorithm.

We assume that the direction of gravity is defined along the vertical (z) axis of the global (earth) frame, i.e.,

$$\begin{aligned} {{\textbf {g}}}=-{{\textbf {e}}}_{3}. \end{aligned}$$
(13)

The accelerometer provides data for the measured acceleration \({{\textbf {a}}}_{m} \). This acceleration vector is normalized during the processing of the raw data. Because of the rotation of the sensor frame, the coordinates of the gravity vector with respect to the sensor frame differ from those in Eq. (13). Thus, using the obtained rotor from the gyroscope in the previous subsection we can predict these coordinates, i.e.,

$$\begin{aligned} {{\textbf {a}}}_{p}={Q} \;\;{\mathbf {g}}\;\; {{\tilde{Q}}}={Q} \;\left( -{{\textbf {e}}}_{3}\right) \; {{\tilde{Q}}}, \end{aligned}$$
(14)

where \( {{\textbf {a}}}_{p} \) is the predicted acceleration vector, Q is the rotor obtained by Eq. (12) and \( {{\tilde{Q}}} \) is reverse of Q.

In reality, the measured acceleration vector \({{\textbf {a}}}_{m} \) and the predicted acceleration vector \( {{\textbf {a}}}_{p} \) differ. We need to find the rotor which rotates the predicted acceleration vector in order to coincide with the measured one. The solution to this problem is not unique, i.e., an infinite number of rotors can rotate a vector to coincide with another one. One of the approaches to the solution is applying optimisation algorithms, e.g. in the quaternion representation of the orientation the gradient descent algorithm is used in [39]. Another approach is used in [40] where a rotation in the XZ plane is chosen and then a system of equations is solved in order to find the quaternions.

Here, we will utilize the properties of the geometric algebra in order to find a unique solution. If we consider rotating the unit vector \( {{\textbf {a}}} \) into another unit vector \( {{\textbf {b}}} \), i.e. the rotation \( {{\textbf {b}}}= {R} \;{\mathbf {a}}\; {{\tilde{R}}}\), then the rotor which performs a simple rotation in the plane \( {{\textbf {a}}} \bigwedge {{\textbf {b}}} \) can be obtained [41]:

$$\begin{aligned} R=\frac{1+{{\textbf {b}}}{{\textbf {a}}}}{\sqrt{2\left( 1+{{\textbf {b}}}\cdot {{\textbf {a}}}\right) }}. \end{aligned}$$
(15)

The rotor obtained by Eq. (15) could be considered to a certain extent an “optimum” one because the rotation is through the smallest angle. Thus, using Eq. (15), the correction rotor which rotates the predicted acceleration vector \( {{\textbf {a}}}_{p} \) into the measured acceleration vector \({{\textbf {a}}}_{m} \), can be written as

$$\begin{aligned} R_{c}=\frac{1+ {{\textbf {a}}}_{m} {{\textbf {a}}}_{p}}{\sqrt{2\left( 1+ {{\textbf {a}}}_{m} \cdot {{\textbf {a}}}_{p}\right) }}. \end{aligned}$$
(16)

Then, the finite rotor after the correction for the \( j-th \) step will be

$$\begin{aligned} Q_{c}=R_{c} \, Q_{j}. \end{aligned}$$
(17)

Obviously, two rotors representing rotational position of the device are found, the first one from the gyroscope data (Eq. 12) and the second one from the accelerometer data (Eq. 17). Accurate estimation of the angular position of the object requires fusion of the data provided by different sensor modules measurements. Several approaches for attitude estimation exist and Kalman Filter and Complementary Filter are the most commonly used algorithms for data fusion. Here, because the problem is addressed from a geometrical point of view, a kind of complementary filter is proposed. Thus, the rotation interpolation ([34]) can be used in order to fuse between the two rotors, i.e. interpolation sequence from the initial rotor from gyroscope data (Eq. 12) to the final rotor from the accelerometer data (Eq. 17). Then, the rotation from the initial rotor to the final one can be represented by the following rotor

$$\begin{aligned} Q_{r} = \frac{Q_{c}}{Q_{j}}=\frac{R_{c} \, Q_{j}}{Q_{j}}={R}_{c}. \end{aligned}$$
(18)

The rotor \( Q_{r} \) can be written as

$$\begin{aligned} {Q}_{r}\equiv {R}_{c}=e^{-\left( \frac{\phi }{2}\right) {A}}=e^{- \frac{\varPsi }{2}}. \end{aligned}$$
(19)

Thus, the fusion process needs only a fraction of the correction rotor \( R_{c} \) obtained in Eq. (17). The SLERP (spherical-linear interpolation) could be employed here (e.g. [34]). Then, the interpolation can be obtained by rotation through an angle fraction \( \alpha \frac{\phi }{2} \), where the scalar \( \alpha \) varies within the range \( \alpha =[0,1] \). Then, the fusion rotor can be written as

$$\begin{aligned} {Q}_{f}=\left( \cos \left( \frac{\alpha \phi }{2}\right) - {A} \sin \left( \frac{\alpha \phi }{2}\right) \right) {Q}_{j} , \end{aligned}$$
(20)

where \( {A}=-\frac{\left\langle R_{c}\right\rangle _{2}}{\left| {\left\langle R_{c}\right\rangle _{2}}\right| } \) and \( \phi =2\arctan \left( \frac{\left| \left\langle R_{c}\right\rangle _{2}\right| }{\left| {\left\langle R_{c}\right\rangle _{0}}\right| }\right) \).

It is clear that there is no need to calculate the rotor \( Q_{c} \) and only the correction rotor \( R_{c} \) and the rotor \( Q_{j} \) from the gyroscope data (Eq. 12) are involved in the fusion process. In case of zero coefficient (\( \alpha =0 \)) the fusion rotor coincides with the rotor \( Q_{j} \) from the gyroscope data, i.e., there is no correction from the accelerometer data. On the other hand, in case of \( \alpha =1 \) the fusion rotor is equal to the correction rotor \( Q_{c} \), i.e., a “full” correction imposed by the accelerometer data is in use. These two cases are boundary ones and in practice good results are obtained with \( \alpha =0.08 \div 0.2 \).

3.1.3 Summary of the Motion Capture Algorithm

The approach for processing data from the gyroscope and accelerometer described in the previous subsections is summarized in the following steps:

  1. 1.

    Get the input values: \( \varvec{\omega }= (\omega _{x},\omega _{y},\omega _{z}) \) - the angular velocity vector measured about the three axes of the sensor frame and measured acceleration \({{\textbf {a}}}_{m} \);

  2. 2.

    Calculate the rotating bivector \( \varPhi \) using Eq. (10);

  3. 3.

    Construct the rotor R from the obtained rotating bivector \( \varPhi \);

  4. 4.

    Calculate the predicted acceleration vector \( {{\textbf {a}}}_{p} \) using Eq. (14);

  5. 5.

    Calculate the correction rotor \( R_{c} \) using Eq. (16);

  6. 6.

    Calculate the fusion rotor \( {Q}_{f} \) using Eq. (20).

This algorithm suggests that if the available data are only from the gyroscope, the rotation can be obtained following the first three steps. In case of available data both from the gyroscope and accelerometer, the fusion can be performed using all six steps.

The results from the above algorithm are illustrated by the next figures, which present a captured simple motion (nodding - rotating head down and back) of a human head. Figure 3 shows the output data from the EMOTIV’s gyroscope and Fig. 4 presents the output data from the EMOTIV’s accelerometer.

Fig. 3
figure 3

Output data from the EMOTIV’s gyroscope for the nodding motion of the human head

Fig. 4
figure 4

Output data from the EMOTIV’s accelerometer for the nodding motion of the human head

Then, the obtained result using the input data from the gyroscope and accelerometer is shown in Fig. 5, where the components \( q_{i}, (i=1,2,3,4) \) are graphically presented. For this purpose, the rotor from Eq. (4) after expansion is written as

$$\begin{aligned} {\mathbf {Q}}=q_{1} + q_{2}\,e_{{2,3}}+q_{3}\,e_{{3,1}}+ q_{4}\,e_{{1,2}}. \end{aligned}$$
(21)

In this case the fusion is obtained with \( \alpha =0.1 \).

Fig. 5
figure 5

Components of the fusion rotor resulting from the nodding motion of the human head

Although this fusion rotor is directly used to control the robot, the fusion rotor, in addition, has been transformed in angles about the three coordinate axes (XYZ) just for the sake of illustration. The angles of rotation around these three axes are shown in Fig. 6. Note that these angles are obtained just for illustration purpose. The proposed algorithm is general and could be applied to process raw data from other inertial measurement units.

Fig. 6
figure 6

Angles of rotation around three axes resulting from the nodding motion of the human head

3.2 Basic Expressions Capture

Emotion recognition has been explored in many papers and some of the recent methods utilize multimodal approach [42] and deep neural model [18, 43]. Tracking facial expression is used to transfer the actor’s emotional features on the robot. Showing expressions is important because, for example, children with autistic spectrum conditions (ASC) frequently misinterpret happy faces as neutral and confuse neutral faces with negative facial expressions (sadness and anger) [44]. The EMOTIV device allows the recording of facial expressions and movements. At the time of the experiments the EMOTIV SDK (Software Development Kit) measures facial expressions with the following options for detections: smileExtent (low level indicates no expression has been detected, high level indicates a maximum level of expression detected); clenchExtent, upperFacePower; upperFaceAction; lowerFacePower; lowerFaceAction; eyebrowExtent. For the neutral, happy and sad states we use the outputs for lowerFaceAction (Fig. 7).

Fig. 7
figure 7

Output from the Emotiv SDK for the lower face detection for the recorded “surprise” emotion

The sad state is not present for the “surprise” emotion (Fig. 7), i.e., all clenchExtent values are zeros. In order to illustrate the sad state, we use the data for the recorded “sadness” emotion. Thus, the state of sadness can be detected in Fig. 8 which shows the data for clenchExtent. In the latest SDK version two new detections for BROW are available: RAISE BROW and FURROW BROW. Thus, the Emotiv outputs for the eyebrows movements can be transposed on the robot’s eyebrows.

The robot can express three basic emotional states: happiness, neutral and sadness (anger). In order to determine these states, the following algorithm based on the output data from the Emotiv SDK is applied:

\(if \quad lowerFaceAction = FE_{-}NEUTRAL\)

\(then \quad expression = Neutral\)

\(else \quad if \quad lowerFaceAction =FE_{-}LAUGH\)

\(or \quad lowerFaceAction =FE_{-}SMILE\)

\(then \quad expression = Happy\)

\(else \quad if \quad lowerFaceAction =FE_{-}CLENCH\)

\(then \quad expression = Sad.\)

For this purpose, the robot is endowed with a mouth, eyes and eyebrows so that the emoticon-like robot head could express some emotions. At this stage of the development of the robot, three main facial expressions are transferred to the robot: happy, neutral and sad (angry), which are shown in Fig. 9. These three expressions are extracted from the data for lowerFaceAction and clenchExtent obtained by the EMOTIV device applying the above algorithm .

4 Transferring Captured Motion and Basic Emotion to the Robot

The actor’s movements expressing emotions include rotations and translations. For the purpose of this study, only rotations are considered because, first of all, the main feature of the head motion is rotation. Another reason for this is that including translations will reduce the rotational workspace of the robot, i.e., the rotational capability of the robot decreases if the robot translates further from a given “central” position. This “central” position of the robot corresponds to the \( X=0, Y=0 \) and Z is calculated at a half of the leg extensions (in this case the half of the leg extension is 25 mm). This conclusion is drawn by analysing some features of the robot workspace and the situation is illustrated by the computed range of orientations shown in Figs. 10 and 11. The workspace volume in the horizontal plane XY and the orientation angle (around X, Y or Z axes) is discretized by a grid \(200\times 200\times 50 \). The workspace for the combined orientations given in Fig. 11b is discretized by a grid \( 200\times 200\) for the XY plane and \( 50\times 50\times 50 \) for the three angles. Each node of the horizontal plane grid (XY grid ) represents the number of possible poses of the robot. The orientation angles vary within the range \([-\pi /3, \pi /3] \). The orientation angles are discretized with identical step for each node of the horizontal plane grid. The calculated leg length extensions for each orientation for a given node of the grid are checked as to whether they fall within the workspace of the robot, i.e, if the condition for the length extension range of all legs is fulfilled. The orientations which fulfil this condition form the number of orientations for each node of the horizontal plane grid (XY grid ). It is clear from Figs. 10 and 11 that the orientation range is maximal in the (0,0) position and decreases towards the workspace boundary, which is represented as a black curve in the figure.

Fig. 8
figure 8

Output from the Emotiv SDK for the clench extent for the recorded “sadness” emotion

Fig. 9
figure 9

Facial expressions of the laboratory social robot

Fig. 10
figure 10

Range of orientation (number of poses) in the XY plane around X and Y axes

Fig. 11
figure 11

Range of orientation (number of poses) in the XY plane around Z axis and combined orientation around X, Y and Z axes

Therefore, for the purpose of the current work we fix the position of the moving platform of the robot at this “central” position and control only the orientation of the robot (head). The head motion is represented by the fusion rotor \( {Q}_{f} \) obtained in Eq. (20). The coordinate systems of the head (EMOTIV device) and the moving platform of the robot are aligned (Fig. 12), i.e., the coordinate system, attached to the moving platform of the robot, coincides with the body coordinate system of the EMOTIV headset. Thus, this fusion rotor \( {Q}_{f} \) can be directly used to calculate the control parameters of the robot.

The process of motion capture results in the derivation of the fusion rotor \( {Q}_{f} \). Next, the path of the motion of the head is to be transferred to the robot as a set of orientations, i.e., establishing the robot control algorithm. The dimensions of the EmoSan robot are similar to the size of the human head and neck. The motion of the moving platform of the robot is realized by the length variation of the six legs which connect the two platforms (the base and the moving platforms) of the robot. In order to control the robot, the lengths of the six legs have to be obtained. At every given time moment, the orientation and the position of the moving platform correspond to a particular set of leg lengths. Since the orientation of the robot head should correspond to the orientation of the human head (EMOTIV headset), the orientation is given by the fusion rotor \( {Q}_{f} \). The joints of the base and the moving platforms are arranged at points \( A (A_{1},...,A_{6}) \) and \( B (B_{1},...,B_{6}) \). Joints \( A_{i} \) are attached to the base and joints \( B_{i} \) are attached to the moving platform of the robot (Fig. 1). In order to obtain the leg length, the coordinates of the points \( B_{i}, (i=1,2,...,6) \) with respect to the base coordinate system need to be derived, i.e.,

$$\begin{aligned} {{{\textbf {B}}}_{i}} = Q_{f}\; ^{M}{{\textbf {B}}}_{i}\; {\tilde{Q}}_{f} + {{\textbf {O}}}{{\textbf {O}}}_{1}, \end{aligned}$$
(22)

where \( {{{\textbf {B}}}_{i}}\) is the position vector of points \( B_{i} , (i=1,...,6)\) with respect to the base coordinate system; \( ^{M}{{\textbf {B}}}_{i} \) is the position vector of point \( B_{i} , (i=1,...,6)\) given in the coordinate system \( O_{1}X_{1}Y_{1}Z_{1} \) attached to the moving platform; \( Q_{f} \) is the fusion rotor, obtained in Eq. (20) and \( {\tilde{Q}}_{f} \) is its reverse; \( {{\textbf {O}}}{{\textbf {O}}}_{1} \) is the vector connecting the origins of the coordinate system attached to the base and the coordinate system attached to the moving platform, respectively.

Then, the leg lengths are as follows:

$$\begin{aligned} L_{i}=\left| {{\textbf {B}}}_{i}-{{\textbf {A}}}_{i}\right| , (i=1,2,...,6), \end{aligned}$$
(23)

where \( {{\textbf {B}}}_{i}={O}{B}_{i} \) and \( {{\textbf {A}}}_{i}={O}{A}_{i} \) are vectors expressed in the base coordinate system OXYZ .

Linear actuators are used to vary the lengths of the legs. Thus, the desired lengths of the actuators which correspond to the given position and orientation of the moving platform are achieved by controlling the six motors. Following this algorithm, the robot head will reproduce the captured motion of the human (actor) head.

The captured basic emotions are transferred to the robot by controlling the motors for the eyebrows and the screen for the mouth.

4.1 Experimental Protocol and Illustration of Captured Emotion Motion

The experimental protocol for classifying movements expressing emotions extracted from data records taken from motion and expression tracking devices has been approved by the Ethics Committee for Scientific Research and Informed Consents have been signed. The experimental conditions for testing the communication protocol based on EMOTIV headset for tracking the movements of actors began with a calibration of EMOTIV sensors. Then, the actor started playing a sequence of head and face postures for six emotions: fear, joy, surprise, sadness, disgust and anger. We recorded data from two actors - one female and one male (Fig. 12). We collected data during the performance of the six above-mentioned emotions. Each one was repeated 3 times with duration of 30 seconds approximately. Since the EMOTIV EPOC resolution is 64 samples per second, we collected 36 files with approximately 1950 records in each single file. The recorded facial movements were: isBlink, isLeftWink, isRightWink, isEyesOpen, isLookingUp, isLookingDown, leftEye, rightEye, eyebrowExtent, smileExtent, clenchExtent.

The data for all six emotions performed by the two actors have been collected and processed in order to synthesise the control signals for the robot. In this section, the obtained results are presented for the “surprise” emotion since the data from the remaining five emotions have been processed in a similar way. The output data from the inertial measurement unit of the EMOTIV device for the “surprise” emotion are shown in Figs. 13 and 14.

Fig. 12
figure 12

Arrangements of the coordinate systems of the actor (device) and the robot

Fig. 13
figure 13

Output data from the EMOTIV’s gyroscope for “surprise” emotion

Fig. 14
figure 14

Output data from the EMOTIV’s accelerometer for “surprise” emotion

The data from the EMOTIV device are processed using the algorithm presented in Sect. 3. The components of the obtained fusion rotor for the “surprise” emotion are shown in Fig. 15. Then, as mentioned above, only for the sake of illustration and better understanding of the movements, the angles of rotation around the XYZ axes are shown in Fig. 16.

Fig. 15
figure 15

Components of the fusion rotor resulting from the “surprise” emotion of the human head

Fig. 16
figure 16

Angles of rotation around three axes resulting from the “surprise” emotion of the human head

During the motion capture the actors were not restrained in their motions and only some general guidance was recommended for movements within some boundaries. For this reason, sometimes the captured motion exceeds the capability of the robot, i.e., the calculated leg extensions are bigger than the real leg design properties. In our case, the maximal extension of all six legs is 50 mm. Figure 17 illustrates the leg extensions for the case of realizing the captured “surprise” emotion. It is clear that in some parts of the motion the calculated leg extensions exceed the boundaries of the real actuator. Indeed, this happens in some extreme rotations, which in reality does not alter the captured emotion. We have also transformed the captured motion into an animated one using the computer image of the robot (Fig. 1a) as an alternative to the real robot motion. In this case, the limitation of the leg extensions does not apply.

Fig. 17
figure 17

Robot leg extensions for the “surprise” emotion of the human head

Fig. 18
figure 18

A sequence of the captured movement of the actor, the corresponding angles of rotation and the corresponding robot poses (computer model) for the first five seconds of the “surprise” emotion

A sequence of the captured “surprise” emotion, the corresponding angles of rotation and the respective robot poses are shown in Fig. 18. This sequence is represented for the first five seconds of the “surprise” emotion motion and shows the angles of rotation presented in Fig. 16 (or the fusion rotor from Fig. 15, respectively). The positions of the actor head and the robot poses are illustrated at the initial position and at the \( 1\mathrm{st}, 2\mathrm{nd},..., 5\mathrm{th} \) second.

The same sequence of the captured movement, illustrating the poses of the computer model of the robot and the real laboratory prototype of the social robot, is shown in Fig. 19.

Fig. 19
figure 19

A sequence of the captured movement for the computer model of the robot, the corresponding angles of rotation and the corresponding poses of the laboratory robot prototype for the first five seconds of the “surprise” emotion

4.2 IoT Framework for Creating Human Robot Control Based on Node-RED “Wiring” of EMOTIV BCI and the Social Robot

Although the EmoSan robot could be controlled as a stand-alone device, we have developed a control framework which includes Nod-RED, EMOTIV BCI and social robot EmoSan. Enhancing the robot performance with the actor’s talent imposes technical challenges concerning the infrastructure of how to access a vast amount of processing power and data to support the operations complying with the notion of ubiquity. We exploit the idea behind the Internet of Things (IoT) which combines people, processes, devices and technologies with sensors and actuators. Thus, all sensing, computation and memory are integrated into a single stand-alone system and can aid socially-assistive robots. Node-RED [22] is an open source development tool built by IBM, which allows to wire up IoT as nodes in flows. Node-RED is built on Node.js and can run anywhere if the applications are capable of hosting node.js, such as small single-board computers (e.g. Raspberry Pi), personal laptops or cloud environments (e.g. IBM Cloud). The Node-RED connectivity allows nodes to collect and exchange data ubiquitously and its flow-based programming is an ideal solution to wire up the biological behavioural or emotional intelligence to robots anytime and anywhere. Based on the idea behind IoT that uniquely addressable “things” communicate with each other and transfer data over the existing network protocols, an approach for using the information channel between the human brain and external devices is proposed. It can be applied to IoT brain-to-robot control through information extracted from the inertial or EEG sensors and used to control the robot motion. The control tasks intend to translate specific behavioural activity interaction patterns into robot commands. In this study, we illustrate a non-traditional control method where the head or facial muscles activity is captured by an EMOTIV brain-listening headset in order to control the emotionally-expressive robot EmoSan. Using Nod-RED within this frame allows playing sound files through the built-in robot loudspeakers. An example for the Node-red flow designed for EMOTIV motion capture, calculations and transfer of a signal to the robot is shown in Fig. 20. Thus, the captured sound of the emotions can be played together with the robot motion simultaneously. Therefore, the robot could express emotions through body postures, verbal messages, intonation and facial expression.

Fig. 20
figure 20

Node-red flow for EMOTIV motion capture and transfer a signal to the robot

Using the node “Facial Expressions” provided by the Emotiv Node-Red toolbox [23], it is possible to detect facial expressions within the Node-Red platform frame (Fig. 21) as well.

Fig. 21
figure 21

Facial Expressions detection within the Node-Red platform frame

5 Discussion

The EmoSan robot could fully reproduce the motion of the human head. The robot has six degrees of freedom and simulates the bending of the human neck, i.e., it can reproduce the movements of the human head. The proposed algorithm for the motion capture can provide a true replication of the robot. This is verified by the computer animation and comparison of position of the actor’s head and the pose of the robot (or animated robot). Certainly, there are movement limitations of the robot prototype resulting from the mechanical restrictions of the robot components, e.g. limitations of the actuators’ speed and the range of the actuators’ extensions. Some of these limitations are shown in Sect. 4. However, these boundaries do not apply to the computer-animated robot.

In this paper we proposed the first step toward incorporating the benefits of educational theatre into social robots for supporting emotional intelligence and learning skills for children. We were directed by the neuroscience implications defining as to how emotional impact and embodiment learning enhance memory, cognition and behaviour. When the body is involved in the robotic intervention scenarios, the process of “learning by experience” is transformed into a more stable memory and cognitive representations because the notion of the body includes not only the body itself but also the senses, the mind and the brain [45]. Neuroscience explains the mental processes involved in the development, plasticity, learning, memory, cognition and behaviour [46]. Learning means that one has created a strong enough memory trace to keep it and adding more modalities, such as touch, audio, manipulation of 3D objects and movements in robotic intervention might strengthen the learners’ memory traces. The educational theatre and actor’s talent aid neurologically the emotion and attention systems in the brain to interact, resulting in emotional association of information from short-term memory to long-term memory. A neuroscientific explanation for the influences of emotion on learning and memory [10] states that the amygdala and prefrontal cortex cooperate with the medial temporal lobe in an integrated manner, which affords the amygdala to modulate the memory association and the prefrontal cortex to mediate the memory encoding and formation. Thus, emotions could be used for modulating the selectivity of attention, as well as for motivating action and behaviour. A more detailed explanation as to how the cognition and emotion are effectively integrated in the brain, as to how emotions enhance perception and attention and the anatomical basis for cognitive-emotional interactions can be found in [47]. A new explanation of how emotional events easily capture our attention is given in [11], i.e., a novel pathway from the amygdala targets the thalamic reticular nucleus, which is the key node in the brain’s attentional network in the upper surface of the temporal lobe. This amygdalar pathway forms unusual synapses with larger and more efficient terminals than the pathways from the orbitofrontal cortex. Recent studies report the role of amygdala-frontal connectivity during the regulation of emotions [13, 48]. Successful control of the affect depends on the capacity to modulate negative emotional responses through the use of cognitive strategies, such as cognitive reappraisal. These strategies involve frontal cortical regions in the modulation of amygdala reactivity, which is important for the child’s regulation of emotions.

In the frame of the ongoing CybSPEED project, we have been conducting longitudinal experiments for speech and language therapy assisted by the proposed EmoSan and the social NAO robots. The study has been conducted at the Centre of Logopaedics, part of South-West University “Neofit Rilski”, Bulgaria. Fifteen children with neurodevelopmental disorders, aged 3 to 10 (M=5.06, SD=2.25), have taken part in the study. The therapy of the children with neurodevelopmental disorders has been performed via play-learning activities within a robot-assisted speech and language session. Every game begins with an introduction sentence “EmoSan does not know X (e.g. fruits) and it makes him unhappy.” The emotion-expressive robot EmoSan performs a sad face and head movements. During the game EmoSan robot stays neutral, however, when the child’s choice of a task is correct, EmoSan performs a happy emotion. At the end of the game Nao concludes positively, “Well done!” Then the emotion-expressive robot EmoSan performs the happy emotion and says, “‘I’m happy!” The robot-assisted speech and language sessions have been observed by the children’s parents. Descriptive statistics of the collected data from a twenty-question survey has revealed that children seem impressed, motivated and engaged when a robot EmoSan assists the therapy session. Both parents and children have a positive attitude towards the robot. The parents claim that the children have been emotionally impressed by EmoSan and spoke a lot about it at home, repeating its faces, movements and pronouncing the robot’s words in an artistic way. The second question of the multiple-choice survey concerns the child”s preference for a communication partner. The results have revealed that 93% have chosen the speech and language therapist as a communication partner, 53% preferred the emotion-expressive robot EmoSan and 47% voted for the social robot Nao.

In our future work it is envisaged that the special educator will explain how the child should change his/her emotional response and by monitoring the child’s behaviour physiologically- or visually-based, the educator could assess the progress in reinterpreting the meaning of the situation. Another and more intuitive emotional regulation is through neurofeedback - a biofeedback which uses real-time displays of brain activity (most commonly EEG). The neurofeedback rehabilitation is effective for training attention or emotional self-regulation of the brain function. In the context of the current project, we place the child’s neurofeedback in the loop of the play-like interventions and implement the “modal model” of emotion in the safe world of robots. The child’s neurofeedback is exposed on humanoid or non-humanoid robots. Under the supervision of the special educator and by means of the mediated robot, the child pays attention to his/her own emotions and understands what he/she feels and how to respond.

6 Conclusions

A framework for capturing the actor’s head motion and face expression representing six different emotions and transferring them to the robot is proposed. A novel algorithm for processing the data from the gyroscope and accelerometer based on geometric algebra is introduced. The obtained data are graphically illustrated and the excess captured motion, which is beyond the robot design capabilities, is analysed and indicated. It turns out that this excess motion is small and for practical purposes can be neglected. A control framework including Nod-RED, EMOTIV BCI and the social robot is developed. The results show that the robot could be successfully navigated by captured motion and facial expression in order to control head, eyes and lip movements for expressing emotions. Experiments for speech and language therapy involving children with neurodevelopmental disorders have been conducted. These experiments show that both parents and children have a positive attitude towards the emotion-expressive robot EmoSan and its performance of emotions.