Words of encouragement: how praise delivered by a social robot changes children’s mindset for learning

This paper describes a longitudinal study in which children could interact unsupervised and at their own initiative with a fully autonomous computer aided learning (CAL) system situated in their classroom. The focus of this study was to investigate how the mindset of children is affected when delivering effort-related praise through a social robot. We deployed two versions: a CAL system that delivered praise through headphones only, and an otherwise identical CAL system that was extended with a social robot to deliver the praise. A total of 44 children interacted repeatedly with the CAL system in two consecutive learning tasks over the course of approximately four months. Overall, the results show that the participating children experienced a signiﬁcant change in mindset. The effort-related praise that was delivered by a social robot seemed to have had a positive effect on children’s mindset, compared to the regular CAL system where we did not see a signiﬁcant effect.


Introduction
Since the emergence of personal computers, researchers and educators have recognised the potential of Computer Aided Learning (CAL) systems to support children in their education [35]. Although early CAL systems relied mainly on text-based interactions, modern learning environments have offered advanced graphical or physical interfaces which offer richer and more elaborate forms of interacting with the learning system. Various forms of such advanced CAL systems have been investigated, each focusing on supporting distinct aspects of students' learning processes. For instance, some CAL systems aim to support mastery of knowledge and skill through drill-and-practice style content, or focus on offering direct tutoring and instruction, while others adopt a constructivist approach using inquiry based learning techniques. It is this latter category of CAL systems that we further investigate in this paper and in the context of the European Commission funded EASEL project [9,52].
CAL systems are increasingly being extended with social capabilities. In such cases, the system often utilises a social agent (either virtual or robotic) to support the learning process. Through social interactions with the learner, the agent can offer different and richer forms of support, which would be difficult to achieve with a strictly non-social CAL system. We are interested in exploring ways in which extending a typical inquiry based CAL system with a social robot can have a meaningful impact on children's education: How can we leverage the robot's inherently social nature to support a better learning experience?
In a previous study we investigated verbalisation behaviour of children who were prompted to explain their thought process to an interactive CAL system that was extended with a social robot [62]. There, the role of the robot was to prompt for more detailed (verbal) explanations while participants worked on an inquiry learning task. We considered such prompting for explanations to be social acts of the system and argued that in these situations a social entity like a robot is more capable at eliciting a favourable response. Results from that study showed that children gave more detailed and more relevant explanations when the CAL system was extended with a social robot through which to deliver the prompts.
Inspired by the findings from our previous study we investigated other social aspects of learning where robots might have opportunities to enhance the delivery of the social acts of the CAL system. The social act of giving praise being one such instance where we saw opportunities for a social robot to play a meaningful role. Praise has long been recognised as an important social mechanism that can be used to support a learner. Praising the learner's process, abilities, and achievements has been shown to influence their motivation, performance, and self-esteem, among others [23,28,29,45]. In a later section of this paper we give a more detailed account of the relationships between learning and praise.
We conducted a long-term, unsupervised, in-the-wild study where we investigated effects of praise by a CAL system, delivered through a social robot, on children's attitudes towards learning. The primary results that focus on the effects of praise delivered by a robot are reported in the paper at hand, while the process of designing a long-term interaction and the results of deploying such a setup in classes over an extended period of time are discussed in [15]. This paper is structured as follows. We introduce the concept of a social robot and discuss the related pedagogic theories on praise and mindset in Sects. 2 and 3, respectively, followed by the aims and objectives in Sect. 4. In Sect. 5 we present the technical setup and discuss the design of the system's multimodal interactions. The study methodology is discussed in Sect. 6, which covers the experimental design, measures, and procedures. Results are presented in Sect. 8 and further discussed in Sect. 9. Finally, in Sect. 10 we draw the main conclusions.

Computer aided learning and social robots
The goal of instruction and teaching is to promote learning. Since we invest much time, money, and effort in good education it is worthwhile to understand what learning is: "Learning refers to lasting changes in the learner's knowledge where such changes are due to experience. Thus, learning is defined as a relatively permanent change in someone's knowledge based on the person's experience." [43, p. 7]. Among many other tools, educators may use Computer Aided Learning (CAL) systems to foster this change in knowledge by offering the learner an interactive learning environment.
To support the learner, CAL systems may for example present background information about the topic at hand, provide the learner with templates or step-by-step instructions, or constrain the learner's interactions with the learning environment to reduce variables in a problem space [36,58,59]. Additionally, such systems may, for example, monitor and structure the learning process to offer adequate advice and feedback [63]. Modern CAL systems may further personalise and adapt the learning experience to match an individual's characteristics, performance, and personal development [51].
CAL systems are sometimes extended with pedagogical agents to add a social dimension to the learning experience [22,27]. They are social agents that perceive and act in a social context and by communicating with the user they aim to support the learning process. Among other things, pedagogical agents can improve learner's self-efficacy (e.g. [1]), reduce anxiety (e.g. [2]), and offer motivational scaffolds (e.g. [57]).
Social robots are increasingly used as pedagogical agents to support learning, and are applied in many situations where social interactions play a role [4,38]. Being physically embodied and physically present bestows robots with multimodal interaction capabilities that set them apart from virtual agents, playing a role in how users interact with them socially [41,60]. For example, through their physical embodied nature robots have been shown to improve social presence (e.g. [37]), turn-taking (e.g. [34]), and learning gains (e.g. [40]).
Aspects of robots' social capabilities are often used to enhance the learning of children in various educational domains, showing promising results throughout. For instance, robots are becoming popular tools in language learning [8]; the social nature of the robot can be used in storytelling situations (e.g. [31,33]) where it may help modulate the child's affective state [21]. Additionally, robots are used to help children with their handwriting; these robots evoke the learning-by-teaching paradigm (e.g. [24,39]) where they achieve success by posing as a social peer that is capable of learning [7]. Robots have also been shown to impact children's problem-solving skills (e.g. [10]) and robot peers have supported children's self-regulation of medical conditions such as diabetes (e.g. [3,13]), where the robot can be provided with a relatable background story to enable more natural colearning paradigms.
Zeno, the robot used in this study, is a small humanoid robot with an expressive face (see Fig. 1). Besides appli- Fig. 1 The robot used in this study: Robokind's Zeno R25 cations in typical primary education, the Zeno robot has also been used in therapeutic settings involving children with autism spectrum disorder. In those settings, the robot is able to elicit spontaneous exploratory child-robot interactions and facilitate child-adult interactions [55]. Furthermore, this robot was used to design interactive experiences for children with autism where they can practice facial expressions [11,42]. Although it can sometimes be hard to recognise which emotion is expressed by robots, work from Chevalier et al. [12] has shown that the availability of Zeno's facial features plays an important role in successfully expressing its emotions. In general, they showed that individuals with autism, as well as typically developing individuals, could more easily recognise emotions that were expressed through a combination of body posture and facial features. Schadenberg et al. [54] suggest that in some cases the recognition of the robot's emotions can be further improved through multimodal non-verbal affect bursts, such as laughter or sobbing-although, obviously, this can only be done when such affect bursts are appropriate to the interaction.

Praise and mindset
In our work we use the robot's social capabilities in combination with offering praise to influence how children learn. One of the factors that plays an important role in the motivation for learning and thinking in school settings is a learner's mindset [6]. Dweck [19] describes two forms of mindset: (1) a fixed mindset is characterised by the belief that you are born with a certain capacity and that you cannot influence your capacity very much; (2) a growth mindset is characterised by the belief that you can improve your capabilities and expertise through perseverance and effort, and that failure is an inherent part of learning.
On the one hand, Mueller and Dweck [45] and Dweck [19] show that people with a fixed mindset tend to focus on proving their intelligence: they want to look smart. Because of this, they are reluctant of tasks that are challenging or hard, because there is a chance of failure, which conflicts with their goal of looking smart. In their belief, failing at something is an indication that you are not smart enough or lack the necessary capabilities; when you fail it is an indication that you will not be able to complete a certain challenge. You better give up and try something easier. Furthermore, according to people with a fixed mindset, effort is seen as something negative. If you have talent or are gifted then it isn't necessary to struggle with a task. Viewed from such a perspective effort is for people who lack talent.
On the other hand, people with a growth mindset tend to focus on learning. They are motivated to do new and complicated tasks because it provides opportunities for learning. Their goal is to learn and to develop themselves by working hard and by putting in a lot of effort. They see failure as something that is necessary for learning. So after failing, people with a growth mindset tend to work harder and try out new strategies in order to complete a challenge or master a skill [19,20,45]. A growth mindset is seen as a favourable trait when it comes to exploring new learning domains and developing new skills. Therefore, we focused on promoting such a growth mindset.
Praise and criticism have an important influence on the development of growth and fixed mindsets [23,28,45]. For praise to have a positive impact it is important that it is perceived as contingent, specific, sincere and credible [46,48]. Praise for high ability and personal traits is a common response when someone did a job well. Whether it is in the classroom, when playing sport, or during artistic endeavours, praise for ability is seen as a popular tool to stimulate learners' motivation [29,45]. However, focusing on primarily praising high ability may have an undesired impact. It can make children feel pressured to perform well in future situations, which stimulates a fixed mindset. An alternative for praising ability is praising effort. Instead of praising one's goal to seem smart, effort-related praise focuses on the process of learning or mastering of a certain skill [47]. This form of praise stimulates a growth mindset, since the emphasis is on the process of learning instead of the end result [45].
In related work involving robots and mindset, Park et al. [49] have shown that a peer-like robot can promote a growth mindset in children. The robot and child took turns solving puzzle tasks, during which the robot either exhibited neutral behaviours or role model behaviours associated with a growth mindset. Their robot used a multimodal behaviour repertoire consisting of speech, nonverbal expressions, and gaze. Depending on the condition, the robot would use neutral factual statements or mindset-related statements accompanied by an appropriate body posture and facial expression (e.g. engagement, interest, excitement, or frustration). Afterwards, children who had worked with the role model robot exhibited a stronger growth mindset themselves. In our study, in contrast with Park et al. [49], we were interested in investigating the role of praise instead of role modelling behaviour.

Aims and objectives
This paper aims to gain deeper insights into how children respond to effort-related praise while working on a learning task. Offering praise to a learner can be seen as a social act of the CAL system. We are interested in exploring ways in which a social robot may be used to extend a traditional CAL system to deliver such social acts in a natural and convincing way. In situations where a child works together with a peer learner robot, we consider praise to be an appropriate and positive way of supporting learning, more so than criticism (even if constructive). Therefore, we chose to primarily focus on promoting attitudes associated with a growth mindset as opposed to dissuading attitudes related to a fixed mindset.
This leads to the following research question: what are the effects on the mindset of children when extending an autonomous CAL system with a social robot to deliver effortrelated praise?
We expect that the social act of giving effort-related praise has a greater potential impact when it is delivered by a robot as opposed to a traditional CAL system, since by its very nature a robot can be more convincingly presented as a social entity. A robot has an elaborate repertoire of social cues to engage with the learner, such as focus of attention, facial expressions, and deictic gaze. Therefore the hypothesis in this study was: participants who work with a CAL system that delivers praise through a social robot will display a stronger growth in their mindset than participants who work with a CAL system that delivers such praise without a social robot.
Additionally, children from a different school worked with a baseline version of the interactive CAL system, without robot, which offered no such effort-related praise. Although for these children we found no significant effects on their mindset, for the sake of completeness we also report these results of the baseline CAL system as part of this paper.

Design of system and multimodal interactions
The CAL system was based on a technical architecture that was developed as part of the EASEL project [52]. This base architecture was adapted and further extended to support fully autonomous, unsupervised, long-term interactions in the wild. An early prototype of this system using interactive embodied learning materials has been previously described in [14]. The system consisted of the following main components: embodied learning materials, sensors, headphones, a tablet, a robot, and a control computer. The embodied learning materials were based on inquiry learning instruments originally developed by Inhelder and Piaget [26] to support the exploration of several phenomena from the physics domain. Firstly, in the balance scale task children explored the 'moment of force' by placing differently weighted pots on a scale at various distances to the left and to the right of the central pivot point. The balance task is shown in Fig. 2. Secondly, in the ramp task children  explored 'potential energy' and 'rolling resistance' by racing various balls of different materials, sizes, and weights from two adjustable sloped ramps. The ramp task is shown in Fig. 3.
Several different types of sensors were used to enable fully autonomous unsupervised interactions. Firstly, embedded sensors in the learning tasks recorded the state of the materials and the child's actions. The balance scale task used a potentiometer in the central pivot to measure the tilt of the balance, it used different resistors in each pot to measure their placement on the various locations, and it used reed switches to detect whether the stabilising blocks were present under each side of the balance. The ramp task used potentiometers to measure the angle of each slope, it had a physical button for releasing the balls, and it used a photoresistor to measure when a ball had reached the end of the track. Secondly, an external Microsoft Kinect 1 depth-camera was used with the SceneAnalyser software [65] to detect the presence of a child and the location of their face. Finally, an RFID scanner was used to recognise individual children; we handed out unique RFID badges that children scanned at the start of each session.
All control software ran on a small desktop computer, which was hidden out of sight of the children. This computer was responsible for collecting and interpreting the sensor values and generating responses from the system. An early version of Flipper 2.0 [61] was used to define the CAL system's dialogue models, and to manage the flow of the interaction. Behaviours of the CAL system were specified in Behaviour Markup Language (BML) [30], which were executed by ASAPRealizer [53]. ASAPRealizer is an engine for choreographing synchronised multimodal behaviours across devices, modalities, and platforms. It ensures that verbal utterances, tablet interface updates, and robot movements remain nicely synchronised throughout the interaction. As shown in Fig. 7, the core BML specification was extended with robot-specific and tablet-specific behaviours that could not be expressed in standard BML: gazing towards a (x,y) location and showing text, images, and buttons on the screen.
The system supported several forms of multimodal inputs and outputs. A Samsung Galaxy Tab2 tablet was used as the primary means of direct input to the CAL system: children could input their responses to questions by pressing buttons on the tablet interface. The various sensors in the task provided another means of input to the system, enabling it to respond dynamically while children worked with the task.
Task instructions were delivered verbally and displayed on the tablet as written text and illustrations (shown in Fig. 4). Since the study took place in a real classroom during school time, teachers had requested that interacting with the system should not interrupt or disturb regular lessons. Such requests are common when doing a study in class [32]. Therefore, all verbal utterances produced by the system and robot were played through headphones. The system used the Fluency 2 text-to-speech engine to generate Dutch speech.
Robokind's Zeno R25 3 robot was used in the Robot condition to deliver the CAL system's verbal feedback and praise. This small humanoid robot has a face that can express basic emotions. Furthermore, it has several degrees of freedom in its eyes, neck, and torso with which it can perform approximate gaze shifts. Unfortunately, the servos in the robot's arms and legs are relatively noisy, which disqualified them from being used in the classroom.
The robot's multimodal behaviours in this study were informed by design guidelines emerging from an extensive contextual analysis of inquiry learning tasks with our tar- Fig. 4 Assignment instructions displayed on the tablet during the preparation phase for the balance and ramp tasks. Each assignment had both a textual and a visual description. Children could press the bottom left button to have the assignment text read out loud. By pressing the bottom right button, children continued to the next assignment phase: prediction Translated text: "First, put the blocks back under the balance. Then, place a yellow pot on pin 1 and a red pot on pin 6. What do you think will happen with the balance?" get user group [16]. The robot used expressions for smiling when addressing the child and delivering praise, and amazement when the child performed the experimentation step of the task (for example, see Fig. 5). Furthermore, it gazed dynamically towards the user, the tablet, and relevant areas of the task depending on the child's actions and progress (for example, see Fig. 6). The robot used lip synchronisation to match the verbal utterances produced by the text-to-speech engine. Minimal idle behaviour was added through periodic eye blinking. An example of a multimodal BML behaviour script is shown in Fig. 7. The timing of speech, gaze, facial expressions, and content displayed on the tablet are synchronised to produce a coherent behaviour sequence. Placeholder variables, like the assignment details and the location of the (a) Gazing towards the user. (b) Gazing towards a part of the task. The gaze direction and the appropriate moment for gazing was determined using embedded sensors in the task (here, the balance had just tilted to one side).

Study design
The main study was a between-group design with one independent variable: the presence or absence of a social robot to deliver the system's effort-related praise. The dependent variable was the child's mindset, measured through a pretest and posttest questionnaire.

Conditions
We manipulated whether the CAL system's praise was delivered by a robot or not. This resulted in the following two conditions: -No-Robot a traditional CAL system that offered effortrelated praise -Robot a CAL system extended with a robot to deliver the effort-related praise Over the course of several months children worked individually on two consecutive learning tasks (see Figs. 2, 3) that consisted of assignments of increasing difficulty. In both conditions, the CAL system offered identical task-related instructions, help, feedback, and effort-related praise.

Participants
Both conditions were run in parallel in a classroom of two district locations of the same Montessori school. Both district locations were located in a similar suburb of the same city. Participants were 44 children between 6-10 years old, as described in Table 1. In parallel, the baseline CAL system without robot and without praise was tested with 17 children in a classroom of a Freinet school of the same city.
Ethical approval was obtained from the EEMCS ethical board of the University of Twente and parents signed an informed consent letter prior to the start of the study.

Manipulations: delivery of praise
Both conditions used the same state-of-the-art CAL system which offered verbal task-related instructions, task-related help, task-related feedback, and effort-related praise using a computer-generated voice. To not disturb and distract other children in the classroom, participants always used headphones to listen to the system's verbal utterances.
In the No-Robot condition, children worked with the CAL system without a robot, and the praise was delivered only through the headphones. In the Robot condition, the CAL system was extended with a social robot to deliver the same effort-related praise. The robot would gaze towards the user and show a smiling facial expression while verbally delivering the praise. Both the No-Robot and Robot conditions offered identical praise using the same computer-generated voice, played through the same headphones.
The system offered such praise at several moments during the assignments. Firstly, after a child completed the experimentation phase of an assignment and had entered their observation the system would offer a compliment on their progress, such as "I think you put in a lot of effort!" or "I see you tried your best!".
Secondly, after the conclusion phase, the system would ask them how they felt about whether their hypothesis was correct or incorrect. For example, if their hypothesis was incorrect, they could select either "I think I am not good at this" or "I think I can learn this" on the tablet. After selecting the former, the system would respond with "We can learn from a mistake!". After selecting the latter, the system would respond with "I think so too!".
Finally, after each completed assignment the system asked the child to rate how difficult they found the task, after which they could choose the difficulty of the next task. At this point, the system would generate appropriate feedback and praise to promote a growth mindset. The praise that was given by the system depended on three aspects: (1) whether or not the child gave a correct hypothesis; (2) the self-reported assignment difficulty; and (3) the subsequent selected level of difficulty. Based on these aspects the system labelled the child's attitude at that point in time as either performance-driven or masterydriven.
On the one hand, a performance-driven attitude is characterised by wanting to demonstrate competence by avoiding mistakes, thus often shunning difficult or unknown challenges. Individuals with a fixed mindset often exhibit performance-driven attitudes. When a child exhibits such performance-driven behaviours the system offered feedback to help promote a growth mindset. For example, when a child made a correct prediction, indicated that they found the assignment easy, yet still chose an easier next assignment, this was labelled as performance-driven. In this case, the system would highlight the importance of seeking an adequate challenge by giving the feedback "Gosh, I would have expected you to choose a more difficult task, because then you can learn more" or "It is fine if you want to practice more, but if you want to learn something new we can try a more difficult task".
On the other hand, learners with a mastery-driven attitude focus on improving their skills and learning process through practice, thus often embracing more difficult challenges. Such attitudes are often associated with individuals who lean towards a growth mindset. When a child exhibits such mastery-driven behaviours the system offered feedback to further strengthen their growth mindset. For example, when a child made an incorrect prediction, indicated that the difficulty was okay, and chose the same difficulty for their next assignment, the system labeled this as a mastery-driven attitude. In this case, the system offered encouragement to emphasize the importance of practicing: "We didn't get it yet this time, let's practice some more and we can learn!". Similarly, for example, if a child made a correct prediction, indicated that the assignment was easy, and chose a harder next exercise, the system would praise the child to emphasize the importance of seeking a challenge: "Great, you are choosing a challenge, I like that!" or "Great, during a more difficult task we might learn something new!"

Measures
The main research question in this study focused on affecting the mindset of children. In particular, we were interested in promoting a growth mindset through effort-related praise. Similar to Park et al. [49], pretest and posttest questionnaires were used to measure a change in the children's mindset as a result of the intervention. The 18-item questionnaire used in this study was inspired by the questionnaire designed by De Castella and Byrne [17] who revised the implicit theories of intelligence scale designed by Dweck [18]. Colleagues from the ELAN group of the faculty of Behavioural, Management and Social Sciences of the University of Twente used a part of the questionnaire from De Castella and Byrne [17] to design a version that is suitable for young children.
In contrast with the questionnaire presented by De Castella and Byrne [17], the concept "intelligence" was replaced by "smart", as pilot tests showed that this was better understood by very young children. The following (translated) definition of smart was given to the children: "Smart means that you are well able to consider, think up, and thresh out/figure out." The questionnaire consisted of items that fall under two main constructs: items measuring a growth mindset and items measuring a fixed mindset. Furthermore, additional questions regarding effort were added to the questionnaire used in this study, since beliefs about effort are related to mindset.
Children could provide their answers according to a 4point Likert scale with the following options: strongly agree, somewhat agree, somewhat disagree, and strongly disagree. However, Likert scales are often difficult for very young children because they tend to think more dichotomously and have a tendency to endorse responses at the extreme end of the presented scales. This can especially be the case if the statements are related to more 'fuzzy' subjects such as feelings, beliefs or attitudes [44]. Park et al. [49] addressed this by offering children sets of bipolar statements from which to choose. However, we were interested in capturing nuanced responses of children that were not necessarily on either extreme end of the spectrum.
In several pilot test iterations with our target user group, we explored different techniques for administering the questionnaire. Participants in these pilot tests were from several primary schools visiting the university during school trips, for whom signed parental consent was available. Firstly, we presented the questionnaire in the traditional fashion as a self-administered test, while giving children the option to ask the experimenter for help or additional explanation. Only some of the older children were able to complete this version of the questionnaire without any issues, as the younger children had difficulties reading and understanding the questions. Secondly, we had the experimenter read each statement of the questionnaire out loud, after which children were asked whether they strongly agreed, somewhat agreed, somewhat disagreed, or strongly disagreed with the statement. In this case, some children seemed to be unable to distinguish between the answer options or were hesitant to commit to a choice of answer. Finally, the technique which was best understood by the target group was to first have the experimenter read each statement aloud and then ask: "do you agree or disagree?" We observed that children could quite naturally answer this dichotomous question. After a child made an initial choice, the experimenter would subsequently ask: "do you strongly (dis)agree or somewhat (dis)agree?" The resulting mindset questionnaire was implemented as an interactive web form and used as a pretest and posttest. The full questionnaire is included in Appendix A.

Procedures
Due to the long-term nature of this study, we worked together with the teachers and school management to fit our activities in with their regular school schedule as best as possible. In some cases, this meant that activities at individual schools were moved forward or delayed and that schools followed a slightly different timeline throughout the study. Additionally, the total duration of the experiment varied per school due to holidays and other events. Table 2 shows an overview of the stages of this study for each school, highlighting the timing of questionnaires and tasks.
The study took place between December 2016 and May 2017. Around two weeks before the start of the study two experimenters would do the mindset pretest with the children. The mindset posttest was done in the week after the second task ended. The experimenters would call each child one by one to a separate room in the school. The experimenter then read aloud the statements of the mindset questionnaire and noted down the answers of the children. It took children approximately 10 minutes to complete the pretest and posttest. Following the pretest, the first learning task was placed in the classrooms for approximately 6-7 weeks. During this period children initiated a total of 260 sessions and completed a total of 756 assignments. Then, the second task was placed in the classrooms for approximately 8-10 weeks, during which a total of 195 sessions were initiated and children completed a total of 550 assignments. In all three classes children progressed through the various levels of difficulty without many issues, with the majority of children achieving the highest level in each of the two tasks at some point.
More detailed procedures for each task are discussed in the following sections.

Task 1: balance
The balance task (see Fig. 2) was placed in the classroom after school hours. The next school day an experimenter gave a short explanation to all children, introducing the various components of the system: the tablet, headphones, the learning materials, and the robot. Children were instructed that the voice of the system/robot was speaking through the headphones. They then handed out the personal RFID badges and showed children how to scan their badge to initiate the interaction. The children were not instructed how often or how long they should interact with the learning task, and there was no set schedule. Instead, children were entirely free to work with the system on their own initiative, as long as it fit within the lesson schedule of the teacher.
To initiate a learning session the child would put on the headphones and scan their RFID tag. The system would then greet the child by speaking through the headphones. If it was the very first interaction, the system would greet the child by saying: "Hi [NAME], nice to meet you! Shall we play together?" If the child had already interacted with the system before the system would say: "Hi [NAME], it's nice to see you again. How did it go last time? Do you think it was hard, easy, or was it okay?" [CHILD SELECTS ANSWER] "All right, last time we finished assignment [DIFFICULTY LEVEL] let's move on from here." After every task, the child would choose whether they wanted to do another task and whether the next task should be easier, harder, or equally difficult.
If the child indicated that they did not want to play anymore, the CAL system said goodbye and logged out the current user: "Goodbye [NAME], it was nice playing with you! See you next time!" If the child indicated that they wanted to continue, the system started a new task according to the chosen difficulty level. A child could complete a maximum of four assignments during each interaction. After four assignments the system said goodbye to the child and logged them out automatically. It took children approximately 10 minutes to complete four assignments.

Task 2: Ramp
Similar to the first task, the ramp (see Fig. 3) was placed in the classroom after school hours. The next school day, a researcher gave an explanation of the new task. Since at this point children were familiar with the tablet, headphones, and robot, the explanation focused on introducing the new task. In contrast with the highly predictable deterministic nature of the balance assignments, the ramp task would sometimes give unpredictable results: balls would occasionally bounce off The duration mentioned for each task excludes weekends and holidays. In the No-Robot condition children worked with a CAL system that offered praise. In the Robot condition, this praise was delivered by the robot. Additionally, we tested a Baseline CAL system without a robot and without praise the sides of the ramp while rolling, which would cause them to slow down. Especially when racing two otherwise identical balls, this occasionally resulted in an unexpected difference in their finishing times. Children were therefore reminded and encouraged that they could repeat the same race as many times as they wanted, in order to collect additional evidence to confirm or refute their initial observations. Children again used their RFID badge to start sessions with the system on their own initiative. The procedure, the system utterances, and praise mechanism were the same as in the first task. Depending on the difficulty level and how often children would repeat the same experiment, they spent between 5-15 minutes on completing four assignments.

Analysis
In the scope of this paper we were interested in the effects on the mindset of children when extending an autonomous CAL system with a social robot to deliver the system's effort-related praise. A full analysis of other qualitative and quantitative results from this study will therefore be reported elsewhere; here we focus on the analysis of the mindset questionnaires.
The mindset questionnaire was used to gather pretest and posttest scores for participating children. Answers were marked according to a 4-point Likert scale with the following options: strongly disagree, somewhat disagree, somewhat agree, and strongly agree. Answers were then converted to a numerical representation by assigning the respective scores 1, 2, 3 and 4, such that a low score corresponded with disagreement and a high score corresponded with agreement.
To investigate the presence of underlying constructs in the questionnaire items an Exploratory Factor Analysis (EFA) was performed using all pretest questionnaire scores from the three participating classes. Bartlett's test of sphericity shows that the data is appropriate for EFA, χ 2 (153) = 291, p < 0.001. We used an oblimin factor rotation and a factor loadings cutoff value of 0.4. Remaining items with a loading below 0.4 on all factors have been dropped from further analysis. Following this approach, three mutually exclusive constructs emerged from the data: Growth The 'growth' construct consists of items 1, 4, 8, 13 and 14 which explain 41% of the variance, with loadings ranging from 0.46 to 0.72. These items are formulated in such a way that they align with a mindset oriented towards growth. They address that one can change how smart they are, and that one can learn and do better by working hard on difficult assignments. When a participant shows high agreement with these items this corresponds with a growth mindset. A Cronbach's Alpha of 0.73 shows an acceptable internal consistency for this construct. Fixed The 'fixed' construct consists of items 2, 3, 5, 7 and 9 which explain 31% of the variance, with loadings ranging from 0.41 to 0.77. These items highlight one's inability to influence how smart they are. The items cite innate causes for this lack of influence. This is an attitude that is characteristic for individuals with a fixed mindset. Accordingly, when a participant shows a high agreement with these items this corresponds with a fixed mindset. A Cronbach's Alpha of 0.68 shows a questionable internal consistency for this construct. Effort Finally, the 'effort' construct consists of items 16 and 18 which explain the remaining 28% of the variance, with respective loadings of 0.99 and 0.43. These two items describe a preference for spending less effort, either by working less hard on difficult assignments or by choosing easier assignments, both of which are telling of a fixed mindset. When a participant shows a high agreement with these items this corresponds with a fixed mindset. A Cronbach's Alpha of 0.59 shows a poor internal consistency for this construct. Therefore, this has been dropped from further analysis. Future iterations of this questionnaire should expand the number of relevant items, and may focus on further validation of the construct as a whole.
Only participants who completed both the pretest and posttest were included in the data set. A score was computed for each construct by taking the average of the individual items belonging to that construct, such that a low average score corresponds to a low agreement with the statements in that construct and a high score corresponds to a high agreement.

Results
A total of 17 children worked with the baseline CAL system, which we tested in a separate school in parallel with the main study. For these children we found no significant differences between their pretest (M = 3.5, SD = 0.4) and posttest (M = 3.4, SD = 0.5) scores.
Regarding our main study, a total of 24 and 20 children completed both the pretest and posttest in the No-Robot and Robot conditions, respectively. Pretest and posttest results for the growth and fixed constructs are shown in Fig. 8. We found no significant differences between conditions for the fixed construct. For the growth construct, a repeated measures ANOVA showed no significant interaction effects, but did show significant main effects on the between-subjects variable (the source of the praise) (Repeated measures ANOVA, F(1, 42) = 7.23, p = 0.01) and the within-subjects variable (the pretest versus posttest scores) (Repeated measures ANOVA, F(1, 42) = 4.7, p = 0.036). These results show that there was a difference between the conditions and that during the course of the study the feedback and praise led to an overall increase in growth mindset. However, these results do not necessarily show that this increase was mediated by the robot.
To further investigate the growth construct main effects, a post hoc analysis was performed with Bonferroni corrected pairwise tests (adjusted Alpha level = 0.0125). Between subjects, we found no significant difference on the pretest scores for the No-Robot (M = 3.2, SD = 0.7) and Robot (M = 3.4, SD = 0.5) conditions. However, we did find a significant difference on the posttest scores between These results show that the group of children who received praise from the robot saw a significant benefit to their growth mindset.

Discussion
Although the results do not necessarily show that an increase in growth mindset scores was mediated by the presence of the robot, they do show that the CAL system as a whole has had a positive effect. Furthermore, we found a significant improvement in the growth mindset in the Robot condition, where no significant difference was found in the No-Robot condition. Based on these results we interpret that working with and receiving praise from the robot had a positive effect on children, although this study was unable to uncover exactly what caused this.
Working with a robot may have all kinds of impact on how children work with an interactive learning system. In our previous work we looked at effects on explanation behaviour [62] and in addition to the mindset results reported here, we looked at how children's interactions with the system developed as they progressed from initial novelty effects towards sustained use [15]. Results from the study presented in this paper and related work from Park et al. [49] suggest that robots who offer implicit or explicit remarks, feedback, and praise may promote a growth mindset. Promoting such a growth mindset in young learners has been shown to improve academic achievement later in life [5], which leads us to speculate that having a robot in class can potentially have a positive impact on learning in the long run. This line of research suggests that robots can be promising tools for education.
Other research has shown benefits of promoting a growth mindset with older learners. First year high school students, for instance, have been found to significantly improve their grades after participating in a single-session mindset intervention Yeager et al. [64]. Similar interventions may continue to help underachieving students throughout high school, increasing their performance and grades, potentially lowering the chances of them dropping out Paunesku et al. [50]. Besides learners, teachers may also benefit from mindset interventions. Seaton [56] found that teachers not only improved their own mindset, but could also more confidently apply it in practice in their own teaching. It could be an interesting line of future research to investigate how a robot's mindset support may adapt, grow, and evolve with the needs of learners of all ages.
In pursuit of ecological validity we conducted the longitudinal study unsupervised and in the wild. By doing so, we have shown that it is feasible to conduct comparative HRI studies in real classrooms, capturing real changes in children's learning process. As a consequence, however, we identified two main limitations related to this study. Firstly, since this study spanned a relatively long period of time we were unable to follow identical procedures and timelines in all participating schools, despite our best efforts. As a consequence, special events such as holidays, sports days, and school musicals took place at different moments in the experiment timeline for the different schools. While we do not expect this to have impacted their mindset directly, it may have had an influence on the number of sessions children initiated throughout the study. Secondly, although we took care to select similar schools from a similar region, the results of this study may have been impacted by differences in educational methods between schools, lesson plans from individual teachers, or other external factors. That being said, the schools' curricula did not explicitly cover the topic of mindset, and in discussions with the teachers, we found no indications that would suggest major differences between classes.
To prevent distractions in the classroom, teachers had requested that the robot would not make too much noise. Therefore we did not use the robot's rich full-body behaviour animation repertoire, resulting in a somewhat static experience. When talking with children after the study, some were disappointed in the robot's limited movements (e.g. "he didn't move his arms or legs" or "he can't walk"). Additionally, we played all of the robot's speech through headphones instead of a speaker, an approach that is not uncommon when conducting research in classrooms (e.g. see [21,25,32]). We tried to lessen the potential effects of disembodied speech by using lip-synchronisation and by explaining beforehand that the robot talked through the headphones. Although we do not know how this may have affected the perceived embodiment, children did seem to consistently ascribe the voice to the robot. They often mentioned that it was the robot who spoke to them, gave instruction, and offered feedback. They also speculated about aspects of the robot's voice (e.g. "he sounds like a boy/girl" or "his voice is like someone my age"). Some stated explicitly that the robot talked to them through the headphones. Nobody mentioned that they had found this to be strange or unpleasant.
For this study we designed a tool for assessing the mindset of young children. The 'effort' construct, which emerged from an exploratory factor analysis, showed low internal consistency and was composed of only two items. It was therefore dropped from further analysis in this study. A future version of the questionnaire should focus on further expanding and validating this construct.
Other mindset assessment tools for young children (e.g. [49]) measure mindset as a singular bipolar dimension that ranges between a fixed-oriented and growth-oriented mindset. However, we find indications in the data that mindset may instead be considered as two separate unipolar dimensions that range between a less-fixed to more-fixed mindset and a less-growth to more-growth mindset. For example, a single child may score high on growth mindset and at the same time also score high on fixed mindset. Our questionnaire was designed to capture such nuanced situations.
Although the mindset questionnaire has not been rigorously validated as of yet, we have used it in several prior pilot tests with target users involving a wider range of schools. Results of these pilot tests were not inconsistent with teach-ers' expectations of those children. Therefore, we are fairly confident that the tool is sufficiently sensitive to capture differences between individuals with respect to their fixed and growth mindsets. The relatively high scores on the growth construct in the pretest may suggest that the sample from the target user group may have been skewed towards individuals who already exhibit a strong growth mindset. Although mindset is not an explicit part of the school curriculum, the Montessori and Freinet teaching methods of the schools that participated in this study typically have a focus on promoting learning attitudes associated with a growth mindset, which may explain why participating children scored so well on the pretest. In retrospect, these participants may not have been an accurate representation of the user group who potentially has the most to gain from an intervention such as ours. It might therefore be interesting to repeat this study with participants who initially score lower on growth mindset.

Conclusion
This paper describes a longitudinal, in the wild, unsupervised study in which children could interact at their own initiative with a fully autonomous Computer Aided Learning (CAL) system situated in their classroom. The system offered effortrelated praise while children worked on the learning task. To measure changes in children's mindset before and after the intervention we constructed a questionnaire which was administered as a pretest and posttest interview. The questionnaire consisted of three constructs related to mindset: (1) growth-related items; (2) fixed-related items; and (3) effortrelated items.
We deployed two versions of the CAL system: a system that delivered the praise through headphones only, and an otherwise identical CAL system that was extended with a social robot to deliver the praise. A total of 44 children interacted with two consecutive learning tasks over the course of approximately four months. Additionally, we tested a baseline version of this CAL system with 17 children, over the same time span, where there was no robot and where children received no effort-related praise. Children who worked with the latter version showed no significant change in their mindset.
The main research question that guided this work was: What are the effects on the mindset of children when extending an autonomous CAL system with a social robot to deliver effort-related praise? Overall, results showed significant differences on the growth construct between the No-Robot and Robot conditions. The effort-related praise that was delivered by the social robot had a positive effect on the children's growth mindset, whereas the same praise offered by the otherwise identical regular CAL system did not result in a significant effect.
The results from this paper offer an interesting insight for social roboticists and educational psychologists working on creating real-world learning interventions. Firstly, we make a methodological contribution to the field of educational HRI by showing the feasibility of conducting comparative studies in real world, longitudinal, unsupervised settings. Secondly, this study makes an empirical contribution by showing the potential benefits of using a robot to more effectively accomplish the social act of delivering supportive praise to promote a growth mindset. In a previous study we saw similar results where a CAL system extended with a social robot performed better on the social act of eliciting longer and more detailed verbal explanations [62]. Results from both of these studies lead us to speculate whether the value of robots could extend to other social acts in learning. We suggest that future research explores additional instances of social acts where social robots may potentially play a key role.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.

A Mindset questionnaire
The full questionnaire that was used to measure the mindset of participants during the pretest and posttest is listed below. Items were answered on a 4-point Likert scale: strongly disagree, somewhat disagree, somewhat agree, strongly agree. An exploratory factor analysis revealed three mutually exclusive factors present in the data: Growth (Items 1, 4, 8, 13 and 14) The 'growth' construct consists of items that are formulated in such a way that they align with a mindset oriented towards growth. These items address that one can change how smart they are, and that one can learn and do better by working hard on difficult assignments. When a participant shows high agreement with these items this corresponds with a growth mindset. A Cronbach's Alpha of 0.73 shows an acceptable internal consistency for this construct. Fixed (Items 2, 3, 5, 7 and 9) The 'fixed' construct consists of items that highlight one's inability to influence how smart they are. The items cite innate causes for this lack of influence. This is an attitude that is characteristic for individuals with a fixed mindset. Accordingly, when a participant shows a high agreement with these items this corresponds with a fixed mindset. A Cronbach's Alpha of 0.68 shows a questionable internal consistency for this construct. Effort (Items 16 and 18) Finally, the 'effort' construct consists of two items that reveal a preference for spending less effort, either by working less hard on difficult assignments or by choosing easier assignments, both of which are telling of a fixed mindset. When a participant shows a high agreement with these items this corresponds with a fixed mindset. A Cronbach's Alpha of 0.59 shows a poor internal consistency for this construct. Future iterations of this questionnaire should expand the number of relevant items, and may focus on further validation of the construct as a whole.
1. I think I can change how smart I am, 2. I think I can't change how smart I am, because I am born like this, 3. I think I will always stay this smart, because I can't change that, 4. I think I can become smarter step-by-step, 5. I think I will always stay this smart, because that is fixed in my brain, 6. I think I can change how smart I am by practising with assignments of increasing difficulty, 7. I think it is fixed how smart I am and there is nothing I can do to change that, 8. I think I can change how smart I am, by doing my best, 9. I think that my smartness is fixed in my brain, and I can't change it, 10. I think I can always change how smart I am, 11. I work harder on difficult assignments because then I learn the most, 12. I feel dumb when I have to think really hard for an assignment, 13. I do my best for difficult assignments, because then I learn the most, 14. I do my best for difficult assignments, because then I will be able to do them better, 15. I work much less hard for difficult assignments, because I can't do them anyway, 16. I work less hard for difficult assignments because I prefer not to do much effort, 17. I prefer to choose more difficult assignments, because then I can learn something new, 18. I prefer to choose easier assignments, because then I have to spend less effort.