Introduction

In the United States, students whose home language is not English make up about 21% of the current K-12 school-age population (National Center for Educational Statistics, 2016). These students tend to score lower than their native English-speaking peers in all subject areas measured nationally (Kena et al., 2015). The early school years are an especially critical period when children are first exposed to academic learning and start developing their learner identities. Unfortunately, low expectations, negative stereotypes, and racialized learning pathways (Nasir & Vakil, 2017; Valencia, 2020) prevalent in schools and classrooms can take a toll on both the learning and the identity of these students. It is well understood that children from diverse backgrounds should be socially and culturally integrated in schooling to succeed academically (Darling-Aduana & Heinrich, 2018; Marx & Larson, 2012; Vasquez et al., 2011).

The resources to assist young children with their cultural and social integration, however, are not readily available for many children in public school. Our research project, which designed a humanoid social robot and introduced it into an active kindergarten classroom, was conducted to help address these challenges. For young children, physically embodied robots can act like playmates (Bers, 2008) and support more highly developed social and emotional relationships among children, compared to other mobile devices (Kim & Smith, 2017; Martínez-Miranda et al., 2018). Building on this research, we designed a social robot to facilitate positive, collaborative interactions among culturally and linguistically diverse children with the goal of assuaging some of the negative experiences diverse children can experience in school.

The development of advanced technology to help resolve real-world problems is a complex, systemic process in which a multitude of theoretical perspectives and variables (e.g., technological features, contextual factors, curricular factors, and learner characteristics) are closely interwoven to influence designs and outcomes. This process typically starts with initial designs grounded in relevant theories and involves ongoing, dialogic decision-making between designers and users through repeated trials in situ (Bielaczyc, 2013; O’Neill, 2016; Tabak, 2004; Wang & Hannafin, 2005). Detailed discussions of design problems and solutions can show how various designs do (and do not) appeal to children, and how children's reactions lead to ongoing changes in design. This paper presents data gathered during the first year of an exploratory multiyear design research project in which we designed and tested robot-mediated interaction activities for children iteratively. Our findings focus on a series of design challenges and the solutions we implemented to solve them while supporting collaborative interactions among the children.

A multidisciplinary conceptual framework

In designing robot-mediated interaction activities, we reviewed research in the field of social robotics as well as other relevant theories and pedagogies including child development, intercultural communication, and culturally sustaining pedagogy. This helped us craft a multidisciplinary conceptual framework that enabled the design of a robot that would be useful in real-world, child-centered interactions in a kindergarten classroom.

Social robots in education

Social (or sociable) robots are physically embodied, life-like robots that interact in a human-like way (Breazeal, 2003). They are distinct from more conventional, autonomous robotic systems that perform mundane or hazardous tasks for humans in industry, agriculture, and other arenas. As a subset of service robots, social robots interact and do things together with people. Social robots influence the health, business, and education sectors and also can be used at home. Research and development in social robotics have focused on the robots’ social and emotional behaviors, such as their facial expressions, gaze, gestures, and bodily movements (Park et al., 2017). The research to date summarizes that users, young and old, respond socially to the robots and develop a sense of companionship with them over time.

Conventionally, educational robotics refers to a domain that educates young students about robotic engineering concepts through hands-on experience (Barker & Ansorge, 2007; Bers, et al., 2014). Such robotic activities have been recognized as an effective tool for science, technology, engineering, and mathematics (STEM) education. In this domain, students can manipulate and/or assemble mechanical compartments to understand robotics itself while learning relevant concepts in STEM. Social robotics expands the capacities of conventional educational robotics and can be used flexibly in a wider range of subject domains, such as language (Berghe et al., 2019) and social skills (Spolaôr & Benitti, 2017). Social robots may tutor and assist students personally (Fridin, 2014), act like novice peers to engage students in a particular task (Tanaka et al., 2015), and address intellectual, affective, and motor skills development in young children (Barreto & Benitti, 2012; Cheng et al., 2018). Equipped with embodied and expressive features, such as mobility, sensors, gestures, and emotional expressions, social robots can afford unique relational dynamics with children while they play and learn together. In a recent review study, Belpaeme et al. (2018) assert that social robots in well-defined domains can assist learners as effectively as human tutors.

The educational use of social robots is still emerging, and research and development in real-world educational contexts are just beginning. Over the past decade, social robotics research has been led largely by researchers from computer science and engineering. While developing robots for young learners, the researchers acknowledge the criticality of theory-guided interaction design (Spolaôr & Benitti, 2017). When the design is grounded in relevant literatures, theories, and pedagogies, social robotic activities have the potential for substantial, sustainable impact. In this study, we referred to theories of child development, intercultural communication, and culturally sustaining pedagogy to design robot-mediated positive, collaborative interactions among culturally and linguistically diverse young children.

Child development

According to the stages of child development, when children start kindergarten, they are typically at the borderline between Early Childhood (spanning two to six years old) and Middle Childhood (six to ten years old) (McDevitt & Ormrod, 2015). At the kindergarten age, children develop and learn while they play. Four notable developmental characteristics typify this phase. First, children quickly improve in fine and gross motor skills. They are rarely able to sit quietly for long periods and actively move their bodies as they learn and play. Second, children like to do things with others and desire companionship (Gregory & Chapman, 2013). While they play together, they negotiate with others and construct shared meaning (Carpendale & Muller, 2004; Vasquez et al., 2011); they become aware of themselves in relation to peers and begin comparing their performance to that of their peers. They also start to recognize that the needs of others are often different from their own. Third, children engage in fantasy play generated from their imaginations (Lindsey & Colwell, 2013). They are most engaged behaviorally and emotionally when the play embraces their interests (Jang et al., 2010). Fourth, their family and cultural backgrounds have a great influence on their developmental characteristics. When children come to school, they bring their prior intellectual, social, and cultural experiences with them (Donovan & Bransford, 2005). These resources, or funds of knowledge (González et al., 2009), help them to not only navigate through the new system but also transfer knowledge from home to school. By doing so, children can develop positive learner identities.

Intercultural communication

Communication is a process through which individuals share information to come to understand each other and the world in which they live (Barnett & Kincaid, 1983). Through communication, we disclose information about ourselves, share personal experiences, and bond with one another (Griffin, 2009). Communication with others also enables individuals to learn to tolerate disagreement and develop common ground (Bakhtin, 1987). Intercultural communication theory (Nishida, 2005), in particular, highlights newcomers’ gradual adaptation to the target language and culture through prolonged exposure. When those fluent in the target language and culture interact with newcomers in nonjudgmental, supportive ways, a positive climate for open communication is developed. This is instrumental to developing children’s identities and helping them construct meaningful relationships (Littlejohn & Foss, 2009).

Culturally sustaining pedagogy

Culturally sustaining pedagogy (CSP) is a theoretical approach to teaching that respects, maintains, and builds on children's diverse languages, cultures, and identities in curricular materials, activities, and teaching strategies (Paris, 2012). CSP builds on the earlier concepts of culturally responsive and relevant approaches to teaching (Gay, 2010; Ladson-Billings, 2009; Nasir & Vakil, 2017). CSP expands the concepts to supporting students in “sustaining the cultural and linguistic competence of their communities while simultaneously offering access to dominant cultural competence” (Paris, 2012, p. 95). In a culturally sustaining learning environment, children are acknowledged as cultural beings and their diverse languages and cultures are considered assets beneficial to the children, the classroom, and the school at large. Rather than seeking to assimilate diverse children into a mainstream school culture, CSP seeks to help children maintain their languages, cultures, and identities as valuable assets in a diverse nation such as the United States (Paris & Alim, 2017). Key components of CSP are valuing and using children’s native languages and making use of their home lives in the classroom.

Synthesizing theory into designing for real-world contexts

The guiding principles for our design of robot-mediated interaction activities required synthesis of all components of the multidisciplinary conceptual framework discussed above. Considering child development, we designed loosely structured fantasy play centered on a robot imagined to be from another planet. The robot asked pairs of children about their personal experiences, which prompted the children to then tell their own expanded stories. Children were allowed to physically move around with the social robot to perform activities. We implemented small-group settings of two children with one robot to support kindergarteners’ peer play, recognizing that small group activity is the most effective way of using technology with children (Liu et al., 2014). Regarding communication, the robot invited all children repeatedly to participate by calling on them by name. To the children’s comments and questions, the robot provided positive feedback and prompted mutual conversation between the children. This approach to communication and interaction aimed to provide all participating children with equal opportunities for interaction. In order to be culturally sustaining, communication was bilingual in Spanish and English, the native languages of the children in the study, and activities were designed on children’s home lives and school experiences.

Research question

This qualitative study was conducted in the first year of a multiyear research project. Our research question was: What are the major design challenges and solutions for developing robot-mediated, collaborative interaction activities for culturally and linguistically diverse children?

Robotic system design

The robotic system included four components: the robot, robot controller, main controller, and server. Figure 1 illustrates the system architecture.

Fig. 1
figure 1

The system and the Robot Skusie

Main controller

The main controller was implemented as a mobile app running on a separate Android tablet. Its main purpose was to allow the researcher to control Skusie’s speech and movement through communication with the robot controller using the Wizard-of-Oz method (Riek, 2012), where a researcher controls the robot from a distance. By using the main controller, the researcher “wizard” controlled what the robot said and when it said it by manually entering speech utterances for Skusie or selecting them from a list of canned utterances in Skusie’s ever-developing lexicon. The researcher also controlled Skusie’s movements. The main controller connected to the server to update its software, download the content of interaction sessions, and store interaction logs with timestamps.

Robot controller

The robot controller component was implemented as a mobile app running on the Android phone embedded in Skusie’s head. The app (1) controlled Skusie's speech and movement, (2) communicated with the main controller held by the researcher wizard, and (3) communicated with the main server module to store log data and update software and scenarios. The robot controller also served as Skusie's main display screen. In interaction sessions, we often used the phone to show relevant photos and pictures while Skusie was talking to children. For example, Skusie displayed a photo of a penguin and asked the children, “Is this an animal?” The robot controller also produced video streams for logging and observation purposes. A video stream generated by the smartphone in the robot’s head could capture children’s facial expressions and movements directly, thus providing useful visual feedback on children’s engagement for the researcher wizard controlling Skusie.

Server

The server was implemented as an FTP-based file server. The server stored the executable code of the main and robot controllers, scripts and accompanying image data, and interaction logs such as video feeds and records of controlling commands and speech utterances.

Design research and ethnographic observations

For this project, we relied on qualitative methodology, combining design research and ethnographic observations. In our approach to design research, we “aimed to improve educational practices through iterative analysis, design, development, and implementation…in real-world settings” (Wang & Hannafin, 2005, p. 6). Ethnographic observations over the ten-week period of the study, field notes of classroom and research meetings, and a detailed researcher journal (Patton, 2002) allowed the research team to make careful records of the design process. In this section, we share details of the “interactive, iterative, and flexible” (Wang & Hannafin, 2005, p. 9) design process we followed.

Participants and context

Participants in this study were twenty-four kindergarten children in a public elementary school located in the Intermountain West region of the United States. The school had a high rate of families living near or below the poverty line. School children were predominantly white English-speaking and Latinx Spanish- and English-speaking. All participants were identified as low performing by the school and attended a supplemental class that provided additional practice with language and academic skills for an hour around lunchtime. We worked with children during this class period. For the study, children were divided into twelve pairs, with the intent to form cross-cultural, cross-linguistic (English and Spanish) partnerships. However, the class did not have an even number of Spanish and English speakers. Thus, while all twenty-four children participated in the robot-mediated interaction activities, the research team studied nine cross-cultural pairs for the duration of this project.

Design and observation procedures

Our interdisciplinary research team had expertise in social robotics, instructional design, computer science, qualitative research, and multicultural teacher education. We combined our expertise to craft four 15-min-long activities with small groups of two children interacting with the robot, Skusie, to help it learn about life on earth. Interaction topics were designed to be highly relevant to the interests and experiences of the children in the study, as well as appropriate to their developmental level. Examining children’s literature, we found four popular themes with which all children would likely have experiences: animals, family, birthdays, and school. In developing activities around these themes, we incorporated fantasy play involving an alien robot asking for help to solve problems on its planet and learn about life on earth. The robot invited children into the activities repeatedly and appreciated their input. The robot was bilingual in Spanish and English and would always speak equally in Spanish and English with the cross-linguistic pairs of children.

In the initial design, researchers spent two weeks developing scripted activities to test with the children while we reviewed children’s literature and ethnographically observed the kindergarten classroom several times each week to get a sense of the culture of the class (Patton, 2002). When a teacher’s aide was absent or busy, we volunteered to fill in and work directly with the children as an act of goodwill and reciprocity (Glesne, 2016). For the next four weeks, twice a week, we implemented scripted activities directly with pairs of children without a robot, iteratively designing, testing, and improving each activity with all pairs. For these sessions, bilingual research assistants took turns acting as the robot and moderating the activity in English and/or Spanish as necessary. In these activities, we asked children to think of our assistants as new friends who had just arrived from another planet and did not know much about life on earth. These assistants needed their help to learn about everything, including human language and culture.

Following the initial six weeks of design and development, we launched the actual robot-mediated activities with the children during another four-week session. By this time, we had moved the activities to a closed conference room in the school to enhance recording quality and minimize classroom distractions. Activities took place on the floor. By the time the robot was ready for mediating children’s interactions, the activities had been polished through several iterations of practice and improvement. In this phase of the design process, we relied on the Wizard-of-Oz method (Riek, 2012), with a researcher wizard controlling the robot from a discrete location in the room. Our goal was not to make Skusie completely autonomous; rather, the researcher wizard modified the robot’s mediation as needed to facilitate interactions between the children in the pair. With the help of the wizard and a bilingual student assistant, Skusie spoke both Spanish and English; the researchers had previously created and continually improved Skusie’s utterances, but they were limited and imperfect. For all activities, children were asked to work together to teach Skusie. For the first several robot-mediated activities, a researcher sat behind Skusie to help the children communicate with it as necessary. As the interactions improved, the human moderator left the group and the robot worked directly with the children. Interactions with the robot took place twice a week for four weeks.

The robot activities were conducted in fifteen-minutes sessions for a total of one hour (usually four sessions), twice a week—Tuesdays and Thursdays—for the duration of the study. Every activity was conducted with each pair of children before a new activity was tested. All human- and robot-mediated interactions with children were digitally recorded and then typed verbatim into transcriptions. A researcher also took ethnographic field notes of the activities to triangulate data collection methods (Glesne, 2016; Patton, 2002). After the Tuesday set of activities, the research team would meet, view the digital recording, and then immediately set about improving the activity to make it more engaging and collaborative; adding to Skusie’s lexicon of vocabulary, phrases, questions, and sentences; and incorporating this information into the electronic systems. The improved version was then conducted on Thursday, allowing for constant refinement (see Appendix for an example of the iterations). Figure 2 presents sample screen shots of the activity sessions with and without a robot.

Fig. 2
figure 2

Sample interaction sessions

Data analysis

A total of 49 sessions with either a research assistant or a robot interacting with pairs of children were digitally recorded. Six sessions were discarded due to poor audio quality or other technical errors. Thus, 43 sessions were transformed into verbatim transcriptions and analyzed for this study, along with researcher journal and field notes from all classroom interactions and weekly research team meetings. To generate the themes in the Findings section, we coded the transcriptions and researcher journal data thoroughly using “first cycle” and “second cycle” coding practices (Saldaña, 2009, pp. 45, 149). This method allowed us to group codes into larger themes. Our conceptual framework of child development, intercultural communication, and culturally sustaining pedagogy guided our interpretation of the data. In addition, we examined the improvement of each of the four activities over time as a key component of our design research goals (Wang & Hannafin, 2005). Over ten weeks, we iteratively designed, implemented, and improved each activity in an effort to engage cross-cultural pairs of kindergarteners in positive and collaborative interaction activities.

Findings

Our research question examined the major design challenges and solutions for developing robot-mediated collaborative interactions for culturally and linguistically diverse children. Through careful qualitative analysis, we found the following four themes that characterized challenges and solutions in the project: (1) anticipating children’s communication styles with flexible design, (2) inviting children to participate with personalized, friend-like communication, (3) enhancing engagement with familiar contexts, and (4) embracing language diversity with a bilingual robot. These themes correspond to the children’s social and linguistic development aspects of our theoretical framework. In this section, we discuss both the design challenges and the solutions to better facilitate the activities.

Anticipating children’s communication styles with flexible design

Our first design challenge was that 5 to 6-year-old kindergarteners are still developing in language and behavior, as understood in child development theory. This challenge required researchers to be flexible with design and implementation of interaction activities. Specifically, the language of the children was not always clearly articulated. Their language skills were still developing so they often used words that approximated the meaning they intended, rather than exact, accurate words. Their word order was often different from adults; their verbs were often incorrectly conjugated; and their speech was often unclear. An automatic speech recognition software that could readily understand and respond to kindergarten language did not yet exist when this study was conducted. In addition, children often did not follow the conversation track that the designers expected. Rather, their reactions were difficult to anticipate as they were often playful and imaginative.

To address this situation, we used technical and pedagogical solution strategies. Technically, we relied on a human wizard who controlled the robot’s actions. The wizard sat unobtrusively in the back of the room and discretely controlled the robot’s speech and movements. The strength of this arrangement was that the controller could hear what children said and input appropriate replies. Limitations included an occasional delay between the controller’s input and the robot’s utterances. This limitation often resulted in Skusie not responding for several seconds, and then responding with too many utterances at once. When this happened, children were interrupted. While some children simply laughed at Skusie’s hiccups, other shyer children often became quiet as the following example of two girls, one Latinx and one white, illustrates. Children are identified with acronyms that indicate their gender (G = girl; B = boy), their race/ethnicity (W = white; L = Latinx), and name initials. Thus, in the following dialogue GWAV is a white girl and GLGL is a Latinx girl.

GWAV: And-

ROBOT: Tell me more about animals. What do you do with animals?

GWAV: [Starts to say something.]

ROBOT: Explícame mas sobre los animales. (Tell me more about animals.)

GWAV: [Starts to say something again.]

ROBOT: Que haces con los animales? (What do you do with animals?)

ROBOT: Tell me more about animals. What do you do with animals? Explica más sobre los animales.

GLGL: You take-

ROBOT: Que haces con los animales? (What do you do with animals?)

GLGL: You take them on walks.

ROBOT: Interesting. Interesante.

ROBOT: Tell me more about animals. What do you do with animals? Explícame más sobre los animales.

GLGL: You- [Whispers to GWAV.]

ROBOT: Que haces con los animales? (What do you do with animals?)

GLGL: [Whispers something to GWAV.]

GWAV: [Whispers to GLGL.] You.

GLGL: No.

Moderator: Do you have some ideas you want to tell Skusie?

[Pause]

GLGL: Nu-uh.

One important step toward a solution was for the human wizard to send a prompt and wait a sufficient time until the prompt was delivered to children and the children started to respond. As the research team became more familiar with the children’s language and as Skusie’s lexicon grew, this aspect of the design improved.

The following example also highlights the adult designers’ challenge in anticipating what children would say and do. In this session, the children could not agree when asked to choose a birthday present for Skusie’s friend. Skusie showed images of different pictures of toys on its smartphone brain and asked the children to choose a toy for her friend. Designers initially assumed the children would work together to agree on a toy, but this did not happen. The children, a Latinx boy (BLJE) and a white girl (GWVI), consistently repeated their own gendered choices and were not able to reach an agreement by the end of the session.

ROBOT: Will you help me choose a birthday present for my friend? [both children lean forward to look at the picture on Skusie’s smartphone brain]

BLJE: Un biciclo, un coche, unos jugetes- (A bike, a car, some toys-)

ROBOT: ¿BLJE- Cual debo darle a mi amigo? (BLJE, which should I give my friend?)

BLJE: Si es de tu tamaño, escoge un coche. (If it’s your size, choose the car.)

GWVI: You could get her a doll. The Barbie, with the dress-

BLJE: Que? (What?)

GWVI: With the dress.

…[The children went on and on, repeating their different choices]

ROBOT: Thank you. GWVI. Can you two pick one present for my friend? ¿Los dos pueden escoger un regalo para mi amigo?

BLJE: Si es tu tamaño escoge- (If it’s your size, choose-)

GWVI: The Barbie.

BLJE: No- porque es niño. Es niño, es niño. (No, because it’s a boy. It’s a boy, it’s a boy.)

ROBOT: No tenemos eso en mi planeta BLJE. (We don’t have those on my planet BLJE.)

BLJE: No que si tu amigo es niño, es niño. (No. If your friend is a boy, it’s a boy)

GWVI: You could give it-

ROBOT: ¿Puedes decir eso otra vez? (Can you say that again?)

GWVI: You could get her a Barbie. If it’s a girl.

ROBOT: ¿Pueden escoger solo un regalo? (Can you choose just one gift?) Can you choose just one?

GWVI: Uh, the Barbie. (points to the picture and looks at BLJE)

BLJE: [shakes head and points at the picture] Yo prefiero el carro. (I prefer the car.)

ROBOT: Se nos acaba el tiempo. Escojan un regalo para mí por favor. (Time is running out. Choose a gift for me please.) We don’t have much time. Choose one for me please.

To solve this problem, the design team added questions to Skusie’s utterances to promote cooperation between children, including "Can you two talk first and choose one for me?" and “Can you two choose together?” The following example shows that these prompts helped another pair of children, a white girl (GWLO) and a Latinx boy (BLEX), to agree on a toy for Skusie’s friend.

ROBOT: Which one should I give to my friend?

GWLO: Um, that one. [GWLO points at the screen.]

BLEX: Optimus Prime!

GWLO: The princess!

BLEX: Nope!

GWLO: The princess!

ROBOT: Can you two talk first and choose one for me?

BLEX: Me!

GWLO: Hey, me.

BLEX: Optimus Prime.

GWLO: No! Princess!

ROBOT: Can you choose just one?

BLEX: Optimus Prime.

GWLO: Optimus Prime.

Through our design research process that deepened our understanding of individual variations in child development, we were able to improve the robot to listen more than it spoke and to encourage children to tell their stories. With this improvement, children came to share more and more of their personal stories upon the robot’s prompt. While listening to the stories, the robot simply expressed appreciation and affirmation through backchanneling (e.g., “thanks,” “wow,” “that’s funny,” “I like that,” etc.). We added such robot utterances as we observed children’s reactions and continually improved them throughout the study.

Inviting children to participate with personalized, friend-like communication

Our second design challenge was related to collaborative communication which engaged children equally in the sessions so they would have positive collaborative experiences. To solve this challenge, we used two main strategies. First, we adopted invitation to peer play where we programmed the robot to communicate directly with children by calling on them personally in a friendly, inviting voice—like a friend who invites them to play. The following examples illustrate this solution.

ROBOT: Hello GNSA and BWTY [Robot used their real names], I’m Skusie.

BWTY: Whoa.

ROBOT: This is my first time on earth. Can you help me?

BWTY: Uh I say yes?

ROBOT: Can you help me?

BWTY: Yes.

---

ROBOT: Hello BLED and BWLA. Good to see you again.

BLED and BWLA: [Laugh and sit down.]

[robot moves closer to them.]

BLED: Uh oh.

ROBOT: Hello BLED and GWLA.

BLED and BWLA: [Laugh]

Children seemed surprised and delighted to hear Skusie use their own names. At first, they could not believe the robot was talking directly to them. After repeated interactions, many children responded to Skusie as they would respond to a friend.

GLNI: You can’t go shopping! [at a birthday party] He’s crazy!

ROBOT: GLNI.

BWOL: He’s crazy.

ROBOT: GLNI.

GLNI: Huh?

ROBOT: Why not?

GLNI: Because that’s what—not how you play with friends. You go in a fun place-

BWOL: Like a jumpy house.

GLNI: Or a jump [zone]. Maybe.

One thing to note is that while the English-language names were easy to enter into Skusie’s lexicon, some Spanish-language names did not work in our system. In these cases, children did not recognize Skusie’s efforts to call on them. Eventually, designers entered phonetic versions of these names into the software so Skusie could say them correctly (for example “hosay” instead of José).

Second, referring to the multimodal and multisensory development of children, we used robotic utterances and pictures to facilitate children’s engagement in the interaction activities. It took many iterations to improve Skusie’s utterances for appropriate flow and focus. To help a stalled conversation start up again, we added prompts to Skusie’s repertoire such as, “Tell me more” and “Can you say it again?” We added the statement, “I’m confused” to help children get back on topic if they digressed or spoke in a manner the wizard could not understand. We also showed pictures on Skusie’s smartphone brain to catch their attention when necessary.

These features worked with BWLA and BLED, two friends who loved to engage in playful, silly behaviors when they were together. These boys, one white and one Latinx, liked to conspire with one another to disrupt the activity and, in an earlier session with a research-assistant mediator, successfully derailed the activity by talking about their favorite made-up item: chocolate weezeberries. When they tried to bring up this topic with Skusie in a session about birthdays, the robot was able to quickly return them to the activity by expressing its confusion and presenting an image on the smartphone. The examples below are from two different interactions.

ROBOT: On my home planet, there are no animals. What are animals? En mi planeta no hay animales. ¿Qué son los animales?

BWLA: Shark.

BLED: Choc-

BWLA: Choc- [Laughs]

BLED: Chocolate weezeberries.

ROBOT: Can you say it again? ¿Puedes repetir eso?

BWLA and BLED: [They laugh and drop the subject of chocolate weezeberries.]

----

BLED: [Whispers something to BWLA]

ROBOT: I don’t understand.

BWLA: Chocolate weezeberry. [Laughs]

ROBOT: What is this? [shows picture of laundry]

BWLA: Laundry. Laundry.

ROBOT: Terrific. Thank you. Can we do this on my friend’s birthday?

BWLA: No!

BLED: No. Oh!

ROBOT: BLED. ¿Podemos hacer esto para el cumpleaños de mi amigo? (Can we do this for my friend’s birthday?)

BWLA: I want to say it.

BLED: Yup.

BWLA: That’s a birthday. Yup. Yup. That’s what you do for a birthday.

Having Skuzie call the children by name and adding phrases to its repertoire that kept children's attention to the activity made the interactions more natural and engaging. In particular, calling children by name invited even shy children to participate equally in activities. As the robot’s utterances improved, interactions became more friend-like as children talked to Skusie as they would a friend.

Enhancing engagement with familiar contexts

Our third design challenge was crafting fantasy storylines where children would engage actively with one another and the robot. To boost their confidence, we anchored activities on children’s prior knowledge and familiar experiences. For example, we initially programmed Skusie to ask children to build an imaginary school, but the children were not very interested in this activity and easily got distracted. In a subsequent round of design, we included contexts and photos that were personally familiar to the children. Researchers took pictures of classrooms, hallways, the cafeteria, the gym, and the playground in their own school. Skusie then asked the children to create a school floor plan using these pictures. Every child was immediately engaged when they saw pictures of their own school. Once they completed a school floor plan, Skusie asked them for directions to each different location. As Skusie moved to the location, children moved along, excitedly using words and gestures to guide Skusie. The children’s enthusiasm for this familiar context is evident in the interaction below.

ROBOT: I saw lots of things on my way here. [The robot rolls toward GLAL and shows an image of the children’s school.]

GLAL: That’s the gym!

ROBOT: What is this place?

GLAL: A gym!

BWOL: That’s - that’s just like our gym!

ROBOT: Amazing. Thank you. Do you learn here?

GLAL and BWOL: Yes.

The children were eager to help the robot understand their directions, so they worked together to talk, compromise, and agree on directing Skusie to the right location. This activity prompted the children to participate equally, even if they were typically shy, like GLAL, or used to being disengaged from activities, like BLJE, who spoke only Spanish in his English-speaking school. Notice, in this activity, the robot speaks only in the first line and children lead the interactions thereafter.

ROBOT: Thank you. [GWMA picks up a few pictures.]

BLJE: Ya tomo estas en orden. (I already put these in order.)

GWMA: This is right here. [GWMA puts a picture on the left of the robot.]

BLJE: Hey! Aqui estan las tres fotos. (Hey! Here are the three pictures.)

GWMA: Right here. [places another picture]

BLJE: Primero. (First.) [places a picture right in front of the robot]

GWMA: This goes right there. [touches the picture that BLJE placed in front of the robot]

BLJE: Poco. (Few.) [re-adjusts the same picture that GWMA just touched]

In a subsequent session, BLJE was very excited to see pictures of not only his own classroom on Skusie’s smartphone brain, but his own backpack in the picture.

GWVI: This is our classroom.

BLJE: Ehh. [he tries to pick up Skusie]

ROBOT: Let me go. Let me do it by myself please.

BLJE: De aqui se ve mi mochila. (From here I see my backpack.)

ROBOT: Am I on the right direction?

GWVI: No.

BLJE: Oye. (Listen.)

ROBOT: ¿Voy en la dirección correcta? (Am I going in the right direction?)

BLJE: Pues desde allí se ve mi mochila. (Well from here I can see my backpack.)

Designing this activity around the familiar context of the school appeared to excite the children; they were eager to share their knowledge of the school with Skusie and help it learn what they already knew. Although this activity does not specifically draw on children’s cultural backgrounds, it draws on their everyday experiences and knowledge about their school. Thus, this kind of activity is relevant to and sustaining of children’s personal experiences and expertise. Their enthusiasm for this activity was a clear indicator of its success.

Embracing language diversity with a bilingual robot

Our fourth design challenge was making the activities equitable and culturally sustaining for all children. Referring to tenets of culturally sustaining pedagogy, we designed Skusie to be bilingual in Spanish and English. In this diverse classroom, eight Latinx children were fairly fluent in both English and Spanish, but one child, BLJE, was a recent immigrant with fluency in Spanish and just a few words in English. This child spent his school day immersed in English and was often socially and academically isolated in the classroom, despite the kindness of his teachers and classmates. In the following session, his first with the robot, BLJE showed that he had a lot to say.

ROBOT: How fast is a polar bear? ¿Qué tan rápido es un oso polar?

GWVI: Maybe one hundred miles?

BLJE: Un poco rapidito. (A little fast.)

GWVI: Maybe one hundred miles fast?

BLJE: Porque si algo que sale del agua, como tiene dientes asi, corr- ellos corren asi porque ellos tienen alas que ella no- ellos no pueden nadar pero puede correr lo más rápido que pueda pero no se puede hundir, no se puede cayer al agua porque esta frio. (Because if something comes out of the water, like the one that has teeth like this [walrus], they run - they run like this because they have claws that [the walrus] doesn’t. They [polar bears] can’t swim but they can run as fast as they can but they can’t sink, they can’t fall in the water because it’s cold.)

In later sessions with the robot, BLJE disclosed his difficult experience with American schooling. He was often bored and anxious about school. BLJE’s isolation was evident to researchers; even though he was excited to meet with Skusie, he often disengaged from the session, responding only when Skusie called him by name. It seemed that he was used to not being integrated into ordinary classroom activities.

ROBOT: BLJE, dígame más sobre por qué tus amigos vienen a la escuela. (BLJE, tell me more about why your friends come to school.) [BLJE comes closer and leans forward toward the robot.]

BLJE: Yo ni vengo casi. (I don’t even come [very often].) A veces no vengo. (Sometimes I don’t come.) A veces no puedo venir porque mi mama tiene una cita. (Sometimes I don’t come because my mom has an appointment.)

ROBOT: ¿BLJE, te gusta venir a la escuela? (BLJE, do you like to come to school?)

BLJE: No.

ROBOT: Do you like to come to school? ¿Por qué no? (Why not?)

BLJE: Porque esta ab- (Because it’s-) [BLJE sits forward and looks at the robot]

ROBOT: Why not?

BLJE: Porque es aburrida. (Because it’s boring.) En esta noche no quise venir porque anoche no pude dormir tanto. Estuve muevenme, muevenme cuando estaba maldiciendo mueveme, muéveme. (Last night I didn’t want to come- I couldn’t sleep very much. I was tossing and turning.)

For BLJE, the opportunity to engage with the robot in Spanish was a much-needed chance to be a fully integrated member of the classroom community. Another Latinx child often denied that he could speak Spanish, although he seemed to understand when Skusie spoke Spanish. In his final interaction with Skusie, he spoke Spanish in the session. In addition, although most white, English-speaking children said at first that they could not speak Spanish, many did speak some Spanish in the bilingual activities with Skusie. The bilingual component of the robot was essential for integrating Spanish and English speakers equally into the learning environment.

Discussion

In this first phase of our multi-year research project, we qualitatively examined the design of social robot mediation to enhance children’s interactions across cultures and languages. The mediation was designed, tested, and improved iteratively. In this process, we gave specific attention to the design challenges we encountered and the solutions we used to address them. Synthesizing the literature, theories, and pedagogies of social robotics, child development, intercultural communication, and culturally sustaining pedagogy, we aimed to create robot-mediated interactions that were developmentally appropriate, optimal for open and positive communication, and supportive of diverse cultural and linguistic experiences. Careful qualitative analysis showed that our design iterations enabled viable solutions for facilitating positive interactions among kindergarteners in the study.

We created playful learning activities where children were encouraged to use their imaginations and engage in fantasy play with the robot, Skusie. These activities were grounded in child development theory (Gregory & Chapman, 2013; Jang et al., 2010; Lindsey & Colwell, 2013), which prompted designers to ensure that all activities would be collaborative as well as fun for the children. By considering the tenets of intercultural communication, we ensured that children were personally, warmly, and repeatedly invited into the activities and that their input was always regarded positively (Barnett & Kincaid, 1983). This approach seemed to have a synergistic effect where children then treated Skusie with kindness and patience when the robot occasionally sputtered, interrupted them, or said something confusing. There were many examples of the children defending Skusie’s limitations to one another, such as when GLAL said, “Skusie, move!” and BWOL responded, “It’s okay. Skusie’s a robot. Skusie doesn’t even know about eating yet ‘cause he’s from a different planet, not earth.”

Adhering to key aspects of culturally sustaining pedagogy (CSP), we developed activities that centered children’s personal and familiar experiences and ensured Skusie spoke equitably in Spanish and English, the native languages of the children (Gay, 2010; Ladson-Billings, 2009; Nasir & Vakil, 2017; Paris, 2012; Paris & Alim, 2017). With Latinx children, whose home languages and cultures are typically marginalized in their schooling experience, CSP played a crucial role in successfully engaging the children in the study. The activities we created were based on aspects of culture that all children have in common: animals, family, birthdays, and school. These topics allowed children to share important parts of their lives and homes across their cultures and become fully integrated members of the classroom community.

Over the ten weeks of this design research, Skusie’s skills improved through several iterations per activity and the collaborative activity sessions continuously improved, running more smoothly and more naturally, lessening the researcher wizard’s burdens with the main controller. Through this incremental refinement, the children’s engagement likewise grew. Toward the end of the project, children excitedly inquired about Skusie when researchers met them in the classroom; they wanted to know more about its friends and home planet and often used their imaginations to offer their own answers. It was common to hear children say that they loved the robot.

From this experience, we inferred that the multidisciplinary framework grounding the real-world design challenges and the iterative refinement of our designs through testing in situ together constituted a robust approach to applying advanced technology. Importantly, classrooms that serve young children who come from diverse cultural and linguistic backgrounds can benefit from our design principles of (1) flexibility in allowing room for children’s exploration, (2) friend-like communication, (3) tasks relying on familiar experiences but stimulating imagination, and (4) use of children’s home languages.

Yet, this study had technological and methodological limitations due to the current status of knowledge in the relevant fields. Natural dialogue between the robot and the children was not possible because natural language processing and automatic speech recognition for children is still developing. Ongoing advances of relevant technologies may help overcome this limitation in the future. Also, the qualitative nature of this study relied on rich, holistic accounts of children’s speech, facial expressions, and bodily movements. Although the rich qualitative data were beneficial for addressing real-world design challenges and solutions in the study, researchers should use discretion when extrapolating from the study implications given the small sample of participants.

Conclusion

This qualitative design study explored using a humanoid social robot to moderate interactions among culturally and linguistically diverse young children, with a focus on the design challenges and solutions for facilitating positive peer interactions. Such interactions among the children, we hoped, would help them become more integrated across their different cultures and languages. The children interacted with each other in an equitable manner, had fun in the activities, and frequently expressed their affection for Skusie. Despite the achievements of this study, there is still a long way to go technologically to be able to design for natural child-robot interaction. Importantly, the study was not meant to achieve this goal but to offer insights for designing child-robot interaction that is theoretically and pedagogically sound.