Robotics education (RE) is an emerging, interdisciplinary branch of engineering and science education, dealing with design, construction, operation, use of robots, and computer systems. RE has been suggested as a promising way to motivate children to engage in all aspects of science, technology, engineering, and mathematics (STEM), where “T” (technology) and “E” (engineering) have often been missing in the conventional curricula (Bers et al., 2014; Leonard et al., 2016). With the realization that children’s engagement with STEM should start in their early age (Papert, 1980), RE has attracted significant attention especially in the last decade. With parental support, afterschool programs in RE have become increasingly popular (Eguchi, 2014). A growing number of studies indicate the importance of parental buy-in in children’s STEM education, including parents’ support of student informal education, such as after school programs, clubs, and activities (Milner-Bolotin & Marotto, 2018).

RE has become a new and rapidly growing subject area across the world and has been recently included in the curriculum of middle and secondary schools in some countries, including Canada, the UK, and the US (Benitti, 2012; Chen et al., 2017; Kucuk & Sisman, 2017; Xia & Zhong, 2018). However, the current literature on RE shows a need for more empirical evidence to understand how RE may benefit young children’s STEM learning using more rigorous measurement tools to evaluate children’s learning outcomes (Alimisis & Moro, 2016). Thus, it is essential to examine how RE can benefit young children’s learning before governments make large-scale investment in infrastructure (e.g., digital and robotics equipment and lab space), teacher education, and professional development.

The present study aims to provide insights into what children can gain from RE, particularly for children aged 6–8 years old. We adopted different measurement strategies, including cognitive assessments, eye-tracking, and interviews, to investigate how RE affected children’ cognitive development. Eye-tracking has been used to study people’s cognitive abilities and processes (Liu et al., 2011; Meißner & Oll, 2019; Odic & Halberda, 2015; Tien et al., 2014), but it has not been used for RE research. The utilization of this technique allows us to study children’s cognitive processes during their RE activities.

Rapid Development of RE

Researchers and policymakers have been making efforts to promote STEM education in grades K-12 (e.g., Government of Canada, 2021; The White House, 2009; U.S. Department of Education, 2018) with the hope to change the under-representation of women in STEM. The popular interest in robotics has increased rapidly in the past several years (Benitti, 2012; Eguchi, 2014). Eguchi (2014) pointed out that some recent curriculum movements and afterschool robotics programs promoted the advent of RE in the USA. For example, the National Science Foundation in the USA founded WaterBotics program, an underwater robotic curriculum that has been implemented with thousands of high school students nationwide (McGrath et al., 2009). Some researchers suggested that RE may provide a playful and novel learning environment for children to engage in all aspects of STEM and could alleviate the gender gap in STEM fields (Sullivan & Bers, 2015).

RE is also an emergent subject area in Canada. Educators and policymakers have started integrating RE into K-12 education. In 2018, BC and ON Ministries of Education announced that robotics would be included in their new curriculum for grades 10–12, starting in 2019–2020 (British Columbia Ministry of Education, 2018; Ontario Ministry of Education, 2018). However, it is still unclear how RE can benefit children’s learning. Several systematic review studies showed that the majority of RE research was conducted in the USA (Benitti, 2012; Spolaôr & Benitti, 2017; Xia & Zhong, 2018). These existing studies were conducted in a limited number of geographic locations and their findings may not apply to other regions or countries as the population structure, socio-economic, and education systems are not homogeneous. Thus, there is a need to collect more empirical evidence to help researchers, educators, and policymakers to understand how RE can benefit children from different regions and/or countries before promoting RE on a large scale globally. Major findings and some controversial issues in the literature are summarized below.

Effects of RE on Cognitive Abilities

A number of studies have examined how RE might affect children’s cognitive abilities, specifically computational thinking skills, which encompass a broad scope of cognitive abilities, including problem-solving skills, number sense, visual memory, spatial processing abilities, and abstract and logical thinking (Barr & Stephenson, 2011; Lee et al., 2011; Nugent et al., 2008). Yet, it still remains unclear how RE can exactly enhance child cognitive abilities due to inconsistent findings in the literature. Several studies found that RE had the potential to foster computational thinking and programming skills (Barker & Ansorge, 2007; Bers et al., 2014; Chen et al., 2017; Govind & Bers, 2021). However, other studies found that RE did not result in any significant improvement in children’s cognitive abilities (e.g., Hussain et al., 2006).

The inconsistent results may be partly due to the challenges of measuring children’s cognitive abilities solely using traditional methods that are either intrusive or inefficient. For example, think-aloud protocols (i.e., whereby children describe their mental activity while completing a task) may interrupt children’s learning and overload their cognitive capacity; self-reported survey is subjective; interviews may miss some important information because of children’s short-span memory or inability to express themselves.

In their systematic review studies, Benitti (2012) and Alimisis and Moro (2016) found a lack of empirical evidence and rigorous assessments to evaluate the effectiveness of RE. In a systematic review, Xia and Zhong (2018) identified 22 peer reviewed journal papers that studied K-12 RE and found the prevalent measurement approaches were traditional methods, including observation, questionnaire, computer coding test, and verbal interview. Given the limitations of these traditional methods discussed above, there is a need for developing rigorous measurement tools for RE research.

To fill in this gap, the present study explored how eye-tracking can help researchers to study children’s cognitive processes during their robotics learning in addition to the conventional methods (assessments and interviews), which can provide both objective and subjective evidence. Eye-tracking methods, including the examination of fixation (i.e., fixed state of the gaze) and saccadic eye movements (i.e., rapid eye movements between fixations) in response to cognitive effort, have been used successfully for understanding children’s cognitive processes and learning strategies in reading, science and mathematics (e.g., Zhu & Feng, 2015), but it has never been used in RE research. This technique may help instructors to identify the patterns of student cognitive processes and to understand how RE can help develop student cognitive abilities.

Eye-Tracking Research in Cognitive Processes

Eye-tracking is an experimental method of recording eye movement and gaze location while participants are working on assigned tasks. Just and Carpenter (1980) formulated the influential strong eye-mind hypothesis. This school of research assumes that eyes reflect the mental process and eye-tracking can provide a moment-by-moment insight into cognitive processes rather than simply observing the final outcome. However, the cost and availability of eye-trackers limited researchers to use this approach to study cognitive processes. Over the last decade, the ever-increasing access to a variety of options to eye-tracking technology has prompted the growth of eye-tracking research (Carter & Luke, 2020).

Most modern eye-trackers utilize near-infrared technology along with a high-resolution camera (or other optical sensors) to track gaze direction. Two commonly used types of eye-trackers are screen-based (computer-mounted) and glasses-based (head-mounted). The screen-based eye-trackers usually require participants sit still in front of a computer and allow researchers to observe the eye-movement on a computer screen with well-designed stimuli, such as pictures or text. As the name implies, glasses-based eye-trackers look like either a framework of glasses or a pair of sunglasses. An example can be found on Pupil Labs website, https://pupil-labs.com/products/. This type of eye-trackers is relatively new and allows participants to move around and look at any objects in the experimental area, which does not limit the experiment only to the computer screen. Recently, glasses-based eye trackers have been used to study cognitive load (John et al., 2022), visual attention (Gunther et al., 2022), and language learning (Schroer & Yu, 2021). Given that the goal of the present study is to observe student cognitive processes during their RE learning activities, we adopted glasses-based eye-tracker technology.

Most prior research focused on visual perception, memory, language, and decision making, and some publications explored mathematics and science learning (Alemdag & Cagiltay, 2018; Carter & Luke, 2020; Murray et al., 2013; Reichle et al., 2010; Schuster et al., 2020). As far as we know, eye-tracking has not been applied to RE studies. This is the first study to explore what cognitive processes can be observed via glass-based eye-tracking for young children who engage in robotics learning. More specifically, we aimed to address the following research questions.

Research Questions

  • RQ1: For children in the RE program, how do their computational thinking skills (i.e., abstract and logical reasoning, spatial understanding, visuospatial working memory, and number sense) change over time?

  • RQ2: What can we learn about children’s cognitive processes while they are participating in RE activities through eye-tracking?

  • RQ3: How do children and parents perceive effects of robotics learning activities?

Methodology

Sample

A total of 31 children aged 6–8 (age: mean = 7.06, SD = 0.93; 19.3% girls) were recruited from a robotics learning school, which provides an afterschool RE program. These children were volunteers from two locations of this school (across the Lower Mainland region in British Columbia, Canada), which provide the same curricula, learning modules, and instruction training. Originally, a separate group of 31 children were recruited as a matched control group from a psychology lab at the University of British Columbia, but the data collection was interrupted by the COVID-19 pandemic. Hence, we only included 31 children from this robotics learning school in the data analysis. In terms of the language spoken at home, nine children only speak English, nine speak both English and Chinese, and 13 only speak Chinese. One participant did not report their parents’ education levels; 25 out 30 (83.6%) parents had bachelor’s degree or above; two parents had 2-year college education; and three parents held vocational or technical certificates.

Research Design and Procedure

A mixed methods approach was adopted in this research (Creswell, 2017) with a repeated measures design, which included quantitative data collected from cognitive assessments and eye-tracking and qualitative data from interviews. Eye-tracking and assessment data were collected three times over 6 months; interviews with parents and children were conducted in the last month of the project. All 31 children responded to four assessments at Time-1 with eight children not responding at Time-2 and Time-3. Research assistants (RAs), who were trained graduate students, assisted with the administration of the assessments with a one RA-to-one child format at a scheduled time.

Fifteen children volunteered for the eye-tracking research at Time-1 with three children missed in Time-2 and Time-3. RAs went to the regular class, helped the child put on the head-mounted eye-tracker, conducted eye-tracking calibration, and then asked the student to start his/her regular class activities while wearing the eye-tracker. RAs usually did not interfere with the class activities unless the child requested help with eye-tracking related issues. In a RE lesson, the instructor usually told the child the learning goal in the beginning and showed the robot model on the screen. Then, the child started to follow the module step-by-step to build the robot with Lego pieces. Sometimes, when the child was stuck on choosing the right Lego piece or was not able to put the Lego pieces together, the instructor gave some hints or guided the child to go through the steps. After finishing the robot building, the child connected the robot with the software program via a computer to make the robot do certain movements. With the instructor’s guidance, the child used software, such as Scratch (http://scratch.mit.edu), to do some simple programming for making the robot move. Appendix 1 provides an example on how a child worked on his robot building task.

By the end of the project, nine children and seven parents were interviewed by two trained RAs. The interviews for children and parents were separated to avoid the influence of parents’ responses on their children. We offered both Mandarin and English language choices for the interview. All the children spoke English during the interviews; two parents were English native speakers and five spoke Mandarin.

Measures

We delivered a survey to parents at the beginning of the project and administered four assessments to children three times over a period of 6 months. The survey collected demographic information about the children and their parents (Appendix 2). The assessments measured different aspects of student computational thinking skills and were presented as computerized tests using MATLAB software program. The proportion of correct answers over all questions for each assessment was used as the outcome variable. The detailed information about the assessments is provided as follows.

Raven Progressive Matrices Tests

Raven Progressive Matrices tests (Raven & Raven, 2003) were designed to assess general intelligence and abstract reasoning. In these tests, children were asked to look at a geometric pattern with a missing piece and fill in this missing piece by choosing one out of six options (Fig. 1). A total of 24 images were presented to children in an increasing order of difficulty. Previous studies demonstrated satisfactory reliability and validity, with a test–retest reliability of 0.85 and moderately high correlations ranging from 0.54 to 0.86 between the Raven tests and Binet and Wechsler Intelligence scales (Raven & Raven, 2003).

Fig. 1
figure 1

An example of Raven Progressive Matrices Tests

Corsi Blocks Tasks (Frogs Matrices Tasks)

To measure children’s visuospatial working memory, the Frogs Matrices Tasks (Morales et al., 2013) were used, which were adapted from Corsi blocks tasks (Corsi, 1972). Test–retest reliability for children (11–16 years old) is moderately high (r = 0.70–0.79; Orsini, 1994). In this task, a 3 × 3 matrix represented nine ponds in which a frog could jump to (Fig. 2). When a trial began, a frog jumped from one pond to another, and children were asked to remember the sequence of the frog jumping between ponds. To calculate children’s working memory span, the number of jumps started from two and increased by one for every two trials, to a maximum of six jumps. Hence, there were a total of 10 trials.

Fig. 2
figure 2

An example of Frogs Matrices Tasks

Mental Rotation Tasks

Mental Rotation Tasks (Vandenberg & Kuse, 1978) were adopted in this study. It was used to assess children’s general intelligence and spatial cognitive abilities. More specifically, it examines whether children are capable to transform a visual image in a 3-dimensional (3D) space. Figure 3 shows an example of a mental rotation task. Children were first shown a reference image of a 3D object and were then asked to identify which of the four options best matches a rotated version of the object pictured in the reference image. A total of 15 trials were included in the current study. Vandenberg and Kuse (1978) summarized the literature and found the test–retest reliability estimates ranged from 0.70 to 0.83 and validity evidence was supported by moderate correlations with other tests of spatial visualization, such as Identical Blocks Test (r = 0.54), Chair-Window Test (r = 0.45), and Spatial Relations subtest of the Differential Abilities Test (r = 0.50).

Fig. 3
figure 3

An example of Mental Rotation Tasks

Panamath Test

Panamath test (Halberda et al., 2008) was designed for evaluating number sense and approximate number system aptitude (Halberda et al., 2012), available at the Panamath website (https://panamath.org/index.php). In this task, a set of blue and yellow dots appears on the screen for 2 s, which is too quick to allow for counting (Fig. 4), and children have to indicate which side shows more dots. In the present study, we included two teaching trials, followed by eight practice trials and 64 experimental trials.

Fig. 4
figure 4

An example of Panamath Test

Interview on Perceptions of RE Effects

We designed a semi-structured interview in order to understand how children and parents perceived the effects of RE on children’s learning and how RE benefited children in terms of their cognitive development and possibly other skills. The interview protocol was developed by researchers on this project. Two RAs received training and conducted the interview. Eleven children and their parent volunteered to participate in the interview.

Eye-Tracking

A glasses-based eye-tracker, Pupil Labs (Kassner & Patera, 2012), was adopted in the present study. More specifically, Pupil Core was used, which has a high-speed world camera and 120-Hz binocular eye cameras with open-source software. The RA helped the participant to conduct eye-tracking calibration before the class. Student eye-movements were tracked and recorded during their regular class. The recording time ranged from 30 to 50 min. An example of eye-tracking research in a robotics class is provided in Appendix 1.

We set up five areas of interest (AOI), computer screen time (showing the robot model on a computer screen), resource (mainly referring to Lego pieces for building robots), robot (building a robot), instructor (instruction time), and peer (peer distractions, e.g., talking or looking at other children). However, we excluded instructor because it was hard to separate the instruction time from other tasks. For example, children might talk to the instructor while building a robot and looking at the robot model on the screen. We also excluded peer because the distraction time is just a small fraction of the total time in each recording. Three eye-tracking measures were chosen for the final data analysis, including proportion of fixation count (a ratio of fixation counts in each AOI to the total fixation counts), average fixation duration (the mean length of all fixations in milliseconds), and average saccade duration (the mean length of all saccades in milliseconds).

Data Analysis

We conducted linear growth models (Singer & Willett, 2003) to address our RQ1. More specifically, we examined whether children’s assessment scores, including Raven, Corsi, mental rotation, and Panamath tasks, changed across three time points over six months. Given the small sample size, we only used unconditional growth models without covariates. HLM 7.03 software program (Raudenbush & Bryk, 2002; Raudenbush & Congdon, 2021) was used for conducting linear growth models. The growth model is defined as follows:

$$\mathrm{Level}-1 (\mathrm{intra}-\mathrm{individual\; level}): {Y}_{ti}={\pi }_{0i}+ {\pi }_{1i}Time+{e}_{ti}$$
$$\mathrm{Level}-2 (\mathrm{inter}-\mathrm{individual\; level}): {\pi }_{0i}= {\beta }_{00}+ {r}_{0i}; {\pi }_{1i}= {\beta }_{10}+ {r}_{1i}$$

where \({Y}_{ti}\) denotes the outcome at time t for child i; \({\pi }_{0i}\) is the intercept of the growth trajectory for child i; \({\pi }_{1i}\) is the growth rate for child i; time denotes the indicator of data collection waves (Time = 1, 2, 3); \({\beta }_{00}\) is the average of level-1 intercepts; \({\beta }_{10}\) is the average of level-1 growth rates; \({e}_{ti}\) is the level-1 residuals; and \({r}_{0i}\) and \({r}_{1i}\) are level-2 residuals for the intercept and growth rate, separately.

Thematic analysis (Braun & Clarke, 2013) was used for analyzing the interview data, which intended to address our RQ2. We followed a six-step process: familiarization, coding, generating themes, reviewing themes, defining and naming themes, and writing up. The audio interview data were transcribed to text first and then were analyzed by NVivo 12 software program (QSR International Pty Ltd., 2020). For the interview conducted in Mandarin, the Chinese transcript was first translated into English by a transcription software (https://otter.ai/); two RAs independently checked and corrected errors in the translation and then went over all the translations with a senior researcher, who had English translation training, to discuss the consistency and accuracy of the translations. If there were any conflicts, the senior researcher resolved the issue with two RAs together.

Eye-tracking data were analyzed to address our RQ3, which were coded by iMotion software (https://imotions.com/eye-tracking/) and analyzed via data visualization using ggplot2 R package (Wickham, 2016). Given the small sample size from this subset of data, we did not conduct any statistical analysis. We used line chart to visualize the eye-tracking measures over time and compared all the measures between two locations.

Results

Descriptive Statistics

Table 1 provides the descriptive statistics of the proportion of correct answers over all questions obtained from each assessment. The means and medians of all four assessments were shown to increase over time, but only Corsi and Raven tests showed a relatively larger magnitude of increase. The distributions of each outcome variable at each time point were approximately normal except the Raven test at Time-3. Figure 5 shows the distribution of Raven scores across three time points. The distribution of Raven scores at Time-3 showed negatively skewed and a relatively large magnitude of kurtosis.

Table 1 Descriptive statistics for four assessments: proportions of correct answers
Fig. 5
figure 5

The distribution of proportions of correct answers in Raven Progressive Matrices Tests across time

The sample size in this study may affect the distribution of the scores. Given that the violation of normality assumption was not serious, we continued further data analysis for the assessment data. In the following sections, we reported the results to address three research questions.

Changes in Cognitive Abilities with Four Assessments

The cognitive assessment data were used to address RQ1. The results of linear growth models showed that children’s scores on Corsi and Raven tests were statistically increased over time, but no significant change was found for mental rotation and Pana test scores over time (see Table 2). The slope of the time factor (i.e., growth rate) showed an increase of 4% in Corsi test scores (\({\gamma }_{10}\)= 0.04, p < 0.01) and an increase of 5% in Raven test scores (\({\gamma }_{10}\)= 0.05, p < 0.001) every 2 months on average. The results suggested that student visuospatial working memory and logical and abstract reasoning skills were improved over time, but their sense of number and spatial processing abilities remained around the same level over 6 months.

Table 2 Linear growth model results for four assessments

Cognitive Processes via Eye-Tracking

The eye-tracking data were used to address RQ2. It is worth noticing that all the children focused on their learning activities and were rarely distracted. We did observe the differences between two locations (even though the instructors received the standard training and used the same teaching materials in class). The instructors in one location had more teaching experiences than those in the other location, which might explain some differences. Additionally, for most of the eye-tracking measures, we found an obvious change at Time-2. It might just reflect the curriculum design: the modules were designed to increase the difficulty level after 2–3 months (around Time-2 data collection) based on the individual student’s progress.

Figure 6 shows that the proportion of fixation counts started to change by Time-2 for most AOIs. For building a robot, the proportion of fixation counts increased over time for both locations, indicating that children put more efforts on this task, whereas the proportion of fixation counts decreased over time for the other two tasks, i.e., resources and screen, indicating that children did not need to frequently check the robot model on the screen and look for Lego pieces.

Fig. 6
figure 6

Proportion of fixation counts by three AOIs and two locations

Figure 7 shows the average fixation duration for three AOIs by two locations. In general, the shorter fixation duration indicates that the student can process the information faster. In our study, the average fixation duration decreased for all AOIs from Time-1 to Time-2, indicating that children became faster to process the information for tackling their tasks. However, it increased or remained the same level from Time-2 to Time-3, suggesting that children needed more time to process the information, which reflected the curriculum design as we discussed earlier, that is, the difficulty level of the modules increased after 2–3 months.

Fig. 7
figure 7

Average fixation duration by three AOIs and two locations

Figure 8 presents the average saccade duration for three AOIs by two locations. In general, the shorter saccade duration indicates that the student focuses more on the task and engages in tackling the task than just surfing around to look for cues. In our study, the average saccade duration decreased for all AOIs except that the screen time for Location-2 increased from Time-1 to Time-2. Additionally, the robot building time for Location-1 remained the same over time. The increasing difficulty level of modules complicated our interpretation of the results here. The average saccade duration for Location-1 was in general shorter than that for Location-2.

Fig. 8
figure 8

Average saccade duration by three AOIs and two locations

In summary, eye-tracking data revealed that in general children became more focused on the tasks and processed the information faster over time. Additionally, children were found to spend more time on tackling the robot building task than looking at the robot model on the computer screen.

Interview on Perceptions of RE Effects

The interview data were used to address RQ3. Children and their parents were interviewed separately to avoid the influence of parents’ responses on their children. Among nine children, three are girls, but we did not find any differences between girls and boys. The results of interviews were summarized into three themes.

Motivation to Attend RE Classes

Most children indicated that they were told by their parents to attend robotics learning classes in the first place. However, children enjoyed attending a few classes, so they stayed in the class. Two children saw their friends attending robotics classes, so they asked their parents to enroll them in. Half of the parents enrolled their children in the class because of their friends’ recommendations. Most parents wanted their children to have fun and to develop a hobby. Half of the parents also indicated that they wanted their children to develop logical reasoning and programming skills.

Perceived Learning Environment

All the children enjoyed their learning activities and liked frequent interactions with the instructor. Since Lego pieces were used for building robots, children perceived Lego as a toy and felt that their activities in robotic classes were more fun than their regular schooling. Building robots and programming to make robots move were more like games to them. Half of children indicated that they enjoyed the frequent interactions with the instructor and preferred to stay with this robotics school if they would continue robotics learning. Only two children said that they sometimes got frustrated when they were not able to complete the tasks.

All the parents liked the small class and student-centered learning environment, which allowed each student to have a lot of individual interactions and hands-on activities with the instructor and to receive timely feedback. They appreciated the systematic learning modules that were designed to help children to build up their skills on both building robots and programming. Only one parent mentioned that it would be better if the instructor could assign some homework to children, so that children could have more practice at home.

Perceived Learning Outcomes

Children indicated that they felt excited when they were able to put Lego pieces together, build a robot, and move it. Compared to 6 months ago, children felt that they were able to build more complex robots and to program the robot to move with less help from the instructor. All the parents reported that their children became more concentrated on a task in general and were able to focus on a task for a longer time compared to 6 months ago. The parents also found that their children were able to reason with them in a logical way and tried to provide rationales for things they did. Half of the parents noticed that their children also improved their communication skills in addition to robot building and programming skills. It was possibly because the instructor requested children to present their work frequently.

Overall, the interview results showed that children enjoyed robotics classes, improved their logical reasoning and organization abilities, and were able to focus on their learning activities for a long time.

Discussion and Conclusion

The emerging field of RE is a new and rapidly growing subject area in many countries. However, the current literature on RE shows a need for more empirical evidence and rigorous measurement tools to help educators understand how RE can benefit young children’s learning. With the mixed methods approach, our study investigated how RE benefit young children’s cognitive abilities and cognitive processes via a combination of eye-tracking technology, cognitive assessment, and interview. The results of cognitive assessments showed that some of children’s computational skills were improved over time, which included visuospatial working memory and logical and abstract reasoning skills. Using eye-tracking data, we found that in general children became more focused and process information faster over time. The interview data revealed that children perceived RE activities as playing games; parents found their children to be more focused on activities comparing to 6 months ago.

One of the interesting findings in our eye-tracking research was that all the children focused on their learning activities in RE classes and were rarely distracted by their peers or others. This echoed one of the important findings in the interview with the parents and children; that is, RE activities were fun and engaging for their children. Contrary to the formal school learning, children perceived RE activities as playing games, which provided a playful and more informal learning environment for children to engage in their studies. This may be good news for schools that plan to include RE in their curriculum.

Our eye-tracking results also showed that in general, children became faster in processing information that they needed for robot building, such as less time to look for Lego pieces and to understand the robot model on the computer screen, while putting more efforts on building robots. This finding was also confirmed by the results we obtained from the cognitive assessments. The results of Raven Progressive Matrices tests and Corsi Blocks Tasks showed that children’s abstract and logical reasoning as well as visuospatial working memory were improved over time. Correspondingly, children were able to get faster at finding the correct Lego pieces and spend less time at recognizing the geometric pattern of the robot model for building their own robot.

However, we did not find statistically significant improvement in children’s number sense and mental rotation abilities though there were small increases on the scores of these assessments. Most RE learning activities for this age group only required very basic numeric skills, which might explain why children did not improve their number sense in a large magnitude. It should be noted that mental rotation was used to assess spatial cognitive abilities that focus on rotations of three-dimensional objects, which was the most challenging among the four assessments. This assessment is even challenging to adults, so we did not expect children to be able to improve their scores in such a short time.

One strength of our research is to observe the changes in children’s cognitive abilities and processes in a natural class setting, which reflects a real world. However, this is also a limitation because we did not have a highly controlled experimental design. Hence, the changes observed in the study may reflect the effect of RE learning activities and possibly other unknown factors that we did not control for. Another limitation of this study is that the quality of some eye-tracking recordings was not desirable because either the calibration just met the minimum requirement, or the eye-tracker was touched or moved by children during the data collection. To make a causal inference, the future research can either use randomized experimental or matched case control study design. Researchers can improve the quality of eye-tracking data by training the RA to pay attention to the participant and adjust the eye-tracker position when needed.

To the best of our knowledge, the present study is the first one to investigate children’s cognitive abilities and processes via a combination of eye-movement tracking technology, assessment, and interview. The research findings provide some insight into how RE can benefit children’s cognitive abilities and cognitive processes, which may inform classroom teachers’ practice and help policymakers to consider the potential to integrate RE in the curriculum for young children. RE is a new and rapidly growing subject area and has been included in the mandated school curricula in some countries. However, there is a need to collect more empirical evidence to help educators and policy makers understand how RE can benefit children’s learning before governments make large-scale investment in infrastructure and teacher professional development. We hope this study will motivate more research to use rigorous and diverse measurement tools for understanding the impact of robotics education on children’s learning and cognitive development.