Introduction

Artificial intelligence (AI) was defined in 1956 as “the science and engineering of creating intelligent machines” (McCarthy, 2004, p.2). AI education is considered a driver of economic growth, future workforce development, and global competitiveness (Cetindamar et al., 2022; Sestino & De Mauro, 2022). Researchers’ interest in equipping students with AI knowledge, skills, and attitudes to thrive in an AI-rich future (Miao et al., 2021; Rina et al., 2022; Wang & Cheng, 2021) has given rise to the term “AI literacy”, which concerns the design and implementation of AI learning activities, learning tools and applications, and pedagogical models. Some educators focus on demonstrating machine learning through activities for mastering coding skills and AI concepts (Marques et al., 2020), while others suggest focusing on computational thinking and engagement in deductive and logical reasoning practices (Wong, 2020). In this paper, it is argued that AI education should be extended beyond universities to K-12 students.

There have been a number of recent studies of AI in the context of kindergartens (Su & Yang, 2022; Williams et al., 2019a, 2019b), primary schools (Ali et al., 2019; Shamir & Levin, 2021), and secondary schools (Norouzi et al., 2020; Yoder et al., 2020). However, little is known about what and how AI should be taught (Su et al., 2023a; Ng et al., 2023; Van Brummelen et al., 2021). One challenge is delivering AI content in an age-appropriate and effective manner (Su et al., 2023b; Su & Yang, 2023). Despite the numerous AI learning tools available in K-12 contexts (Rizvi et al., 2023; Van Brummelen et al., 2021), such as Turtle Robot (Papert & Solomon, 1971), PopBots (Williams et al., 2019a) and LearningML applications (Rodríguez-García et al., 2020), many educators are concerned about the suitability of these tools (Chiu & Chai, 2020; Su & Yang, 2023).

With the development of age-appropriate learning tools, AI concepts can be simplified via visual representation, such as block-based programming (Estevez et al., 2019). For example, Scratch, a high-level block-based programming language, allows students with limited reading ability to create computer programs by using illustrations and visual elements (such as icons and shapes) without having to rely on traditional written instructions (Park & Shin, 2021). AI tools and platforms, including Zhorai (Lin et al., 2020), Learning ML (Rodríguez-Garciá et al., 2021), Machine Learning for Kids (Sabuncuoglu, 2020), and Scratch (Li & Song, 2019), have a positive impact on students’ AI knowledge and skills. Chen et al. (2020) noted that despite the introduction of various learning tools to teach AI, there has not been enough discussion on how AI content should be taught and how tools should be used to support pedagogical strategies and related educational outcomes.

Theoretical model

The technology-based learning model of Hsu et al. (2012) is adopted and modified in this study; it has been widely used by other researchers conducting similar systematic reviews (Chang et al., 2018, 2022; Darmawansah et al., 2023; Tu & Hwang, 2020), as shown in Fig. 1. Hsu et al. (2012) suggested cross-analyzing academic research trends by examining the associations among three categories: research methods, research issues, and application domains. They argue, for example, that by exploring how the topic of a study may affect the selection of its sample and participants, a more thorough and comprehensive analysis can be conducted. Their proposed technology-based learning model has helped frame the research questions of the present study.

Fig. 1
figure 1

Modified technology-based learning model by the researchers of this review (adopted from Hsu et al., 2012)

According to Hsu et al. (2012), “research methods”, “research issues”, and “application domains” are the three main categories to be considered in the development of a coding scheme to gauge research trends in the field of technology-based learning and education. In terms of research methods, a quantitative, qualitative, and mixed approach is employed in this study to construct the coding scheme for the review of the literature (McMillan & Schumacher, 2010). In terms of research issues, with reference to Chang et al. (2018), learning outcomes are categorized as cognitive, affective, behavioral, and skills acquisition outcomes. Finally, two application domains are pursued in this paper: (1) the pedagogical strategies commonly used in science courses, which were employed by Lai and Hwang (2015) and which include constructive, reflective, didactic, and unplugged pedagogies (Cope & Kalantzis, 2016), and (2) the learning tools, namely, hardware, software, intelligent agents, and unplugged strategies, which are coded as suggested by Ng and Chu (2021).

Research objectives

In this study, the literature on pedagogical strategies, assessment methods, learning tools, and learning outcomes in AI K-12 settings is studied. Four research questions are formulated.

  • RQ1: What are the potential learning tools identified in AI K-12 education?

  • RQ2: What pedagogical strategies are commonly proposed by studies on AI K-12 learning tools?

  • RQ3: What learning outcomes have been demonstrated in studies on AI K-12 learning tools?

  • RQ4: What are the research and assessment methods used in studies on AI K-12 learning tools?

Methods

This study follows the same four steps employed in other studies on AI literacy in K-12 (e.g., Ng et al., 2022; Su et al., 2022): (1) identifying relevant studies, (2) selecting and excluding eligible studies, (3) data analysis, and (4) reporting findings. In this study, the preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines (Moher et al., 2015) are followed.

Identifying relevant studies

The electronic databases used for the literature search were ACM, EBSCO, Web of Science, and Scopus. The aim of this review is to provide a comprehensive K-12 education for learning tools, encompassing early childhood education and primary and secondary education. As the education systems of different countries may differ from each other, the search string used in the paper for K-12 includes from kindergarten to secondary school students. In addition, learning tools are defined as a variety of learning platforms and systems, educational applications and activities that can enhance the teaching process and support students in AI literacy learning. Therefore, the search strings are reflected with specific definitions for K-12 and learning tools to search for target articles and data, as shown in Table 1.

Table 1 Search string

Study selection and exclusion

To ensure the generalizability of the findings and to avoid biases in article selection, specific inclusion and exclusion criteria are employed in this study (Table 2).

Table 2 Article selection—inclusion and exclusion criteria

As shown in Fig. 2, a total of 326 articles were identified, 105 from EBSCO, 81 from Web of Science, 110 from Scopus, and 30 from ACM. The exclusion criteria were as follows: (1) studies that were irrelevant to the research topic (N = 251). For example, Bai and Yang (2019) were excluded since the research applied a deep learning technology recommendation system to improve teachers’ information technology ability. It was conducted in contexts other than those of AI literacy education, learning and instruction. Mahon et al. (2022) presented the design of an online machine learning and artificial intelligence course for secondary school students; however, they did not discuss in detail what type of learning tools can be used and how to support students’ AI literacy learning. A discussion paper by Karalekas et al. (2023), a theoretical paper by Leitner et al. (2023) and a scoping review by Marques et al. (2020) were also removed because they were not empirical studies, and they did not involve conducting any practical experiment. (2) Duplicate studies (N = 10), (3) studies that were not written in English (N = 4), (4) non research studies (N = 10), and other types of articles (N = 8). Finally, 46 studies were selected, as shown in Appendix 1.

Fig. 2
figure 2

PRISMA diagram of included articles in the scoping review

The snowball method

To enhance the systematic search for relevant literature, the snowballing method as outlined by Sayers (2008) was employed. This involved tracing references in previously selected articles. The focus was on the references cited in the earlier selected articles as discovered through Google Scholar. Utilizing the snowballing method led to the identification of three additional articles that met the eligibility criteria described above.

Overview of selected studies

Table 3 presents an overview of the 46 selected studies, including the type of articles, year of publication, and educational level.

Table 3 The characteristics of the reviewed articles

Publication trends

Forty-six articles were identified: 28 conference papers and 18 journal articles. The first article was published in 1995, and 39 articles have been published in the past 5 years, with a peak in 2021 (Fig. 3).

Fig. 3
figure 3

The trend of AI literacy education in K-12 contexts

Countries

Most research took place in the USA (N = 8), China (N = 7), Finland (N = 3), Hong Kong (N = 3), Israel (N = 3), Spain (N = 3), Australia (N = 2) and Japan (N=2). Others were conducted in Brazil, Denmark, Greece, Indonesia, New Zealand, Norway, Sweden, Thailand, and the UK. The locations of the remaining six articles are unknown.

Educational levels

Primary and secondary schools are both the most researched educational levels, each covering 44% of the selected articles, followed by kindergartens (11%) and K-12 education (2%).

These selected studies generally include samples of students of both genders and a wide range of ages, from 3-year-old kindergarten students (Vartiainen et al., 2020) to 20-year-old Danish high school students (Kaspersen et al., 2021). It also encompasses participants in science technology engineering mathematics (STEM) classes (Ho et al., 2019), high-performing students of the Scientists in School program (Heinze et al., 2010), students with and without an AI background (Yoder et al., 2020), and students from varying socioeconomic backgrounds (Kaspersen et al., 2021).

There were three AI-related research studies between 1995 and 2017, mostly adopting unplugged activities and games for AI teaching, which are different from research conducted after 2017. The first article was published by Scherz and Haberman (1995), who designed a special AI curriculum with the use of abstract data types and instructional models (e.g., graphs and decision trees) to teach AI concepts such as logic programming and AI systems to high school students in Israel. In another two studies, the use of programming robots (Heinze et al., 2010) and computer science unplugged activities (Lucas, 2009) were explored with Australian and New Zealand K-6 students, respectively. Since then, a greater variety of learning tools have been employed and expanded to European and Asian countries across all educational levels in K-12 settings. Appendix 1 provides an overview of the selected articles.

Findings

RQ1: What are the potential learning tools identified in AI K-12 education?

The potential learning tools identified in K-12 contexts were intelligent agents (N = 20), software-focused devices (N = 19), hardware-focused devices (N = 10), and unplugged activities (N = 6) (Fig. 4 and Table 4). In this section, intelligent agents, software devices, and hardware devices are discussed.

Fig. 4
figure 4

Summary of learning tools used in AI K-12 education

Table 4 AI learning tools adopted in K-12 educational setting

Intelligent agents

Intelligent agents, such as Google Teachable Machine, Learning ML, and Machine Learning for Kids, which make decisions based on environmental inputs by using their sensors and actuators, are the most popular learning tools for enhancing students’ computational thinking skills within K-12 contexts. Teachable Machine is a web-based tool developed by Google and is found to be more effective than are unplugged activities in kindergarten settings (Lucas, 2009; Vartiainen et al., 2020). In Vartiainen et al. (2020), children aged between 3 and 9 autonomously explored the input‒output relationship with Google Teachable Machine, which fostered their intellectual curiosity, developed their computational thinking, and enhanced their understanding of machine learning. In both primary (Toivonen et al., 2020; Melsión et al., 2021) and secondary schools (Kilhoffer et al., 2023; Martins et al., 2023), Google Teachable Machine has been employed, allowing students to use their webcams, images, or sounds without coding to develop their own machine learning classification models.

In addition, Learning ML has been employed for primary schools to create AI-driven solutions and models, for example, to teach the supervised machine learning principle (Voulgari et al., 2021; Rodríguez-Garciá et al., 2021), which simplifies abstract AI algorithms for primary school students. Machine Learning for Kids, which introduces the power of the IBM Watson engine for AI modelling (Fernández-Martínez et al., 2021), Cognimates (Sabuncuoglu, 2020; Fernández-Martínez et al., 2021), which allows students to practice coding, and Ecraft2Learn, which contains a deep learning functionality (Kahn et al., 2018), have also been used in secondary school classrooms. Intelligent agents often offer students hands-on experience to develop datasets and to build customized machine learning systems.

Software devices

Software devices are adopted to enable mostly primary and secondary school students to learn about computational thinking, including programming for sequences, rule-based and conditional mechanisms, as well as data science and machine learning using visual language. For example, Scratch, a block-based programming software, is frequently used in both primary (Dai et al., 2023; Li & Song, 2019; Shamir & Levin, 2021) and secondary schools (Estevez et al., 2019; Fernández-Martínez et al., 2021). Other software is used for visualizing and scaffolding abstract AI concepts through online games and experiences, such as Quick and Draw (Martins et al., 2023) and Music Box (Han et al., 2018). In primary schools, Kitten is used to teach block-based programming (Li & Song, 2019), whereas C++ and JavaScript are used for logical thinking and simulation (Gong et al., 2020). In secondary schools, researchers have often employed free online software and tools, such as Snap (Yoder et al., 2020) and Python (Gong et al., 2018; Norouzi et al., 2020), for algorithm automation, as well as RapidMiner for no-code data science learning (Sakulkueakulsuk et al., 2018). To introduce machine learning concepts to secondary school students, other researchers have focused on developing online games such as the Rock Paper Game (Kajiwara et al., 2023) and the 3D role-player video game Quest (Priya et al., 2022).

Hardware devices

In addition, hardware, such as robotics and physical artifacts, has also been used with built-in software to supplement students’ understanding of AI concepts. Williams et al. (2019a, 2019b) introduced a preschool originated programming platform consisting of a social robot (PopBot) and a block-based programming interface. In Williams et al. (2019a), 80 prekindergartens to second-grade children (aged four to seven) were asked to build their own LEGO robot characters by using DUPLO block programming. PopBot is used as a learning companion to demonstrate its human-like behavior and to demystify AI concepts to younger students.

The lawn bowling robot (Ho et al., 2019), Zhorai conversational robot (Lin et al., 2020), Micro: Bits (Lin et al., 2021), and Plush toys (Tseng et al., 2021) have been used in primary schools, while CUHKiCar (Chiu et al., 2021), the Alpha robot dog (Chai et al., 2020), Raspberry Pi Raspbian and a four-wheel drive chassis (Gong et al., 2018) have been used in secondary schools. For example, in Ho et al. (2019), grade six students built lawn-bowling robots for games and competitions while learning about the binary search and optimization algorithms of machine learning. Chiu et al. (2021) introduced the robotic CUHKiCar to secondary school students so that they could perform face-tracking and line following tasks.

RQ2: What pedagogical strategies are commonly proposed by studies on AI K-12 learning tools?

As shown in Fig. 5, the four orientations of pedagogy are summarized as authentic/constructive, reflective, didactic, and unplugged. While a total of 17 potential pedagogical strategies were identified within the four orientations in K-12 contexts (Table 5), authentic/constructive methodologies with project-based learning (N = 27) were the most popular pedagogy used across kindergartens (Williams et al., 2019a, 2019b), primary schools (Toivonen et al., 2020; Rodríguez-Garciá et al., 2021), and secondary schools (Gong et al., 2018; Kilhoffer et al., 2023; Sakulkueakulsuk et al., 2018). When teaching AI to students with a diverse range of needs, the evidence demonstrates the positive impact of combining multiple pedagogical approaches in K-12 studies (Heinze et al., 2010; Lee et al., 2021; Williams et al., 2019a, 2019b).

Fig. 5
figure 5

Four orientations of pedagogical strategies commonly used in AI K-12 education

Table 5 Four orientations of the pedagogical strategies used in AI K-12 learning tools studies

First, authentic and constructive methodologies, project-based (N = 27), human-computer interaction (N = 7), and play-based active learning (N = 5) approaches have been commonly used in K-12 education. Offering hands-on opportunities to students to learn about real-world applications of AI is an example of project-based learning (Fernández-Martínez et al., 2021; Han et al., 2018; Williams et al., 2019a). Other researchers have examined whether students can acquire AI knowledge on human-computer interactive experiences and have found that this does not require any prior knowledge of AI models, such as Zohari (Melsión et al., 2021) and Google Teachable Machine (Lin et al., 2020; Vartiainen et al., 2020). In addition, child-centered play-based learning can effectively engage students and encourage them to take the initiative to construct knowledge during the process of imaginative play (Heinze et al., 2010), which involves students adopting the role of AI developer, tester, and AI robot (Henry et al., 2021).

Pedagogical strategies in kindergartens

Researchers have often used project-based approaches (N = 3), human-computer interactions (N = 3), play-based learning (N = 1), and unplugged activities (N = 1) to teach younger students AI concepts. In a project-based learning approach, students learn by actively engaging in real-world projects. Williams et al. (2019a, 2019b) used a hands-on project allowing prekindergarten and kindergarten students to acquire AI concepts, including knowledge-based systems, supervised machine learning, and AI generative music. Alternatively, Vartiainen et al. (2020) studied human-computer interactions that allowed students to freely explore the input‒output relationship with Google Teachable Machine to identify and to evaluate a problem and find a solution to it. Heinze et al. (2010) focused on imaginative play, which is relevant to young students, as play is associated with various levels of autonomy and provides an engaging introduction to AI and the formation of scientific concepts. Lucas (2009) used unplugged activities to teach the key concepts of computing, including data encoding, data compression, and error detection.

Pedagogical strategies in primary schools

Project-based learning is more frequently used in primary schools than in kindergartens: It has been reported as a learning approach in 14 of the 18 studies of primary school settings, compared to only three of the five studies in the kindergarten setting. Similarly, in primary school settings, studies have revealed a strong dependence on play/game-based (N = 5) and human-computer interaction learning approaches (N = 3).

Projects that demonstrate students’ improved AI knowledge have been conducted. Machine learning projects (Toivonen et al., 2020), LearningML projects (Rodríguez-Garciá et al., 2021), and “AI+” projects (Han et al., 2018) have been designed to demystify AI knowledge. Henry et al. (2021) integrated machine learning in role-playing games, while Shamir and Levin (2021) allowed students to play with AI chatbots to develop AI models and to construct a rule-based machine-learning system. Some researchers have designed learning programs that offer human-computer interaction activities to educate students about gender bias (Melsión et al., 2021) and the social impact of mistakes made by AI models in training datasets (Lin et al., 2020).

Pedagogical strategies in secondary schools

The project-based learning approach (N = 10) is also the most dominant in secondary schools, followed by collaborative learning (N = 5). First, project-based learning is used to engage students by applying their AI knowledge to solve real-world problems. Teachers have reported that AI projects and hands-on activities are effective in keeping students focused on tasks (Kilhoffer et al., 2023). For example, a smart car-themed AI project (Gong et al., 2018), the Redesign YouTube project (Fernández-Martínez et al., 2021), and the agriculture-based AI Challenge project (Sakulkueakulsuk et al., 2018) have been introduced to provide hands-on experience for students to connect their knowledge to their day-to-day lives. Through active exploration, such projects prompt secondary school students to contemplate the personal, social, economic, and ethical consequences of AI technologies (Kaspersen et al., 2021).

Second, collaborative learning allows students to work in groups to promote cognitive knowledge, as it engages them in scientific inquiry with the help of smart devices (Wan et al., 2020). Kaspersen et al. (2021) designed a collaborative learning tool, VotestratesML, together with a voting project allowing students to build machine learning models based on real-world voting data to predict results.

RQ3: What learning outcomes have been demonstrated in studies on AI K-12 learning tools?

Of the 46 articles, 31 reported potential learning outcomes: (1) cognitive outcomes, (2) affective and behavioral outcomes, and (3) the level of course satisfaction and soft skills acquisition.

Cognitive outcomes

Thirty-one studies documented various degrees of positive cognitive outcomes. Students generally showed a basic understanding of AI, including AI rule-based systems (Ho et al., 2019), machine learning principles and applications (Han et al., 2018; Shamir & Levin, 2021), AI ethics (Melsión et al., 2021), and AI limitations (Lin et al., 2020). In Williams et al. (2019a), 70% of prekindergarten and kindergarten students understood knowledge-based systems, whereas Vartiainen et al. (2020) found that, through AI learning tools, younger students developed their computational thinking and their understanding of machine-learning principles and applications. Then, Dai et al. (2023) reported that primary school students taught with analogy-based pedagogy (i.e., using humans as a reference to teach and learn AI) significantly outperformed primary school students taught with the conventional direct instructional approach in terms of developing their conceptual understanding and increasing their AI technical knowledge proficiency as well as their ethical awareness of AI. Other researchers have argued that primary school students have demonstrated their understanding of AI by constructing and applying machine-learning algorithms with the help of digital role-playing games (Voulgari et al., 2021) and project-based pedagogy (Shamir & Levin, 2021). Through designing and programming a robot, students increased their understanding of AI biases (Melsión et al., 2021). In secondary schools, researchers have also reported an increase in students’ knowledge of AI algorithms (Yoder et al., 2020) and machine learning concepts (Sakulkueakulsuk et al., 2018), as well as their recognition of AI patterns (Wan et al., 2020). For example, students understood the fundamental neural networks of machine learning concepts by developing a classification model of recycling images (Martins et al., 2023).

Affective and behavioral outcomes

Affective and behavioral outcomes have been identified in AI learning tool studies within K-12 contexts. In general, students’ motivation to learn AI (Han et al., 2018; Shamir & Levin, 2021, 2022) and their interest in the course (Mariescu-Istodor & Jormanainen, 2019; Martins et al., 2023) were enhanced as a result of AI learning activities. Students’ perceptions of the relevance of AI to their life also increased (Kajiwara et al., 2023; Lin et al., 2021). Students scored high on self-efficacy (Kajiwara et al., 2023; Shamir & Levin, 2022) and confidence (Shamir & Levin, 2021) in training and validating an AI system. In Martins et al. (2023), over 45% of 108 secondary school student participants in the introductory course “Machine Learning for all” reported that they perceived AI learning as an enjoyable experience, and 63% of them hoped to learn more about machine learning in the future.

Moreover, students reported that they were highly motivated to explore the Teachable Machine (Vartiainen et al., 2020), to design the robotic arm and computer source codes (Ho et al., 2019), to draw animals and sea creatures for the machine learning project (Mariescu-Istodor & Jormanainen, 2019), and to predict the sweetness of mangoes by using machine learning models (Sakulkueakulsuk et al., 2018).

From the behavioral perspective, high student engagement was reported in project-based (Kaspersen et al., 2021; Shamir & Levin, 2021; Wan et al., 2020) and play/game-based (Heinze et al., 2010; Voulgari et al., 2021) settings. Primary students attended all sessions and expressed a desire to join an upcoming AI contingency course (Shamir & Levin, 2021), while secondary students were actively engaged in scientific inquiry (Wan et al., 2020). Students were also keen on recommending AI games to their friends (Voulgari et al., 2021). Therefore, a combination of play/game-based and project-based approaches may consolidate AI concepts through gameplay while enhancing students’ engagement in AI projects (Han et al., 2018).

Level of satisfaction and soft skills acquisition

Students’ level of satisfaction was found to be positively influenced by constructivist (e.g., project-based) and reflective (e.g., learning by design and learning by teaching) pedagogies (Ho et al., 2019; Shamir & Levin, 2021, 2022). In Lin et al. (2020), students reported a high satisfaction level upon acquiring AI knowledge. Their computational thinking and subsequent project performance were also enhanced. All students completed the course and their AI tasks without any previous learning experience (Toivonen et al., 2020).

The findings from the selected articles reveal that a deep understanding of AI promotes the acquisition of various soft skills. Ali et al. (2019) found that students’ intellectual curiosity increased after engaging in the construction of an AI neuron. By using bulletin boards shared electronically and online chats for feedback, their collaboration and communication skills were also enhanced (Shamir & Levin, 2021). Moreover, students reported gaining problem solving and technical skills when working with AI systems, including coding, designing simple algorithms, and debugging in Scratch learning activities (Dai et al., 2023).

RQ4: What were the research and assessment methods used in AI K-12 learning tools studies?

In this section, an overview is presented of research methods and data collection procedures within K-12 contexts. Overall, researchers adopted a mixed method (N = 19), qualitative (N = 15) and quantitative methods (N = 12) in AI learning tools in K-12 research. Mixed methods are predominantly used in both primary school (e.g., Dai et al., 2023; Martins et al., 2023; Shamir & Levin, 2021; Toivonen et al., 2020) and secondary school contexts (e.g., Chiu et al., 2021; Estevez et al., 2019), whereas qualitative methods are commonly used in kindergartens (e.g., Heinze et al., 2010; Vartiainen et al., 2020), as shown in Table 6.

Table 6 An overview of the research methods within K-12 contexts

A variety of assessment methods were used: questionnaires and surveys (N = 30), artifacts/performance-based evaluation (N = 15), interviews (N = 14), observations (N = 5), games assessment (N = 1), and field visits (N = 1) (Table 7). The two most commonly used data collection methods - questionnaires and surveys and artifacts/performance-based evaluation - are discussed in this section.

Table 7 An overview of the data collection procedures within K-12 contexts

In terms of assessment methods, questionnaires and surveys (N = 30) and artifacts/performance-based evaluation (N = 17) are the two most commonly used data collection methods across K-12 contexts (Table 7).

Questionnaires and surveys are used in a quantitative methodology to understand the perception of robotics and theory of mind (e.g., knowledge access, content false belief and explicit false belief). For example, perception of robotics and theory of mind were used in kindergartens (Williams et al., 2019a, 2019b).

Surveys were used to evaluate primary school students’ motivation (Lin et al., 2021), self-efficacy in AI learning (Shamir & Levin, 2022), and perceived knowledge and competence (Dai et al., 2023; Mariescu-Istodor & Jormanainen, 2019; Ng et al., 2022). In addition to Ali et al. (2019), who used the Torrance test for assessment, researchers also utilized pre- and posttests (Tseng et al., 2021) to compare the AI learning outcomes of control and treatment groups in primary school settings (Melsión et al., 2021). Others provided AI educational experience without stating the assessment method (Ho et al., 2019; Lee et al., 2020; Tseng et al., 2021). Heinze et al. (2010) conducted AI learning activities without assessing learning outcomes. Shamir and Levin (2022) designed a questionnaire based on “constructionist validated robotics learning” for machine learning construction (the questionnaire included statements such as "I can make a ML system", "I can propose ideas for using ML to solve problems."). Dai et al. (2023) used multiple choice questions (e.g., "Which of the following devices or systems is an intelligent agent?") to evaluate the AI knowledge of primary school students according to Bloom’s Taxonomy.

In secondary schools, surveys are used to measure students’ information knowledge acquisition (Priya et al., 2022), perceived abilities (Chiu et al., 2021; Ng & Chu, 2021) and futuristic thinking, engagement, interactivity, and interdisciplinary thinking skills (Sakulkueakulsuk et al., 2018). For example, in Priya et al. (2022), surveys were used in the first phase of their study to test the knowledge gained by students in three AI areas, namely, supervised learning (e.g., "What is the underlying idea behind supervised learning?"), gradient descent (e.g., "In gradient descent how do we reach optimum point?"), and KNN classifications (e.g., "Using underlying principle of KNN classification classify a fruit which is surrounded by 2 apples and 1 mango in its nearest neighbors."). In the second phase of the study, surveys were used to evaluate students’ satisfaction with the design of the game “ML Quest”, which introduced machine learning concepts based on the quality factors of the technological acceptance model (e.g., “Visualizations displayed by ML-Game are relevant to the concept taught at each level”).

Artifact-based/performance-based assessments are embedded in a large number of studies to evaluate learning outcomes. Through artifacts (e.g., Popbots), Williams et al., 2019a, 2019b) evaluated kindergarteners’ knowledge and understanding of supervised machine learning. Ho et al. (2019) used a performance-based assessment to assess primary students’ understanding of optimal data training and its AI applications. The artifact analysis of Shamir and Levin (2021) involved the construction of a rule-based AI system, which included designing, understanding, and creating the AI neural network agent. Dai et al. (2023) used a drawing assessment to evaluate primary school students’ understanding of AI and its impact on their cognitive development using prompt questions (e.g., "What AI can do? What would you like to use AI for?") to stimulate their thinking.

Moreover, Yoder et al. (2020) focused on secondary school students’ block-based programming artifacts to examine their knowledge of AI search algorithms and breadth-first search (BFS), as well as their understanding of the possibility of gender bias when using AI screening tools in recruitment. In Martins et al. (2023), machine learning model artifacts created by students were used as evidence to demonstrate their learning outcomes. The performance-based assessment was used to evaluate students’ ability to correctly label the recycling trash images in the classification process.

Discussion and conclusion

The results of this study are consistent with Kandlhofer et al. (2016), who found that a variety of learning tools have been designed to support various learning objectives for students from kindergarten to university. The previous literature also indicates that many learning tools, such as intelligent agents and software, are effective in facilitating adolescents’ and university students’ acquisition of computational thinking skills (Çakiroğlu et al., 2018; Van Brummelen et al., 2021), whereas the availability of such tools for kindergarten and primary students is often overlooked. Few researchers have investigated whether AI learning tools can bridge the learning gap of younger students (Zhou et al., 2020). This study revealed that without prior programming experience, these learning tools (such as Popbots, Teachable Machine, and Scratch) can help address the diverse needs of younger students across K-12 educational levels (Resnick et al., 2005), leading to a richer visual learning experience and improving instructional quality (Kaspersen et al., 2021; Long & Magerko, 2020).

Previous reviews have indicated that many pedagogies are suitable in AI education, although this was done without reference to students' learning outcomes (Sanui & Oyelere, 2020). The findings of this study enrich existing knowledge of the positive effects of authentic and constructivist pedagogies in affective, behavioral, and cognitive aspects, as well as students’ level of satisfaction in AI learning. This study reveals that multiple pedagogies, such as project-based learning, experiential learning, game-based learning, collaborative learning, and human–computer interaction, are widely used in K-12 educational settings. An emerging form of analogy-based pedagogy to evaluate the AI knowledge of primary school students by assessing their drawings is identified. The focus of this analogy-based pedagogical strategy is the comparison of humans and AI, where humans are gradually moved from an analogy and to a contrast to highlight the characteristics, mechanism, and learning procedures of AI. It demonstrates and reflects the dialogic quality of the relationship with shared enquiry and shared thinking among students and AI learning tools. This is significant given the new cognitive demand of the AI era, as it provokes a shift in the role of the students by thinking together and learning to learn together (Wegerif, 2011). In future studies, exploration of additional emerging pedagogies (Yim, 2023), the co-creation of arts-based possibility spaces (Burnard et al., 2022), and dialogic learning spaces (Wegerif, 2007) in AI literacy education can be considered.

In addition, educational tools and applications are used not only to contribute new ways of knowing and doing but also to embed learning tools at the center of the AI literacy activities and programs instead of playing a supporting role in the primary purpose of education. This is expanding to serve the human need for education. The use of multiple educational learning tools and pedagogical strategies may be influenced by various factors in the teaching process, including students’ gender, background knowledge, and educational setting, all of which may affect their learning styles and motivation to learn AI. These factors and issues can be explored in future studies.

In this review, it was found that some studies assessed students’ performance by using the Torrance test for creativity (Ali et al., 2019), an AI knowledge test (Ng et al., 2022; Wan et al., 2020), pre- and postsurveys (Chiu et al., 2021; Estevez et al., 2019), and comparisons between control and treatment groups (Dai et al., 2023; Melsión et al., 2021), while others used subjective measures, including self-report surveys. Although artifact-based and performance-based approaches have been increasingly adopted in data collection procedures, some researchers used them as evidence of learning, without scoring according to established marking criteria for assessment purposes. There is room for introducing objective and rubric-based evaluation mechanisms to assess the quality of suggested methodologies. However, the lack of agreement on assessment criteria and instructional feedback shows that further research is needed to support the wide application of AI teaching in K-12 classrooms.

Research implications

From this study, the use of intelligent agents is recommended, including Teachable Machines, Machine Learning for Kids, and Learning for ML. Kindergarten students can benefit from learning tools such as PopBots, while software devices such as Scratch and Python can be introduced to demystify core AI principles to primary school students and create AI-driven solutions and models for secondary school students. Although hardware such as robotics and physical artifacts are generally effective, they may be costly for scalability.

This review reveals that constructivism, constructionism, and computational thinking are instrumental in addressing AI literacy education. Unfortunately, little research has adopted theoretical frameworks or conceptual models of reference for AI curricula, educational activities, or the design of AI learning tools and applications. To guide teaching, learning and effectiveness in using AI learning tools within AI literacy education, AI literacy learning theoretical frameworks are needed to guide the teaching instruction of kindergarten, primary and secondary school students. Usability, AI ethics, and transparency must be addressed in tool design to ensure that issues pertaining to data privacy and security will not arise. Moreover, there is currently insufficient theory-based, rigorous research on the effectiveness of AI educational tools to meet the diverse learning needs of students. Children may be invited to codesign with application designers. Thus, researchers may conduct theory-based and outcome-oriented quantitative and qualitative research on AI educational tools, which may be significantly beneficial to students.

More evaluation and documented analysis regarding the effectiveness of learning tools should be conducted to inform stakeholders of the existing trends in the field, pedagogical strategies, and instructional methods for teacher professional development.

More research, analysis, and evidence are needed to determine the effectiveness of AI learning tools before they are scaled up based on a risk-benefit analysis. Researchers should also clearly define the educational settings in which specific AI learning tools are appropriate to support the effective delivery of AI content in the classroom.

Recommendations

For educators

Aside from providing students with AI knowledge and skills that the market demands (Burgsteiner et al., 2016) and encouraging all citizens to be AI literate (Goel, 2017; Pedro et al., 2019), educators may promote holistic AI literacy education by considering humans, nonhumans (e.g., animals and machines) (Yim, 2023) and environmental elements (Miao & Shiohira, 2022) in their teaching content. Ethical questions should also be considered, including inclusivity, fairness, responsibility, transparency, data justice, and social responsibility (Crawford, 2021; Benjamin, 2019). To provide a roadmap for sustainable AI education implementation and development, it is essential to involve teachers in the design of learning tools and understand their perceptions regarding AI literacy education, as well as provide pedagogical strategies, resource development, and needs-based professional training for both preservice and in-service teachers.

For teachers

Children learn best at a certain stage of cognitive development (Ghazi & Ullah, 2015). It is recommended that the content of instruction is consistent with students’ cognitive developmental level, as it influences their readiness and ability to learn (Piaget, 2000). As a result, the technical and content depth of the educational learning tools should align with students’ age and the teaching objectives, and teachers should understand students’ cognitive development to plan age-appropriate activities with suitable learning tools. More collaboration among teachers with various pedagogical experiences across various educational levels may lead to more innovative and efficient teaching processes.

For researchers

Researchers should report evidence of the reliability. and validity of their findings where applicable since such data are crucial to evaluating the quality of their recommended learning tools or pedagogies. This can also aid other academics in updating their research on existing and developing pedagogical strategies. Researchers may consider designing and developing a standardized AI assessment mechanism that can be used across different grade levels to compare students’ AI literacy. This approach permits the standardization of assessment criteria and instructional feedback and thus better supports the wider application of AI teaching in K-12 classrooms.