Keywords

1 Introduction

Computer science (CS) education in primary and secondary schools (commonly referred to as K-12Footnote 1 in the education literature) has become increasingly prevalent in recent years, with more countries adopting, developing, and implementing standards of practice. Artificial intelligence (AI) and Machine learning (ML) are essential additions to K-12 computing education, as students are increasingly interacting with these technologies in their daily lives. For instance, the Irish Leaving Certificate Computer Science mentions both artificial intelligence and machine learning, albeit in a single learning outcome: explain when and what machine learning and AI algorithms might be used in certain contextsFootnote 2.

While there is considerable literature on general computing education [14] such as programming at the K-12 level, research on AI/ML in K-12 education is much less common. Although “AI education has begun to progressively trickle down to the K-12 range” [43], research on AI/ML in K-12 education is still in its infancy [35].

This paper describes a systematic review of the existing literature on AI/ML school-level education. The aim of this review is to broadly analyse the research that has been conducted in K-12 computing education that relates to AI/ML. Specifically, we are interested in the following questions:

  1. RQ1:

    Who has been the focus of the research on AI and ML education at school level?

  2. RQ2:

    What course content appears in this research?

  3. RQ3:

    Where has this research taken place?

In answering these questions we also make several important observations with implications for teaching AI/ML to school students.

2 Method

Selecting search terms for a broad review of this literature proved challenging. Terms that are too general result in an unwieldy set of papers (a very large percentage of which are irrelevant), while terms that are too specific are likely to miss relevant papers. After some trial and error with a range of databases, we selected a combined search string that captures the area of interest: (artificial intelligence OR machine learning) AND (school OR k-12). To check this search string for appropriateness, we inspected the results for relevance to our research questions. This was in addition to the overall content matter, as well as adequate coverage of a small test-set of papers that together served as a model of the area we intended to survey. The search terms were then applied to the title and abstract of the ACM Full-text Collection. The search was conducted on the 10th August 2022, and identified 197 papers.

We selected the ACM Digital Library as our initial database because it contains the most computer science (or computing) education papers from any one publisher by a significant margin. A search of DBLPFootnote 3 for computer science education returns 1,249 matches. 614 matches (40%) are from the top 10 venues, of which 6 are ACM venues accounting for 74% of the matches from the top 10 venues. Searching for computing education returns 1,328 matches. 431 (32%) are from the top 10 venues, of which 9 are ACM venues accounting for 93% of the matches from the top 10 venues.

Figure 1 shows the number of papers included in the originally retrieved set for each year since 1973. A marked increase in the appearance of the terms artificial intelligence and machine learning alongside the terms K-12 and school in the ACM Full-text Collection in recent years is clearly shown. Only 10.7% of the 197 search results are from the 35 year period from 1973 to 2007, with the rest from the almost 15 year period from 2008 to August 2022. Over one quarter of the search results are from the year 2021.

Fig. 1.
figure 1

Search results from the ACM Full-text Collection, Aug 10, 2022. 197 results were returned from a keyword search and 33 papers met the inclusion criteria.

All 197 results were written in English, and only 33 papers (16.75%) met the inclusion criteria for this review - they expressly related to the teaching of AI and/or ML in K-12 education (see Table 1). 62 papers were discounted (31.47%), as they were less than 3 pages in length. There was some interesting preliminary work in some of these shorter papers [17, 28, 29, 38] but they were all without findings. The remaining 102 results (51.78%) were discussed in the context of unrelated fields, or included older students.

Table 1. The 33 papers of the 197 search results that met the inclusion criteria.

3 Results

3.1 RQ1: Who Has Been the Focus of the Research?

Focus on Students: The majority of the papers focus upon students’ teaching and learning (30 of 33 papers - approx. 91%), although 7 of these papers also address the role of the teacher - see Table 2. The age ranges of students vary widely - for example, one paper describes teaching basic ML concepts (image recognition, supervised learning, training data, model, feature, classifying, accuracy) to children aged from 5 years and older [36]. Another describes how ML fundamentals can be taught to children from 10–16 years old through hands-on activities on an educational web platform [26].

Table 2. The research focus (students only, teachers only, or students & teachers) of the studies that are included in the review.

Number of Student Participants: Participant numbers also varied widely (see Table 3). For example, in Weng et al. [42] two elementary school math teachers, a parent and a primary school student, were involved in the development of robot maths quiz games.

In contrast, a 2012 paper [31] sets out a software engineering curriculum for students that includes subjects in AI and ML. Students were given a choice between a 3-unit or a 5-unit programme in their school, which held over 1,000 students aged 16–18 years.

Focus on Teachers: Three papers had a specific focus on teachers—see Table 2. Polak et al. [25] detailed an initial qualitative study with 14 teachers, school psychologists, and education managers from schools in four European countries. A survey targeting a larger group of European teachers was also designed to collect teachers needs and expectations, to create a supportive online educational platform that aids teachers in AI education. The second study [22] introduced a new instrument to measure teachers’ trust in AI-based educational technology, and used it to portray secondary-level school teachers’ attitudes towards AI. The final teacher-focused paper, by Lin & Brummelen [18], described workshops co-designed with 15 K-12 teachers, where teachers and researchers co-created lesson plans using AI tools and embedding AI concepts into various core subjects.

Table 3. The number of student participants that are in each study included in the review. * Exact number not provided.

Targeted Groups: A number of the studies included in the review described work focused on specific, typically underrepresented groups. For example, a 2008 study using video games [9] specifically focused on girls, as did the Stanford Artificial Intelligence Laboratory’s Outreach Summer (SAILORS) program [40]. In another study [7] each teacher chosen for the workshops was asked to recruit 5 or 6 of their students, with at least 50% of the students identifying as non-male. The researchers in this study also targeted particular schools to provide opportunities for enrichment to low income families. A 2021 paper by Druga & Ko [12] recounts how they specifically chose many different locations for their workshops to include a diverse population of students. A 2008 study [4] describes how fifteen blind high school students created and personalized instant messaging chatbots using C# and were guided by both blind and sighted mentors. A 2021 study [18] that explores teachers’ perspectives in the adoption of AI curricula and learning tools, prioritised teachers who taught non-STEM classrooms. In 2019, a storybook was produced as part of a workshop, and it was written in a colloquial style, to make the workshop content accessible to multiple people in a students’ household [30]. Finally, The Curated Pathways to Innovation (CPI) web tool [19] was launched in 2020. It gathered existing online resources for CS and specifically focused on K-12 girls and historically underrepresented groups, so as to help students navigate their career journeys in STEM, particularly in computers.

Table 4. The topics referenced or covered in the courses that are described in the papers included in the review.

3.2 RQ2: What Course Content Appears in the Research?

A broad range of course topics were covered, or referenced in the search results—see Table 4. The most important are listed below.

Ethics/Social Good: Sinha & George [30] describe a course that introduced students to the basic idea of intelligence, examining analytical, emotional, moral and social intelligence. In a 2020 course [27] students designed interactions to make an interface more human-friendly and ethical. Bilstrup et al. [5] highlighted criticisms from the literature with regard to ethical issues of ML.

Data Collection/Analysis: Srikand & Aggarwal [32] described a half-day data science tutorial that was designed to expose students to the full cycle of a typical supervised learning approach, while Zimmermann-Niefield et al. [45] used wearable sensors to allow students to leverage their domain knowledge to collect data, build models, test and evaluate them. Mike et al. [21] reported on the pilot implementation of their data science curriculum, while Perach and Alexandron [24] discussed an ML and Deep Learning blended learning programme that used MOOCs. Bilstrup et al. [6] facilitated students to explore different data types and sources in their card-based workshop.

Classifiers: A 2022 a study explored children’s interactions with a simple image classification tool, using two features to classify images, using their own image data [36]. In Vachovsky et al. [40] students programmed a linear classifier. They also built their own Naive Bayes classifier and some students also implemented K-nearest neighbors classification. Sinha & George [30] describe how students wrote R code to create a simple model to classify flowers.

Natural Language Processing (NLP): A 2021 pilot study by Hjorth [15] had students use the Natural Language Processing 4 All (NLP4All) tool to learn about the policy views and communication styles of political parties by classifying tweets. In Vachovsky et al. [40] students learned how to use NLP to determine which area of the world needed disaster relief e.g. water or medical care. During a co-design workshop [18] teachers made connections by starting with a core subject concept and relating it to AI. NLP was identified as having a potential connection to English.

Computational Biology: In Vachovsky et al. [40] students were taught some of technical methods that are used in computational biology, focusing on gene expression from different types of cancer. Computational Biology also appears in the ‘Introduction to Artificial Intelligence’ topic in Sperling and Lickerman’s 2012 curriculum [31]. Tedre et al. [34] highlighted biological computing as an emerging technology.

Problem Solving Techniques: In a 1985 paper [23] Ourusoff discusses the nature of intelligence, and describes techniques that can be used in the classroom to model problem solving behaviour In 2018, students were given a basic course of Python programming using an interactive tool with mathematical reasoning problems [2]. Researchers analysed how students improved their approach to problems.

Explainable AI (XAI)/Black-Box Solutions: The NLP tool NLP4All was designed to support students through an XAI interface. It scaffolds students without coding skill, helping “students find relationships between categories and features without explicit a priori knowledge” [15]. Meanwhile, Wen et al. [41] contend that their study on face glyph data visualization demystifies ML be looking inside the black-box.

Various Topics: There are a number of additional topics appeared in our results. For example, Korf [16] examined circuits in their syllabus, Carmichael [9] looked at game design, Sperling & Lickerman [31] included algorithms and graph theory, and Vachovsky et al. [40] explored the topic of self driving cars. Sinha & George [30] covered the historical development of machines, and in Lin & Van Brummelen [18] teacher groups identified AI project ideas such as “endangered plant identification” or a “mobile app for home automation”. Chittora and Baynes [10] demonstrated a regression problem and Mike et al. [21] explored algorithms such as the K-nearest neighbors algorithm. Sabuncuoglu [27] developed a 36-week open-source AI curriculum, that included the history of human and computer interaction, prototyping, and soundwaves. They also provided students with the opportunity to complete a project to address a United Nations Sustainable Development Goal (SDG) [1] problem. Bilstrup et al. [5] mention the environmental cost of training ML models.

Coding: A coding element appeared in 14 of the 33 papers identified (42.4%)—see Table 5. Python is the most commonly used language in our results. A CS syllabus that was designed in 1983 included programming segments in LISP [16]. In a 1994 paper [33] three students used C to develop software to play the game Connect 1 and to control a robotic arm making the moves on a vertical grid of six rows and seven columns. Bigham et al. [4] describe how students created their own chatbots. A small subset of C# was taught to students in order to prevent them from being overwhelmed by the syntax. Sinha & George [30] outline how students were introduced to very basic programming using the R programming language. They were taught how to write code and to use the iris dataset to create a simple model to classify flowers. In another study [20] an ML method for object recognition was developed, and students were asked to use a language familiar to them (typically C, C++, Python or JavaScript). The teacher, with the help of the students, built a web app (HTML and JavaScript) that learned to recognize objects when shown to a camera. A 2020 study used Zenbo as the development tool to write Zenbo Scratch for a robotic quiz game system for primary school students [42]. A software engineering curriculum for high school students used the DrRacket functional programming language [31]; while a 2022 course that taught reinforcement learning (RL) through virtual robotics, used Swift [11]. A blended learning program in 2022 used a series of Massive Open Online Courses (MOOCs) by Professor Andrew Ng. While not identified in the literature, these courses typically use Octave (MATLAB) [24].

Table 5. The programming languages used in the courses that are described in the papers included in the review.

3.3 RQ3: Where Has the Research Taken Place?

The United States (US) was the most represented country in the research, with 14 of the 33 search results (42.4%) being based there, or being the principal location of the work/authors—see Table 6.

Table 6. Location of Research in 33 papers selected from 197 search results.

4 Discussion

In addition to illustrating the landscape of AI/ML education at the K-12 level, this review gives rise to a number of important questions which we discuss here.

Which groups do not appear in the research? Aside from Eaton et al. [13] here is little evidence of qualitative research that involved experts, the school community or parents in the search results (there are some minor references to these groups [12, 18, 25, 27, 30, 36, 42]). It may be helpful in future work to gain these broader perspectives.

What do we know about the students who have been the focus of the AI/ML K-12 educational research? Overall, in terms of targeted approaches towards disadvantaged groups, we can glean very little from these search results. It is difficult to determine who is progressing and who is getting left behind. This is reflected in a 2020 study by Upadhyaya et al. [39] in the US, who analyzed seven years of K-12 computing education research data. They identified that “while it is clear that computing has entered the K-12 space, what is still not clear is how equitable the access is to the computing due to data that is either not being collected or analyzed or is being under-reported”. Bryant et al. [8] highlight how the stereotypes about “who does computer science” can preclude interest in the field with many perceiving computing as “irrelevant” and “asocial”. They state that the “underrepresentation in computer science of women, domestic students of color, and students of lower socioeconomic status” is a national issue. A number of results in the search employed the use of robots, smart speakers or specially designed web based environments. Dietz et al. [11] highlighted issues such as “costly specialized equipment and ample physical space” as “barriers that limit access”.

What content has not been addressed in the search results? A wide range of content appeared in the search results, but as Tedre et al. [34] have highlighted, there are other topics that are “working their way toward us”. These include quantum computing, and neuromorphic computing. Auccahuasi et al. [2] maintain that Python “is being widely applied” in AI and quantum computing, as there are “multiple libraries such as numpy and scipy for scientific data processing, scikit-learn and TensorFlow for artificial intelligence and QISKit for quantum computing, which are constantly used, reviewed and improved by a large community of programmers”. There is little reference to frameworks or curricula in the literature. Both Mike et al. [21] and Sperling & Lickerman [31] set out curricula that are linked to the Israeli high school CS curriculum. Hjorth [15] presents a learning unit that has been aligned with the Danish national standards for Social Studies. Sabuncuoglu has set out a proposed long term curriculum [27], based upon Touretzky’s [37] ‘Five Big Ideas in AI Education’. Polak et al. [25] used the ‘Will, Skill, Tool’ model as a theoretical lens, to guide the design of educational content and online platforms, so as to enable teachers to integrate AI education into their classroom.

5 Conclusion

Based on these results we have found a recent, marked increase in AI/ML K-12 computing education research, which has mainly taken place in the United States. There are wide variations in the age ranges of students involved and the number of student participants in each study. There is very little research that specifically focuses on teachers teaching AI/ML (although there is no shortage of use-cases for AI and ML aiding teachers [3] this was not a focus of this research). Reference to experts, parents, or the wider school community is also minimal. Further, a very small proportion of the research is focused on girls or those from historically underrepresented groups. We found a lack of clarity around equity of access to AI/ML K-12 courses and, overall, we are unsure as to how successful AI/ML K-12 courses have been at recruiting girls to and/or ultimately helping retain women in CS. We have identified evidence of the emergence of Python coding in K-12 courses as the dominant language used. Finally, there are wide variations in course content, and little alignment to CS frameworks or curricula. We have identified a number of open questions in the research work on K-12 AI/ML education and these will be addressed in future work.