1 Introduction

Countries in Europe are implementing national strategies for AI; however, doubts arise when it comes to deciding on how to educate our children in AI and the right age to introduce the concepts [1]. Different trends have been observed across nations. The fact that Chinese publishers have already introduced AI textbooks for high-schools and have presented AI books for preschoolers [2] points to the need to train AI concepts earlier, at the middle school level. MIT has also developed an AI ethics curriculum for this educational level [3]. AI is taught differently across nations maybe due to the resistance to curriculum modification and the numerous interest issues more likely to enter school syllabuses (e.g. gender, environment or financial education). In this paper, we compare the effectiveness of introducing AI notions relying on already integral parts of Grade 8 Technology course (e.g. ML examples using Scratch) versus a practical out-of-hours AI workshop.

Madrid was one of the first regions in Europe to introduce a compulsory course on Technology, Programming and Robotics in middle school [4]. Students from 1st to 3rd-year ESO (Spanish Secondary Education) have 2 h of Technology per week. These topics are not obligatory in the national curriculum, though. AI has been taught so far, focusing only on the Robotics’ subarea as an after-school activity in Primary Education and integrated part in the Technology curriculum. AI activities and curriculum modifications initiatives are often receiving private corporate funding. In 2019 Google Australia sponsored MOOCs’ launched in cooperation with the University of Adelaide for the introduction of these contents in Primary and Secondary classFootnote 1. This is our breeding ground to begin formally introducing AI in all its multiple dimensions.

This work describes the practical experience, AI curriculum for middle school, early high school pre and post-survey and motivational results. It ends with a discussion on the curriculum’s success and the research question: Are the educational system and the middle school ready for AI?

The rest of the paper is organised as follows. Sections 2,  3 describe the practical experience of introducing AI in Spanish middle schools and the participants’ background . In Sect. 4 we describe the content, general overview and timing of a theoretical AI didactic unit for middle school in-depth. Section 5 gathers the survey results and the study’s details for grade 8 (middle school) and grade 10 (1st-course high school). It also describes the motivational study we carried out as a tool to measure the didactic unit’s success. Finally, we draw conclusions in Sect. 6, describing our impressions after the experience and the perfect age to introduce AI concepts, late middle school or early high school.

2 Experience

The aim of this experience is assessing if students in middle schools are prepared to take up Artificial Intelligence. This is a challenge for AI researchers to explain AI far beyond the university courses or knowledgeable audiences. Universities in Spain are ready for a bachelor degree in AI. It is easy to cover all AI concepts at the university level, but middle school and high school students do not know what AI stands for, although they do have intuitive ideas about it. McCarthy coined the term ’AI’ in 1956 during the Dartmouth conference, which is now considered the field’s founding event [5], calling it ’the science and engineering of making intelligent machines’. To our mind, a hands-on workshop should include practical experience with models; students themselves should be able to train and test AI models for speech and image recognition.

Researchers and middle school Technology teachers put the workshop into practice during Fall 2019/2020. In total, 84 students completed the activities. The action plan was first to introduce AI and ethics as a general schoolwide theme. Later, proceed with three different comprehensive AI project-based activities (year 8, a theoretical workshop Group G and a hands-on workshop, group E) and a shorter experience for year 10. The Technology teacher replaced their regular curriculum with this proposal of a 2-weeks AI workshop and a 1 1/2 weeks workshop for older students, grade 10. In total, eighty-four students, 29 in group G, 28 students in group E and 27 students in year 10 took part in at least one program session. The students took a motivation test right before and after the AI activities. Only participants without absences in the evaluating sessions were taken into consideration for the study. We gathered motivational data from seventy-five students, 21 group E, 27 and 27 in other groups. 9 out 84 students took part in the AI activities but did not complete the motivational test. The didactic unit addresses AI’s core concepts, e.g. definitions, history, areas and uses of AI, differences in supervised/unsupervised learning. Finally, AI applications using Scratch and the ethical considerations that AI entails were presented. The hands-on experiences include activities in which students themselves can build, use AI, e.g. game Survival of the best fit, debates, and training models using AI extensions for Scratch.

We engaged eighty-four students and their Technology teacher and assistant teacher for the AI activities as reflected in Table  1. Young participants in the range of 12–16-year-old took part in this practical experience. Fifty-seven students are between 12 and 13 years old and twenty-seven older students are in the age between 14 and 15. Concretely, 41 girls and 43 boys. Groups reach equilibrium in gender. Recruiting was not necessary because they were registered students of Technology, Programming and Robotics. We conducted this on-site workshop and all the students completed at least one SIMS (Situational Motivational Scale) test as a pre-test activity and another one at the end of the sessions. Motivation is described as ’reasons underlying behaviour’ [6]. Motivation Assessment in education very often uses SIMS test [6, 7]. We describe the concrete applications and results later on in Sect.  5.2.

All participant parents or guardians signed a consent to participate and share image and video at the beginning of the academic course for all subjects, including Technology, Programming and Robotics.

Table 1 Participants in AI Activities’ gender distribution

3 Participants

We conducted this activity during Fall Semester 19/20 with students from one integrated middle/high school. Children are from average income families and attend a centre with around 1500 students. All have access to class material, and there is no presence of disadvantaged students.

Concerning participants’ personal background, every one of the students has an individual experience with programming. They have studied Programming for at least one semester, preferably Scratch. During their weekly Technology, Programming and Robotics (TPR) class, the activities were carried out and supervised by three people, the TPR teacher, a teacher in training/researcher and a North American language assistant.

Based on the participants’ background, we designed three different AI project-based activities:

  • A only hands-on workshop for Grade 8, Group E. Duration: 2 weeks. 4 h in total.

  • A theoretical/hands-on workshop for Grade 8, Group G. Duration: 2 weeks. 4 h in total.

  • A workshop for older students, Grade 10. Brief 1 1/2 weeks. 3 h in total.

4 Didactic unit overview

The didactic unit comprises 3 different subunits: Introduction/AI theoretical, a Scratch and AI, and AI Ethics. Each activity/subsession lasted at least 60 min. Most of the assessments and activities are designed as individual work. We did observe collaboration in small informal groups, though.

The first subunit is designed to provide a straightforward Introduction to AI aimed at middle school and first-year high school students. Our materials were initially focused on university students and based on the well-known book by Russell and Norvig [8]. One of the first challenges was adapting the materials so that the course had no prerequisites. The theoretical introduction covers the history of AI, areas and subareas like Artificial Vision, Robotics, Knowledge Engineering, Machine Learning and Learning Algorithms, to name but a few. We continue presenting differences in Human versus Machines, (e.g. understanding Machine language, how to codify a photo in computers and briefly describing the Turing Test). This leads to introduce the debate Future of Technology (e.g. Will robots replace us all?). Afterwards, we describe AI examples (e.g. help mobility, help blind people, health applications, translation tools and content recommendations). The students continue their path to learning AI with key concepts, such as Machine Learning techniques (classifier, neural networks, supervised vs unsupervised learning, cluster). Finally, the introduction deals with well-known areas like Natural Language Processing (NLP) and ends with Introduction to AI Ethics. There are a set of open questions that reinforce the classroom debate, for example related to the logic and algorithms used in self-driving cars, industrial robots and the future of AI.

In the second subunit, AI Ethics, we proposed three activities: (1) play with game called ’Survival of the Best Fit’, (2) debate newspaper of the future and (3) redesign Youtube o Google website. The game, Survival of the best fitFootnote 2 was designed by NYU alumni and targeted the new use of AI in recruiting. It involves complex concepts like algorithmic bias and requires a bit of introduction to the class. First, the instructor plays the game in front of the class like a speak up activity. Next, the students themselves can play to understand how AI can impact lives and decisions and the concept of algorithmic bias. In the beginning, decisions seem to be conscious decisions, but after model training and automation, the user lost track of why the game hires one candidates or another candidates. This activity shows the effects of automatization in reality and it opens a debate for the students to reflect on the consequences of bias and negative uses of AI.

Newspaper of the future (Fig. 1) is a collaborative activity where students debate the current state of the art in different use of AI. We chose this activity to foster critical thinking among students as future users and designers of Technology.

Fig. 1
figure 1

Screenshot of newspaper of the future collaborative activity

With the third activity, redesigning Google or YouTube website, see Fig. 2, children participated in round table about the considerable impact of companies like Google in the software we use in our everyday lives and the pros and cons we encountered when using them. Previous attempts to introduce AI in the US seem to be biased towards Google products. Indeed, students could have a formed opinion about software products they already use, and they could contribute as a motivating factor concerning activities. In order to help students we provided some guidelines in question form like: ’What do you like the most about YouTube?’, ’What you would change?’. This activity models the one offered by MIT AI curriculum, YouTube redesign [3], adapting it to Google AI in general terms. Students enumerate all Google products they know, e.g. Google maps, Google translate, Google videos, Calendar, to name but a few, identifying AI systems in them. They mention what they like and what they dislike, e.g. notifications, ads, copyright. This leads to a deeper debate and the possibility of introducing complex ideas like copyright and its reasons. Students feel like real actors and active designers/stakeholders when instructors give them the possibility of coming up with a new AI for Google, YouTube, etcetera.

Fig. 2
figure 2

Screenshot of Google redesign round table activity

For the third subunit, AI Scratch, activities we recommend Software/Apps like CognimatesFootnote 3 and Machine Learning for kidsFootnote 4 to teach students to train models for speech and image recognition. These are online platforms where students can experiment with AI and train their own models or play with the pre-trained models. Young learners are familiar with its Scratch Interface, and additionally, it enables them to explore AI concepts and build their own AI Scratch programs using the AI block extensions. This activity is aimed to integrate AI concepts within the existing Technology, Programming and Robotics curriculum that covers scratch in depth. Grade 8 curriculum already covered several Scratch practices over the course.

With Machine Learning for kids, the students follow the process of building, training and evaluating their own model for speech recognition and sentiment analysis. In order to create models, application provides noises and claps prerecorded connected with positive category and the same for negative sentiments. Students select three different types of avatars, positive, negative and waiting, e.g. happy/sad ghost. Once the students have enough samples to train the model, they could continue build and test their own model. The avatar is reacting to the words. This way, the students complete a model for the sentiment analysis of sounds or spoken language. Cognimates offers an extension for the sentiment analysis of words, and it enables direct translation of dialogue to different languages. It is focused on NLP characteristics. This way, the students are able to discover AI potential and develop skills to fully integrate AI in their Scratch programs.

5 Results

We developed a quiz with open and multiple-choice questions. These are intended to check if the students grasped the AI concepts plus their interest and curiosity on AI. Also, students took a formal motivation test before and after the AI activities. We gathered motivational data and answers from seventy-five participants in total.

5.1 Survey Results

Questionnaire results show that at least 45% do want to study AI further across the three groups (see Fig. 3). In Grade 8 group E, 13 out 21 students want to study AI next year, 5 answered ‘no’, and 3 ‘don’t know’. These results could mean that the participants involved in the most practical workshop have welcomed the AI activities favourably and feel motivated to further studying AI. In the case of Group G in Grade 8, the one who follows a more traditional style of theoretical-practical workshop, results are good but less favourable. Only 10 out of 27 students feel encouraged to study AI during the next academic course further. The first-year high school students, Grade 10, only 11 out of 27 students wish to continue studying AI the next academic course. The differences in approach and the short-duration style workshop could explain the results.

Considering their previous background, 16 out of 21 students in Grade 8 Group E like programming. Grade 8 Group G 18/27 like programming. In Grade 10 high school 15 out of 27 students have a dislike for programming.

More broadly, concerning STEM vocations, there are a total of 9 prospective STEM vocations in a group of 21 people. In grade 8, group G there are 12 future STEM vocations, 4 girls and 8 boys. In high school, the number of STEM vocations is lower, 10 students in a class of 27.

It is worth saying that, after the activities, children refined their AI definitions. One question was: ’what AI means for you?’ We found several blank responses to this question in the pre-test and none in post-test. Students show their own perception of AI. We asked if they had interacted before with different AI systems and found positive answers. They do know how to describe the meaning of ‘agent’, but most of them have agents in their own pockets, called Siri or Alexa. There is previous work on how children perceive conversational agents [9] or other types of AI in everyday lives.

Fig. 3
figure 3

Opinions about AI by class

5.2 Formal Motivation Study

One of the essential variables to consider in curriculum design is motivation. Motivation is crucial in learning processes. According to Spanish RAE dictionary ’motivation’ means ’set of internal and external factors that determine partially human actions’. Cambridge defines motivation as ’the need of reason for doing something’. We carried out a motivational test frequently used in education, SIMS, measured the engagement and welcome of activities [6]. It is based on the self-determination theory [6, 10]. According to this theory, different types of motivation underlie human behaviour. These motivations are intrinsic motivation, extrinsic motivation and amotivation. More specifically, we would say that motivation is composed of four internally consistent factors, Intrinsic Motivation, Identified Regulation, External Regulation and Amotivation.

As mentioned above, it is decided to use a motivation test frequently used in schools specific for every country. In Spain’s case, researchers from the University of Las Palmas de Gran Canaria present a statistical validation of the test EMSI (Escala de Motivación Situacional) [11]. There are various adaptations for specific subjects, from Physical Education to Medicine. Norwegian researchers have recently applied SIMS test to this field of Physical Education [7]. It is used to evaluate motivation in different areas or for testing the welcome of educational software [12]. A facet of the SIMS/EMSI motivation questionnaire is to compute the degree of global motivation in terms of its four constituent dimensions.

In the initial test proposed by Guay et al. [6] students describe the reason why they are engaged in the activity where 1-is ’not all’, 2- is ’a very little’, 3- ’a little’, 4- ’moderately’, 5- ’enough’, 6- ’a lot’ and 7- ’completely’.

Fig. 4
figure 4

Initial test EMSI proposed by Guay et al. extracted from [6]

The students completed a test of 14 items [see Fig. 4] according to the Spanish test EMSI version. It eliminated 2 items of the initial 16 for being redundant. To safeguard the accuracy of results, it was decided to cancel answers with all 7 or 1. The tests are anonymous. The global motivation is measured according to the below Eq. (1) and should score on a scale −18/18. The average motivation score near zero values means the student is neither forced (negative values) nor enthusiastic about carrying out the activity (positive values).

Global motivation is calculated considering four dimensions, intrinsic motivation (IM), identified regulation (EMI), external regulation (EME) and amotivation (AM).

$$\begin{aligned} M= 2 * IM + EMI - EME - 2 * AM \end{aligned}$$

Table 2 gathers the global motivation obtained during the three activities and Fig. 5 a comparative assessment by class. Global motivation results from Table  2 are shown graphically in the figure. Proportionally, all classrooms obtained average results around zero in the above mentioned scale −18/18, we do not observe extreme values in motivation or lack of motivation. Grade 8 students show slightly better results than grade 10. With all these precise results, we conclude the early introduction of AI in Spanish Compulsory Education and a more extended workshop are then advisable.

Table 2 Formal motivational study AI
Fig. 5
figure 5

Student’s final performance on the motivational assessment by classroom

If we had to measure student’ degree of interest in our AI activities, we could start by checking their average in amotivation. An average value of group amotivation higher than in the rest may indicate that the unit will not have a good reception or the expected result, the students are unmotivated. Table 3 shows which groups score high in amotivation.

Table 3 Formal motivational study AI

When it comes to interpreting the results, we did not observe extreme values in amotivation, no groups scores neither low (1) or excessively high (7) in this dimension. Reviewing the results with the IT/ Technology teachers who collaborate in the research, the results are in line with the expectations they had of the groups already known in advance. It is a fact that the introduction of AI at an earlier age shows slightly lower values of amotivation. The third year of ESO, ages 14–15, has had a very different workshop experience. We designed a workshop less practical and shorter. That may be the reason why their degree of amotivation is slightly higher; they have not been particularly motivated.

6 Conclusion

The introduction of this didactic unit of Artificial Intelligence leads us to extend this experience towards designing a complete subject. The tutors of the courses have well received the implementation of this didactic unit in a classroom. However, the results cannot be considered to have been successful. Student motivation did not increase on average, but it was relevant that students completed the final questionnaire and demonstrated significant learning of the essential AI content. Through building and testing models, children grasped the basic AI concepts. The groups with more practical and interactive activities performed better on the motivational assessment and final quiz. Better results were not expected, though. It is a fact that motivation post-test results got slightly worse. These results lead us to think that planning for the introduction of AI may need to be rethought. The workshop should be less theoretical and more practical, reducing the amount of concepts explained and expanding the work with applications, which facilitate the acquisition of intuitive ideas about AI. It is the first formal study of the effects of introducing the teaching of AI at an early age in Madrid.

Concerning the interpretation of motivation results, we should consider the nature of the EMSI test. The global motivation follows a scale of −18 to 18; a final decrease of −0.5 in global terms is not so remarkable if compared with these scales. Group diversity could account for worse results in post-motivation as well. It is a mixed group with different interests and different professional vocations that are not yet significantly developed at 12. Vocational test carried out in parallel showed that some students do not feel keen on STEM or even Computer Science. As we could contrast with teachers, the results are a real reflection of the group’s diversity.

The students’ answers lead us to get closer to perceiving what is the best moment to introduce AI in the curriculum of Madrid students. A priori, it could be determined that a topic such as Artificial Intelligence would be recommended for students with a higher degree of maturity; for example, students in the last year- first cycle of ESO. However, the results of the motivational pre-test show a higher global index in the two groups of younger students (see Table 2 ; Fig. 5). The motivational results serve to refute this theory and defend the subject’s early introduction, preferably in the first year. Younger students did have a good understanding of the unit. We observe that the motive for AI study is slightly higher in the first year than in the third. This lower motivation may be related to less interest or curiosity about new subjects than is usually the case in children. Also, we observed that the students who benefited from hands-on workshops show much better opinions about AI.

A brief comment on the ideal length of the workshop. The younger -middle school grade 8- students take up a longer version of the workshop, 4 h instead of 3. This approach makes the students more likely to have higher scores in global motivation and immerse themselves on AI and work independently at home. Gender differences in results are not outstanding, girls seem to be interested, but more effort is needed to foster interest in STEM for both genders.

What concepts should cover a complete AI course? We believe it is convenient to start with basic manuals in the field of Informatics and AI such as Russell and Norvig [8] for the introduction of key concepts and terminology. The idea of a graph and automata could be explained through games. Other fundamental concepts are search trees, and decision trees could also be adapted to the students’ level.

We should add that further work is needed to understand supply material and plan new AI subject. We need to extend the experience with more students and different workshops to correlate the time dedicated to the activities and the motivation to carry them out.