1 Introduction

In recent years, researchers have put considerable effort into investigating the effects and potential of the relatively new augmented reality (AR) technology for teaching and learning. In particular, cognitive, affective and motivational variables, such as learning achievement, the motivational impact of AR learning environments and the attitudes of different populations towards the AR technology used, have been studied (Arici et al. 2019; Garzón et al. 2019; Garzón and Acevedo 2019; Pellas et al. 2019). Regarding academic learning outcomes, a recent meta-analysis by Garzón and Acevedo (2019) showed an intermediate effect of learning with AR. However, the authors noted that this result must be considered in light of the methodological characteristics of the studies included in the meta-analysis (Garzón and Acevedo 2019, p. 255). Most of the studies analyzed could be categorized as media comparison studies, which involve the comparison of the learning outcomes of the same content presented through two different media (e.g., video or AR) (Mayer 2019a). Such studies have been criticized for a long time because the results do not indicate whether the AR technology itself is effective for learning. Instead, the method used, and the coding of information are investigated as the aspects that contribute to learning success and/or failure (Clark 1994; Kozma 1994). This line of research finally led to the development of the probably best-known theory of learning with multimedia: the cognitive theory of multimedia learning (CTML) (Mayer 2002, 2014a). In it, multimedia learning is defined as learning with combined presentations of text and pictures, whereas text can be spoken or written, and pictures can be static or animated. For effective learning with multimedia, three cognitive principles have to be considered: First, people process verbal and visual information in separate channels. Second, working memory capacities are limited, hence, overloading while learning must be prevented. Third, meaningful learning with multimedia occurs through active processing by selecting, organizing and integrating (SOI model) the new information with prior knowledge (Mayer 2017). Consequently, when designing multimedia learning educators can help learners to learn by reducing extraneous processing, managing essential processing and fostering generative processing. Reducing extraneous processing can be done by eliminating unnecessary material (coherence principle) or through highlighting most relevant information (signaling principle). Managing essential processing can be ensured by splitting up a lesson into smaller parts (segmenting principle) or when presenting words in spoken form instead of written (modality principle) (Mayer 2019a). Foster generative processing helps learners to make sense of the material, for example by using human-like gestures (embodiment principle) and conversational language (personalization principle) or through learning strategies like summarizing, self-testing and self-explaining (Fiorella and Mayer 2016).

In addition to these cognitive factors, affective factors, such as the interest or positive attitudes of learners during the learning process, are now also considered to play an important role in meaningful and effective multimedia learning, (Mayer 2014b; Plass and Kaplan 2016; Wong and Adesope 2020). Additionally, regarding the affective and motivational effects of AR, a large number of primary studies and secondary studies, such as systematic reviews and other literature reviews, have already been conducted (Arici et al. 2019, p. 7). The results of the reviews suggest that learners regardless of age generally have a positive attitude towards learning with AR and often perceive it as “playful” (Akçayır and Akçayır 2017), motivating and satisfying (Radu 2014; Sırakaya and Alsancak Sırakaya 2020) and as an authentic and situated learning experience that can have a positive influence on learning (Dunleavy and Dede 2014; Wu et al. 2013).

Studies on affective and motivational factors have also primarily been media comparison studies. This is not surprising since these variables are usually evaluated together with cognitive learning outcomes (Ibáñez and Delgado-Kloos 2018; Sırakaya and Alsancak Sırakaya 2020).

Another methodological approach to examine the effectiveness and enjoyment of multimedia learning is through value-added studies, which compare two groups using the same medium, e.g., the same computer game, but add an additional feature, like a learning strategy/activity, in the experimental group (Mayer 2019b).

Fiorella and Mayer (2012) added paper-based metacognitive prompts to a computer game for learning the electronic circuit. The results showed that the learners in the computer game group with added metacognitive prompts achieved better learning performance than the control group. In a similar study by Pilegard and Mayer (2016), learners in a control condition played exclusively with a computer game, and in the experimental group, learners additionally worked on a worksheet while playing. Again, the experimental group with the additional learning strategy showed advantages over the control group, especially in transfer tasks for problem solving. The effectiveness of generative learning strategies has also been empirically shown for learning with videos. For example, making summaries, explaining the information presented in a video, and drawing the information presented on a video supports the learning process and leads to better learning performance than watching videos without additional generative learning strategies (Fiorella et al. 2019a, b).

Parong and Mayer (2018) also demonstrated the positive effect of learning strategies on learning achievement for immersive virtual reality (IVR). In their study, they first conducted a classical media comparison study and compared an IVR simulation of the human body with a slideshow containing the same images and text as the IVR application. The slideshow group scored significantly better than the IVR group in terms of learning achievement, but the IVR group scored significantly better for affective and motivational factors. The IVR group was significantly more satisfied, motivated and less bored than the slideshow group (Parong and Mayer 2018, p. 792). In the authors’ second study, learning with immersive virtual reality simulation was divided into segments and supplemented by summarizing as a generative strategy. After each segment, the participants took off the VR glasses and summarized the content they had learned. The group that used the summarizing strategy outperformed the control group that also learned with the IVR simulation in cognitive learning outcomes. However, even more interestingly, the IVR group with the added learning strategy did not differ significantly in affective and motivational factors from the IVR group that did not use the learning strategy. The authors concluded that enriching an IVR simulation with generative learning activities does not reduce learners’ positive attitudes towards IVR learning (Parong and Mayer 2018, p. 793).

The above studies had in common a consideration of learning with multimedia materials from not only a cognitive point of view but also a constructivist point of view. To this end, they referred to generative learning theory, which proposes the active participation of learners during the learning process as the central condition for the successful integration of new information to long-term memory (Wittrock 1974, 1992, 2010). The basic assumptions of generative learning theory can also be found in aforementioned Mayers’ (2014a) selection, organization and integration model (SOI model), e.g., regarding the summary of information. To summarize information, learners must first select the most relevant information, then organize the context of the information, and finally record the new information in their own words and link it to their prior knowledge (Fiorella and Mayer 2016). The question of the effects of generative learning strategies on affective and motivational factors in learning with new media, such as AR and/or IVR, which was addressed by Parong and Mayer (2018), is a new issue that merits further investigation (Parong and Mayer 2018, p. 789; 795). Nistor (2020, p. 540) also came to the same conclusion, noting that the characteristics of the learning environment and its influence on attitudes towards and acceptance of educational technologies have been studied very little (see also, Nistor 2018). In addition, Mayer et al. (2020) observed that the list of evidence-based principles for learning with different multimedia applications must continue to be developed, which applies in particular to technologies such as AR and IVR (Mayer et al. 2020, p. 850).

While for IVR meanwhile some studies are available, which examined affective learning outcomes under consideration of generative learning strategies, for learning with AR such studies are still missing or are found only in very small numbers. For example, Wu et al. (2018) examined two AR systems in science education, and had the experimental group additionally fill a so-called repertory grid. They found that the experimental group with AR and the learning strategy did not report any loss of motivational and affective experience. On the contrary, these factors were even higher in the experimental group compared to the control group.

With this study, we want to contribute to increasing the empirical evidence for the positive effect of learning strategies on affective learning outcomes when learning with AR.

Therefore, we compare two mobile vision-based AR learning arrangements and add self-explaining and self-testing as generative learning strategies in the experimental group. Furthermore, we investigate possible differences between boys and girls, which was not included in the analyses in the cited studies (e.g. Wu et al. 2018). As shown in previous studies on educational technology in general, gender might be a moderating variable, which can influence attitudes towards new technologies. More research on this topic is therefore particularly needed (Scherer et al. 2019; Schumacher and Morahan-Martin 2001; Siddiq and Scherer 2019).

The following main research question is examined:

What are the effects of the design of an AR learning environment (traditional vs. generative learning) and gender (female vs. male) on attitudes towards AR as an educational technology among primary students?

Based on previously conducted research, we propose the following hypotheses, which we test empirically:

  • If learning strategies based on generative learning theory are added to an AR learning environment, students will show positive attitudes towards AR as an educational technology, even in comparison to a control group that does not use these learning strategies.

  • Both female and male students will show a positive attitude towards AR as an educational technology.

  • There will be no gender differences in the effect of the addition of generative learning strategies on attitudes towards AR as an educational technology.

2 Method

2.1 Sample and research design

The current study followed a quasi-experimental design with a posttest-only approach. Two primary school classes with a total of 56 students, namely, 25 girls and 31 boys, participated in the study executed in a real classroom setting. The mean age of the primary students at the time of the intervention was 9.68 years (SD = 1.21). Of the 56 students, 34 (14 girls, average age of 9.56, SD = 1.46) were included in the experimental group, in which the generative learning strategies of self-explaining and self-testing were added to the generative AR learning environment (GenAR). Self-explaining means that the learners summarize what they have learned in their own words on a sheet of paper we have prepared. Since students often do not begin the process of explaining themselves, it is recommended that teachers should initiate this process through suggestions. Therefore, we have prepared a rough structure for this (Fiorella & Mayer, p. 728). Self-testing or retrieval practice is used to repeat what has been learned and allows this new information to be linked to previous knowledge. It is recommended to use self-testing directly after the learning process (Fiorella & Mayer, p. 727). In the control group (ConAR), 22 students (11 girls, average age of 9.86, SD = 0.64) learned with the prepared AR materials and no additional learning strategies. None of the students had prior experience with AR learning materials or AR technology in general and the students in the experimental group did not differ significantly on behalf of their age from students in the control group (Mann-Whitney U-test, z = −1.43, p = 0.15). An overview of the research design is provided in Fig. 1. According to Mayer’s (2019b) definition, such a design can be called a value-added research type, because the same technology is investigated, but additional features are added in the experimental group.

Fig. 1
figure 1

Overview of the research design applied

2.2 Learning environments

The generative AR learning environment consisted of six self-made 3D paper-based learning materials that acted as markers (Fig. 2). Markers are pictures or objects that contain the AR content that is displayed when the camera of a mobile device, such as a tablet computer, is pointed at it. All AR materials were made by the teachers themselves with the Xpanda application (Xtend interactive 2020) and included written and audio-visual information. Initially, the teachers made the 3D paper-based learning aids together with students in class. Afterwards, various multimedia artifacts were created. On the one hand, these were videos explaining, for example, the way of life of a dinosaur. On the other hand, these were simple texts that were later to be placed over the learning material as AR elements. In order to augment the paper-based learning materials, the teachers worked with the AR studio of Xpanda (Xtend interactive 2020). In this studio, photographs are first uploaded, which later function as trigger images. Then the multimedia content to be displayed over the trigger images is uploaded. For example, to display a video over a trigger image, the video is uploaded to AR Studio, linked to the corresponding trigger image and saved at the end. To display the AR content, you only need the Xpanda application. The designed contents were part of a science lesson, and the living habits of dinosaurs were chosen as the topic. Each corner of the learning material represented a different dinosaur, and information about its eating behavior, natural enemies, weight and size, etc., became visible via the digital AR overlay (Fig. 3).

Fig. 2
figure 2

Example of a paper-based 3D learning material

Fig. 3
figure 3

Digital content superimposed onto the paper-based learning material

For the strategy of self-explanation, we created a worksheet with central questions and tasks about the dinosaurs. After each interaction with an AR element, the students worked on the worksheet and wrote down answers to the questions in their own words. One task, for example, was to arrange the dinosaurs according to their size (“create a ranking list”).

For the learning strategy of self-testing, the participating teachers have designed a multiple-choice quiz with nine questions for their students. The quiz questions were formulated on the basis of the information from the AR elements. This allowed the students to evaluate whether they had understood the content of the learning environment. Open questions or problems of understanding were taken up and dealt with again in further teaching lessons. The students worked paper-pencil-based through the multiple-choice quiz at the end of the lesson (e.g., “The dinosaurs had similarities with...a.) crocodiles, b.) birds, c.) lizards”). Empirical studies proved that the use of quizzes to initiate retrieval practice is effective and therefore its usage is also recommended in the literature (Moreira et al. 2019).

For the control learning environment, we used professional AR materials that were also marker based. To create a highly contrasting learning environment, the students in this condition interacted with the prepared materials in a self-directed way without having to perform additional tasks. Figure 4 shows an example of the AR material used.

Fig. 4
figure 4

Example of AR material used in the control condition (Buchner and Jeghiazaryan 2020)

Students in both conditions explored the AR content for approximately one hour in pairs with a tablet computer.

2.3 Data collection

To evaluate learners’ attitudes towards AR as a learning technology, we used validated items from the Technology Usage Inventory (TUI) (Kothgassner et al. 2013a). This survey instrument was especially developed for the evaluation of contemporary digital technologies like AR and VR and includes both the survey of technological factors (e.g. usability) and psychological factors (e.g. interest). The instrument is available in German and the metric as well as comprehension-related challenges often caused by translations could be reduced considerably. In the context of a study on validity and reliability, the instrument was convincing (Kothgassner et al. 2013a), and has since been used in other studies as well. For example, the instrument was used in the study by Schmidt et al. (2013) to find out to what extent different IVR environments influence psychological and technological factors positively and negatively. Furthermore, Kothgassner et al. (2013b) used the TUI questionnaire to investigate the extent to which collaborative working in virtual environments affects the affective sensation.

Data on the following factors were collected: (1) Interest in technology, consisting of three items, e.g., “I want to know more about new technologies like augmented reality”, Cronbach’s alpha = 0.70; (2) Usability, consisting of three items, e.g., “The application of augmented reality was easy overall”, Cronbach’s alpha = 0.70; (3) Usefulness, consisting of four items, e.g., “Augmented reality can support me in learning”, Cronbach’s alpha = 0. 70; (4) Skepticism, consisting of three items, e.g., “Learning with augmented reality brings me more disadvantages than advantages”, Cronbach’s alpha = 0.70; and (5) Accessibility, consisting of three items, e.g., “I think augmented reality can be used at home”, Cronbach’s alpha = 0.70. The students answered the items on a Likert scale from 1 = disagree to 7 = fully agree.

The overall attitude data included the average values for all the above subscales.

The questionnaire was administered to all children as an online survey after they learned in the respective learning environment.

3 Results

3.1 Scoring and data analysis

To analyze the data, the mean values of the individual items were first assigned to their respective scales. For example, the scale Interest consists of three items whose mean values were calculated to form an overall mean value for the scale. The overall setting is calculated from the averages of all five subscales. All calculations were performed in SPSS 26. Due to the small sample size and the unequal distribution of participants in the experimental (N = 34) and control groups (N = 22), nonparametric test procedures like Mann-Whitney U-test and Kruskal-Wallis test were used to calculate the mean value differences (Sedlmeier and Renkewitz 2013).

3.2 Effects of the learning environment

Table 1 shows the means and standard deviations of the five factors, i.e., Interest, Usability, Usefulness, Skepticism and Accessibility, as well as the overall attitude for both the experimental (GenAR) and control (ConAR) conditions.

Table 1 Means and standard deviations of the scores of the five subscales and the overall attitude scale for each condition

To assess the influence of the learning environment on each of the attitude factors, a Mann-Whitney U-test was performed. The results with U, z and p values as well as the effect size (Cohen’s d) are shown in Table 2. For the factors Interest (z = −1.32, p = 0.19), Usefulness (z = −0.83, p = 0.41) and Accessibility (z = −0.20, p = 0.84), no significant differences were found between the experimental and the control group. The overall attitude also did not differ significantly between the two groups, z = −0.23, p = 0.82.

Table 2 Mann-Whitney U-test results of the comparison of the mean values of each factor between the experimental and control groups

Significant differences were found for the factors Usability (z = −3.10, p = 0.002) and Skepticism (z = −2.42, p = 0.016). These results indicate that the experimental group (GenAR) rated their AR materials as less user-friendly than the control group (ConAR). According to Cohen (1992), the effect size, d = 0.91, corresponded to a strong effect. In addition, the GenAR group was more skeptical about the use of AR as a learning technology. According to Cohen (1992), the effect size, d = 0.68, corresponded to a medium effect.

3.3 Effects of gender

Table 3 presents the means and standard deviations for the five factors, i.e., Interest, Usability, Usefulness, Skepticism and Accessibility, as well as the overall attitude for female and male students regardless of the treatment condition.

Table 3 Means and standard deviations of the scores of the five subscales and the overall attitude scale divided by gender

To investigate whether attitudes towards AR as a learning technology depended on gender, the mean values were compared using the Mann-Whitney U-test. The results with U, z and p values as well as the effect size (Cohen’s d) are shown in Table 4. For the factors Interest (z = −1.34, p = 0.18), Usability (z = −0.69, p = 0.49), Usefulness (z = −0.42, p = 0.67) and Skepticism (z = −0.84, p = 0.84), no significant differences were found between female and male students. The overall attitude also did not differ significantly between the two groups, z = −1.85, p = 0.065. However, this result was almost at the significance level, hence, we keep it in mind in the interpretation of the results.

Table 4 Results of the Mann-Whitney U-test of each factor with comparison by gender

A significant difference was found for the Accessibility factor, z = −2.12, p = 0.034, indicating that male students assessed the use of AR for their own purposes to be more accessible than female students. The effect size, d = 0.59, corresponded to a medium effect (Cohen 1992).

3.4 Effects of gender and the learning environment

Table 5 shows the means and standard deviations of the five factors, i.e., Interest, Usability, Usefulness, Skepticism and Accessibility, as well as the overall attitude for female and male students in the experimental and control groups.

Table 5 Means and standard deviations of the scores of the five subscales and the overall attitude scale divided by gender and learning environment

A Kruskal-Wallis test revealed significant differences by gender and learning environment for the Usability factor (Kruskal-Wallis H = 11.05, p = 0.011). All other factors and the overall attitude did not differ significantly.

A post hoc test (Dunn-Bonferroni test) showed a significant difference between the girls in the GenAR group and the boys in the ConAR group (z = −2.87, p = 0.025) as well as between the boys in the GenAR group and the boys in the ConAR group (z = 2.81, p = 0.005). The results suggest that boys in the ConAR condition rated the usability of AR significantly higher than girls and boys in the GenAR experimental group. Notably, the girls in the ConAR condition perceived AR to be more user-friendly than both the boys and girls in the GenAR group, but this difference was not significant.

4 Discussion

This study builds on recent findings of Parong and Mayer (2018), who reported that supplementing learning strategies based on generative learning theory (Fiorella and Mayer 2016; Wittrock 1974, 2010) did not negatively affect learners’ affective and motivational attitudes towards learning with contemporary digital technologies. Furthermore, this study responds to the call to investigate different teaching approaches with the same technology to examine the affective effects of the teaching approach rather than only conducting comparative media studies with the same teaching method (Mayer 2019a, b; Nistor 2018, 2020; Parong and Mayer 2018). In this study we investigated AR technology from the perspective of generative learning theory and the extent to which adding the learning strategies of self-explaining and self-testing affected the attitudes of primary school students towards AR as a learning technology. Our first hypothesis, based on the findings of Parong and Mayer (2018) on IVR, was that adding learning strategies to an AR learning environment would not negatively affect attitudes and that fundamentally positive attitudes would be observed. Confirmation of this assumption required comparison of an experimental group to a control group and the observation of no differences between the groups. Table 1 shows high subscale and overall scale scores for both groups, which reflect a positive attitude towards AR as a learning technology. All values are well above the mean of 3.50, and the overall attitude values are very high, at 4.63 and 4.85; the values do not differ significantly between the two groups. These results allow the conclusion that the first hypothesis can be confirmed. Students within an AR learning environment with additional learning strategies show positive attitudes towards AR as a learning technology. However, a more detailed look at the individual subscales shows that the usability of AR was significantly worse and that the skepticism regarding AR as a learning technology was much greater in the experimental group (GenAR). Although these findings are not significantly reflected in the overall attitude scores, it is important to keep them in mind for future research projects and implementations of AR for teaching and learning purposes.

The different usability scores in this study may be due to another variable that was not the focus of this study. For example, the control group used professional AR material, whereas the experimental group worked with AR material that teachers had created themselves. Future research projects could build on this additional result of this study and follow it up. This will especially be necessary as an increasing number of AR platforms come onto the market that allow teachers to create AR materials for classroom use, e.g., web AR studios (Areeka 2020; Qiao et al. 2019).

The fact that the students in the experimental group were more skeptical about AR as a learning technology may be due to already known reasons that have been reported for other media that were perceived as difficult. For example, Salomon (1984) proved that television is perceived as easy, while reading is perceived as difficult; therefore, the liking of these media differs. It is therefore possible that the children in this study may have associated AR primarily with games and entertainment, and when they had to perform additional tasks that required mental effort, this previous association was undermined; thus, the children in the experimental group may have formed a more skeptical attitude towards AR than those in the control group, who were allowed to interact with the AR materials much more autonomously and without further learning activities. It is important to note that liking alone does not always lead to better learning. However, the facilitation of learning is the goal for every teacher, and a balance of ability and effort should be emphasized (Mayer 2017; Sweller 2020).

The second hypothesis addressed the recurring question of the influence of gender on technology-enhanced learning. In line with other studies on AR (e.g., Adedokun-Shittu et al. 2020), we assumed that both genders would have positive attitudes towards the use of AR as a learning technology. Significant differences were found only for accessibility; otherwise, girls and boys did not differ significantly. The accessibility results are consistent with other findings regarding greater self-efficacy among boys in terms of their computer skills and their ability to use computers (e.g., Ong and Lai 2006).

An interesting detail is that the gender difference in overall attitude was only slightly below the level of significance, with p = 0.065. According to this result, the girls assessed learning with AR somewhat less positively than the boys. This result should probably find further consideration in research as well as in practice. Further research will be necessary to explicitly investigate the gender variable and to investigate other technologies besides AR. In practice, this result should be taken into account and e.g. an application should be prepared or discussed with all students in advance. Also, pretraining how to learn with and how to use AR can possibly reduce or even completely eliminate negative attitudes.

Finally, the third hypothesis investigated whether attitudes would be influenced by the respective learning environment and gender. A significant influence was found for usability. The boys in the ConAR condition were significantly more satisfied than the girls and boys in the experimental group with the self-made AR materials. This result is a further indicator that children are already used to a very high standard of computer technologies and therefore might appreciate self-made implementations less. However, it is much too early to draw an absolute conclusion, as this is the first study to address this topic. All other attitude factors and the overall attitude did not differ significantly. This leads to the conclusion that both boys and girls generally have a positive attitude towards learning with AR, even if learning strategies are added.

5 Limitations and future directions

A limitation of this study may be the focus on only affective learning outcomes in terms of primary students’ attitudes towards AR as a learning technology. Future studies should not only verify these results but also expand them to include additional variables such as cognitive learning outcomes, immersion, emotional attachment, or cognitive load and to clarify findings in this still underrepresented field of value-added media studies.

In addition, future studies may use other research designs, e.g., include pretesting or conduct randomized experiments to better control possible moderator variables. For this purpose, it would be advisable not to use different AR materials as in this study but rather to use the same AR materials. The focus in this study was on the comparison of mobile vision-based AR in different learning conditions, not on the content of these AR materials. Attention to content could be included in the future. The same applies to the question of whether different or the same content would be conveyed via see-through, spatial, location-based, web AR or vision-based (markerless) AR, including learning strategies that influence affective and/or cognitive learning outcomes.

It would also be possible to use a within-subject designs to investigate the change in the dependent variable of a sample.

As always in such studies, it must be noted that the sample size was not large enough to make generally valid statements regarding the hypotheses. The ecological validity, on the other hand, can be classified as good since this study was carried out in the field under real classroom conditions.

6 Conclusion

In summary, the major contribution of this research is that the attitudes of primary school students towards the use of AR as a learning technology are very positive, and even the addition of learning strategies does not significantly diminish positive attitudes. The gender differences found relate solely to the accessibility factor, which can be explained by boys’ stronger self-confidence regarding computer skills.

The general assessment of learning with AR was slightly more negative for girls than for boys. This should definitely be further investigated and also taken into account in practical use. It can be recommended to prepare for the application via pretraining.

Although this study makes an important contribution to a subject area that has only recently become of more interest to the research community, the results presented here should be viewed only as a starting point. The present research indicates the need to conduct further investigations with additional variables, content and AR types.

For practical purposes, this contribution reveals important findings demonstrating that primary school students perceive AR not only as a playful and entertaining medium but also as a real learning medium which can contribute to effective and enjoyable learning.