Introduction

Interest in using augmented reality (AR) to create immersive and interactive learning environments has increased significantly over the past two decades (Arici et al., 2019; Bacca et al., 2014; Ibáñez & Delgado-Kloos, 2018; Pellas et al., 2020; Radu, 2014). While using educational AR technology, learners’ physical environment is augmented with additional digital information through AR-enabled devices like AR glasses, smartphones, or tablets, making it more engaging and exciting. Unlike other immersive technologies such as virtual realities (VR), AR specifically emphasizes enhancing the real-world environment. This involves blending virtual elements seamlessly with real-world objects (Chang et al., 2022). Consequently, AR visually combines immediate real-world surroundings with digital information (Schwan, 2022). The advancement of augmented reality technology (leading to more powerful software or lightweight and inexpensive devices) has also significantly expanded the possibilities for designing authentic learning activities (Cai, 2017; Hobbs & Holley, 2016; Hwang et al., 2023; Lee et al., 2018; Ogle et al., 2018; Own, 2017; Rosenbaum et al., 2007). Hence, the utilization of augmented reality (AR) in education exhibits significant potential, offering an expanded and alternative “real-world” environment. This technology proves particularly valuable in granting access to highly realistic content, especially when learners are unable to gain firsthand experiences. Given these attributes, AR is anticipated to foster a conducive learning environment and make substantial contributions across various educational settings, including primary, secondary, and higher education. It holds applicability in diverse domains such as medicine and STEM education, with potential applications both inside and outside the traditional classroom setting (Akçayir & Akçayir, 2017; Ibáñez et al., 2014; Ibáñez & Delgado-Kloos, 2018; Sommerauer & Müller, 2014). Research on the educational use of AR is expanding and predominantly revealing positive effects on cognitive outcomes like conceptual or procedural knowledge gains (Akçayir & Akçayir, 2017; Garzón & Acevedo, 2019) and also motivational learning outcomes, e.g., engagement or situational interest (Sun et al., 2023; Zimmerman et al., 2015). However, there has been limited empirical investigation into the connection between AR-based learning and the perception of authenticity (e.g., Wei et al., 2018; Aitamurto et al., 2020; Hsu et al., 2016). Moreover, previous studies have not addressed instructional support in AR-based learning, especially not in conjunction with investigating perceived authenticity. This paper aims to contribute to filling this research gap. Investigating the role of students’ perception of authenticity during learning is necessary because it can have affective, cognitive, and behavioral effects, e.g., on learners’ situational interest or domain-specific knowledge (Betz, 2018; Schüttler et al., 2021). Fink et al. (2023) investigated the effect of authenticity on interest in VR-based environments (like AR an immersive and interactive technology) and found that authenticity-related variables influence interest development. Further, until now, limited research has been conducted on the impact of authenticity perception on knowledge acquisition. Nachtigall and Rummel (2021) investigated the perceived authenticity of an out-of-school learning environment and reported that students’ perceived authenticity correlates with situational interest but not with knowledge acquisition. However, to our knowledge, no similar research has been done on AR-based learning. We investigate perceived visual authenticity which describes how realistic or real the AR environment feels to the learner and is related to how accurately the AR experience replicates real-world scenarios. When learners feel that the AR experience is authentic and closely mimics real-world situations, they might likely be motivated and engaged in learning. This can lead to increased attention and participation, enhancing knowledge acquisition.

Integrating and investigating instructional support in AR-based learning is further relevant because the technical affordances of AR might be challenging during learning, so encouraging the use of instructional scaffolds could help students reach desired learning goals (e.g., Ibáñez et al., 2015). We investigate two different generative learning strategies that involve qualitatively distinct modes: drawing (generating coherent visual representations) and self-explaining (generating coherent verbal representations; Fiorella & Mayer, 2016; Fiorella, 2023), and analyze their impact on perceived authenticity, knowledge gain, and satisfaction with the AR environment. Engaging students in generative learning strategies has proven effective in non-AR learning (Fiorella & Mayer, 2016). However, some studies reported media-effects, describing different results in different media (e.g., immersive virtual reality vs. video; Makransky et al., 2021). Therefore, gaining deeper insights into generative learning strategies within the specific context of AR-based learning is essential. Further, conducting a comparative analysis of these two learning strategies seems to be particularly interesting because both involve generative activities. In addition, it is noteworthy that both strategies address distinct core elements of authenticity. While the drawing task addresses the visual aspect of authenticity, the self-explaining task addresses the content respectively, function-related aspects of authenticity. As this study focuses on an AR-learning environment, a key question arises regarding whether there is an advantage to visually representing ideas (by creating drawings) or verbally (via self-explanations). Additionally, students using learning strategies during AR-based learning reported positive attitudes toward the learning environment but a worse perception of usability and higher skepticism toward AR as learning technology than students learning without such strategies (Buchner, 2022). Thus, this paper examines an AR-learning environment regarding the learners’ perception of visual authenticity and determines the relationship between the perception of visual authenticity and instructional support stimulating the use of generative learning strategies. Due to the previously mentioned advantages of AR, we used AR in this study to teach a specific aspect of the human anatomy: the cardiovascular system. First, against the background of the authenticity model from Betz et al. (2016; see also Nachtigall et al., 2022), we examine the relationship between authenticity-relevant person-related cognitive (prior knowledge, cognitive abilities) and motivational prerequisites (thematic interest) and the perception of visual authenticity in AR-based learning, in dependence on the learning strategy. Incorporating person-related variables is deemed relevant, as they typically exert an influence on learning processes, especially in multimedia learning (Kalyuga, 2014; Krüger et al., 2022; Schrader et al., 2018). Second, we analyze the role of learning strategies in AR-based learning and investigate possible influences on the perception of visual authenticity and learning outcomes. So far, the effect of generative learning strategies while learning with AR has not yet been investigated in great detail (by so-called value-added studies, e.g., Buchner, 2022), and in particular, the influence of those strategies in AR-based learning on the perception of visual authenticity is still to determine. To date, these issues have been addressed independently, with minimal attention given to the interconnection between them.

Theoretical background

Augmented realities as authentic learning environments

AR-learning environments enable a close link between the learners’ natural environment and an instructional arranged digital environment, offering diverse possibilities for ubiquitous and situated learning (Moser & Zumbach, 2012; Moser, 2017; Dawley & Dede, 2014). In terms of research on AR-based learning, there is growing evidence that educational AR can enhance learning outcomes and improve learning processes: research has repeatedly shown a medium positive influence on learning success, motivation, and cognitive load (Buchner et al., 2021; Chang et al., 2022; Garzón & Acevedo, 2019; Kalemkuş & Kalemkuş, 2022: Li et al., 2021; Radu, 2014). Chang et al. (2022) found a nearly large effect on facilitating students’ authentic performance (applying and transferring what they have learned to their jobs or authentic situations) in a meta analysis. About cognitive load, a recent meta-analysis states that compared to other technologies, AR seems to be less cognitively demanding when providing learners with a spatial and temporal integrated format, thus reducing the risk of overloading students’ working memory (Buchner et al., 2021). AR often contains a variety of features not possible in the real world to enhance users’ engagement and learning (Dawley & Dede, 2014) and, thus, is often used to enrich pedagogical learning situations and facilitate teaching rather complex content or content that could otherwise not be shown (Garzón & Acevedo, 2019; Liono et al., 2021).

Dunleavy & Dede, (2014) state that AR is primarily aligned with situated and constructivist learning theory, as AR-learning environments position the learners within real-world physical and social contexts while guiding, scaffolding, and facilitating participatory and metacognitive learning processes (such as authentic inquiry or active observation) with multiple modes of representation. Aligned with problem-based learning, a high degree of self-regulated learning integrated into authentic contexts enables the transfer of learned competencies to other problems, leading to more profound and sustainable knowledge structures (Georgiou & Kyza, 2018).

Key features of authentic AR environments

Real-object-based learning is typically seen as highly authentic. However, this authenticity may not strictly apply to augmented realities, as they digitally visualize objects without representing the actual substance or material (Schwan, 2022). Nevertheless, existing literature highlights diverse facets and design elements of authenticity. This suggests that authenticity cannot be defined by a single characteristic, and the use of specific design elements is influenced by the intended authenticity (Betz, 2018; Fougt et al., 2019; Nachtigall et al., 2022; Newman & Smith, 2016; Schwan, 2022). In this sense, authenticity is not solely attributed to original objects but can be extended to the physical verisimilitude of natural objects or situations, the reality-like appearance, and the media-supported approximation toward the originals (known as iconic authenticity; Grayson & Martinec, 2004; Lee, 2004; Schwan, 2022). AR offers realistic learning material and allows, for example, to create authentic learning situations by providing immersive features and realistic 3D visualizations as representations of abstract concepts (Liono et al., 2021). Aitamurto et al. (2020) note that those specific characteristics of AR contribute “[…] to the perceived accuracy, authenticity, and credibility of visuals, similar to video and multimedia” (p. 4). The visual fidelity of AR content and its integration into the real world (e.g., the accuracy of the virtual objects’ placement or the real-time interactions) plays a significant role in its authenticity. Thus, one key element of authentic AR environments can be seen in the degree of their contents’ realistic depiction or representation fidelity, providing authentic experiences (Dağ et al., 2023). High-quality AR provides detailed and realistic virtual elements that closely resemble their real-world counterparts. The optimal level of realism depends on the to-be-learned content and might, e.g., lie on the more realistic end of the spectrum for learning tasks focused on memorizing shapes (Skulmowski, 2022). In this regard, Kenderdine and Yip (2018), for example, speak of immersive AR technologies as representing high-fidelity digital facsimiles. They state that “Technologies of reproduction are able to record objects and sites in sufficiently high resolution to produce visual replicas with a spatial and structural integrity that respects the original’s materiality” (Kenderdine & Yip, 2018, p. 274).

Another critical feature of authentic AR environments is their immersive element, which can be seen as a form of cognitive and emotional absorption (Georgiou & Kyza, 2018), allowing users to interact with AR content naturally and intuitively. Dağ et al. (2023) report that the immersive experience of AR users positively affects perceived authenticity. In addition, Burden and Kearney (2015) argue that authentic (mobile) learning is a highly fluid construct that will continue to evolve with technological advancements. Against this background, it is not only the realistic presentation that makes AR authentic but also virtual data integration into the real world in real time. Real-world learning and AR can be linked to create powerful, authentic educational experiences. For example, learners can move around a 3D virtual object and inspect it from any angle, just like a real object (Thees et al., 2020). Therefore, AR enables learners to combine real sensory experiences with digital information, gain an authentic perception of the virtual environment, and facilitate real-world interactions (Azuma, 1997; Weng et al., 2018). Additionally, in an attempt to describe authenticity in general based on constructive and essentialist views (van Gerven et al., 2018), one can assume that the concept of authenticity is psychologically, socially, and culturally constructed and depends on the context of an object, its history, material properties, and environment, as well as on the experience and perception of a person who judges whether or to what extent something is (considered) authentic. Concerning virtual or virtually enriched environments, Gilbert (2016) states that “(…)‘authenticity’ refers to whether the virtual environment provides the experience expected by the user, both consciously and unconsciously” (p. 322). Following these arguments, if the concept of authenticity is not only dependent on certain specific design elements of AR environments but is also regarded as a subjective construct that reflects, among others, the expectations toward the realism of representations, it becomes apparent that AR environments are closely linked to authenticity (Schwan, 2022).

Authentic AR in education

Research on AR has been conducted in various domains, mainly in STEM fields (science, technology, engineering, and mathematics; Altinpulluk, 2019; Chang et al., 2020), by offering engaging, interactive, and immersive experiences that facilitate a deeper understanding of complex concepts. AR is particularly beneficial in supporting mental models of spatial representations and in teaching content that could otherwise not be taught and (Chen et al., 2011). In the fields of biology or medical training, students can use AR to view 3D models of cells, organs, and organisms, which can be challenging to understand from flat diagrams or textbooks. Thus, it fosters a dynamic and hands-on approach to learning. This proves particularly beneficial in the instruction of human anatomy (such as teaching the function of the human cardiovascular system), where relevant human body structures should be examined from all angles. For various complex medical contents, AR has already demonstrated its effectiveness as an authentic instructional approach (Chien et al., 2010; Moro et al., 2021; Munzer et al., 2019; Nuanmeesri et al., 2019; Salar et al., 2020; Tang et al., 2020; Zhu et al., 2014). Zhu and colleagues found AR to be of growing importance in medical education, for different learners, from high school to medical students (Zhu et al., 2014). Moro et al. (2021) reported that AR contributed to knowledge gain in anatomy classes and promoted intrinsic benefits such as increased learner immersion and engagement.

Taken together, AR can generate perceptions of authentic experiences (Alimamy & Nadeem, 2021) that mediate, e.g., the relationship between immersive experiences and place satisfaction of an AR-based learning environment (Dağ et al., 2023). Following Betz et al. (2016), learners develop a subjective perception of authenticity resulting from the characteristics of the learning setting and the individual personality characteristics. Thus, the users’ perceptions are related to the authenticity of AR. In consequence, augmented learning environments can be considered authentic, especially when they enable learner-centered, self-directed, and active engagement with the to-be-learned content by providing interactive options or incorporating experiences from real life (Rosenbaum et al., 2007; Rule, 2006; Wu et al., 2013). Much literature explores authenticity’s influence on learning across various domains. (e.g., Betz, 2018; Chang et al., 2010; Gulikers et al., 2005; Nachtigall & Rummel, 2021), but regarding AR-based learning, research on authenticity perception and its impact on outcomes and processes is limited and occasionally conflicting. Wei et al. (2018) found that students of an AR-based simulator reported enhanced feelings of authenticity in the domain of technology learning. Based on student surveys, McCord et al. (2023) stated that AR has a high potential to simulate authentic learning in architecture, engineering, and construction domains. However, Aitamurto et al. (2020) revealed that AR did not increase the perceived authenticity of the information given compared to other forms of visuals when exploring AR in journalism. The authors reported that the investigated viewing modalities (AR vs. interactive non-AR visualizations vs. non-interactive static visualizations) did not differ in the perceived authenticity of the visuals. Hsu et al. (2016) found that using an AR surgery simulator promotes students’ engagement and motivation but not perceived authenticity.

Person-related variables

Taking the latter results into account, it has to be stated that learning with AR technology faces several challenges. Personal requisites of the learners, like prior knowledge, interest in the to-be-learned content, or spatial abilities, might impact the learning environment’s effectiveness (Brucker et al., 2014). In terms of prior knowledge, Chang et al. (2020) found that higher prior knowledge led to better post-performance when learning about socioscientific reasoning with AR. Compared to non-AR learning environments, Conley et al. (2020) reported that the AR experience was particularly advantageous for students without prior knowledge of statistical reasoning. In the case of spatial abilities, no relationship was found between spatial abilities and knowledge acquisition in a recent meta-analysis conducted by Bölek et al., (2021). However, Krüger et al. (2022) report that high spatial abilities may provide the necessary skills to learn with a 3D AR model. In addition, besides possible difficulties with technology or usability, AR technology’s affordances and diverse possibilities might distract learners and hinder their learning (Dunleavy et al., 2009; Schneider et al., 2011). Realistic visualizations, such as those provided in high-quality AR, could distract learners from relevant information (Skulmowski, 2023). In line with this, some studies found no significant or even adverse effects of AR on learning outcomes compared to other, more common forms of learning material (Bölek, 2021; Zumbach et al., 2022). This being the case, Ibáñez et al. (2015) state that “(…) a challenge in using AR in education in general, and in science in particular, is to take advantage of its benefits while keeping students’ attention on the learning activity rather than on the learning technology” (p. 2). Ibáñez et al. (2015) propose to provide additional scaffolding mechanisms to help students focus their work toward specific goals. Accordingly, in this study, we apply empirically proven effective instructional scaffolds and investigate generative learning strategies’ role in supporting AR-based learning.

Generative learning strategies in AR-based learning

Learning, in general, is not a passive task but involves several constructivist activities like the information reception and selection, the meaning-making or interpretation of to-be-learned information, drawing inferences from them, and integrating it into one’s knowledge structure (Wittrock, 1974; Mayer, 2014). Engaging students in generative learning strategies (GLS) aims to improve their learning by encouraging them to actively engage with the subject matter (Fiorella & Mayer, 2016). GLS can be defined as “(…) activities that prompt learners to produce something meaningful that goes beyond the information provided” (Brod, 2021, p. 1296). GLS are considered adequate for learning as they promote active sense-making of to-be-learned information, involving mental reorganization and integration with one’s prior knowledge - an essential part of effective learning (Fiorella & Mayer, 2016; Wittrock, 1974; Mayer, 2014). They are suitable for learners of different ages (Brod, 2021) and applicable across different subject areas and types of knowledge. On the one hand, they promote different representations of the learning content. On the other hand, different cognitive processes are stimulated. GLS that result in external products are of particular value because, by externalizing knowledge, learners can reduce their working memory load (Fiorella, 2023). The use of GLS has already been shown as beneficial for domains like STEM sciences, which require students to understand complex concepts profoundly and solve problems (Chi, 2021). Its use was also found advantageous during multimedia learning: Parong and Mayer (2018), e.g., investigated learning with a virtual reality simulation and additional content summarizing. They note that the additional learning strategy of summarizing did lead to higher learning outcomes and, at the same time, did not lead to different attitudes toward the learning environment compared to a control group. Klingenberg et al. (2020) reported a significant interaction between media and methods: using the GLS of teaching significantly improved transfer, retention, and self-efficacy when learning through immersive virtual reality but not desktop virtual reality. Makransky et al. (2021) also showed differences in the beneficial influence of GLS, depending on the media used: they reported that using the enactment learning strategy leads to significantly better procedural knowledge and transfer when learning with immersive virtual reality, but not when learning with videos. The authors elucidate their findings by asserting that incorporating learning strategies constitutes a suitable instructional design for VR lessons, given their high cognitive demands. Thus, using learning strategies ensures that students’ engagement fosters meaningful generative processing, emphasizing the selection of pertinent information over irrelevant details. Some GLS are considered highly relevant for supporting deeper cognitive processes, e.g., drawing and self-explaining strategies. Both strategies are classified as model-focused (Leopold & Leutner, 2012) and are the present study’s focus. According to Fiorella (2023), self-explaining generalizes knowledge, while drawing organizes knowledge. Both learning strategies relate to qualitatively distinct modes: self-explaining refers to generating coherent verbal representations, while drawing refers to visualizing, i.e., generating coherent visual representations.

Learning by drawing is based on the idea that learning is a multi-representational process (Ainsworth & Scheiter, 2021). During drawing, learners engage in generative learning processes while constructing an external representation of the learning content (van Meter, 2013; Leutner & Schmeck, 2014), e.g., they draw an illustration that looks like or corresponds to a studied concept from a text (Scheiter et al., 2017) or from a multimedia environment like a simulation (Kohnle et al., 2020). Drawing may lead to additional improvements by encouraging students to elucidate abstract spatial relations, often typical in science education (Miller-Cotto et al., 2022) This constructivist engagement may result in a deeper understanding of the learning content as it provides the learners with an additional chance to include prior knowledge or visualize relationships between different aspects of a certain topic (i.e., learning through diagrams or illustrations; Schmidgall et al., 2019; Fiorella & Mayer, 2014; Schmidgall et al., 2019). However, drawing might also be a barrier to learning due to the higher need for extraneous processing if the learner has to focus more on the act of drawing itself than on the to-be-learned content (Fiorella & Mayer, 2014). Compared to other model-focused strategies (e.g., summarizing), Fiorella and Zhang (2018) found that the effects of creating drawings are mixed and may depend on the level of drawing guidance provided, e.g., pre-drawn elements (see also Leutner & Schmeck, 2014). The authors also state that individual differences in spatial ability may moderate the effect of drawing in learning. Considering this, a study by Bobek and Tversky (2016), e.g., found that in particular students with lower spatial ability benefited from creating drawings after learning about the mechanics of a bicycle pump, when compared to students who wrote verbal explanations, while students with higher levels of spatial ability might not need the supportive drawing activity.

Explaining something to oneself is a cognitive activity that should lead to learners developing a deeper understanding of learning content. Learning by self-explaining involves explaining (either orally or in writing) in one’s own words the content of a lesson, a text, or a multimedia environment to oneself during learning (Chi, 2000, 2021; Fiorella & Mayer, 2014; Mayer & Johnson, 2010). By self-explaining, learners generate inferences about causal connections and conceptual relationships that enhance understanding (Bisra et al., 2018). Encouraging students in self-explanation can be considered an effective instructional strategy because it stimulates deeper processing of information among students (Sweller, 2012). Fiorella et al. (2019) revealed that learners using the self-explaining strategy significantly outperformed learners who draw when learning from diagram-heavy lessons. Explicit training in self-explaining effectively leads to higher learning outcomes (Kurby et al., 2012). Despite these advantages, alternative instructional techniques may be more effective under some conditions, such as when self-explanations are suitable to improve the transfer of information to a new task but reduce recall of details (Rittle-Johnson & Loehr, 2017). Chi (2000) found in her research on self-explanation that some learners spontaneously engage in self-explaining activities, while others do not. She also reported a broad range in the number of explanations that the learners wrote during learning. Now, it is an open question, if individual differences in verbal ability might play a role here, similar to the role of spatial ability when creating drawings (see paragraph above). Self-explanation is a learning strategy in which learners explain concepts or solve problems in their own words, either aloud or in writing, to help them understand and retain the material better. Learners with high verbal abilities may naturally excel in self-explanation because they can articulate their thoughts more effectively. They might use a more extensive and precise vocabulary to express their understanding. While having high verbal abilities can be an advantage when using self-explanation, to our understanding, it is not yet investigated.

As shown above, both strategies have been studied numerous times in multimedia learning, e.g., learning with videos or simulations (e.g., Fiorella et al., 2019; Kohnle et al., 2020). However, research on the comparison of drawing and self-explaining is still scarce and contradictory: For example, Miller-Cotto et al. (2022) found that for students to solve science problems, self-explanation led to greater accuracy on problems, compared to drawing or a combination of both strategies. Yet, Bobek and Tversky (2016) and Scheiter et al. (2017) reported that students who produced drawings while reading science texts outperformed students who produced written self-explanations, in particular if the products where of high quality and students had low spatial ability. Furthermore, GLS in AR-based learning is still scarce. For example, Buchner (2022) found that students using learning strategies (self-explaining and self-testing) during AR-based reported positive attitudes toward the learning environment but a worse perception of usability and higher skepticism toward AR as learning technology than students learning without such strategies. The author attributes these results (1) to the fact that the experimental group learned with an AR program designed by their teachers, while the control group used professional AR material and (2) to the assumption that performing additional tasks, as required by the use of learning strategies, required additional mental effort and thus the participants in the control group had a less skeptical attitude, as they were allowed to interact with the AR materials much more autonomously and without further learning activities. Examining learning strategies in the context of AR is also relevant, as aligning media formats and instructional methods with appropriate learning strategies is essential for effective learning (Fiorella et al., 2019; Klingenberg et al., 2020; Makransky et al., 2021). The questions of whether and how GLS support learning gains in AR-based learning, as well as the relationship between GLS and the perception of visual authenticity during AR learning, still need clarification. Additionally, the role of individual differences in prior knowledge and cognitive abilities (verbal and spatial ability) has yet to be determined. Therefore, the present research project aims to investigate the relationship between perceived authenticity and different GLS during learning with AR.

Research questions

Nachtigall et al. (2022) state in their review that “providing authentic materials (as a design element of authentic learning settings) to resemble real-life experiences (as an intention of authenticity) could be a double-edged sword, as they feature both authentically designed learning settings with low effects on cognitive outcomes and settings with high effects on motivational outcomes” (p. 1). Therefore, it is relevant to investigate the effects of AR-based learning (with realistic and authentic representations via spatial 3D models): realistic details could hinder or enhance learning (Skulmowski et al., 2022), and personal aspects like spatial ability might affect information processing as well (Brucker et al., 2014; Krüger et al., 2022). In addition, learning strategies during AR-based learning have not yet been sufficiently studied (Buchner, 2022). Thus, it appears questionable whether instructional support enhances or decreases participants’ perception of visual authenticity when learning with AR and which learner and situation characteristics are related to perceived authenticity. Furthermore, given the arguments described earlier, it remains to be clarified how learning with AR can be most usefully supported and what role learners’ perception of visual authenticity plays in this process. Therefore, in this study, we aim to investigate the role of learners’ perception of the visual authenticity of the learning content when learning with different learning strategies in an AR-based learning environment.

In the case of the heart and the cardiovascular system, appearance and functionality play a central role in comprehensive understanding. The GLS used in the current study, self-explaining and drawing, involve both aspects, raising the question of which GLS is more likely to enhance or diminish the experience of authenticity: perception of visual authenticity through visual drawing or understanding via self-explaining? Thus, the main objective of this study is to examine the effect of learning strategies on learners’ perceived visual authenticity of the learning environment and on learning success and situational interest.

Against the background of the authenticity model from Betz et al. (2016; see also Nachtigall et al., 2022), different authenticity-relevant variables are examined: as person-related predispositions, we include learners’ prior thematic knowledge, spatial and verbal skills, and motivational aspects (topic-related interest). Here, we are interested in the relationship between those variables and perceived authenticity. Regarding the characteristics of the learning situation, the influence of the two mentioned GLS is investigated. Perceived visual authenticity, knowledge acquisition, and participants’ satisfaction with the learning environment are dependent variables. Further, learners’ perceived authenticity of the AR content is investigated as a factor possibly correlating with the learning process’s cognitive and motivational target variables.

Our research questions are, therefore, as follows:

  • RQ1: To what extent is there a relationship between learners’ cognitive prerequisites (prior knowledge, cognitive abilities), motivational prerequisites (thematic interest), and their perception of the visual authenticity of the AR content?

  • RQ2: To what extent is there a difference between the three comparison groups (AR + self-explanation learning strategy; AR + drawing learning strategy; AR-only group) about perceived visual authenticity, learning outcomes and participants’ satisfaction with the learning material?

    RQ3: To what extent is perceived visual authenticity related to learning outcomes, satisfaction and situational interest (catch and hold)?

Method

Sample and design

The current study has a 1-factorial experimental comparative research design and uses pre-and post-tests. Students of a German university were approached by the project leaders on campus and after lectures and asked if they want to participate in this study. Sixty-two participants (mainly students, with a majority of pre-service teachers of different STEM subjects; n = 26) participated in the study in their free time (age M = 24.24 years; SD = 3.58; 28 women; 32 men; 1 non-binary person). As a reward for participation, the participants could win one of three shopping gift cards. Participants were randomly assigned to one of three conditions: the first group learned with the method of self-explaining as a learning strategy (n = 20), the second group used drawing as a learning strategy (n = 20), and a control group learned without the usage of learning strategies (n = 22). The participants in the different conditions did not differ regarding their gender ratio (χ2 (1, N = 62) 1.95, p = .745), age (F2,57 = 0.01, p = .996), prior knowledge about the cardiovascular system (F2,58 = 1.91, p = .160), thematic interest (F2,58 = 1.27, p = .881), figural intelligence (F2,58 = 0.43, p = .958), or verbal intelligence (F2,58 = 1.02, p = .368).

Materials and process

Project leaders introduced participants using a standardized script. The project leads welcomed the participants and briefly explained the following learning process and the learning materials. Organizational questions regarding the process were answered.

Afterward, the participants received a tablet PC. They completed an online questionnaire on sociodemographic data (age, gender), their highest education degree, study domain, thematic interest, and cognitive ability (verbal/visual intelligence). Their prior knowledge was assessed on a paper-pencil questionnaire. Participants of the drawing group (1) were prompted to apply the drawing learning strategy during learning and created a drawing of the content to be learned (see below); participants of the self-explaining group (2) were prompted to apply the self-explaining learning strategy, which means that learners wrote down self-explanations of the content during learning (see below); and (3) participants of the control group did not apply any learning strategy.

This was followed by the intervention, in which the participants were first introduced to the AR app InsightHeart (AnimaRes GmBH, 2017; https://animares.com/portfolio/insight-heart) on the tablet PC, the accompanying material, and guidance on their learning strategy. The cardiovascular system is rather demanding to learn due to its complexity and dynamic process, and it is expected that learners can benefit from visualization using AR. Using InsightHeart, participants learned about three relevant components of the human cardiovascular system: pulmonary/systemic circulation, anatomy of the heart, and blood flow within the heart. Insight Heart is an augmented reality application that displays the human heart on the tablet PC screen through the camera in the learners’ natural surroundings. The heart is displayed as a high-resolution floating three-dimensional image (3D hologram), offering an immersive experience. The aim is to create an authentic and realistic representation of the human anatomy for medical studies that can be performed as genuinely and realistically as at the open heart. The app explained the relevant functions of the heart via additive pop-up labels, different layers, or sounds. Participants could explore the human heart and blood circulation via touch gestures on the tablet (zoom, rotate, and scale the 3D heart hologram) and walk around to inspect it from all angles. For the first component, they saw the human body in total body view to view the life-size human anatomy. By zooming in, exploring from different angles, and following the course of various blood vessels, the structure of the cardiovascular system became apparent. The second aspect dealt with the anatomy of the heart. The heart could be rotated and enlarged from different angles of view. Finally, the blood flow through the heart was to be investigated. For this, the blood flow was displayed and was then to be observed.

The accompanying learning material, presented on paper, guided all learners through the learning process and occasionally gave additional background information not visible in the App (e.g., the connection of the vessels with the different heart chambers). First, the procedure in the app was explained step by step. Then, the overall topic (human heart) and the three relevant components (pulmonary/systemic circulation, anatomy of the heart, and blood flow in the heart) were introduced. The learning process for each aspect was always prompted by leading questions relevant to the to-be-learned content (e.g., “Why is the human circulatory system referred to as a double circuit?”) and practical tips for the observation (“First, look at the human circulatory system as a whole. Take different perspectives” or “You can also display the blood flow”). In addition, for the participants that are supposed to use the learning strategies, the accompanying learning material prompted the participants to make drawings or self-explanations during their learning process (e.g., “Determine why the human circulatory system is referred to as a double circulatory system. Draw a labeled sketch that schematically represents the double blood circulation with the relevant components” or “Identify the double circulatory system, including the relevant components, and explain why the human circulatory system is called a double circulatory system”). The project leaders offered extra paper and pencils to the learners to engage them in the learning activities. In sum, the participants of the treatment groups got five prompts to use the respective GLS. Learners then could make their drawings or write down their explanations on paper. The project leaders encouraged students when they noticed they did not draw or wrote self-explanations. The learning time was approximately kept the same between the three groups: while learners in the experimental groups were engaged in the learning strategies, learners in control group partly engaged longer with the AR-materials (min. 30, max. 45 min).

Finally, another online questionnaire was completed on the tablet PC (asking questions about the learners’ perceived visual authenticity of the AR environment and their satisfaction with the AR environment), and a post-knowledge test was collected in paper-pencil format. The learners in the intervention groups were not allowed to use their drawings or self-explanations during the knowledge test. In the following, the instruments will be explained in more detail.

Instruments

All scales were presented online, except the knowledge tests (paper-pencil), and participants reached the questionnaire via tablet.

Knowledge about the cardiovascular system was measured in the pre- and post-test via eight questions on the three components of the cardiovascular system that were addressed in the learning environment. Questions contained verbal and visual content and were presented as four open questions and four single-choice questions addressing recall and comprehension, related to the leading questions of the learning material. Tasks 1 and 2 asked for the internal structure of the heart and the blood flow through it, once by making a sketch (task 1) and once by explaining it (task 2). In tasks 3 and 4, the double blood circulation was to be drawn (task 3) and explained (task 4), respectively. Task 1/2 and task 3/ 4 each had the same score. The single-choice questions 5 to 8 tested the physiologically correct distribution of oxygenated and deoxygenated blood in the ventricles and atria of the heart, the correct labeling of the four most important vessels, the type of blood transported in the pulmonary veins, and the classification of the circulatory system. The test was balanced via integration of verbal as well as visual items, so that none of the treatment groups had any advantage due to the learning strategy use while dealing with the learning material. In the pre-test, there was also the option to answer “I don’t know” for each of the questions. In addition to the conception of the knowledge test, a coding scheme was created for each task by the project leaders, who also evaluated the test. An example for a visual related task is as o: “Draw a sketch of the internal structure of the heart and label all the components drawn. Then draw the direction of blood flow through the heart using arrow symbols.” The sketch was scored as follows: Five points for drawing the two ventricles, two atria, and most important vessels (aorta, pulmonary artery, body vein/hollow vein, pulmonary vein) with 1/2 point allocated for each. An additional point for the correct arrangement of these components. Four points for labeling and equivalent terms, with 1/2 point given for each labeled item. Another point for a correct depiction of the blood flow. An example for a verbal related task is as follows: “Explain why we speak of a double circulatory system in humans. Explain the structure of the circulatory system including the important organs.” The explanation was rated as follows: explaining of pulmonary and systemic circulation (1/2 point each), explaining the double, closed blood circulatory system (1 point per feature); naming lungs, heart, body (1/2 point each). One example of single-choice question would be as follows: “What type of blood does the pulmonary vein transport? Mark the correct answer. A) Mixed blood, B) Capillary blood, C) Oxygen-poor blood, D) Oxygen-rich blood.” The overall points from open responses and single-choice questions were combined into one score. Participants could gain a maximum score of 33 points.

To measure participants’ verbal and figural intellectual abilities in the pre-test, two subtests were selected from the basic module of the Intelligence Structure Test 2000 R (IST-2000-R; Liepmann et al., 2007) that assessed both verbal and figural (i.e., visuospatial) reasoning with 12 items each. The IST 2000-R is one of the German standard tests assessing intelligence (Roth & Herzberg, 2008). Verbal and figural intelligence was determined via 12 tasks each, requiring sentence completion, verbal analogies, and similarities and figure selection, cubes, and matrices. (The original questionnaire had 20 items for each category; however, for economic reasons, we administered each category with four items with different difficulty levels. For this purpose, the items were sorted according to their given difficulty indices, which indicates the percentage of correct answers, and every fifth item was selected.) The scores achieved in the reasoning tests were added up for each scale (max. 12 points per test).

Thematic interest in the cardiovascular system before learning was measured in the pre-test via four items based on Krapp (2002; α = 0.92; example item: “I find the topic of the cardiovascular system exciting.”). Each item could be answered on a 5-point Likert scale (“agree” to “disagree”).

Perceived visual authenticity of the AR content was measured in the post-test with Aitamurto et al.’s (2020) scale on perception on visual authenticity, that consists of three items, assessing how authentic, realistic, accurate, and credible the participants perceived the AR visuals to be (“The visuals in the AR application seemed authentic to me,” “The visuals accurately reflected reality,” “The visuals were realistic and convincing”; α = 0.63). Participants could rate their agreement to five statements on a Likert scale of five points 5 (1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, 5 = strongly agree).

In the post-test, we assessed participants’ satisfaction with the AR environment. To this aim, we used the respective subscale of Diaz-Noguera et al.’s (2017) Augmented Reality Applications Attitude Scale (ARAAS) in the post-test. Participants could rate their agreement to five statements on a Likert scale of five points 5 (1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, 5 = strongly agree; α = 0.86; “Demonstration of 3D objects, videos and animations with AR applications increases my curiosity”; one original item was omitted to improve internal consistency).

Situational interest was measured in the post-test with a 12-item scale by Lewalter, (2020; Knogler et al., 2015). In theoretical conceptions of situational interest, a distinction is made between two components: situational interest catch and situational interest hold (Knogler et al., 2015; Krapp, 2002). Situational interest catch refers to the initial emergence of situational interest, in which a person’s attention is initially drawn to a certain subject matter, their curiosity for this content or subject area is aroused, and a positive emotional experience is created. Situational interest hold is present when a person wants to continue to engage with the content beyond a brief period of attention. They perceive it as meaningful and want to learn more about it (Knogler et al., 2015; Lewalter, 2020). The scale consists of two subscales: catch (6 items; example: “I found the learning environment exciting”; α = 0.86) and hold (6 items; example: “I would like to learn more about parts of the learning environment”; α = 0.80). Participants could indicate their answers on a 5-point Likert scale ranging from 1 = completely disagree to 5 = agree entirely.

Data analysis

To detect any relevant prior differences between the groups, we used the chi-square test for age differences and MANOVA to find differences in prior knowledge, cognitive abilities, and prior thematic interest (see sample section). Repeated measures ANOVA is used to reveal significant knowledge gain from pre- to post-test. For answering RQ1, we calculated correlations with Person’s correlation coefficient to analyze relationships of perceived authenticity with personal prerequisites (prior knowledge, cognitive abilities, thematic interest). For answering RQ2, we computed MANOVA with post hoc tests to analyze differences between the three conditions in perceived visual authenticity of the AR content, post-knowledge, and satisfaction with the AR environment. For analyzing relationships of perceived visual authenticity with cognitive outcomes (post-knowledge) and motivational outcomes (satisfaction and interest), we calculated correlations with Pearson’s correlation coefficient (RQ3).

Results

Basic analysis regarding the perception of visual authenticity, satisfaction, and learning effects

Descriptive values of prior knowledge and the three dependent variables knowledge, perceived visual authenticity, and satisfaction for the three groups are presented in Table 1.

Table 1 Descriptive data for prior knowledge, perceived visual authenticity, satisfaction, and post-knowledge test for each group

Overall, repeated measures ANOVA reveals a significant increase between pre-test (M = 12.65; SD = 8.79) and post-test (M = 24.54; SD = 6.55) in knowledge gain (F(1,60) = 171.67 p < .001; η2 = 0.17). Differences in dependent variables between the three comparison groups are described below in the section on RQ2.

RQ1: relationship between learners’ prerequisites and perceived visual authenticity

Correlation analysis reveals no significant relationships between learners’ prerequisites (prior knowledge, figural/verbal intelligence, and thematic interest) and perceived visual authenticity. However, in the control group, a tendency of a positive relationship can be seen for cognitive abilities (figural and, in particular, verbal intelligence; see Table 2).

Table 2 Correlations of perceived visual authenticity with learners’ prerequisites

RQ2: differences between the three comparison groups in regard to the perceived visual authenticity, post-knowledge, and satisfaction

The assumption of homogeneity of variance is assessed before conducting MANOVA: The results of Levene’s test showed that the assumption of homogeneity is met and the variances for all three dependent variables are equal between the three comparison groups (perceived visual authenticity, F(2,58) = 0.332, p = .719; post-knowledge, F(2,58) = 2.24, p = .115; perceived satisfaction, F(2,58) = 71.30, p = .735). Regarding differences between the three conditions, overall MANOVA results reveal mean differences among the three conditions (F(6,114) = 3.17; p = .007; η2 = 0.14). For the individual dependent variables, differences depending on the use of learning strategies can be found for post-knowledge test (F(2,58) = 8.41; p < .001; η2 = 0.23), but not for perceived visual authenticity (F(2,58) = 2.51; p = .090; η2 = 0.08) or satisfaction with the AR environment (F(2,58) = 0.21; p = .814; η2 = 0.01). A more detailed look at the individual subscales show that both conditions with learning strategies outperformed the group without learning strategies, and the self-explaining group performed better than the drawing group (see Table 1). To compare group means in post-knowledge, we performed post hoc tests (Bonferroni), revealing that the self-explaining group performed significantly better than the drawing group (p = .044) and the AR-only group (p < .001). No difference in post-knowledge between the drawing and the control group was found (p = .403).

RQ3: relationship between perception of visual authenticity, post-knowledge, situational interest and satisfaction

Correlation analyses revealed no correlations of perception of visual authenticity with post-knowledge in any group (drawing, (19) = − 0.12, p = .623; self-explaining, (19) = 0.02, p = .934; control group, (20) = 0.08, p = .718). We did find a significant correlation between perceived visual authenticity and situational interest hold in the self-explaining group ((18) = 0.47, p = .037), but not with situational interest catch ((18) = 0.23, p = .335). However, we did not find any corresponding relationships in the drawing group (perceived visual authenticity and situational interest catch, (18) = 0.15, p = .542; perceived visual authenticity and situational interest hold, (18) = 0.12, p = .490) or in the control group (perceived visual authenticity and situational interest catch, (19) = 0.39, p = .071; perceived visual authenticity and situational interest hold, (19) = 0.01, p = .962). Perceived visual authenticity correlates with satisfaction again in the self-explaining group ((19) =.47, p = .039) but not in the drawing group ((19) = 0.21, p = .384) or the control group ((19) = 0.38, p = .081).

Discussion

This paper contributes to the existing research on the educational use of AR by exploring the relationship between instructional support and perceived visual authenticity of AR-based learning with AR, and its influence on cognitive and motivational learning outcomes (learners’ post-knowledge and satisfaction). First, the descriptive values of all three groups reveal that the participants perceived the AR content as authentic (the mean values of all three groups were above four, on a scale from one to five). The significant knowledge gain from the pre- to post-test adds empirical evidence to prior findings that AR-based learning is a promising way to support learning (e.g., Garzón & Acevedo, 2019; Radu, 2014) also in the field of human anatomy (Moro et al., 2021; Nuanmeesri et al., 2019; Tang et al., 2020). The present research results suggest that AR can provide learning content that enhances complex medical learning scenarios by offering the chance to thoroughly study the anatomy of a structure by virtually investigating anatomical parts, enabling a comprehensive understanding of the subject.

Concerning RQ1, learners’ perceived visual authenticity did not show significant correlations with learners’ prerequisites (prior knowledge, verbal and figural intelligence, and interest) in any group. Thus, this implies that our data yielded inconclusive evidence regarding the assumption that the perception of visual authenticity seems is related to prior personal characteristics, no matter if and what kind of support was presented (visual drawing or verbal self-explaining). However, we think it is worth mentioning that the correlation between learners’ verbal intelligence and perceived visual authenticity in the control group just missed significance, but not in the experimental groups. Previous research has already shown that cognitive ability can have an impact on performance and attitudes in learning with multimedia (Burin et al., 2021; Heo & Toomey, 2020; Höffler, 2010). For verbal ability, e.g., Burin et al. (2021) and Kim and Lombardino (2019) found that the comprehension of multimedia material is affected by verbal ability, as verbal ability was a strong concurrent predictor of learning outcomes. In our study, deficits in cognitive abilities, in particular verbal skills, might have been counterbalanced by the engagement of instructional strategies, to a certain amount, so they may have less influence on the perception of visual authenticity. The learners’ expectations and assumptions about the features of the real objects that the AR environment is attempting to represent might be more satisfied by using learning strategies, thus making learning and resulting attitudes more independent of prior existing cognitive abilities. However, the correlation results did not achieve statistical significance. To draw more definitive conclusions regarding these assumptions, repeating the experiment with a larger sample would be beneficial.

For RQ2, in regard to the question of which GLS is more likely to enhance or decrease the experience of authenticity in AR-based learning environments with learning contents such as the human cardiovascular system, learners of the three groups reported no significantly different perceptions of the visual authenticity of the AR content. The control group without learning strategies reported a descriptively higher perception of visual authenticity than both experimental groups, but this difference was not statistically significant, leaving inconclusive results regarding the assumptions that the these learning activities play a significant role in the perception of visual authenticity. Nevertheless, the results suggest that the perception of visual authenticity is maintained in learning situations where learning strategies are used. The findings offer additional insights and contribute to previous findings regarding the users’ perceived visual authenticity of AR content (Aitamurto et al., 2020; Hsu et al., 2016; Wei et al., 2018), showing that the guidance on visual or functional aspects of the content provided via the strategies might not influence the perceived visual authenticity. Regarding knowledge acquisition, our results reveal that both groups using learning strategies show descriptively higher post-knowledge than the control group. This adds to prior research stating that GLS are adequate for learning by leading learners to mentally reorganize the to-be-learned content and integrate it into their prior knowledge (Fiorella & Mayer, 2016; Wittrock, 1974; Mayer, 2014). Concerning the significant differences between the two experimental groups and the control group, it appears that self-explaining proves more effective in supporting AR-based learning than drawing in this learning setting, as indicated by the significant variations in post-knowledge. A possible explanation for this is that learners in the self-explaining group produced more additional information than learners in the drawing group (which is closer to the modality of AR content; Hausmann & VanLehn, 2010). Further, according to dual-coding theory (Clark & Pavio, 1991), the use of nonverbal and verbal codes, being functionally independent, can have additive effects on recall. Together with the visual input from the AR-learning environment, the learners of the self-explanation group might have created a more coherent mental representation by utilizing the verbal self-explanation strategy. Thus, encouraging learners to explain concepts and contexts to themselves seems to be an appropriate way to promote deeper understanding in AR-based learning. This understanding might then again be more easily translated into both types of test items of our knowledge test. Although the learners in this study did receive support on what to include in the drawing (e.g., to integrate relevant components of the cardiovascular system), other supportive elements, like for example pre-drawn parts of the cardiovascular system, might further enhance learning with the drawing strategy (Fiorella & Zhang, 2018; Leutner & Schmeck, 2014). Regarding the way that learners feel about AR-learning environments, related research found that using learning strategies during AR-based learning positively influenced learners’ attitudes toward the learning environment but at the same time led to higher skepticism toward the learning environment than learning without learning strategies (Buchner, 2022). In our study, regarding the perceived satisfaction with the learning material, we could not find any significant differences between the three groups. All participants reported overall high satisfaction with the presented AR content.

Regarding RQ3, Betz et al. (2016) reported that the perception of authenticity may lead to motivational, affective, and cognitive effects. In this respect, only in the self-explaining condition, we found moderate positive correlations between perceived visual authenticity and motivational aspects: the higher the perceived visual authenticity, the higher the situational interest (hold), and the higher the satisfaction with the learning material. Thus, perceived visual authenticity is not related to catching situational interest (catching students’ attention) but is related to more sustainable holding interest, at least in the self-explaining group.

Limitations

These findings are specific to the domain of human anatomy (the cardiovascular system) and are constrained by the use of a short-time intervention without controls for long-term effects. However, the instruments used (e.g., the questions on learners’ perception of visual authenticity) are also only useful for short-term interventions. Nevertheless, controlling for longer effects, e.g., via follow-up tests, could be useful. In addition, with a small sample size, caution must be applied, and a higher sample size could add to the generalizability of the study’s results. However, samples with 20 or more persons per group are generally considered robust for detecting most effects (Simmons et al., 2011). According to power analyses with “G*Power,” with an alpha level 0.05 and power 0.80 (Faul et al., 2007), a sample size of 51 is deemed sufficient for detecting medium effect sizes. However, a larger sample size could enhance the ability to detect even small effects in future investigations.

Future research

For future studies, it might be beneficial to include a control group employing a non-generative learning strategy for a more comprehensive comparison of effects. Additionally, assessing the quality or quantity of drawings and self-explanations, along with a potential impact on learning outcomes, could provide valuable insights. Evaluating the quality of the learning strategy products may enhance the interpretation of the observed results.

Conclusion

This study examined how instructional support through learning strategies in AR-based learning relates to learners’ perceptions of authenticity, knowledge acquisition, and satisfaction with the AR environment. By exploring diverse instructional methods utilizing the identical technology, our study extends current AR research beyond mere media comparisons and examines in more detail research questions that go beyond the scope of traditional media-oriented studies. The empirical findings offer new insights into the role of instructional support via learning strategies in AR-based learning. In summary, our results demonstrate that learning strategies enhance factual knowledge without compromising the perceived visual authenticity of AR content or learners’ satisfaction with the learning environment. Our study contributes to existing research on authenticity in AR-based learning by indicating that instructional scaffolds like learning strategies neither diminish nor increase the perception of visual authenticity or satisfaction with the learning material. However, there is a notable improvement in learning outcomes, particularly with the self-explanation learning strategy, suggesting that learners in this group may have developed more comprehensive mental representations. In conclusion, these findings suggest that incorporating learning strategies in AR-based learning is beneficial for educational practices. Future research could explore additional relevant aspects such as the quality of learning strategy products, content transfer, cognitive load, or prior AR experience.