Introduction

The goal of designing authentic learning settings is to contextualize learning experiences in such a manner that learners recognize the value, utility, meaning, and functionality of the knowledge to be acquired (e.g., Nachtigall et al., 2022). Thereby, authentic learning settings aim to evoke both cognitive (e.g., development of deep understanding) as well as motivational (e.g., development of intrinsic motivation) effects (Betz et al., 2016; Lepper, 1988; Newmann & Wehlage, 1993). One way to design for authenticity is to implement learning activities that emulate the work of professionals of a certain discipline (e.g., Nachtigall et al., 2022). As shown in a study by Stamer et al. (2021), implementing videos of working scientists can be a suitable option to provide students with authentic insights into scientific practices. Moreover, the use of video modeling examples, which allow students to observe a person (the so-called ‘model’) performing a task (van Gog & Rummel, 2010; van Harsel et al., 2022), has also been proven to be effective for promoting students’ scientific reasoning (e.g., Kant et al., 2017; Omarchevska et al., 2022). Video modeling examples may be a particularly promising opportunity to familiarize novices with inquiry learning without requiring much scientific reasoning and high cognitive demands from students and, thus, without being too cognitively overstraining for them (see, e.g., Kirschner et al., 2006; Tuovinen & Sweller, 1999). This seems particularly relevant with regard to mathematics education, in which inquiry learning is not traditionally used. Nevertheless, mathematics and science are closely connected fields, and mathematics, whatever its specificity, is not a purely deductive science but also has an experimental component (e.g., Artigue & Blomhøj, 2013). For instance, the mathematical scientist Pólya (1957, p. vii) described “mathematics in the making […] as an experimental, inductive science”. However, teachers often do not associate learning mathematics with experimental approaches (see, e.g., Geisler & Beumann, 2020) and hands-on experimentation is often rarely implemented in mathematics classes (e.g., Hagenkötter et al., 2024). Hence, students probably have no or only few experience, and thus, providing them with first authentic insights into mathematical hands-on experimentation by letting them observe corresponding video modeling examples seems to be a particularly promising and not too cognitively demanding way.

However, the effectiveness of (video) modeling examples can be affected by different characteristics of the model (e.g., Schunk, 1987; van Gog & Rummel, 2010). From a social-cognitive perspective of example-based learning (see, e.g., Bandura, 1994; Buunk et al., 2003; Hoogerheide et al., 2016a; Schunk, 1987), one could assume that the observation of students of the same age with similar expertise (i.e., peers) may be particularly conducive to learning. But from a perspective of authentic learning (see, e.g., Betz et al., 2016: Nachtigall et al., 2022), it seems that the observation of experts, such as scientists, is particularly suitable for promoting students’ perception of authenticity and, thus, their motivational as well as cognitive learning outcomes. Therefore, the question arises how students perceive and learn from models with different degrees of authenticity (i.e., model authenticity) performing a mathematical hands-on experiment. To address this question, we compare video-mediated observation of peer models with video-mediated observation of scientist models engaged in mathematical hands-on experimentation and examine the effects on students’ perceived authenticity as well as their motivational (i.e., situational interest) and cognitive (i.e., knowledge acquisition) learning outcomes.

Learning from video modeling examples

Example-based learning has been studied from a cognitive (cognitive load theory; Sweller et al., 2011) and social-cognitive perspective (social learning theory; Bandura, 1986). Research from the cognitive perspective focused particularly on the effects of worked examples, which typically provide students with “a written step-by-step explanation of a full and correct solution procedure of how to solve a problem” (van Harsel et al., 2022, p. 704; see also, e.g., van Gog & Rummel, 2010). Research from the social cognitive perspective mainly dealt with (video) modeling examples, which provide students with the opportunity to observe “a model demonstrating and possibly explaining the solution procedure step by step on video” (van Harsel et al., 2022, p. 704; see also, e.g., van Gog & Rummel, 2010). Nowadays, new forms of video examples are being developed that combine features of worked and modeling examples, for instance, by demonstrating information in a step-by-step manner and combining dynamic visual information and the model’s narration (e.g., Hoogerheide et al., 2014; van Gog & Rummel, 2010; van Harsel et al., 2022). According to the example-based-learning-principle (e.g., van Harsel et al., 2022, p. 705), studying an example containing a step-by-step solution (whether written and/ or demonstrated and explained in a video) instead of solving a task is assumed to reduce unnecessary cognitive load, whereby learners might be enabled to use their working memory resources to build a problem-solving schema for later problem-solving situations. However, studying examples can lose its effectiveness or may even hamper learning when students have some prior knowledge of the problem (i.e., the expertise reversal effect; e.g., Kalyuga et al., 2001, 2003). Thus, in summary, “for novices studying several examples […] leads to better test performance (i.e., is more effective) attained with less time and/ or effort investment (i.e., is more efficient) than practice problem solving only” (van Harsel et al., 2022, p. 705; see also, e.g., Cooper & Sweller, 1987; Renkl, 2014; van Gog & Rummel, 2010).

The example-based-learning-principle has been demonstrated in diverse contexts such as algebra (e.g., Sweller & Cooper, 1985), argumentative writing (e.g., Braaksma et al., 2002), programming (e.g., Kalyuga et al., 2001), or subtraction (e.g., Schunk & Hanson, 1985; for reviews, see, e.g., Renkl, 2014; van Gog & Rummel, 2010). With regard to inquiry learning, previous studies show that observing video modeling examples of inquiry learning effectively reduced the high cognitive demands on students and increased cognitive learning outcomes compared to independent inquiry learning (e.g., Kant et al., 2017; Omarchevska et al., 2022). For example, the results of a study by Kant et al. (2017) on inquiry learning in physics reveal that students who watched a video modeling example reported lower mental effort and exhibited higher values in learning process measures as well as higher performance in a scientific reasoning test than students who solved the inquiry task on their own. A study by Omarchevska et al. (2022) shows similar findings. They compared the effects of watching video modeling examples (with and without prompts) with unguided inquiry learning on students’ hypothesis and argumentation quality in a subsequent training and transfer task. The findings reveal that students who watched video modeling examples demonstrated higher hypothesis and argumentation quality in the training task and higher hypothesis quality in the transfer task than students in the unguided inquiry group. Additionally, in order to investigate the effects on students’ scientific reasoning and self-regulation processes, they used screen captures and think aloud protocols of the training and transfer task. They observed that students who watched video modeling examples were self-regulating more frequently during scientific reasoning activities in the training task than students in the unguided inquiry group.

Based on these findings, observing video modeling examples of mathematical hands-on experimentation seems to be particularly suitable to reduce the cognitive demands on students and foster their cognitive learning outcomes. However, when developing video modeling examples, several design choices have to be made that may influence the effectiveness, the most salient refers to the choice of model (e.g., Hoogerheide et al., 2016b; see also, e.g., van Gog & Rummel, 2010).

Characteristics of the observed models in terms of age and expertise

Research has shown that different characteristics of the model, such as age or expertise, can impact the effectiveness of (video) modeling examples (for an overview, see, e.g., Schunk, 1987; van Gog & Rummel, 2010). From a social-cognitive perspective, particularly according to the model-observer similarity hypothesis (e.g., Bandura, 1994; Schunk, 1987), it is assumed that (perceived) similarity between learners and the model in terms of age or expertise moderates the effectiveness of modeling examples. It is likely that especially novices, whose prior knowledge as well as self-efficacy and perceived competence are low, are affected by model-observer similarity, as they are particularly likely to engage in social comparison (see, e.g., Buunk et al., 2003). Thus, it can be assumed that “the higher the degree of similarity between observer and model, particularly when the observer is novice to the task at hand, the more cognitive outcomes of learning (e.g., performing the same or novel tasks) and affective aspects of the learning process (e.g., self-efficacy, perceived competence) may be enhanced” (Hoogerheide et al., 2016a, p. 71).

However, in terms of both model-similarity in age and expertise, findings have been mixed (see, e.g., Hoogerheide et al., 2016b; Schunk, 1987; van Gog & Rummel, 2010). Regarding model-similarity in age, previous research compared learning from peer models who are similar in age to the learners with adult models who are dissimilar in age. In line with the model-observer similarity hypothesis, the results of a study by Schunk and Hanson (1985) with students who were low achieving in mathematics, for example, showed that peer models were more effective than adult models (who were more effective than no model) in enhancing self-efficacy and cognitive learning outcomes. In contrast, Bandura and Kupers (1964), for instance, found adult models to be more effective in transmitting self-reinforcing responses than peers. With respect to model-observer similarity in expertise, previous studies compared the effects of learning from high expertise models (e.g., experts) to low expertise models (e.g., advanced students) who are closer in knowledge and skill to novice learners. In line with the model-observer similarity hypothesis, for example, Braaksma et al. (2002) observed that secondary education students with weak writing skills benefited more from being instructed to focus on weak models than from focusing on strong models, whereas the reversed effect was shown for more competent students. On the other hand, the results of Sonnenschein and Whitehurst (1980), for instance, reveal that for primary school students, a more expert model was more beneficial for learning communication skills relative to a low expertise model. According to Hoogerheide et al. (2016b), one possible reason for these mixed findings may relate to the different example contents and the quality of the explanations provided by the model. Therefore, in their study, Hoogerheide et al. (2016b) kept the content equal and compared learning by observing models of the same age (i.e., peer models) or dissimilar age (i.e., adult models) which were introduced as having low or high expertise, respectively. Contrary to the model-observer similarity hypothesis, they found that adult models were more effective and efficient to learn from than peers. Moreover, the results show no effect of the alleged expertise of the observed models. Nevertheless, they found that students who observed adult models found the model’s explanations to be of higher quality than those who observed peer models, although the peer and adult models provided the exact same explanation. One possible explanation for this finding given by the authors relates to the fact that adult models may be more beneficial than peer models for behaviors that are viewed more appropriate for adults and in which adults are considered to be more of an expert (i.e., age- and task-appropriateness).

In summary, due to the mixed findings, it remains unclear to what extent model-observer similarity in terms of age and expertise affects the effectiveness of video modeling examples. However, as already mentioned, students probably have no or only few experience with mathematical hands-on experimentation (see, e.g., Geisler & Beumann, 2020; Hagenkötter et al., 2024). Therefore, according to the model-observer similarity hypothesis (e.g., Bandura, 1994; Schunk, 1987), it can be assumed that they are particularly likely to engage in social comparison and, thus, are affected by model-observer similarity (see also Buunk et al., 2003; Hoogerheide et al., 2016a). In line with the findings by Schunk and Hanson (1985) as well as Braaksma et al. (2002), it may further be assumed that, from a social-cognitive perspective, students of the same age with similar expertise (i.e., peers) may be a good choice as model.

(Perceived) Authenticity of the observed models

From a perspective of authentic learning, the presence of experts is considered being a design element of authentic learning settings (e.g., Nachtigall et al., 2022). Consequently, models of experts, such as scientists, can be expected to increase the authenticity of the learning environment perceived by students. For example, as already mentioned, the results of a study by Stamer et al. (2021) indicate that observing scientists can foster students’ perception of authenticity. They found that students who performed nanotechnological experiments and watched videos showing regular practices of scientists reported higher perceived authenticity and developed more adequate beliefs about scientific practices than students who performed the same experiments but without watching the videos. Thus, the authors concluded that videos of working scientists can be a suitable option to promote students’ perception of authenticity and to convey an authentic conception of scientists’ work. As illustrated, for example, in the model of authenticity in teaching and learning contexts by Betz et al., (2016; see Fig. 1), students’ perceived authenticity, in turn, is assumed to affect their motivational (e.g., situational interest) as well as cognitive (e.g., knowledge acquisition) learning outcomes.

Fig. 1
figure 1

Model of authenticity in teaching and learning contexts adapted from Betz et al., (2016, p. 816; see also Nachtigall et al., 2022, p. 1486)

However, so far, to the best of our knowledge, the effects of the presence of experts as one of the design elements of authentic learning settings (e.g., Nachtigall et al., 2022) have not been explicitly investigated. Although Stamer et al. (2021) found that the observation of scientists can increase students’ perception of authenticity and promote more adequate beliefs, they did not examine the hypothesized effects on students’ motivational as well as cognitive learning outcomes (see Fig. 1). Furthermore, Itzek-Greulich and colleagues (Itzek-Greulich & Vollmer, 2017; Itzek-Greulich et al., 2015, 2017) as well as Betz (2018), for example, examined the authenticity perceived by students and the effects of different instructors, but not separately, only together with the learning location. Specifically, Itzek-Greulich and colleagues (Itzek-Greulich & Vollmer, 2017; Itzek-Greulich et al., 2015, 2017) compared learning in an out-of-school lab together with a scientist and a lab assistant and learning in school together with the regular science teacher (and a combination of both and a control condition). Betz (2018) compared learning in an out-of-school lab together with a professional linguist and the project leader and learning in school with the project leader only. Moreover, the findings are highly inconsistent. On the one hand, Itzek-Greulich and colleagues found no or even a negative effect of learning in an out-of-school lab together with a scientist and a lab assistant on students’ achievement (Itzek-Greulich et al., 2015, 2017) and no effect on students’ state motivation (e.g., situational interest; Itzek-Greulich & Vollmer, 2017; Itzek-Greulich et al., 2017). In line with the theoretical assumptions of the model of authenticity (Betz et al., 2016; Fig. 1), the results of Betz (2018), on the other hand, reveal a positive effect of learning in an out-of-school lab with a professional linguist on students’ perceived authenticity as well as their situational interest mediated by students’ perceived authenticity. The authenticity of a learning setting, for example of the instructor, is likely to evoke motivational effects, especially on students’ situational interest, as it is assumed that situational interest is first triggered and then maintained by external features of the learning setting (e.g., Hidi & Renninger, 2006). According to Hidi and Renninger (2006, p. 112), situational interest refers to “the psychology state of engaging or the predisposition to reengage with particular (…) content.” Based on these results, however, it remains unclear what role the out-of-school lab scientist or professional linguist himself played on students’ perceived authenticity as well as further learning outcomes.

The present study

To conclude, the choice of model can influence the effectiveness of video modeling examples (e.g., Hoogerheide et al., 2016b; see also, e.g., van Gog & Rummel, 2010). However, findings from both a social-cognitive perspective and a perspective of authentic learning have been mixed. On the one hand, from a social cognitive perspective, it can be assumed that students, as they probably have no or only few experience with mathematical hands-on experimentation (see, e.g., Geisler & Beumann, 2020; Hagenkötter et al., 2024), are particularly likely to engage in social comparison and, thus, are affected by model-observer similarity (see Buunk et al., 2003; Hoogerheide et al., 2016a). One may therefore assume that especially students of the same age with similar expertise (i.e., peers) may be a good choice as model to foster students’ cognitive learning outcomes. On the other hand, from a perspective of authentic learning (e.g., Betz et al., 2016: Nachtigall et al., 2022), the observation of scientists as experts seems to be particularly promising for fostering students’ perception of authenticity and, thus, their cognitive as well as motivational learning outcomes.

Against the background of these contradictory assumptions, the present study aims to investigate the effects of observing models with different degrees of authenticity performing a mathematical hands-on experiment on students’ perceived authenticity as well as their motivational and cognitive learning outcomes. For this purpose, we compared the effects of video-mediated observation of peer models with video-mediated observation of scientist models of mathematical hands-on experimentation. In line with the model of authenticity (Betz et al., 2016; Fig. 1) and in light of the demonstrated beneficial effects of learning in an out-of-school lab with a professional linguist on students’ situational interest by Betz (2018), we focus on examining the effects on students’ situational interest as well as their knowledge acquisition.

Hypotheses

Based on the assumption that scientist models are likely considered more as experts than peer models, which is one of the design elements to implement disciplinary authenticity (e.g., Nachtigall et al., 2022), and on the findings of Stamer et al. (2021) indicating that observing scientists can promote students’ perception of authenticity, we hypothesize that students who observe scientist models of mathematical hands-on experimentation will report higher perceived authenticity of the observed models than students who observe peer models (Hypothesis 1/H1).

Furthermore, building on the model of authenticity by Betz et al., (2016; see Fig. 1) assuming that authentic learning settings may evoke positive motivational (and cognitive) effects, we hypothesize that students who observe scientist models of mathematical hands-on experimentation will report higher situational interest than students who observe peer models (Hypothesis 2/H2). With additional reference to the mediating effect of students’ perceived authenticity on their situational interest found by Betz (2018), we further assume that the effect of observing models with a different degree of authenticity on students’ situational interest will be mediated by their perceived authenticity of the observed models (Hypothesis 3/H3).

With regard to students’ knowledge acquisition, based on the arguments above, the following contradictory hypotheses can be derived: From a social-cognitive perspective, it can be assumed that students, as they probably have no or only few experience with mathematical hands-on experimentation (see, e.g., Geisler & Beumann, 2020; Hagenkötter et al., 2024), are particularly likely to engage in social comparison and, thus, are affected by model-observer similarity (see Buunk et al., 2003; Hoogerheide et al., 2016a). Therefore, on the one hand, it can be hypothesized that students who observe peer models of mathematical hands-on experimentation will achieve higher knowledge acquisition (i.e., performance on a knowledge test) than students who observe scientist models (Hypothesis 4a/H4a). In contrast, from a perspective of authentic learning (e.g., Betz et al., 2016; Nachtigall et al., 2022), it can be assumed that students who observe scientist models of mathematical hands-on experimentation will achieve higher knowledge acquisition than students who observe peer models (Hypothesis 4b/H4b) and that the effect of observing models with a different degree of authenticity on students’ knowledge acquisition will be mediated by their perceived authenticity of the observed models (Hypothesis 5/H5).

Method

Participants and design

Participants were 105 10th graders (Mage = 15.38, SD = 0.60; 48% female, 50% male, 3% divers) from one school in Germany. Participants were randomly assigned to one of the two conditions: video-mediated observation of peer models (peer condition, n = 52) or video-mediated observation of scientist models (scientist condition, n = 53) performing a mathematical hands-on experiment.

Materials

The students in both conditions watched a 30 minutes video in which three people (two male and one female) performed a mathematical hands-on experiment on beer foam decay in an out-of-school lab. In the video of the peer condition, the three models were introduced as peer students by showing photos of students of the same age. In contrast, the three models in the video of the scientist condition were introduced as scientists by showing photos of adults. To ensure a high level of authenticity of the models in the video and, thus, that students perceive the models as real scientists, we selected pictures of people with stereotypical attributes of scientists, namely glasses, older age, more male than female, and a woman without styling (see, e.g., Christidou, 2011; Hagenkötter et al., 2021). In order to draw students’ attention to the aspects the models are talking about, in both videos only the models’ arms and activities as well as written notes were visible during mathematical hands-on experimentation (see, e.g., van Gog & Rummel, 2010; see Fig. 2). Moreover, students in both conditions heard the same voices but with minor differences in the language use in order to provide an age appropriate context (i.e., peer models’ experiences of pouring beer for their parents and their parents’ friends and scientist models’ experiences of pouring beer in general) and to create a genuine conversation between the three models. For example, when planning their experiment to investigate the beer foam decay, one peer model says: “We definitely need the beer here first. And then, of course, the bottle opener.” One scientist model, on the other hand, formulates: “We definitely need the bottle and a bottle opener.”

Fig. 2
figure 2

Screenshot of the video

Based on the steps of mathematical hands-on experimentation (see Fig. 3), the models in both videos first developed different assumptions about how the beer foam decays (1. Assuming). They then planned together an experiment to investigate beer foam decay and thought about the materials they would need to do so (2. Planning). Afterwards, they carried out their planned experiment (3. Conduction). In doing so, the models filled in their measured values in a table (4. Mathematization; see also Fig. 2) and then transferred them to a coordinate system (5. Mathematical Work). With the help of their graph, they analyzed and interpreted the results (6. Interpretation) and compared them with their initial assumptions (7. Validation). At the end, the models reflected together on their procedure and considered what they would do differently next time (8. Reflection).

Fig. 3
figure 3

Steps of mathematical hands-on experimentation (based on Geisler, 2021; see also Ganter & Barzel, 2012)

Both videos could neither be stopped nor fasten-forwarded or rewound by the students. In addition, to encourage the students to actively watch the video, there was an interruption of 90 seconds after each of the above steps in which the students had to answer multiple-choice questions related to the previously displayed content, for example: “Which assumptions are mentioned in the video?”.

Measures

Dependent variables

To assess students’ perceived authenticity, we used an adapted version of the questionnaire for a multidimensional assessment of the perception of authenticity in science education (FEWAW) from Finger et al. (2022) which distinguishes between the following four authenticity dimensions: instructor, location, method, and innovation. However, as we intended to assess students’ perceived authenticity during observing models performing a mathematical hands-on experiment rather than during independent experimentation, we adapted the questionnaire to measure students’ perceived authenticity of the observed models, the location where the observed models experimented, the methods the observed models used, and the innovation of the experiment performed by the observed models. Students were asked to rate 13 different statements (e.g., “The observed people in the video are real scientists.”) related to the video-mediated observation on a five-point Likert scale ranging from 1 (completely wrong) to 5 (completely right). The internal consistencies of all four adapted dimensions are acceptable (Cronbach’s α ≥ 0.72).

In order to measure students’ situational interest, we adapted a questionnaire from Lewalter and Geyer (2009) which distinguishes between triggered (i.e., catch dimension) and maintained (i.e., hold dimension) situational interest (see also Hidi & Renninger, 2006). As Lewalter and Geyer (2009) used their questionnaire to assess students’ situational interest after visiting science and technical museums, we slightly adapted the wording of the items for the learning environment of the present study. The students were asked to rate six items of the catch dimension (e.g., “Did the beer foam experiment capture your attention?”) and six items of the hold dimension (e.g., “Would you like to learn more about certain aspects of the beer foam experiment?”) on a five-point Likert scale ranging from 1 (not at all) to 5 (a lot). The internal consistencies of both dimensions were good (Cronbach’s α ≥ 0.86).

To measure students’ knowledge acquisition, we used a self-developed knowledge test with a total of 15 items that tested students’ content knowledge about exponential decay (i.e., reproduction task on the decay of milk foam) and growth (i.e., transfer task on the growth of cress) processes. The items asked the students to reproduce or transfer the knowledge they gained while observing the video modeling example, especially in terms of interpreting and validating. For the most part, the students were asked to first select which answers they thought were correct from given statements and then to explain their choice. For example, the students were shown a graph and five statements describing the decay of milk foam and asked to first select the correct statements and then to explain their choice. The students could achieve a total of 47 points. Two raters coded around 20% of the knowledge test, with satisfactory interrater reliability (ICC = 0.94; 95%-CI [0.93, 0.95]). As the results of a principal axis factor analysis with oblique rotation (direct oblimin) did not confirm the separation between reproduction and transfer task, we did not differentiate between these two types of tasks in the following, but instead considered students’ overall knowledge acquisition. The internal consistency of the 15 items is nearly acceptable (Cronbach’s α = 0.69).

Control variables

Prior to watching the video, we assessed students’ demographics (i.e., gender and age) as well as grades in mathematics, biology, chemistry, and physics as control variables. In addition, we assessed students’ self-concept and interest in the mentioned subjects as control variables as it is hypothesized that these characteristics of learners may influence their perception of authenticity (Betz et al., 2016; see Fig. 1). To assess students’ self-concept, we used a questionnaire from PISA 2012 (Mang et al., 2018). The students were asked to rate five statements each per subject (e.g., “In …, I learn quickly.”) on a four-point Likert scale ranging from 1 (do not agree at all) to 4 (totally agree). The internal consistencies for all four subjects were good (Cronbach’s α ≥ 0.84). Moreover, we used a questionnaire from Rost et al. (2008) to assess students’ interest in mathematics, biology, chemistry, and physics. The students were asked to rate seven statements each per subject (e.g., “I enjoy working on tasks in ….”) on a six-point Likert scale ranging from 1 (does not apply at all) to 6 (totally applies). Again, the internal consistencies for all four subjects were excellent (Cronbach’s α ≥ 0.91). Furthermore, we survey students’ prior experience in the field of mathematical hands-on experimentation as another control variable. The results of an interview study we conducted show that the surveyed students often did not understand the intended meaning of the activities during the different steps of mathematical hands-on experimentation when they are only named (Hagenkötter et al., 2024). Therefore, we assessed students’ prior experience at the end (i.e., after watching the video) by asking them the overarching question how often they have already done activities similar to those observed in the video in mathematics class. We used one item for each step of mathematical hands-on experimentation. For example, we asked the students how often they had made their own assumptions about experiments (1. Assuming) or conducted an experiment (3. Conduction) in their mathematics class. The students answered on a five-point Likert scale from 1 (never) to 5 (very often). The internal consistency of the eight items is good (Cronbach’s α = 0.85).

Procedure

The study was conducted in November and December 2021. Participation lasted about 135 minutes and took place in a separate room at the school. The students participated in the study with their mathematics courses and their mathematics teachers, but worked individually on their own laptops throughout the entire study. All questionnaires and the videos were provided in a computer-based environment.

On the day of the intervention, we first explained the procedure of the study to the students (see Fig. 4). Then the students completed the first questionnaire on their demographics as well as grades, self-concept, and interest in mathematics and the natural sciences. Afterwards, immediately before watching the video, students received a short introduction to the technical features of the video (e.g., no possibility to stop, fasten-forward, or rewind it). We also informed the students about the questions included in the video and the knowledge test that would follow. While watching the video, the students used headphones. After watching the video, the students filled in the second questionnaire on situational interest and perceived authenticity during watching the video. After a break, the students worked on the knowledge test. As the students were not yet familiar with exponential processes before participating in our study and, thus, the beer foam experiment served as an exploration of exponential processes, we only used the knowledge test after the intervention. To complete the knowledge test, the students had a total of 30 minutes. Finally, the students filled in the third questionnaire on their prior experience in mathematical hands-on experimentation.

Fig. 4
figure 4

Overview of the procedure

Results

Preliminary analyses

Prior to our analyses, we tested whether the random assignment resulted in comparable groups. We did not find a statistically significant difference between the groups regarding the aforementioned control variables (i.e., grades, self-concept, and interest in mathematics as well as the natural sciences, and prior experience in the field of mathematical hands-on experimentation), F(8,96) = 1.35, p = 0.228, ηp2 = 0.10. Moreover, the descriptive statistics (see Table 1) indicate that the students who participated in our study had rather few prior experience in the field of mathematical hands-on experimentation.

Table 1 Descriptive statistics of the control variables: Students’ grades, self-concept, and interest in mathematics as well as the natural sciences (i.e., averaged from biology, chemistry, and physics), and prior experience in the field of mathematical hands-on experimentation

Furthermore, we tested whether the students in both conditions differed in their answers to the multiple-choice questions related to content of the video. On the scale totaling to 8 points (i.e., 1 point for each question), students in both conditions achieved an average of 5 points: M(SD)peer condition = 4.96 (1.76), M(SD)scientist condition = 4.80 (1.59). Hence, we did not find a statistically significant difference between the groups regarding their responses to the multiple-choice questions in the video, t(82) = 0.46, p = 0.649, d = 0.10.

Students’ perceived authenticity of the observed models

To test whether students who observe scientist models report higher perceived authenticity of the observed models than students who observe peer models (H1), we conducted a MANCOVA with condition as factor and students’ perceived authenticity as dependent variable. We included students’ self-concept in the natural sciences as covariate due to significant correlation with students’ perceived authenticity of the observed method (r = 0.20, p = 0.045). In line with our H1, the analysis reveals a significant effect of condition on students’ perceived authenticity, F(4,99) = 2.62, p = 0.020 (one-sided), ηp2 = 0.10. We conducted post-hoc univariate ANCOVAs for every adapted authenticity dimension (i.e., perceived authenticity of the observed models, the location where the observed models experimented, the method the observed models used, and the innovation of the experiment performed by the observed models). As expected, the analyses show a significant difference between students’ perceived authenticity of the observed models, F(1,102) = 4.19, p = 0.022 (one-sided), ηp2 = 0.04, but no significant differences between students’ perceived authenticity in any of the other adapted authenticity dimensions (the location where the observed models experimented: F(1,102) = 0.22, p = 0.637 (two-sided), ηp2 < 0.01; method the observed models used: F(1,102) = 0.60, p = 0.439 (two-sided), ηp2 = 0.01; innovation of the experiment performed by the observed models: F(1,102) = 0.25, p = 0.621 (two-sided), ηp2 < 0.01). The descriptive statistics with regard to students’ perceived authenticity are shown in Table 2.

Table 2 Descriptive statistics for students’ perceived authenticity

Students’ situational interest

To test whether students who observe scientist models report higher situational interest than students who observe peer models (H2) and whether the effect of observing models with a different degree of authenticity on students’ situational interest is mediated by their perceived authenticity of the observed models (H3), we conducted two mediation analyses. We used condition as a predictor variable (X) and students’ situational interest (either catch or hold) as an outcome variable (Y). We only consider students’ perceived authenticity of the observed models as a mediator (M) variable in our mediation analyses, as we only found, as expected, differences between the two groups in this adapted authenticity dimension. We included students’ interest in mathematics and the natural sciences as covariates due to significant correlations with students’ triggered (interest in mathematics: r = 0.26, p = 0.007; interest in the natural sciences: r = 0.32, p < 0.001) and maintained (interest in the natural sciences: r = 0.26, p = 0.008) situational interest. We conducted the mediation analyses with 95% percentile bootstrap confidence intervals from 10,000 bootstrap samples using the SPSS macro PROCESS (see Hayes, 2022). Contrary to our H2, the analyses reveal no significant direct effect of the intended model authenticity on neither students’ triggered (see Fig. 5) nor maintained (see Fig. 6) situational interest. As already mentioned with regard to students’ perceived authenticity, the analyses indicate a significant effect of the condition on students’ perceived authenticity of the observed models, which, in turn, significantly affects students’ triggered, but not their maintained situational interest. Furthermore, against our H3, the analyses reveal neither a significant indirect effect of the intended model authenticity on students’ triggered nor maintained situational interest through their perceived authenticity of the observed models (which is indicated by the fact that zero is included in the bootstrap confidence intervals; e.g., Field, 2018).Footnote 1 Table 3 provides the descriptive statistics.

Fig. 5
figure 5

Results of the mediation analysis with triggered situational interest as outcome variable

Fig. 6
figure 6

Results of the mediation analysis with maintained situational interest as outcome variable

Table 3 Descriptive statistics for students’ situational interest

Students’ knowledge acquisition

To test the contradictory hypotheses that, on the one hand, students who observe peer models achieve higher performance on a knowledge test than students who observe scientist models (H4a) and, on the other hand, students who observe scientist models achieve higher performance on a knowledge test than students who observe peer models (H4b) and whether the effect of observing models with a different degree of authenticity on students’ knowledge acquisition is mediated by their perceived authenticity of the observed models (H5), we again conducted a mediation analysis. We used condition as a predictor variable (X), students’ knowledge test performance as an outcome variable (Y), and students’ perceived authenticity of the observed models as a mediator (M) variable. Due to significant correlations with students’ knowledge test performance, we included students’ grades in mathematics (r = -0.24, p = 0.013) and the natural sciences (r = -0.33, p < 0.001) as covariates. Again, we conducted the mediation analysis with 95% percentile bootstrap confidence intervals from 10,000 bootstrap samples using the SPSS macro PROCESS (see Hayes, 2022). Neither in line with our H4a nor H4b, the analysis indicates no significant direct effect of the condition on students’ knowledge test performance (see Fig. 7). Moreover, against our H5, the analysis reveals neither a significant effect of students’ perceived authenticity of the observed models on their knowledge test performance nor a significant indirect effect of the intended model authenticity on students’ knowledge test performance through perceived authenticity of the observed models. The descriptive statistics with regard to students’ knowledge test performance are depicted in Table 4.

Fig. 7
figure 7

Results of the mediation analysis with knowledge test performance as outcome variable

Table 4 Descriptive statistics for students’ knowledge test performance

Discussion

Given that the effectiveness of video modeling examples, which seems to be a particular promising and not too cognitively demanding way to provide students with first authentic insights into mathematical hands-on experimentation, is strongly influenced by the choice of model (e.g., Hoogerheide et al., 2016b; see also, e.g., van Gog & Rummel, 2010), the present paper aimed to investigate the effects of observing models with different degrees of authenticity performing a mathematical hands-on experiment. One the one hand, from a social-cognitive perspective (e.g., Bandura, 1994; Buunk et al., 2003; Hoogerheide et al., 2016a; Schunk, 1987), it may be assumed that, especially for novices, students of the same age with similar expertise (i.e., peers) may be a good choice as model and particularly conducive to learning. On the other hand, from a perspective of authentic learning, the presence of experts is considered being a design element of authentic learning settings (e.g., Nachtigall et al., 2022) which may foster students’ motivational and cognitive learning outcomes (e.g., Betz et al., 2016; Lepper, 1988; Newmann & Wehlage, 1993). Against this background, we compared the effects of video-mediated observation of peer models with video-mediated observation of scientist models performing a mathematical hands-on experiment on students’ perceived authenticity as well as their motivational (i.e., situational interest) and cognitive (i.e., knowledge acquisition) learning outcomes.

As expected (H1), our results showed that students who observed scientist models reported significantly higher perceived authenticity of the observed models than students who observed peer models. As we explicitly varied the model authenticity separately and not different design elements of authentic learning settings in common with the instructor, like in previous studies (e.g., Betz, 2018; Itzek-Greulich & Vollmer, 2017; Itzek-Greulich et al., 2015, 2017), we are able to draw conclusions about the effects of this single design element. Thus, our results support the assumption that the (virtual) presence of experts can be a design element for creating authentic learning settings. Moreover, our results indicate that, in order to increase students’ perceived authenticity, it is not mandatory to film “real” scientists during working on relevant and up to date scientific topics, as, for example, Stamer et al. (2021) did, or to provide students “with opportunities for direct interaction with practitioners of the culture” (Hod & Sagy, 2019, p. 146). Although this can probably foster students’ perception of authenticity even more, due to time constraints, it is often difficult to provide students with direct contact to scientists (e.g., Stamer et al., 2021).

With respect to the hypothesized motivational effects of authentic learning, our results showed neither a direct effect of condition on students’ situational interest (against H2) nor an indirect effect mediated by their perceived authenticity of the observed models (against H3). We merely found a statistically significant effect of students’ perceived authenticity of the observed models on their triggered situational interest. Thus, our findings do not support the assumption that the perceived authenticity of learners relates to motivational effects as hypothesized in the model of authenticity by Betz et al., (2016; see Fig. 1). Hence, our results are in line with the findings of Itzek-Greulich and colleagues (Itzek-Greulich & Vollmer, 2017; Itzek-Greulich et al., 2017) revealing no effect of learning in an out-of-school lab together with a scientist and a lab assistant on students’ situational interest. Unlike Betz (2018), we further did not find a mediating effect of students’ perceived authenticity of the observed models on their situational interest. One possible reason for this could be that Betz (2018) varied several design elements of authentic learning settings simultaneously, namely the instructor together with the learning location. In line with previous research on authentic learning in out-of-school labs indicating a potentially interrelatedness of different characteristics of authentic learning settings (e.g., Nachtigall et al., 2018; Nachtigall & Rummel, 2021), the authenticity level of one feature, namely the location, may have affected students’ perceived authenticity of another feature, namely the instructor. As a result, the combination may have induced the mediating effect. In contrast, the students of the present study participated in their school. Merely students’ perceived authenticity of the observed models, as in our study, does not seem to be sufficient to evoke motivational effects. Furthermore, our results only revealed a mediating effect of students’ perceived authenticity of the observed models on their triggered situational interest if we, like Betz (2018), did not include any covariates in the mediation analysis. However, this effect was no longer evident if we included students’ interest in mathematics and the natural sciences as covariates in the mediation analysis. Thus, contrary to the assumptions of the model of authenticity (Betz et al., 2016), students’ perception of authenticity seems to play a less important role than hypothesized and than other variables (i.e., discipline-specific interest) in fostering their motivational learning outcomes, such as situational interest.

Regarding the expected positive cognitive effects of authentic learning, we found neither a significant direct effect of the condition on students’ knowledge test performance (against both H4a and H4b) nor an indirect effect mediated by students’ perceived authenticity of the observed models (against H5). On the one hand, from a social-cognitive perspective, these findings do not support the assumption that, especially when learners are novices, a high degree of model-observer similarity leads to more cognitive learning outcomes (see, e.g., Bandura, 1994; Buunk et al., 2003; Hoogerheide et al., 2016a; Schunk, 1987). Although the students who participated in the present study had little prior experience in the field of mathematical hands-on experimentation (see Table 1) and were not yet familiar with exponential processes before participating in our study, the observation of students of the same age with similar expertise (i.e., peers) did not lead to a higher knowledge acquisition than the observation of scientists. On the other hand, from an authentic learning perspective, our results are in line with the findings of Itzek-Greulich et al. (2015) showing no effect of learning in an out-of-school lab together with a scientist and a lab assistant on students’ achievement. The absence of difference may be attributed to the fact that students in both conditions performed poorly in the knowledge test (see Table 4). We may, therefore, suspect that the students did not observe the video modeling examples attentively. However, the analysis of students’ responses to the multiple-choice questions related to content of the video showed that the students observed the video modeling examples carefully. Another explanation may be that the students during observing the video modeling examples perhaps did not primarily acquire content knowledge about exponential processes (as required in our knowledge test), but rather other learning outcomes were promoted, such as knowledge about the steps of mathematical hands-on experimentation or inquiry learning in general.

Limitations and future directions

Although we implemented an experimental design and, thus, assigned the participants individually to one of the two conditions, a first limitation of the present study may relate to the small sample size of participating students from only one school. It would be advisable to conduct the present study with a larger sample of students from different schools. Another possible limitation relates to the selection of the models in the scientist condition. As we wanted to make sure that the models are perceived as scientists by the students, we used pictures of stereotypical scientists (i.e., with glasses, older age, more male than female, and a woman without styling; see, e.g., Christidou, 2011; Hagenkötter et al., 2021; Nachtigall & Rummel, 2021) in the scientist condition. However, there are, of course, many scientists to whom these widespread stereotypical attributes do not apply. Future studies should therefore not only investigate the influence of stereotypical scientist models, but also use non-stereotypical scientist models and investigate the effects on students’ perceived authenticity as well as further learning outcomes. Furthermore, as we explicitly varied the intended model authenticity separately, both peer and scientist models carried out the same task, namely mathematical hands-on experimentation on beer foam decay. In contrast to the study by Stamer et al. (2021), in which students observed scientists during their regular work on relevant and up to date scientific topics, this task may not be considered a scientific content and, thus, as typical for “real” scientists. Hence, the fact that the content of the present study is not about scientific practices per se, but especially about mathematical hands-on experimentation, may have led to the scientist models being of less importance regarding students’ perception of authenticity and, consequently, hampered the hypothesized positive effects of observing scientist models. A further shortcoming of the present study relates to the minor differences in the language use between the peer and scientist models. Even though we kept the differences as small as possible, we had to use slightly different contexts (i.e., peer models’ experiences of pouring beer for their parents and their parents’ friends and scientist models’ experiences of pouring beer in general) and adapt the language use in order to create a genuine conversation between the three models, which may have influenced the results. Another limitation of the present study, which should be addressed in future research, relates to the use of video modeling examples as a very strong form of instructional structure. Further research should try to involve students more (and not only by asking them to answer content-related questions in order to actively watch the video) during mathematical hands-on experimentation as an authentic learning activity. This could be achieved, for example, if students discuss what they have seen while observing or if not all steps of mathematical hands-on experimentation are carried out by the models in the video, but the students also carry out steps on their own. As already mentioned, students likely do not only acquire content-related skills, but also process-related skills during mathematical hands-on experimentation. Hence, future research should also examine the effects on students’ performance regarding process-related skills, for example, by evaluating them while performing a mathematical hands-on experiment on their own after observing the video modeling examples.

Conclusion

To conclude, the (virtual) presence of experts can be a design element for creating authentic learning settings. The observation of scientist models of mathematical hands-on experimentation led to a higher perception of authenticity of the observed models by students than the observation of peer models. However, students’ perceived authenticity of the observed models influenced neither their situational interest nor knowledge acquisition. Based on our results, it does not seem to be sufficient to only vary the authenticity of the observed models during learning from video modeling examples in order to evoke the hypothesized positive motivational and cognitive effects of authentic learning (e.g., Betz et al., 2016). Instead, our results support the assumption that different characteristics of authentic learning settings are interrelated (e.g., Nachtigall et al., 2018; Nachtigall & Rummel, 2021). Nevertheless, it is likely that the observation of models perceived as authentic by students, such as scientists, performing a mathematical hands-on experiment may foster further learning outcomes that were not examined in the present paper. For example, against the background of students’ naïve epistemological beliefs about mathematics (e.g., Köller et al., 2000; Schoenfeld, 1992) as well as limited conceptions about the work of mathematical scientists (Hagenkötter et al., 2022), it seems particularly important to foster more adequate conceptions. This may be achieved by students observing (non-stereotypical) scientists performing a mathematical hands-on experiment (and yet not discourage them due to the complexity of working on open mathematical research questions; see, e.g., Ziegler & Loos, 2014). In addition, students’ conceptions about mathematics are probably largely based on their experience in mathematics classes (e.g., Schoenfeld, 1988). As mathematics teachers consider the use of hands-on experimentation in mathematics lessons to be very time-consuming (e.g., Hagenkötter et al., 2024), the video-mediated observation of models of mathematical hands-on experimentation represents an opportunity to increase the use of experimentation in mathematics lessons.