Example-based learning: should learners receive closed-book or open-book self-explanation prompts?

In learning from examples, students are often first provided with basic instructional explanations of new principles and concepts and second with examples thereof. In this sequence, it is important that learners self-explain by generating links between the basic instructional explanations’ content and the examples. Therefore, it is well established that learners receive self-explanation prompts. However, there is hardly any research on whether these prompts should be provided in a closed-book format—in which learners cannot access the basic instructional explanations during self-explaining and thus have to retrieve the main content of the instructional explanations that is needed to explain the examples from memory (i.e., retrieval practice)—or in an open-book format in which learners can access the instructional explanations during self-explaining. In two experiments, we varied whether learners received closed- or open-book self-explanation prompts. We also varied whether learners were prompted to actively process the main content of the basic instructional explanations before they proceeded to the self-explanation prompts. When the learners were not prompted to actively process the basic instructional explanations, closed-book prompts yielded detrimental effects on immediate and delayed (1 week) posttest performance. When the learners were prompted to actively process the basic instructional explanations beforehand, closed-book self-explanation prompts were not less beneficial than open-book prompts regarding performance on a delayed posttest. We conclude that at least when the retention interval does not exceed 1 week, closed-book self-explanation prompts do not entail an added value and can even be harmful in comparison to open-book ones.

Example-based learning is a common and powerful instructional means to introduce learners to new content (e.g., Hoogerheide and Roelle 2020;Van Gog et al. 2019). One highly effective sequence of learning from examples works as follows (see Renkl 2014;Wittwer and Renkl 2010). Firstly, learners are provided instructional explanations that communicate basic knowledge concerning new principles and concepts. Secondly, learners are given examples that illuminate these principles and concepts (e.g., Atkinson 2002;Foster et al. 2018;Schworm and Renkl 2007;. One important activity in learning from this sequence is generating principle-based self-explanations. During this learning activity, students explain the examples by using the content of the basic instructional explanations (e.g., Hausmann and VanLehn 2010;Renkl 2014).
As learners usually do not engage in this learning activity to a sufficient degree of their own accord, one important instructional principle for example-based learning is to encourage learners to self-explain the examples (Renkl 2014). Frequently, this principle is implemented by providing the examples in conjunction with self-explanation prompts (e.g., Atkinson et al. 2003;Conati and VanLehn 2000;Schalk et al. 2020). The benefits of these prompts are evident-a wealth of research clearly shows that self-explanation prompts foster the benefits of example-based learning (for an overview, see Renkl 2014). Nevertheless, self-explanation prompts are not always equally effective, which indicates that their benefits depend on certain factors.
One factor that could substantially moderate the benefits of self-explanation prompts but has scarcely been considered is the format of the self-explanation prompts. Basically, self-explanation prompts can be implemented in two different formats: a closed-book format in which learners cannot reinspect the basic instructional explanations while generating their self-explanations or an open-book format, in which learners have access to the basic instructional explanations while they self-explain the examples. In the present study, we investigated the role of this potential moderator. We predicted that when the retention interval is short, open-book self-explanation prompts would be more beneficial than closed-book prompts and that this effect would be mitigated when the retention interval is long. Furthermore, we predicted that closed-book self-explanation prompts might be even more effective than open-book self-explanation prompts when learners actively processed the main content items of the basic instructional explanations before they engaged in selfexplaining and thus are well-prepared to respond to closed-book self-explanation prompts.
Below, we describe the theoretical rationale behind our predictions and then report two experiments. In both experiments, learners were introduced to new content by means of example-based learning. We manipulated (a) whether learners received closed-or openbook self-explanation prompts, and (b) whether learners were prompted to actively process the main content items of the basic instructional explanations before they received the self-explanation prompts. As the main dependent variables, we used the number of generated self-explanations as well as learners' performance on an immediate and a one-week delayed posttest.

Self-explanation prompts viewed through the lens of generative learning theory
Generative learning activities involve learners manipulating given information or generating new information in order to understand the information that has already been given to them (e.g., Fiorella and Mayer 2016). These activities are theorized to benefit learning because they and when no feedback is provided, learners who practice the retrieval of specific idea units from memory show lower forgetting rates and thus demonstrate better long-term retention than learners who do not. The beneficial effects of retrieval practice are frequently explained by the fact it contributes to the consolidation of memory traces. Two main mechanisms that likely support this consolidation effect are outlined in the episodic context account and the elaborative retrieval account.
The episodic context account proposes that retrieving knowledge items leads to consolidation of the items' representation in memory because the context representation that is stored with the respective items is updated . In order to retrieve certain idea units or knowledge items, learners can reinstate the episodic context in which idea units were initially encoded. When learners are successful in retrieving the idea units, features from the context in which the idea units are currently retrieved are added to the context representation. These newly added context features can be used as further cues to access the idea units in memory on future retrieval occasions. Thus, through the addition of contextual features, the strength of the idea units' mental representation is enhanced. In contrast, the elaborative retrieval account (e.g., Carpenter 2009Carpenter , 2011 proposes that in using certain retrieval cues to search for targeted idea units, several other idea units that are semantically related to the retrieval cues are activated and, as a consequence, more strongly linked to the representation of the targeted idea units. On future occasions these more strongly linked units can serve as additional retrieval routes for the respective idea units. The beneficial effects of retrieval practice are well established. Numerous studies have indicated that engaging in (different types of) retrieval practice fosters learning (e.g., Bae et al. 2018;Butler 2010;Grimaldi and Karpicke 2014;Karpicke and Blunt 2011;Roediger and Karpicke 2006;Rummer et al. 2017). In example-based learning, self-explanation prompts can serve as retrieval practice tasks. More specifically, when self-explanation prompts are provided in a closed-book format, which does not allow learners to review the basic instructional explanations while responding to the prompts, the self-explanation prompts inevitably require learners to retrieve the main idea units of the basic instructional explanations required to explain the examples from memory.
When self-explanation prompts are viewed not only through the lens of generative learning theory but also through the lens of retrieval-based learning theory, at first glance it is reasonable to predict that closed-book self-explanation prompts should be more beneficial than open-book prompts. The rationale behind this prediction is that closed-book prompts engage learners in two types of beneficial learning activities (generative and retrieval practice activities), whereas open-book prompts engage learners only in one type of beneficial learning activity (generative activities). However, recent studies have indicated that this line of argumentation might be too simple.

The downsides of closed-book self-explanation prompts
Learners seldomly retrieve information perfectly (e.g., Dunlosky and Rawson 2015;Rowland 2014). Hence, it is reasonable to assume that learners who receive closed-book self-explanation prompts would not be able to retrieve all of the required knowledge components of the instructional explanations. This, in turn, should hinder the generation of self-explanations. As a result, one could predict that closed-book self-explanation prompts elicit fewer self-explanations than open-book self-explanation prompts. In line with this notion, Blunt and Karpicke (2014) reported that the learners in their study generated fewer idea units when concept mapping and summarizing were prompted in a closed-rather than in an open-book format. Notably, this effect occurred even though the closed-book learners were provided with the opportunity to restudy the learning material at one point during concept mapping. Similarly,  found that implementing adjunct questions designed to prompt learners to elaborate on an expository text in a closed-book format led to a decrease in the number of generated elaborations in comparison to an openbook format (for similar results, see Agarwal et al. 2008;Roelle and Nückles 2019;Waldeyer et al. 2020). Hence, although learning tasks in which generative learning activities and retrieval practice are combined can benefit learning (e.g., Endres et al. 2017;Hinze et al. 2013;Heitmann et al. 2018), in comparison to pure generative learning tasks the combination of generative learning activities and retrieval practice entails certain downsides.
The downsides that come along with imperfect retrieval concern not only the generative part of learning tasks in which generative learning and retrieval practice is combined but also the retrieval practice part. At least when no feedback is provided, the benefits of practicing retrieval substantially depend on the degree to which learners are able to correctly retrieve the required knowledge items (e.g., Karpicke et al. 2014;Rowland 2014). Hence, the benefits of both the generative and the retrieval part of learning tasks should be impaired when learners are not able to successfully retrieve all of the required knowledge items from memory.
Against this background, it can be assumed that closed-book self-explanation prompts would not necessarily be more beneficial than open-book self-explanation prompts. However, given that retrieval practice decreases forgetting rates (see Rowland 2014), it can be predicted that the effects between closed-book and open-book self-explanation prompts change over time. When the retention interval is short (e.g., when a posttest follows immediately after the learning phase), open-book self-explanation prompts might be more beneficial than closed-book prompts because open-book prompts elicit more self-explanations and potential retrieval-driven benefits of closed-book prompts should be relatively low. This potential superiority of open-book prompts should decrease and might even disappear when the retention interval is long because the retrieval-driven benefits of closed-book self-explanation prompts might increasingly outweigh the lower number of generated selfexplanations. These hypotheses are supported by previous studies that investigated effects between closed-and open-book generative learning tasks. Both Agarwal et al.'s (2008) and  findings suggest that closed-book learning tasks catch up to open-book learning tasks over time.
However, at least when complex generative learning tasks (such as self-explanation prompts) were involved, these previous findings also show that a closed-book format did not lead to higher learning outcomes than an open-book format even after a delay (but see Rummer et al. 2019). One potential underlying reason for this lack of superiority of the closed-book format could be the large detrimental effect of a closed-book format regarding the number of elicited generative learning activities. This detrimental effect, however, should substantially depend on learners' processing in the initial study phase (i.e., before they engage in retrieval-based learning activities). With respect to example-based learning, this notion means that the degree to which learners actively process the main content items of the basic instructional explanations (provided in the initial step of the sequence) is crucial for the benefits of closed-book self-explanation prompts. Learners who actively process the main content of the basic instructional explanations should be able to successfully retrieve the required knowledge components when they receive the closed-book self-explanation prompts to a greater extent than learners who do not actively process the main content items beforehand (e.g., Fiorella and Mayer 2016; Kintsch et al. 1990). High retrieval rates, in turn, should increase the benefits of both the retrieval practice part (because retrieval is effective only when it is successful) and the generative part of the self-explanation prompts (because learners are able to generate more self-explanations). It follows that if learners are encouraged to actively process the basic instructional explanations before they receive the closed-book self-explanation prompts, the benefits of a closed-book format should substantially increase; in this case, because they engage learners in (additional) retrieval practice without substantially hindering the generative activity of self-explaining, they might even be more effective than open-book self-explanation prompts, which solely engage learners in the generative activity of self-explaining, regarding performance on a delayed posttest.

Hypotheses
In view of the outlined findings and theoretical considerations, the present study was designed to investigate the role of the format (closed-book vs. open-book) of self-explanation prompts in example-based learning. In terms of learning processes, we hypothesized that when learners were not prompted to actively process the main content items of the basic instructional explanations before they receive the examples and self-explanation prompts, closed-book self-explanation prompts would elicit fewer self-explanations than open-book self-explanation prompts (hypothesis 1a). Under these circumstances, we furthermore predicted that open-book self-explanation prompts would yield higher learning outcomes on an immediate posttest (hypothesis 2a) but not on a delayed posttest (hypothesis 3a).
When the learners were prompted to actively process the main content items of the basic instructional explanations in the initial study phase, we predicted a different pattern of results. In this case, we did not expect a significant difference between closed-and openbook prompts regarding the number of generated self-explanations (hypothesis 1b) and immediate posttest performance (hypothesis 2b). Regarding delayed posttest performance, we expected the closed-book prompts to be more effective (hypothesis 3b).

Sample and design
The participants were N = 97 eighth-grade students from German high schools (57 female). The students were between 12 and 15 years old (M = 13.43, SD = 0.53) and received €10 for their participation. 1 The first language of all participants was German. As the learning material was mainly text-based (see below), participants whose first language was not German were excluded from the experiment.
The participants were introduced to three new topics via example-based learning. Specifically, for each topic the learners in the first step received a basic instructional explanation that was designed to impart basic knowledge concerning new principles and concepts. These basic instructional explanations were provided either with or without prompts that were designed to elicit active processing of the main content of the basic instructional explanations (hereafter: active processing of instructional explanation prompts). In the second step, all learners were provided with two examples of the principles and concepts that were communicated by the basic instructional explanations and two self-explanation prompts that asked the learners to explain the examples using the idea units of the basic instructional explanations. Depending on the experimental condition, the self-explanation prompts were provided in either a closed-book format in which the learners could not review the respective basic instructional explanation while generating their self-explanations or in an open-book format in which the basic instructional explanations were still available while the learners engaged in self-explaining.
Jointly, these manipulations resulted in a 2 × 2 factorial between-subjects design with the factors active processing of instructional explanation prompts (with vs. without) and format of self-explanation prompts (closed-book vs. open-book). The participants were randomly assigned to the conditions.

Materials
A science educationalist (the second author) supervised the design of all materials. All materials were based on, and thus highly similar to, the materials used in .
The computer-based learning environment covered three new topics that were part of the participants' regular curriculum: (a) the Bohr model, (b) the formation of ions (ionization), and (c) ionic reactions. The topics were introduced using the sequence that is established in the field of learning from worked examples (see Renkl 2014). Hence, each topic's introduction began with an instructional explanation providing basic declarative knowledge concerning the new concepts and principles. These basic instructional explanations were provided in text form and were between 163 and 192 words in length (see Fig. 1).
In the conditions with active processing of instructional explanation prompts, each of the basic instructional explanations was combined with a prompt that required the learners to actively process the basic instructional explanations' main content in their own words. More specifically, based on the recommendations by Berthold and Renkl (2010), the active processing of instructional explanations prompts were explicitly designed to focus learners on the most relevant content items, i.e. the content items that are crucial for self-explaining the subsequently provided examples. It is important to highlight that although the active processing of instructional explanations prompts asked the students to respond to the prompts in their own words (e.g., "Based on the text, answer the following question in your own words: How are atoms structured according to the Bohr model?"), they hardly required the learners to go beyond the content of the basic instructional explanations. Rather, the prompts mainly engaged learners in attending to the instructional explanations' main content items. Consequently, the active processing of instructional explanation prompts did not qualify as a type of self-explanation prompts, which usually require learners to go (substantially) beyond the provided information (e.g., by relating the content to one's prior knowledge or by explicating to what extent the respective information entails new insights, see e.g., Chi et al. 1994;Wylie and Chi 2014). The learners typed their prompts responses into text boxes that were positioned next to the basic instructional explanations. In the conditions without active processing of instructional explanation prompts, the students were given the instruction to "Please type any thoughts that come to mind into this text box as you attempt to understand the principles of the Bohr model/the formation of ions/ionic reactions." Albeit this general prompt could be conceived as an active processing prompt as well, its stimulative nature likely was lower than that of the active processing of instructional explanation prompts.
After they had finished processing the basic instructional explanations, the learners proceeded to the two examples of the previously introduced basic content (i.e., principles and concepts). Each of these examples (between 44 and 73 words in length) included one graphic that was based on graphics that are typically used in high-school chemistry textbooks (see Fig. 1). Each example was provided together with a self-explanation prompt that asked the students to explain the respective example by using the content of the basic instructional explanations. For example, the learners had to "Answer the following question based on the principles of the formation of ions you worked on above: Why do sodium atoms form positive ions?".
In the conditions in which the self-explanation prompts were provided in a closed-book format, the basic instructional explanations were hidden once the students proceeded to the examples and self-explanation prompts (see Fig. 1). Thus, the learners in these conditions had to retrieve the idea units that were needed to make sense of the examples from memory. By contrast, in the conditions in which the self-explanation prompts were provided in an open-book format, the basic instructional explanations were still presented on the screen when the learners proceeded to the examples. Thus, these learners were not dependent on retrieval from memory in generating self-explanations. In order to prevent test expectation effects (e.g., Agarwal and Roediger 2011), the learners were not informed about the format of self-explanation prompts in advance. The learners typed their prompts responses into text boxes that were positioned next to the examples. The copy and paste commands were disabled so that the learners had to actively type their text box entries.

Instruments and measures
Pretest: assessment of prior knowledge A pretest assessed the participants' prior knowledge using 10 open-ended questions. Four questions assessed very basic prior chemistry knowledge (e.g., "What is an electron?"). The other six items directly related to the three content topics of the learning environment. For instance, the learners were asked "How are atoms structured according to the Bohr model?" or "Why do sodium atoms form positive ions?".
Using a scoring protocol, two independent raters (blind to the conditions) scored the responses of 20 learners. For each of the questions, the number of correct arguments was determined (1 point per argument; incomplete but correct arguments were awarded 0.5 points). Interrater reliability, as determined by the intraclass coefficient (absolute agreement), was very good for each of the 10 questions (all ICC > .85). Thus, only one rater scored the rest of the written answers. For the later analyses, we summed up the points over all 10 questions (Cronbach's α = .79).
Prompts responses: assessment of learning processes In order to gain insight into the learning processes the learners executed throughout the example-based learning sequence, we separately analyzed the entries in the two types of text boxes (i.e., the text boxes that were provided together with the basic instructional explanations and the text boxes that were provided together with the examples). Regarding instructional explanation processing, the overwhelming majority of learners' text box entries were comprised of restatements of the main content items that were included in the basic instructional explanations (e.g., facts such as "Each atom has a nucleus.", "All atoms have protons." and "Protons are inside the nucleus."). These content items were subsumed under the category covered main content items. There were also some segments in which the learners related the content of the basic instructional explanations to their prior knowledge (e.g., the atom "[…] wants to be moved to the state of a noble gas."). However, segments that fell into this category of integration/elaboration were very rare (< 1% of all segments).
In terms of example processing, most of the text box entries were comprised of relations between the basic instructional explanations and the examples that were generated by the learners during self-explaining. These segments were subsumed under the category of self-explanations. In some instances, the learners referred to knowledge that was not communicated by the basic instructional explanations in their self-explanations (e.g., "Sodium has three electron shells because it is in the third period."). As these references to prior knowledge were very rare (less than one reference per participant), these self-explanations were not differentiated from the self-explanations that solely referred to content of the basic instructional explanations.
Two independent raters analyzed the entries of 20 students. To distinguish between the different content items and interrelations, they used a list of separately coverable content items and interrelations for each basic instructional explanation and for each example. As the examples that were to be explained partly included multiple features that could be related to the basic instructional explanations (e.g., in explaining why sodium atoms react with fluoride atoms at a ratio of 1:1, the learners needed to refer to knowledge components regarding the occupation of electron shells and regarding principles of ionic reactions), it was possible that the learners generated more than one selfexplanation per example. Specifically, each generated interrelation between an example and the previous basic instructional explanations that did not overlap with previously generated interrelations was counted as a self-explanation. In view of the findings that both erroneous processing of basic instructional explanations (e.g., Roelle et al. 2015) and erroneous self-explanations (e.g., Berthold and Renkl 2009) can detrimentally affect learning outcomes, segments that included errors were not scored. 2 Interrater reliability was very good for both the basic instructional explanation and the example processing categories (all ICC > .85).
Posttest: assessment of learning outcomes A posttest assessed the students' conceptual knowledge using 18 open-ended questions (six items per topic). Six of the 18 questions were identical to questions used in the pretest, while the other 12 items were new. Four of the new questions were similar to the self-explanation prompts (e.g., "Why does beryllium form a positive ion?"), whereas the other eight came in new formats such as "This example about the formation of boron ions contains a mistake. Give reasons why this example is incorrect and explain how a boron ion is correctly formed." Based on a scoring protocol that was analogous to the one used in the pretest, two independent raters (blind to the conditions) scored the answers of 20 students. Interrater reliability was very good for each of the 18 questions (all ICC > .85). For the later analyses, the points were summed up over all 18 questions (Cronbach's α immediate = .84; Cronbach's α delayed = .87).

Procedure
The experiment took place in the students' regular classrooms. Throughout the entire experiment, the students worked individually at a computer. First, they took the pretest. Second, they were provided the basic instructional explanations (with or without active processing of instructional explanation prompts) and examples with self-explanation prompts (closed-book or open-book format) regarding the three topics. All learners took a posttest immediately after the learning phase as well as one week later. After they had completed the immediate posttest, the learners were instructed not to deal with the learning content and not to talk to each other about the learning content until the end of the second session (delayed posttest). At the end of the second session, all learners indicated that they had complied with this instruction. The time spent on the posttest was limited to 36 min for both the immediate and the delayed posttest. Table 1 shows the mean scores and standard deviations for the four groups on all measures of the study. An α-level of .05 was used for all tests.

Preliminary analyses
To analyze whether the random assignment had resulted in comparable groups, we first compared the students' pretest scores and chemistry grades (all students were in classes of high-track high schools (i.e., Gymnasium in Germany) and thus the approaches to award grades should be comparable among all participants). We did not find any statistically significant effects of condition, F(3, 93) = 0.22, p = .876, η p 2 = .00, and F(3, 93) = 1.15, p = .331, η p 2 = .03. Hence, the groups did not significantly differ concerning these important learning prerequisites. To reduce error variance, both variables were included as covariates in our subsequent analyses. The assumption of homogeneous within group regression slopes was not violated in any of the analyses. 3

Learning processes
We were interested in whether closed-book self-explanation prompts would elicit fewer self-explanations than open-book self-explanation prompts when the learners were not prompted to actively process the main content of the basic instructional explanations before they proceeded to the examples and self-explanation prompts (hypothesis 1a). Furthermore, we sought to find out whether the potential detrimental effect of closed-book self-explanation prompts would be mitigated when learners were prompted to actively process the main content of the basic instructional explanations beforehand (hypothesis 1b).
In terms of covariates, we found a statistically significant effect of the pretest score but not of the chemistry grade, F(1, 91) = 32.11, p < .001, η p 2 = . 26, and F(1, 91) = 0.24, p = .620, η p 2 = .00. Concerning main effects, we did not find a statistically significant effect of either the format of self-explanation prompts, F(1, 91) = 1.50, p = .219, η p 2 = .01, or the active processing of instructional explanation prompts, F(1, 91) = 0.15, p = .694, η p 2 = .00. However, there was a statistically significant interaction effect, F(1, 91) = 4.36, p = .040, η p 2 = .04. The pattern of the interaction effect is shown in Fig. 2a. Probing the interaction revealed that the closed-book format reduced the number of generated self-explanations for the learners who did not receive active processing of instructional explanation prompts, F(1, 44) = 4.44, p = .041, η p 2 = .09, but not for the learners with active processing of instructional explanation prompts, F(1, 45) = 0.40, p = .526, η p 2 = .00. We also analyzed whether the active processing of instructional explanation prompts actually fostered the degree to which the students actively processed the main content items of the basic instructional explanations before they received the examples and selfexplanation prompts. Regarding the number of covered main content items, the ANCOVA 3 Please note that we did not conduct multilevel analyses because the participants were individually randomly assigned to the conditions and all dependent variables were measured on the individual level. Hence, even if the classrooms from which the participants stemmed would have differed concerning their performance, the pattern of results should not be biased due to the potential influence of the L2-variable classroom. Furthermore, before we addressed our main hypotheses, which were related to the dependent variables generated self-explanations and performance on the immediate and delayed posttest by conducting three separate ANCOVAs, we conducted a MANCOVA that included all three main dependent variables. This MANCOVA revealed statistically significant effects of both covariates, F(3, 75) = 6.78, p < .001, η p 2 = .21 for chemistry grade, and F(3, 75) = 11.45, p < .001, η p 2 = .31 for pretest score, but no statistically significant main effects, F(3, 75) = 1.79, p = .155, η p 2 = .06 for the active processing of instructional explanation prompts, and F(3, 75) = 0.52, p = .667, η p 2 = .02 for the format of self-explanation prompts. By contrast, there was a statistically significant interaction effect, F(3, 75) = 3.05, p = .034, η p 2 = .10. revealed a statistically significant main effect of active processing of instructional explanation prompts, F(1, 93) = 15.98, p < .001, η p 2 = .14. The learners who received the active processing of instructional explanation prompts covered the main content items to a higher extent in their text box entries than their counterparts. The covariate chemistry grade was a statistically significant predictor in this model, F(1, 93) = 4.40, p = .039, η p 2 = .04, whereas the pretest was not a statistically significant predictor, F(1, 93) = 2.62, p = .109, η p 2 = .02. With respect to the degree to which the students related content items of the basic instructional explanations to their prior knowledge, there was no statistically significant effect of the active processing of instructional explanation prompts, F(1, 93) = 0.00, p = .927, η p 2 = .00. Furthermore, none of the covariates entailed a statistically significant effect (both Fs < 1).

Learning outcomes
We predicted that when learners were not prompted to actively process the main content items of the basic instructional explanations before they received the examples and selfexplanation prompts, open-book self-explanation prompts would yield higher learning outcomes than closed-book self-explanation prompts on an immediate posttest (hypothesis 2a). Furthermore, we hypothesized that this effect would be mitigated when the posttest was delayed (hypothesis 3a). When the learners were prompted to actively process the main content of the basic instructional explanations in the initial study phase, we did not expect to find a significant effect of the format of prompts regarding immediate posttest Fig. 2 Interactions between active processing of instructional explanation prompts (with vs. without) and format of self-explanation prompts (closed-book vs. open-book) regarding the number of self-explanations and immediate posttest scores in Experiment 1. Error bars represent standard errors of the means performance (hypothesis 2b). In terms of delayed posttest performance, we assumed that closed-book prompts would be more effective than open-book prompts (hypothesis 3b).
In view of the fact that 14 learners missed the delayed posttest (due to a sports event at one of the schools), which in the above-mentioned mixed ANCOVA reduces the statistical power not only regarding the delayed but also regarding the immediate posttest, we also conducted separate analyses for the immediate posttest and the delayed posttest. Regarding immediate posttest performance, we found statistically significant effects of both the covariate pretest, F(1, 91) = 28.77, p < .001, η p 2 = .24, and the covariate chemistry grade, F(1, 91) = 15.82, p < .001, η p 2 = .14. In terms of main effects, as in the mixed ANCOVA there was no statistically significant effect of the active processing of instructional explanation prompts, F(1, 91) = 0.08, p = .772, η p 2 = .00, and also no statistically significant effect of the format of the self-explanation prompts, F(1, 91) = 1.33, p = .251, η p 2 = .01. However, there was a small but statistically significant interaction effect, F(1, 91) = 4.40, p = .039, η p 2 = .04. The pattern of the interaction is depicted in Fig. 2b. Probing the interaction revealed that the closed-book self-explanation prompts decreased posttest performance for the learners without active processing of instructional explanation prompts, F(1, 44) = 5.69, p = .021, η p 2 = .11, but not for the learners with active processing of instructional explanation prompts, F(1, 49) = 0.35, p = .552, η p 2 = .00. For explorative purposes, in the next step we analyzed whether in the groups without active processing of instructional explanation prompts, the higher immediate posttest performance of the learners with the open-book self-explanation prompts was due to the fact that these learners had generated more self-explanations in the learning phase than their closed-book counterparts. In order to address this question, we performed a mediation analysis using Hayes' (2013) SPSS macro PROCESS. Specifically, we calculated 95% bootstrap percentile confidence intervals of the potential mediation effect from 10,000 bootstrap samples. The results of the mediation analysis are shown in Fig. 3. We found a statistically significant indirect effect via the generated self-explanations, a × b = 3.46 [0.09, 8.31]. Hence, because of the mediating function of the generated self-explanations, the learners in the open-book group had an advantage of 3.46 units over the learners in the closed-book group on the immediate posttest.

Experiment 2
The findings of Experiment 1 indicate that closed-book self-explanation prompts can have detrimental effects as compared to open-book self-explanation prompts. When the learners were not prompted to actively process the main content items of the basic instructional explanations beforehand, the open-book self-explanation prompts were more beneficial in terms of both learning processes (generated self-explanations) and learning outcomes as measured by an immediate posttest (hypotheses 1a and 2a). However, both effects between the closed-and open-book self-explanation prompts decreased and did not reach statistical significance when the learners were prompted to actively process the main content items of the basic instructional explanations before proceeding to the examples and self-explanation prompts (hypotheses 1b and 2b). In terms of delayed learning outcomes, as expected we did not find a significant effect of the prompt format when the learners were not prompted to actively process the main content of the basic instructional explanations (hypothesis 3a). However, contrary to our expectation, the closed-book self-explanations prompts were not superior to the open-book prompts when the learners received active processing of instructional explanation prompts beforehand (hypothesis 3b). The effect sizes and lack of significant interaction effects of the mixed ANCOVA furthermore indicate that the differences concerning the pattern of results between the immediate and delayed posttest were relatively small. Thus, these results thus should be interpreted cautiously.
The results regarding the groups that were not prompted to actively process the main content items of the basic instructional explanations are in line with previous research that compared closed-and open-book generative learning tasks (Agarwal et al. 2008;Waldeyer et al. 2020). More specifically, the mediation analysis indicates that because it detrimentally affects the number of executed generative learning activities (here: self-explanations), a closed-book format detrimentally affects immediate Fig. 3 Results of the mediation analysis in Experiment 1 posttest performance. In terms of delayed posttest performance, the detrimental effect of the closed-book format is substantially decreased; however, even in this case the closedbook format is not superior to an open-book format.
Our results with respect to the groups that were prompted to actively process the main content items of the basic instructional explanations go beyond previous findings. These results reveal that the detrimental effect of a closed-book format regarding the execution of generative learning activities can be reduced through encouraging learners to engage in active processing in the initial study phase (i.e., before learners engage in retrieval). However, even in this case we did not find a significant superiority of the closed-book prompts. At first glance, these findings question whether closed-book self-explanation prompts can be more beneficial than open-book self-explanation prompts in example-based learning. However, the pattern of results regarding delayed posttest performance should be interpreted with caution because it was confounded by the presence of the immediate posttest. Specifically, the immediate posttest can be conceived of as a further retrieval practice activity, which itself affected learning outcomes. As the groups differed regarding their performance on the immediate posttest, it likely biased the pattern of results of the delayed posttest.
Against this background, in Experiment 2 we pursued two main goals. First, in light of recent emphasis on the importance of replicating novel findings (e.g., Maner 2014; Simons 2014), we aimed at replicating the results of Experiment 1 regarding learning processes (generation of self-explanations) with a second sample of eighth-grade high school students. Second, we aimed at testing the prediction that closed-book self-explanation prompts would be superior to open-book self-explanation prompts when the posttest was delayed and when learners were prompted to actively process the main content items of the basic instructional explanations beforehand (hypothesis 3b) without confounding through an immediate posttest.

Sample and design
We recruited N = 124 eighth-grade students from different German high schools (88 female) as participants for Experiment 2 (as in Experiment 1, students whose first language was not German were excluded from the experiment). They were between 12 and 14 years old (M = 13.24, SD = 0.44) and received €10 for their participation. 4 As in Experiment 1, the learners were randomly assigned to one condition of a 2 × 2 factorial between-subjects design with the factors active processing of instructional explanation prompts (with vs. without) and format of self-explanation prompts (closed-book vs. open-book).

Materials
We used the same learning materials as in Experiment 1.

Instruments and measures
Pretest: assessment of prior knowledge The pretest was identical to Experiment 1. Two raters scored the answers of 20 students (all ICC > .85). In view of the high interrater reliability, only one rater scored the rest of the written answers. The points were summed up over all 10 questions for the later analyses (Cronbach's α = .71).
Prompts responses: assessment of learning processes The students' text box entries were examined using the same procedure as in Experiment 1 with the only exception that the category elaboration/integration was not used for instructional explanation processing because there were no segments that fell into this category. 5 Interrater reliability was very good for both the instructional explanation and the example processing (all ICC > .85).
Posttest: assessment of learning outcomes The posttest was identical to the one used in Experiment 1. Interrater reliability was very good for each of the 18 questions (all ICC > .85). For the later analyses, the points were summed up over all 18 questions (Cronbach's α = .80).

Procedure
Except for the fact that there was no immediate posttest, the procedure was identical to Experiment 1. At the end of the first session, the learners were instructed not to deal with the learning content and not to talk to each other about the learning content until the end of the second session. At the end of the second session, all learners indicated that they had complied with this instruction. Due to illness, 10 participants missed the delayed posttest. Table 2 shows the mean scores and standard deviations for the four groups on all measures of the study. An α-level of .05 was used for all tests.

Preliminary analyses
Prior to addressing our hypotheses, we first tested whether there were differences between the groups with regard to their pretest scores and chemistry grades. In terms of both variables, we did not find a statistically significant effect of condition, F(3, 120) = 1.22, p = .302, η p 2 = .03, and F(3, 120) = 1.24, p = .287, η p 2 = .03. Thus, there were no significant differences between the groups with respect to these important learning prerequisites. As in Experiment 1, both pretest score and chemistry grade were included as covariates in the subsequent analyses. For all analyses, the assumption of homogeneous within group regression slopes was not violated. 6

Learning processes
As in Experiment 1, we were interested in whether closed-book self-explanation prompts would elicit fewer self-explanations than open-book self-explanation prompts when the learners were not prompted to actively process the main content items of the basic instructional explanations before they proceeded to the examples and self-explanation prompts (hypothesis 1a). Moreover, we sought to find out whether the potential detrimental effect of closed-book self-explanation prompts would be mitigated when learners received active processing of instructional explanation prompts beforehand (hypothesis 1b).
In terms of covariates, the ANCOVA revealed neither a statistically significant effect of pretest, F(1, 118) = 0.45, p = .500, η p 2 = .00, nor of chemistry grade, F(1, 118) = 0.50, p = .481, η p 2 = .00. Concerning main effects, there was no statistically significant main effect of the format of self-explanation prompts, F(1, 118) = 2.76, p = .099, η p 2 = .02, and also no statistically significant main effect of active processing of instructional explanation prompts, F(1, 118) = 0.47, p = .493, η p 2 = .00. However, there was a marginally significant interaction effect, F(1, 118) = 3.83, p = .053, η p 2 = .03. Although the interaction effect was merely marginally significant, for explorative purposes we analyzed whether the interaction pattern (see Fig. 4a) corresponded with the pattern that was found in Experiment 1. For the learners without active processing of instructional explanation prompts, the closed-book self-explanations significantly decreased the number of self-explanations, F(1, 58) = 7.65, p = .008, η p 2 = .11; this effect was not found for the learners with active processing of instructional explanation prompts, F(1, 58) = 0.01, p = .903, η p 2 = .00. We also analyzed whether and to what extent the active processing of instructional explanation prompts increased the degree to which the learners actively processed the main content items of the basic instructional explanations before they received the examples and self-explanation prompts. An ANCOVA showed that the active processing of instructional explanation prompts substantially increased the extent to which the learners covered the main content items of the basic instructional explanations in their text box entries, F(1, 120) = 26.27, p < .001, η p 2 = .18. The covariates pretest score and chemistry grade were not statistically significant predictors in this model (both Fs < 1).

Learning outcomes
Regarding learning outcomes, we assumed that when learners were not prompted to actively process the main content items of the basic instructional explanations before they receive the examples and self-explanation prompts, there would be no significant difference 6 Similar to Experiment 1, before we addressed our main hypotheses, which were related to the dependent variables generated self-explanations and performance on the delayed posttest by conducting separate ANCOVAs, we conducted a MANCOVA that included both main dependent variables. This MANCOVA revealed no statistically significant effects of the covariates, F(2, 107) = 0.64, p = .526, η p 2 = .01 for chemistry grade, and F(2, 107) = 0.33, p = 720, η p 2 = .00 for pretest score. Further, there was a statistically significant main effect of active processing of instructional explanations prompts, F(2, 107) = 12.71, p < .001, η p 2 = .19, but not of the format of self-explanation prompts, F(2, 107) = 1.62, p = .202, η p 2 = .02. The interaction effect between the two factors was statistically significant, F(2, 107) = 3.52, p = .033, η p 2 = .06.
between open-and closed-book self-explanation prompts in terms of delayed posttest performance (hypothesis 3a). However, when the learners were prompted to actively process the main content items of the basic instructional explanations in the initial study phase, we assumed that closed-book prompts would be superior (hypothesis 3b). With respect to covariates, the ANCOVA indicated neither a statistically significant effect of the chemistry grade, F(1, 108) = 0.05, p = .809, η p 2 = .00, nor of the pretest score, F(1, 108) = 0.65, p = .420, η p 2 = .00. Regarding main effects, we found a statistically significant effect of active processing of instructional explanation prompts, F(1, 108) = 8.99, p = .003, η p 2 = .07. The learners who received active processing of instructional explanation prompts achieved higher scores than the learners who did not receive active processing of instructional explanation prompts. There was no statistically significant main effect of the format of self-explanation prompts, F(1, 108) = 0.33, p = .563, η p 2 = .00. However, there was a statistically significant interaction effect, F(1, 108) = 7.10, p = .009, η p 2 = .06. The pattern of the interaction effect is shown in Fig. 4b. Probing the interaction showed that the interaction was due to the fact that the closed-book self-explanation prompts significantly decreased posttest performance for the learners without active processing of instructional explanation prompts, F(1, 53) = 11.85, p = .001, η p 2 = .18, but not for the learners with active processing of instructional explanation prompts, F(1, 53) = 1.45, p = .232, η p 2 = .02.

General discussion
In the present study, we investigated the role of the format (closed-book vs. open-book) of self-explanation prompts in example-based learning. We predicted that closed-book self-explanation prompts would hinder self-explanations when learners were not prompted to actively process the main content items of the basic instructional explanations before they received the examples and self-explanation prompts, which should detrimentally affect immediate but not necessarily delayed posttest performance. When the learners were prompted to actively process the main content items of the basic instructional explanations in the initial study phase, we did not predict a significant difference between closedand open-book prompts regarding self-explanations and immediate posttest performance. Rather, in this case we predicted a superiority of the closed-book self-explanation prompts concerning delayed posttest performance.

Closed-vs. open-book self-explanation prompts: effects on learning processes
When there were no prompts that elicited active processing of the main content items of the basic instructional explanations that were provided in the first phase of the examplebased learning sequence, closed-book self-explanation prompts reduced the number of generated principle-based self-explanations in both experiments (hypothesis 1a). One explanation for this result is that the learners who received the closed-book self-explanation prompts were not able to successfully retrieve all of the required knowledge components of the basic instructional explanations when they engaged in self-explaining. Hence, in comparison to the learners in the open-book groups who could review the basic instructional explanations and thus were not dependent on successful and complete retrieval while self-explaining, the closed-book learners were at a disadvantage when it came to establishing interrelations between the basic instructional explanations and examples. This pattern of results is consistent with previous findings regarding the effects of the format (open-book vs. closed-book) of generative learning tasks (see Agarwal et al. 2008;Blunt and Karpicke 2014;Waldeyer et al. 2020).
In both experiments, we also found that the detrimental effect of the closed-book self-explanation prompts was substantially mitigated when the learners were prompted to actively process the main content items of the basic instructional explanations that were provided in the first step of the example-based learning sequence (hypothesis 1b). An explanation for this finding, which goes beyond previous research, is that the active processing that was elicited by the prompts enhanced the quality of the mental representations that the learners formed when processing the basic instructional explanations. Based on generative learning theory, it is reasonable to assume that these enhanced mental representations fostered learners' retention and understanding of the main content of the basic instructional explanations (e.g., Fiorella and Mayer 2016;Kintsch et al. 1990;Roelle and Nückles 2019). Hence, by fostering active processing, the prompts decreased the forgetting rates of the knowledge components of the basic instructional explanations which, in turn, helped the closed-book learners to generate nearly the same number of principle-based self-explanations as their open-book counterparts. Notably, these consistent results were found even though the samples in Experiment 1 and 2 substantially differed in terms of prior knowledge. Although implicitly already reflected in the fact that the assumption of homogeneous within group regression slopes was not violated in any of the analyses for the covariate pretest score, this result suggests that prior knowledge does not seem to be a substantial moderator of the effects of the format of self-explanation prompts in example-based learning.
It is important to highlight, however, that the (marginally) significant interactions regarding the generation of self-explanations that were found in both experiments likely were due in part to an unexpected side-effect of the active processing of instructional explanation prompts for the learners with open-book self-explanation prompts. More specifically, an inspection of the interaction patterns in both experiments suggests that the active processing of instructional explanation prompts detrimentally affected the generation of self-explanations on part of the open-book learners (see Figs. 2a,4a). One explanation for this pattern of results could be that responding to the active processing of instructional explanation prompts was exhausting, which might have led the learners who received open-book self-explanation prompts to respond to the self-explanation prompts in a relatively economic manner. Consequently, in comparison to their counterparts without active processing of instructional explanation prompts, they did not exploit the full potential of the open-book self-explanation prompts.
Against this background, the lack of effect between the closed-and open-book selfexplanations prompts for the groups that received active processing of instructional explanation prompts beforehand should be interpreted with caution. Future studies should test whether active processing of instructional explanation prompts would also be sufficient to prevent detrimental effects of closed-book self-explanation prompts when learners are allowed to take breaks in order to recover from the potential exhaustion that is due to responding to active processing of instructional explanation prompts. In these future studies, it would furthermore be useful to use larger sample sizes and conduct a priori power analyses, which were lacking in the present study. In terms of the low to medium interaction effects regarding generated self-explanations, the power of our experiments to detect these effects was relatively low, which reduces the interpretability of the respective findings.

Closed-vs. open-book self-explanation prompts: effects on learning outcomes
In terms of learning outcomes, our findings did not fully support our hypotheses. More specifically, in line with hypothesis 2a we found that when the learners were not prompted to actively process the main content items of the basic instructional explanations that were provided in the first step, the closed-book self-explanation prompts yielded lower performance on the immediate posttest than the open-book self-explanation prompts. However, contrary to hypothesis 3a, this inferiority of the closed-book prompts was not consistently mitigated when the posttest was delayed; at least when there was no confounding through an immediate posttest, the closed-book learners were outperformed by the open-book learners regarding delayed posttest performance (Experiment 2).
One explanation for this pattern of results is that the engagement in retrieval practice on part of the closed-book learners did not fall on fertile ground. As these learners generated fewer self-explanations than their counterparts, their mental representations of the learning content likely were of relatively low quality after the learning phase. Consolidating these deficient mental representations might have entailed relatively little benefit regarding learning outcomes, which resulted in an inferiority of the closed-book learners even at the delayed posttest. An alternative explanation could be that the retention interval that was used in the present study was too short. For instance, Rummer et al.'s (2017; see also Rummer et al. 2019) results suggest that when retrieval practice tasks are compared to relatively strong control conditions (such as an open-book equivalent), their benefits might be found only after longer delays (e.g., after two weeks). Hence, in future studies it might be fruitful to test whether closed-book self-explanation prompts-even when they are not preceded by active processing of instructional explanation prompts-might entail beneficial effects in comparison to open-book prompts after longer retention intervals.
When the learners were prompted to actively process the main content of the basic instructional explanations beforehand, we found a different pattern of results. Specifically, when there was no confounding through an immediate posttest, we found that the closed-book group was not outperformed by the open-book group regarding delayed posttest performance in Experiment 2 (hypothesis 3b). One explanation for this pattern of results is that the active processing of instructional explanation prompts prevented the closed-book learners from generating fewer self-explanations than their open-book counterparts. Consequently, the quality of the mental representations after the learning phase should have been equal for the closed-and the open-book learners. Yet even under these circumstances, the additional retrieval on part of the closed-book learners scarcely paid off. As mentioned above, one explanation for why the benefit of the closed-book format was relatively small and did not reach statistical significance even when the closed-book learners did not generate fewer self-explanations than their openbook counterparts could be that the retention interval was too short. A further explanation could be that the delay between processing the basic instructional explanations and responding to the self-explanation prompts was too short as well. In the retrieval-based learning literature, there is evidence which suggests that the benefits of engaging in retrieval practice increase with an increasing delay between the initial study phase and the retrieval phase because the temporal context hardly changes when the delay is short. Consequently, little context updating takes place and thus the process of retrieval hardly contributes to building distinctive context cues that can be used to retrieve the information in the future (see Karpicke et al. 2014). Hence, in future studies it could be useful to analyze the benefits of closed-book self-explanation prompts in settings with longer delays between processing the basic instructional explanations and self-explaining the examples.
In addition to exploring the outlined potential means to enhance the benefits of closedbook self-explanation prompts, it could also be fruitful to investigate potential optimizations of open-book self-explanations prompts as well. In the present study, although the copy and paste commands were disabled, the open-book learners could nevertheless exactly type or closely paraphrase content items of the basic instructional explanations. In view of the finding that paraphrasing is less beneficial than engaging in deep-oriented generative learning activities (e.g., Hausmann and VanLehn 2010), a consequence of the learners' engagement in exactly typing and paraphrasing could be that they did not exploit the full potential of the open-book self-explanation prompts. Informing learners about the low effectiveness of copying and paraphrasing during self-explaining could be a viable means to reduce their engagement in these activities and could thus further enhance the benefits of open-book self-explanation prompts. Future studies should address this potential optimization of open-book self-explanation prompts.
A further limitation that should be addressed in future research relates to the prompts that were designed to elicit active processing of the basic instructional explanations' main content. As stated in the Method section of Experiment 1, these prompts were merely designed to engage learners in attending to the basic instructional explanations' main content items but scarcely required learners' to deeply engage with the provided content.

3
Consequently, the learners in the groups with and without active processing of instructional explanations prompts differed only concerning the quantity of covered content items but not concerning the types of learning processes, for the learners without the active processing of instructional explanation prompts engaged in active processing of the instructional explanations content as well to some degree (i.e., ca. 10-13 covered idea units, see Tables 1 and 2). In view of this pattern of results and the finding that prompts that require learners to generate inferences or elaborations and thus go beyond the provided information can be substantially more effective than active processing prompts that merely require attending processes (e.g., Chi 2009;Roelle et al. 2015), our active processing of instructional explanations prompts likely were suboptimal. Although the prompts were sufficient to substantially decrease the detrimental effect of closed-book self-explanation prompts concerning the generation of self-explanations, it would be interesting to test whether the example-based learning sequence that was used in the present study could be further optimized by increasing the depth of the processing of the basic instructional explanations.

Conclusions
The format of self-explanation prompts clearly matters. When learners are not encouraged to actively process the main content items of the basic instructional explanations before they receive the examples and self-explanation prompts, open-book self-explanation prompts elicit more self-explanations than closed-book self-explanation prompts, which can beneficially affect both immediate posttest performance (Experiment 1) and delayed posttest performance (Experiment 2). By contrast, when learners are encouraged to actively process the main content of the basic instructional explanations beforehand, open-book and closed-book self-explanation prompts do not significantly differ concerning the number of elicited self-explanations. However, even in this case openbook self-explanation prompts are not less beneficial than closed-book self-explanation prompts concerning learning outcomes, at least when the retention interval does not exceed one week. We conclude that instructors should provide learners with openbook prompts for self-explanation because in case of sufficient processing of the basic instructional explanations they are not less beneficial and in case of insufficient processing of the basic instructional explanations they are more beneficial than closed-book self-explanation prompts.

Informed consent Written informed parental consent was given for all participants.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.