I review six popular GLSs that are used to improve students’ learning: generating concept maps, generating explanations, generating predictions, generating questions, generating answers (i.e., practice testing), and generating drawings. Each review consists of three sections: The first section briefly describes how the technique is commonly implemented and how it is supposed to improve students’ learning. The second section reviews the available evidence regarding the technique’s effectiveness in improving learning in various age groups. To provide a balanced evaluation of the evidence base, I relied on existing meta-analyses whenever possible. The third section reviews the literature concerning age-related differences in the strategy’s effectiveness. While this review is clearly influenced by the evaluation provided in the second section, the third section particularly reviews studies that tested learners of various ages. It, thus, aims at evaluating evidence for an effectiveness trajectory of a particular strategy.
Generating Concept Maps
How Is it Supposed to Work?
Concept maps depict the hierarchy of and relations among concepts (for a toy example of a concept map, see Fig. 1). This technique is based on Ausubel’s (1960) assimilation theory of cognitive learning and was developed by Novak and colleagues as an instructional tool to represent knowledge structures and to promote meaningful learning (e.g., Novak 1990). According to Ausubel, meaningful learning can be achieved by anchoring new ideas or concepts with previously acquired knowledge. Concept maps thus serve the goal of activating relevant prior knowledge and thereby providing an “optimal anchorage for the learning material” (Ausubel 1960, p. 271).
Concept maps come in various forms and can be used in various ways (for an overview, see O’Donnell et al. 2002). They can range from being entirely pre-generated (typically by the teacher) to being entirely student generated. Because the focus of this review is on generation by students, it is limited to studies in which students at least partly generated the maps themselves; that is, they either modified a map they were given or generated the map entirely by themselves. Furthermore, in line with the constructivist view of learning, meta-analyses suggest that self-generating a concept map is more effective than studying a provided concept map (Nesbit and Adesope 2006; Schroeder et al. 2018; but see Horton et al. 1993). The implementation of this technique also varies in when it is employed. In the tradition of Ausubel (1960), concept maps are often used as advance organizers, intended to activate relevant prior knowledge so as to prime students for learning new information. Additional uses, not targeted by this review, are as an activity-closing summary and as an evaluation to test students’ knowledge.
For What Ages Has Its Effectiveness Been Demonstrated?
Several reviews and meta-analyses indicate that generating concept maps is generally beneficial for learning (e.g., Horton et al. 1993; Nesbit and Adesope 2006; Novak 1990). The most recent and comprehensive meta-analysis by Schroeder et al. (2018), which clearly distinguished self-generated and provided concept maps, revealed that the mean size of the effect of self-generated concept mapping on students’ achievement scores was 0.72 SDs. As can be expected, this beneficial effect was greater for studies that compared generating concept maps to passive activities such as hearing a lecture (1.05 SDs) than for studies that included active control conditions such as summarizing (0.48 SDs). The meta-analysis also looked at whether children’s grade (a proxy for age) influenced the effectiveness of having them generate concept maps. Results indicated consistent beneficial effects across studies performed with 4th–8th gradersFootnote 2 (0.68 SDs), 9th–12th graders (0.74 SDs), and college students (0.73 SDs).
The youngest children in the studies included in the meta-analyses were fourth-graders. What about children below fourth grade? Novak (1990, p. 37) writes that, based on his experience, concept maps are successful only in children from grade 4 onwards, but no empirical data were provided to bolster this claim. Furthermore, generating concept maps that contain written language is, of course, constrained by children’s ability to write, which makes this technique unlikely to work in young children. In summary, results indicate that there is good evidence that letting students generate concept maps is generally effective from grade 4 onwards, while there is a clear lack of evidence regarding its effectiveness in younger students.
Are There Age-Related Differences in Its Effectiveness?
Concept maps can be implemented in various ways and with varying instructional support, which may impact their effectiveness in different age groups. Direct evidence for this idea has been provided in a study comparing concept mapping among college and high-school students, which revealed a map coherence × age group interaction (Gurlitt and Renkl 2008). University students profited more when they received a low-coherence map that required them to create and label lines and thus called for more self-organization of the material being learned, whereas high-school students profited more when they received a high-coherence map that required only labeling existing lines, and thus provided a highly structured learning environment and more support for the learner. Differences in prior knowledge between the groups were not assessed in the present study, but are likely to have contributed to the observed interaction. Nevertheless, the study strongly indicates that learners of different ages need different amounts of support for concept-map generation to be maximally effective.
Further support for this conclusion was provided in a study with late-elementary-school children that varied the level of support provided during concept mapping (Karpicke et al. 2014). Children in the no-support condition (experiment 1) had to generate the entire concept map by themselves, which led to poor concept maps and did not benefit their performance on a later test in which they had to retrieve key concepts of the target material. Children who only had to fill in parts of an existing concept map (experiments 2 and 3) did significantly better than children in the no-support condition and children who did not work on concept maps at all.
To conclude, existing evidence suggests that—among students grade 4 and older—concept-map generation can be similarly effective in students of different ages. However, demands on prior knowledge, working memory, and self-regulation during mapping vary strongly depending on the complexity of the concept and on the support given by pre-existing structures. Initial evidence suggests that a suboptimal implementation of the technique for a particular age group can render it ineffective, and that younger learners generally need more support than older learners. There is a clear need for further studies that systematically vary the amount of information to be generated within age groups and then compare the results across different age groups.
Generating Explanations
How Is It Supposed to Work?
Researchers have suggested that asking students to generate explanations during learning activates relevant prior knowledge and facilitates integration and organization of new information (Chi et al. 1994; Pressley et al. 1992; Renkl et al. 1998). This strategy further encourages elaboration of the new information, such as processing similarities and differences among elements of the to-be-learned content (Dunlosky et al. 2013). Such processing has been shown to be beneficial for memory (Hunt and Lamb 2001). Furthermore, asking students to provide explanations is supposed to prompt them to generate inferences that go beyond the given information and to revise their mental models (Chi 2000). Generating explanations, thus, requires reasoning abilities in that additional information has to be inferred or deducted on the basis of prior knowledge.
The strategy of student-generated explanations can be implemented in various formats. Most research on this strategy has focused either on elaborative interrogation or on self-explanation. Although both involve answering questions during learning, elaborative interrogation refers to explaining something to other people and focuses on “Why?” questions that are directly related to understanding the phenomenon at hand (see Pressley et al. 1992). Self-explanation refers to explaining something to oneself and includes a much wider array of questions, ranging from simple “Why?” questions, as used in elaborative interrogation, to metacognitive questions about the learning process (Dunlosky et al. 2013). Typically, self-explanation is performed silently, such as during silent reading (e.g., Chi et al. 1989; McNamara 2004). Because the focus of this review is on GLSs, I discuss only those self-explanation studies that used questions intended to prompt reflection on the to-be-learned content. In this case, self-explanation requires cognitive mechanisms similar to those required by elaborative interrogation, which is why I consider the two together. However, it deserves mentioning that explaining something to other people may be more demanding than explaining something to oneself (Pressley et al. 1992).
For What Ages Has Its Effectiveness Been Demonstrated?
Beneficial effects of both elaborative interrogation and self-explanation have been reported for various student groups and under various learning conditions (for an overview, see Dunlosky et al. 2013). A recent meta-analysis on self-explanation (Bisra et al. 2018) that did not include unpublished studies revealed a mean effect size of 0.55 SDs when conditions in which participants were instructed to explain content to themselves were compared with conditions in which they were not instructed to do this. Focusing on the domain of mathematics instruction, a recent meta-analysis of the literature on self-explanation prompts revealed a mean effect size of 0.33 SDs for short-term improvement in conceptual knowledge (Rittle-Johnson et al. 2017). However, less is known about the persistence of these effects, and only few studies equated time on task (cf. Rittle-Johnson et al. 2017). When considering only studies in which time on task was equated, the meta-analysis by Bisra et al. (2018) found an effect size of 0.41 SDs.
Bisra et al. (2018) reported a slightly reduced effect size of this strategy for elementary-school students (0.48 SDs, based on 10 studies) and high-school students (0.43 SDs, based on 13 studies) compared with university students (0.61 SDs). Most of the studies included in this comparison did not control for time on task, however, which indicates that effect sizes are overestimated. In the meta-analysis by Rittle-Johnson and colleagues (2017) on mathematics instruction, none of the studies that were performed with elementary-school children revealed lasting beneficial effects on conceptual knowledge. In summary, while asking students to generate explanations has proven beneficial across a wide variety of learning situations, evidence is mixed as to how much elementary-school children benefit from it.
Are There Age-Related Differences in Its Effectiveness?
Because generating an explanation for novel phenomena requires at least some reasoning abilities, it could be expected to disadvantage younger students. In short, it is not before around age 9 that children are consistently able to generate inferences based on deep structural similarity (Gentner and Toupin 1986), and this inability should hamper their ability to generate explanations based on underlying properties of to-be-learned material. However, laboratory experiments with children as young as 3 years of age (e.g., Legare and Lombrozo 2014) have demonstrated a beneficial effect of this strategy on conceptual understanding. Furthermore, in classroom studies with elementary-school children, beneficial effects of explanation have repeatedly been observed (e.g., Pine and Messer 2000; for overviews, see Dunlosky et al. 2013; Pressley et al. 1992).
Turning to studies that included a wider age range, Rittle-Johnson (2006) compared effects of self-explanation and direct instruction in a sample of third-, fourth-, and fifth-graders and found comparable improvements in conceptual knowledge for the two instructional methods. In a sample of second-, third-, and fourth-graders, Mceldoon et al. (2013) compared a self-explanation condition with two conditions in which students had to solve additional problems instead of generating explanations. In one of these comparison conditions, time on task was equated by letting children solve additional practice problems, and in the other comparison condition, it was not. Whereas the self-explanation condition yielded larger gains in conceptual knowledge than the non-time-equated control condition, there was no difference in gains between the self-explanation condition and the time-equated control condition. It deserves mentioning that solving additional practice problems can be considered a generative activity as well and, thus, constitutes a rigorous control condition. One could, thus, interpret this finding as providing some evidence suggesting that generating explanations can be effective relative to non-generative activities already in elementary school children.
Turning to research on elaborative interrogation, a study in fourth- to eighth-graders (Wood et al. 1990) reported beneficial effects of elaborating on “Why?” questions related to a text relative to control conditions in which children just read the text or were provided with elaborative answers. While the study found a strong link between students’ benefit from elaborative interrogation and the quality of their generated answer, which is likely higher in older students, it did not analyze potential age-related differences. Studies that investigated younger children did not find benefits of elaborative interrogation (Miller and Pressley 1989; Wood et al. 1993), which led Pressley et al. (1992, p. 103) to conclude that the beneficial effects of elaborative interrogation likely increase from early childhood to adulthood, and that this might be due to age-related increases in the extent and accessibility of prior knowledge. In conclusion, the available literature suggests an age-related increase in the benefit from generating explanations. This explanation remains clearly speculative, however, because of the lack of research directly comparing results obtained with the same task in different age groups.
Generating Predictions
How Is It Supposed to Work?
In this technique, teachers ask learners to generate a prediction (often called hypothesizing in science classes) about a specific fact or outcome before providing them with the to-be-learned information. Generating a prediction requires accessing prior knowledge and connecting it to the new information being learned (Schmidt et al. 1989). Furthermore, generating a prediction may stimulate curiosity for the correct answer (Brod and Breitwieser 2019; Potts et al. 2019) and if the correct answer is different from the prediction, learners experience surprise (Brod et al. 2018). Both curiosity and surprise are epistemic emotions that are supposed to lead to increased attention to the to-be-learned information, which strengthens learning (D’Mello et al. 2014). Whereas generating a prediction requires retrieving prior knowledge, learning from a prediction requires processing feedback on that prediction (Vaughn and Rawson 2012). Monitoring feedback on a prediction has been shown to require at least basic executive-function abilities (Brod et al. 2019).
For What Ages Has Its Effectiveness Been Demonstrated?
Asking students to generate predictions has been successful in classroom studies that have investigated ways to improve students’ learning in various fields of study, including learning from text (Fielding et al. 1990; Palinscar and Brown 1984), physics (Champagne et al. 1982; Inagaki and Hatano 1977; Liew and Treagust 1995), and biology (Schmidt et al. 1989). In addition, asking students to generate predictions is a component of teaching with audience response systems, which have been shown to enhance students’ engagement in university lectures and learning from them (e.g., Crouch et al. 2004). However, these classroom studies have often used generating predictions along with other tasks that were intended to foster learning, which makes it difficult to tell how much of the observed benefit was uniquely due to predicting. A recent laboratory study that examined the specific effects of generating predictions among university students revealed that this strategy led to greater learning of geography than did another generative activity (Brod et al. 2018). The lack of a control condition that did not engage in generative activities means that the present study does not allow an assessment of the effectiveness of generating predictions relative to more passive learning, however. This heterogeneity in studies might further explain why there is no meta-analysis available that would allow drawing conclusions regarding the general effectiveness of generating predictions.
In contrast to other GLSs, however, many of the studies on the effects of generating predictions have been performed with children and adolescents. Inagaki and Hatano (1977) taught fourth-graders about the conservation law in physics and observed beneficial effects of letting them generate predictions first. Similar beneficial effects were revealed in studies on text comprehension (Fielding et al. 1990; Palinscar and Brown 1984), which were conducted with third-graders and seventh-graders, respectively. Brod and colleagues (Brod et al. 2019) tested third- to fifth-graders on a belief-revision task and found that a prediction-generation condition improved belief revision compared with a no-generation control condition. Similarly, a facts-learning study with second-graders revealed memory benefits of generating a prediction before seeing the correct answer relative to control conditions without the prior prediction (Marsh et al. 2012). In summary, asking students to generate predictions has proven beneficial across a wide variety of ages and learning situations relative to passive control conditions. However, the lack of meta-analyses means that the robustness of the effect as well as the role of potential moderators such as prior knowledge or learning material is currently unclear.
Are There Age-Related Differences in Its Effectiveness?
Metcalfe and Kornell (2007) compared sixth-graders and college students on a definition-learning task in which children had to generate (mostly incorrect) definitions before seeing the correct one. Similarly for both age groups, this prediction condition yielded better learning than control conditions in which the correct definition was presented at the same time as the word or in which no correct definition was presented (i.e., no feedback on the correctness of the prediction). A recent study (Breitwieser and Brod 2020) compared fourth- and fifth-graders and university students’ learning of trivia facts under two conditions; participants had to generate either a prediction about the fact or an example related to the fact. Although these two GLSs were similarly effective for the university students, the children clearly gained more from generating predictions than from generating examples. The present study’s findings suggest that, for children, the benefits of generating predictions extend beyond prior knowledge activation. However, the lack of a control group that did not engage in generative activities means that the present study does not allow an assessment of differences between the age groups in the benefit of generating predictions relative to a passive control condition.
While the results of these studies offer initial evidence against age-related differences in the effectiveness of generating predictions, they also suggest that processing of feedback on a prediction is crucial for the success of this method. There is a wealth of studies suggesting that feedback processing increases across childhood and that this increase is related to developing executive functions (Crone et al. 2004; Zelazo et al. 1996). Support for this conjecture comes from a study in which preschool to third-grade children either had to study complete word pairs or they had to guess the second word of the pair based on the first (Carneiro et al. 2018). It was shown that the benefit of guessing over studying increased with age, and that this was partly due to a decrease in interference from wrongly guessed words. With increasing age, children were better able to use corrective feedback and inhibit their initial guess. In summary, while generating predictions has been shown to be effective already in lower elementary grades, it is tempting to speculate that there still is an age-related increase in effectiveness because of increases in feedback processing. However, the currently available evidence is insufficient to draw such a conclusion.
Generating Questions
How Is It Supposed to Work?
Asking a good question requires accessing and elaborating relevant prior knowledge, and a key instructional goal of this technique is to help learners identify gaps in their knowledge (for overviews, see Graesser et al. 1992; King 1992). This technique can be implemented either in an interactive context, such as in reciprocal teaching (Palinscar and Brown 1984) or guided questioning of peers (King 1994), or in the form of questioning oneself (Wong 1985). For the purpose of this review, only the latter form of questioning will be considered. It has been suggested that asking learners to generate questions on the taught content also guides their attention to the main ideas and helps them and the instructor monitor their current state of understanding (Palinscar and Brown 1984). The quality of the questions asked are both a viable indicator of learners’ knowledge as well as of their evaluation of their current state of understanding (Graesser and Olde 2003). Thus, in order to generate a good question, metacognitive abilities are needed. By teaching students to generate questions, it is argued, one is also teaching them a self-regulatory cognitive strategy that helps them to learn by themselves (Garcia and Pearson 1990; Scardamalia and Bereiter 1985).
For What Ages Has Its Effectiveness Been Demonstrated?
The overwhelming majority of studies on question generation have been carried out in the domain of text comprehension. A review of studies that examined the effects of generating questions in this domain (Wong 1985) indicated that generating questions had a beneficial effect if students were properly instructed on how to do it beforehand and if they were given sufficient time to do so. A meta-analysis of intervention studies in which students were first taught to generate questions either during or after reading a text (Rosenshine et al. 1996) revealed a positive overall effect on comprehension tested at the end of the intervention (0.35 SDs for standardized tests and 0.82 SDs for non-standardized tests). In their meta-analysis, Rosenshine et al. (1996) qualitatively reviewed the effects at different grade levels of the students who participated in the studies. This review indicated that college students consistently showed positive effects of question-generation training (but see Hoogerheide et al. 2019), but only one out of four studies that tested third-graders showed such an effect. Results were mixed for fourth- through ninth-graders. In line with this meta-analysis, a large study including a question training manipulation in third-grade science and math units found no benefit of the question training for learning outcomes (Souvignier and Kronenberger 2007). These results suggest that third-graders are unable to profit from generating questions even if they have been intensively trained to do so.
For older children, the mixed evidence indicates that generating questions might work provided sufficient instructional scaffolding is provided by the teachers. An exemplary study with fourth- and fifth-graders (King 1994) found that providing them with generic questions or question stems led to beneficial effects of questioning, but unguided questioning did not. It seems uncertain, though, whether the guided conditions truly involved question generation given that the learners were provided with generic questions that they only had to apply. In an exemplary study with ninth-graders, training them to pose questions for themselves during a classroom lecture proved effective in promoting their comprehension of and memory for the lecture (King 1991).
In summary, there is sufficient evidence to conclude that generating questions can be an effective learning strategy in university students, while it is likely not effective in younger elementary school students. For the intermediate ages, evidence is mixed, suggesting that generating questions might work if sufficient instructional scaffolding is provided by the teachers.
Are There Age-Related Differences in Its Effectiveness?
The evidence discussed in the last section already strongly suggests the existence of age-related differences in the effectiveness of generating questions. One study has examined age-related differences in the effectiveness of student-generated questions directly (Denner and Rickards 1987). It compared text comprehension in fifth-, eighth-, and eleventh-graders and found that self-generated questions were less effective than provided questions and not more effective than rereading in promoting comprehension across all age groups. However, there was an increase in effectiveness of self-generated questions relative to rereading as well as in the number of conceptual questions with age, which the authors attribute to older students’ better metacognitive abilities. This interpretation seems plausible given that only among the eleventh-graders did the majority of students direct their questions toward the main ideas in the text. Younger students mainly asked simple factual questions that were often targeting less relevant details in the text. In summary, findings of the present study suggest that the unguided generation of questions by students will result in strong age-related differences. For this method to be effective, especially without intensive training or provision of question stems, high levels of metacognitive abilities and prior knowledge seem necessary, which presents a challenge for children.
Generating Answers (Practice Testing)
How Is It Supposed to Work?
Answering test questions about previously learned material is known by many names—most prominently, self-testing, practice testing, and retrieval practice. The focus on practice makes clear that this method qualifies as a GLS because it is not intended as a one-shot summative assessment of learners’ knowledge but rather involves repeated activity to enhance their retention of the material (Fiorella and Mayer 2016). According to the rationale for this strategy, attempting to retrieve target information involves memory search processes that activate related prior knowledge (Carpenter 2009). The co-activated knowledge gets bound to the target information, resulting in an elaborated memory trace that facilitates later retrieval of the target because more cues can be used to guide memory search. Attempting to retrieve target information can be effective even when the attempt fails because it helps to structure existing knowledge, thereby facilitating retrieval in the future. Retrieval failures may allow learners to identify ineffective cue–target connections and to shift to more effective ones (Pyc and Rawson 2012). As is generating predictions, self-testing is most effective when corrective feedback is provided (Kang et al. 2007), which means that it requires capability for at least basic feedback monitoring.
For What Ages Has Its Effectiveness Been Demonstrated?
In the past two decades, a plethora of studies have found a strong beneficial effect of practice testing on learning (also called the testing effect, for reviews, see Rawson and Dunlosky 2012; Roediger and Butler 2011). A meta-analysis (Rowland 2014) that took into account unpublished studies revealed a moderately strong effect size (0.50 SDs) for studies that compared testing with a restudy control condition. Another meta-analysis that also included no-activity control conditions revealed slightly larger effect sizes (i.e., between 0.60 SDs and 0.70 SDs; Adesope and Trevisan 2017). The meta-analyses further suggested that testing is similarly beneficial when performed in the classroom or in the laboratory, as well as for learning verbal and nonverbal material.
Although most of the studies on the effects of practice testing have been conducted with university students, several studies have involved children as well (for a review, see Fazio and Marsh 2019). The meta-analysis by Adesope and Trevisan (2017) found negligible differences in effect size between studies conducted in elementary, secondary, and postsecondary students, indicating that testing can be successful across a wide age range. Moreover, the beneficial effects of practice testing seem to be independent of learners’ prior knowledge (Adesope and Trevisan 2017; Carroll et al. 2007). In summary, the existing evidence indicates that practice testing is an effective strategy in students of all ages.
Are There Age-Related Differences in Its Effectiveness?
Turning to studies that compared different age groups, Lipowski et al. (2014) tested first- and third-graders and found robust beneficial effects of testing compared with restudying. While the benefit of testing compared with restudying was stronger in third-graders than in first-graders, the interaction did not reach significance. A significant interaction between age and condition (testing, restudying) was found in a study that compared lower elementary, upper elementary, and college students on the “forward testing effect” (Aslan and Bäuml 2016). In this variant of the common testing effect, testing is found to enhance learning of subsequent information. While this was the case in college students and, to a lesser extent, in upper elementary students, lower elementary students did not profit from testing. The authors attribute this age-related increase to younger children’s difficulties in combating interference from the previously studied information, which is related to the development of executive functions. In summary, while most studies indicate that testing is an effective strategy early on, recent results point to an additional age-related increase in effectiveness. Future research involving different age groups is certainly necessary to bolster this claim, however.
Generating Drawings
How Is It Supposed to Work?
Asking learners to draw an illustration that looks like or corresponds to a studied concept has been investigated mainly in the context of learning from instructional texts (but see Ploetzner and Fillisch 2017). To draw an illustration, learners have to translate the verbal information into a picture that represents spatial relationships among the elements mentioned in the text (Alesandrini 1984; Schwamborn et al. 2010). Drawings are different from concept maps in that the latter do not physically resemble the concepts they depict (Van Meter and Garner 2005).
Grounded in Mayer’s generative theory of textbook design (Mayer et al. 1995), Van Meter and Garner (2005) proposed that this method requires learners to engage in three cognitive processes: selecting the relevant information from the text, organizing the selected information to build up an internal verbal representation, and constructing an internal nonverbal representation that corresponds to the verbal one. Activating relevant prior knowledge, such as suitable nonverbal representations, and connecting it with the new information is crucial during all three steps. In addition, the task involves analogical reasoning in that a good drawing has to be structurally similar to the text, which means that learners have to constantly compare the correspondence between the text and their drawing. A specific feature of drawings is the integration of the verbal and nonverbal representational domains. The translation from the verbal to the nonverbal representation involves metacognitive processes such as monitoring and regulation, as learners have to go back and forth between the two representations (Schwamborn et al. 2010; Van Meter and Garner 2005).
For What Ages Has Its Effectiveness Been Demonstrated?
Research has by and large revealed positive effects of learner-generated drawing, but findings have been inconsistent (for reviews, see Alesandrini 1984; Leutner and Schmeck 2014; Van Meter and Garner 2005). A recent review/meta-analysis (Fiorella and Zhang 2018) that included only published studies revealed that drawing was clearly more effective than just reading a text for both text comprehension (0.46 SDs) and transfer performance (0.75 SDs). However, when compared to provided illustrations, drawing was only advantageous when instructional support was very high. This conclusion is supported by a study in which a drawing-only condition was compared with a reading-only condition as well as to two drawing-plus-support conditions (Van Meter 2001). Participants in the condition with the most support, who were prompted to compare their drawing with a provided illustration, generated more accurate drawings, acquired more conceptual knowledge, and engaged in more beneficial self-monitoring than did participants in the other conditions. In line with this finding, drawing accuracy and self-monitoring processes have repeatedly been found to mediate the beneficial effects of learner-generated drawing (Van Meter and Garner 2005).
The effects of learner-generated drawings have been examined across a wide age range. The aforementioned reviews, however, did not discuss potential age-related differences in the effectiveness of this method. The summary tables of empirical studies on learner-generated drawings provided by Van Meter and Garner (2005) and Fiorella and Zhang (2018) reveal no clear pattern regarding age-related differences either. They rather indicate mixed findings regardless of learners’ age, which again points to the importance of examining how drawing was implemented in the different studies. For example, a study with first-graders revealed a benefit relative to provided drawings (Lesgold et al. 1975), but in the present study children only had to assemble cutouts, which changes the nature of the learning activity. A study with fourth- and fifth-grade children that provided only little instructional support (Rasco et al. 1975, Experiment 3) found that children profited less from drawing than from being provided with illustrations.
Studies with slightly older children revealed more promising results, however. For example, eigth-grade biology students who were provided with background parts for their drawings showed better text comprehension than students who only read the text or who read the text and received the complete drawings (Schmeck et al. 2014). In summary, while there is sufficient evidence to conclude that generating drawings can be an effective learning strategy in university students, it is likely not effective in elementary school students unless instructional support is extremely high, thus requiring only limited self-generation.
Are There Age-Related Differences in Its Effectiveness?
While the evidence discussed in the last section already strongly suggests the existence of age-related differences, two studies directly compared the effectiveness of generating drawings between different age groups. One (Van Meter et al. 2006) compared fourth- and sixth-graders in a design similar to that of Van Meter (2001), including a read-only condition as well as three drawing conditions that varied in instructional support. Gains in conceptual knowledge were assessed on a problem-solving task. Only the sixth-graders benefited from drawing, and this benefit was enhanced in the high-support condition, in which students were prompted to compare their drawing with a provided illustration. Contrary to the prediction that the fourth-graders would benefit more from additional support than the sixth-graders would, they did not benefit from drawing even in the high-support condition. This outcome could not be explained by differences in prior knowledge or general comprehension ability, which were comparable between the two age groups. The authors speculated that the provided support was not high enough for fourth-graders even in the high-support condition or that fourth-graders would need additional practice in the technique. The second study comparing the effectiveness of learner-generated drawing between age groups (Van Essen and Hamaker 1990) compared first-, second-, and fifth-graders in an intervention design in which children were instructed to generate drawings to help them solve arithmetic word problems. It was found that, despite extensive practice and instructional support, first- and second-graders’ problem solving did not profit from drawing at all, whereas fifth-graders’ did. The lack of an active control group makes it difficult, however, to assess how much of this effect in fifth-graders is due to drawing itself and how much of it is due to other effects of the intervention.
To conclude, the available evidence suggests strong age-related differences in the ability to profit from drawing. While elementary-school children struggle to profit from drawing at all, secondary school children can profit only if instructional support is high. Because generating drawings that correspond to a studied concept requires good analogical reasoning and monitoring abilities, this technique’s effectiveness can be expected to exhibit a slowly ascending age trajectory that reaches its peak only in early adulthood.
Overall Summary
The first and foremost goal of this review was to evaluate the evidence for the effectiveness of six popular GLSs in improving declarative learning in different age groups. All six techniques reviewed generally work well for university students, which should make them promising for younger students as well. The success of these techniques with university students is unsurprising, however, given that most of the techniques were initially developed and tested at universities. Do they work equally well for younger students? Table 1 is a first attempt toward a general summary of each strategy’s level of effectiveness for different age groups. In keeping with the fact that many of the studies assessing them were performed in the classroom, the table is organized by grade level instead of age. Grade levels are combined on the basis of the grade distribution of the available studies, many of which involved upper-elementary-school children (grades 4–5), which is why they can be considered separately.
It bears mention that this assessment for the different age groups is tentative because it often rests on very few studies. Moreover, the conclusions are based predominantly on published studies. Because studies finding no benefit of a particular strategy are less likely to be published (the well-known file-drawer problem; Rosenthal 1979), the table likely overestimates effectiveness. In addition, for several GLSs, very few studies involving children below fourth grade were found. A pragmatic reason for this might be that it is difficult to use written assessments with children in this age range. A conceptual reason might be that GLSs require a great deal of self-regulatory and especially inhibitory abilities, which are relatively slow to mature.
An overall pattern that can be observed is that there is stronger evidence for the effectiveness of GLSs in older students, and that the evidence for their effectiveness in elementary-school children is quite variable. For example, whereas practice testing and predicting seem to be effective already in lower-elementary-school children, generating drawings seems to be largely ineffective until secondary school. These findings also speak to the second goal of this review, which was to examine whether there are age-related differences in the effectiveness of GLSs and whether these differ between strategies. However, the limited amount of age-comparative studies precludes any strong conclusions regarding age-related differences and mechanisms thereof. Therefore, in the final sections of this review, I will discuss theoretical reasons that speak for an age-related increase in effectiveness and differences therein between strategies along with caveats against premature conclusions. This discussion will also open up ways to alleviate some of the difficulties that particularly children face in using GLSs successfully.