Introduction

Cognitive load theory (see Sweller et al., 2011; see also Sweller et al., 1998, 2019) is an instructional theory to optimize learning activities and materials based on the human cognitive architecture. Recent systematic reviews (e.g., Li et al., 2019; Mutlu-Bayraktar et al., 2019) have shown the importance of cognitive load theory in current research about multimedia learning.

A basic recommendation of cognitive load theory to multimedia designers is that they should produce educational materials that do not overload the limited capacity of working memory (see Cowan, 2001; see also Oberauer et al., 2018). This is particularly needed when the learning materials are complex (high element interactivity), so there is a risk of overloading working memory (e.g., Ashman et al., 2020; Chen et al., 2017; see Paas & van Merriënboer, 2020).

The strategies to avoid working memory overload are termed as effects by cognitive load theorists (see Sweller et al., 2011). Researchers of the related cognitive theory of multimedia learning describe these strategies as principles (see Mayer, 2014a, 2014b, 2020; Mayer & Fiorella, 2014; Mayer & Moreno, 2003). In the present review article, we focus on five of these effects or principles (see also Castro-Alonso et al., 2019a): (a) multimedia principle, (b) split-attention effect or spatial contiguity principle, (c) redundancy effect similar to coherence principle, (d) signaling principle, and (e) transient information effect or segmenting principle.

Each of these strategies avoids a problematic multimedia design that could hinder learning. For example, the multimedia principle (see Butcher, 2014; Mayer, 1997) promotes adding visualizations to text-only instructional materials, as pictures aid understanding of visual and spatial relationships between the learning elements (e.g., Eitel et al., 2013). Building on the multimedia principle (Butcher, 2014; Mayer, 1997), researchers have examined how to most effectively present instructional materials to students and how different types of visualizations support specific cognitive processes (Guo et al., 2020; Mayer, 2014a; McCrudden & Rapp, 2017; Renkl & Scheiter, 2017). For example, visualizations are most effective when they are spatially integrated with written text (split-attention effect or spatial contiguity principle, e.g., Chandler & Sweller, 1991), when they do not contain extraneous nonessential information (redundancy effect, similar to the coherence principle, cf. Harp & Mayer, 1998), when they emphasize essential information (signaling principle, e.g., de Koning et al., 2009), and when they provide the opportunity to manage transient information (transient information effect or segmenting principle, e.g., Castro-Alonso et al., 2018).

Typically, researchers of the cognitive load theory and the cognitive theory of multimedia learning have focused on instructor-managed strategies to deal with these instructional design issues. For example, the recommendation of adding visualizations to text-only materials (the multimedia principle) is a solution executed by instructors and multimedia designers, not by learners. However, in the present review article, we propose that these strategies can be also used to promote learner-managed solutions to problematic instructional formats (see Table 1).

Table 1 Instructor- and learner-managed strategies to solve problematic instructional designs

The present review article has three main goals: (a) describe five strategies that researchers of the cognitive load theory and the cognitive theory of multimedia learning have investigated to help instructors and designers produce more effective instructional materials; (b) describe the five strategies with a novel focus on recommendations for students to foster self-managing or self-regulating their cognitive load and learning; and (c) provide examples that compare the effectiveness of the strategies when applied by instructors versus learners. Next, we present these strategies as solutions to problematic instructional materials.

Problem: Material Contains Only Text

Learning materials such as lectures or textbooks often emphasize verbal modes of presenting information—either spoken or printed text (see Fig. 1a). Text comprehension requires using one’s limited working memory resources to mentally organize ideas and integrate them with prior knowledge, forming a mental model (Kintsch, 1998). Mental models are depictive representations assumed to contain a visual–spatial structure analogous to the ideas presented in the text (Schnotz, 2014). Thus, understanding text often requires translating the text into a coherent mental image. For example, if the text explains how a tire pump works, the reader must mentally depict critical spatial relations, such as how when the piston moves up, the inlet valve opens, and the outlet valve closes, allowing outside air to enter the cylinder (Mayer & Anderson, 1991). This process can be cognitively demanding, especially for students with low prior knowledge or limited expertise.

Fig. 1
figure 1

a Problematic design containing only text. b Instructor-managed solution supplementing texts with visualizations. c Learner-managed solution by generating own visualizations

Instructor-Managed Solution: Supplement Text with Appropriate Visualizations

One solution to enhance lessons that rely on text is to provide students with supplementary instructional visualizations (see Guo et al., 2020; Mayer, 2014a; Renkl & Scheiter, 2017; Schraw et al., 2013), following the multimedia principle (see Butcher, 2014; Mayer, 1997). Visuals such as diagrams or illustrations can serve as a scaffold from which learners construct their mental model (Eitel et al., 2013; Schnotz, 2014). Unlike purely verbal representations, visualizations organize complex ideas into meaningful images that depict important conceptual, structural, and temporal relationships (McCrudden & Rapp, 2017). For example, as shown in Fig. 1, it is more straightforward to understand the structure of the coronavirus SARS-CoV-2 (e.g., Shereen et al., 2020) by studying texts and visualizations (Fig. 1b) than only texts (Fig. 1a).

A wide body of literature has documented the learning benefits when instructors supplement text-based lessons with corresponding diagrams, illustrations, models, graphs, or dynamic visualizations (Mayer, 2014a, 2020). According to the multimedia principle (Butcher, 2014), lessons containing words and visualizations support understanding (as reflected by performance on transfer tests) better than lessons that contain words alone. In one study by Butcher (2006), students generated self-explanations aloud as they studied a lesson on the human circulatory system that contained text only or text and graphics. Students who learned from text and graphics generated more accurate inferences than students who learned from text, resulting in better learning outcomes. Other studies report a similar pattern of findings (Ainsworth & Th Loizou, 2003; Cromley et al., 2010; Lindner, 2020). This research suggests visuals support learning by helping students more accurately build connections among the ideas presented in the text and their existing knowledge—critical processes in mental model construction.

Learner-Managed Solution: Generate Own Visualizations

One potential downside of instructor-provided visualizations is that students may not process them deeply (Renkl & Scheiter, 2017). Research suggests that many students tend to focus their attention on the text or struggle to build appropriate connections between the text and corresponding parts of the graphics (Johnson & Mayer, 2012; Renkl & Scheiter, 2017; Schnotz & Wagner, 2018; Schüler et al., 2019). Instructor-provided visuals also run the risk of causing students to overestimate their level of understanding because the provided images provoke feelings of fluency or familiarity (Serra & Dunlosky, 2010; Wiley, 2019).

An alternative, learner-managed solution is to ask students to generate their own images from the text (Ainsworth et al., 2011; Fiorella & Mayer, 2016, 2017; Van Meter & Garner, 2005), as shown in Fig. 1c. For example, research on learning by drawing suggests that students understand science texts better when they create drawings compared to when they only read the text or if they use text-focused learner strategies like summarizing (Fiorella & Kuhlmann, 2020; Fiorella & Zhang, 2018). In a study by Leopold and Leutner (2012), students either created drawings or generated verbal summaries while learning from a scientific text about structure of water molecules. Students who drew outperformed students who summarized on subsequent comprehension and transfer tests. Other studies suggest that asking students to create more abstract visual representations, such as concept maps, also enhances learning from text (Schroeder et al., 2018).

According to the cognitive model of drawing construction (CMDC; Van Meter & Firetto, 2013), generating drawings from text encourages students to select the most important ideas from the text, organize these ideas into a descriptive representation (or propositional network), and integrate these with prior knowledge into a depictive mental model. Furthermore, students must continually engage in metacognitive processes to monitor and regulate the drawing construction process, such as by consulting the text, updating their mental model, and revising their drawings. Indeed, research by Van Meter (2001) indicates that students who draw engage in more self-monitoring behavior during learning than students who only study text with provided illustrations.

Nevertheless, there are also potential downsides of learner-generated drawings. Drawing can be time consuming and cognitively demanding, and without adequate guidance from the instructors, students tend to generate inaccurate drawings (Fiorella & Zhang, 2018). As noted in recent reviews (Ainsworth & Scheiter, 2021; Fiorella & Zhang, 2018; Guo et al., 2020), research suggests drawing is most effective when students receive instructional support, such as scaffolds to complete partially provided visuals (Schmeck et al., 2014) or opportunities to compare their drawings to an instructor-provided visualization and revise their drawings accordingly (Gagnier et al., 2017; Van Meter, 2001).

Problem: Material Presents Texts and Visualizations Separately

Although the multimedia principle (see Butcher, 2014; Mayer, 1997) predicts that the combination of text and pictures results in higher learning outcomes than learning from text alone, certain presentation formats that combine text and pictures are more cognitively challenging. Hence, not all combinations of text and pictures are equally effective for learning.

A frequently ineffective combination is produced when mutually referring text and visualizations are spatially separated (see Fig. 2a). To learn from such a format, learners must split their attention between the textual and graphical representations and are required to mentally integrate these two information sources to construct an adequate and coherent mental representation (see Ayres & Sweller, 2014; Mayer, 1997). Specifically, learners must continuously switch between searching for relevant information in one information source (e.g., text) and matching this information to the corresponding information in the other information source (e.g., visualization). Both the switching of attention and the requirement to keep information active in working memory when switching from one source to another do not directly contribute to learning and unnecessarily increase the demands on the learner’s working memory resources (Ayres & Sweller, 2014).

Fig. 2
figure 2

a Problematic design with texts and visualizations presented separately. b Instructor-managed solution presenting texts and visualizations contiguously. c Learner-managed solution by moving texts closer to the visualizations

Instructor-Managed Solution: Present Texts and Visualizations Contiguously

A typical instructor-managed solution to avoid split-attention is to physically integrate the textual and pictorial information such that corresponding textual information and pictorial information are presented close together (i.e., integrated format, see Fig. 2b), following the spatial contiguity principle (Mayer, 1997; Mayer & Fiorella, 2014). Two meta-analyses have shown the positive effects of integrated formats over split-attention designs for multimedia learning. The meta-analysis by Ginns (2006), which included 37 effect sizes, showed an overall effect size of d = 0.72, favoring spatial contiguity. The more current meta-analysis by Schroeder and Cenkci (2018), which included 58 independent effect sizes, showed an overall effect size of g = 0.63. These effects sizes correspond to large effects (Kraft, 2020).

The benefits of an integrated format over a split-attention format cover a range of topics, such as mathematics (e.g., Tarmizi & Sweller, 1988), biology (e.g., Chandler & Sweller, 1991; Cierniak et al., 2009), electrical engineering (e.g., de Koning et al., 2020a), meteorology (e.g., Makransky et al., 2019; Mayer et al., 1995; Moreno & Mayer, 1999), and technology of mechanical devices (e.g., Bodemer et al., 2004; Johnson & Mayer, 2012).

It is important to note, however, that empirical evidence for the split-attention effect does not surface in all integrated vs. split-attention comparisons. For example, several recent attempts using exactly the same or slightly adjusted instructional materials from prior studies on the split-attention effect showed null differences between split-attention and integrated formats for learning outcomes and cognitive load (Cammeraat et al., 2020; Pouw et al., 2019). The learners’ expertise or existing knowledge may be influencing these results, as students with high levels of expertise do not always benefit from integrated formats (e.g., Kalyuga et al., 1998).

Learner-Managed Solution: Move, Trace, or Imagine Texts into Visualizations

An increasing number of studies have been investigating the effectiveness of asking learners to move text segments into corresponding elements in the pictures (e.g., Agostinho et al., 2013; Gordon et al., 2016; Roodenrys et al., 2012; Sithole et al., 2017). Across studies, this learner-managed integration took place by moving paper-based cut-out text segments with the hand or by moving digital text segments with the mouse or via a touchscreen interface (see Fig. 2c).

In addition to moving elements, another learner-managed solution is tracing relationships between visualizations and texts. For example, Macken and Ginns (2014) investigated adult participants studying paper-based texts and illustrations about the human heart in a split-attention design. Some participants used their fingers to trace connections between the separate texts and illustrations (tracing group), whereas others did not gesture (control group). Retention and comprehension tests showed that the tracing group outperformed the nongesturing condition. Korbach et al. (2020) extended these positive effects of tracing on split-attention designs, by adapting the biology materials to touch screen devices given to university students.

Notably, these physical integration solutions of moving or tracing can only be used when the learning environment allows for (sufficient) physical interaction. This is not always the case, for example, when studying text and pictures from a noninteractive webpage. In such cases, an alternative could be found in a mental self-management strategy. Several studies show that learners imagining integrating spatially separated textual and pictorial information support learning over just studying a split-attention format. A recent study by de Koning et al. (2020b) used a specific imagination instruction on how and why to move text to the corresponding, yet spatially separated, part in the picture, and compared this to (a) a physical learner-managed integration strategy, (b) learning without a strategy, and (c) learning with an (instructor-managed) integrated format. Results indicated that the mental learner-managed strategy produced higher retention and comprehension outcomes than the physical learner-managed strategy (see also de Koning et al., 2020a) and the split-attention format. No significant differences were found between mental learner-managed integration and instructor-managed integration. Comparable findings have been obtained by Bodemer and Faust (2006) when students were simply prompted to mentally integrate textual and pictorial information, suggesting that benefits of learner-managed mental integration can be obtained even with less specific instructions (not indicating how and why to integrate).

Problem: Material Contains Redundant or Nonessential Information

Some instructors may be tempted to present the same information in multiple formats (e.g., text and images, see Fig. 3a) or to provide detailed elaborations on the learning material (e.g., Harp & Mayer, 1998). After all, these approaches may help students process the same information multiple times, thereby reinforcing and strengthening learning. However, research suggests presenting redundant or nonessential information can backfire—that is, sometimes less is more (e.g., Adesope & Nesbit, 2012).

Fig. 3
figure 3

a Problematic design with redundant texts and visualizations. b Instructor-managed solution removing the block of text. c Learner-managed solution by generating a summary of the block of text

According to cognitive load theory, redundant information creates extraneous cognitive load because it interferes with processing the essential information (see Kalyuga & Sweller, 2014). As with the split-attention effect, redundancy also depends on learners’ expertise or existing knowledge: information that is redundant for students with high levels of expertise may not be redundant for students with low levels of expertise (Chen et al., 2017). Also, sometimes little redundancy is helpful rather than detrimental (e.g., de Koning et al., 2017; Mayer & Johnson, 2008; Yue et al., 2013). For overcoming the problems with too much redundant information, there are some simple instructor- and learner-managed solutions.

Instructor-Managed Solution: Remove Redundant or Nonessential Information

The most straightforward solution to the redundancy effect is to remove any redundant or nonessential information from the instructional materials, a solution related to the coherence principle (see Mayer & Fiorella, 2014). One common example of redundant content that should be discarded is when the same information is presented verbally and visually (see Fig. 3a). Importantly, redundancy occurs when one source of information—either the text or the visualizations—could be understood independently; instructional materials are not redundant when understanding depends on integrating both sources (see Kalyuga & Sweller, 2014).

This important point was demonstrated in a study by Chandler and Sweller (1991), in which students learned from text and/or diagrams of a lesson on the flow of blood in the heart, lungs, and body. The results indicated that students learned best when the text and diagrams were spatially separated from each other (rather than integrated) or when students only learned from the diagrams without the text. From the standpoint of cognitive load theory, integrating the text and visuals made it likely students had to process both sources of information. Since the text and diagrams for this lesson could be understood in isolation, processing them both was redundant. When the diagram was separated from the text, students could presumably focus on only the diagram and ignore the text. Yet the best approach was to remove the text entirely to avoid imposing unnecessary processing demands on learners (see Fig. 3b).

A second common example of redundancy is when multimedia lessons contain on-screen text that is redundant with spoken text (e.g., Kalyuga et al., 2000). This approach is evident when instructors or presenters read aloud text directly from their slideshow presentations (see Horvath, 2014). In this case, learners must mentally reconcile what they are hearing with the text they are seeing on the screen. If the two sources are identical, it forces learners to engage in unnecessary cognitive processing, which can prevent learners from mentally integrating the text with provided visuals (assuming the text and visuals themselves are not redundant). In a study by Mayer et al. (2001), college students viewed a narrated animation on the process of lightning formation. Students performed significantly worse on subsequent retention and transfer tests when the lesson contained on-screen text that summarized or was identical to the spoken narration. Several other studies have reported a similar detrimental effect of including redundant on-screen text in multimedia lessons (see Mayer, 2020; Mayer & Fiorella, 2014).

Redundancy can also occur when lessons are unnecessarily elaborated, such as textbook chapters that contain many details in addition to the core concepts (e.g., Eitel et al., 2019; Harp & Mayer, 1998). In a study by Mayer et al. (1996), undergraduates studied learning materials about lightning and then completed recall and transfer tests. Students either read a full textbook chapter, a short visual summary (i.e., simple illustrations with brief captions), or both the full chapter and the summary. The results indicated that students learned at least as well or better from only studying the summary than from reading the full chapter (with or without the summary). Furthermore, adding text to the visual summary made it less effective. The results highlight the importance of designing concise instructional materials that focus only on the essential information and reduce redundant or unnecessary information (see also Sundararajan & Adesope, 2020).

Learner-Managed Solution: Generate Textual or Visual Summaries

In some cases, redundant instructional materials will be unavoidable for students. Many textbooks, PowerPoint presentations, and video lectures contain redundant or unnecessary information, and sometimes students are expected to integrate across multiple sources of overlapping learning materials. In these cases, learners can employ various summarization strategies to reduce redundancy and create their own concise and coherent representation of the learning material (see Mayer et al., 2020).

As shown in Fig. 3c, summarizing involves concisely stating the main ideas or ‘gist’ of the learning material in one’s own words (Brown et al., 1981; Dunlosky et al., 2013; Fiorella & Mayer, 2016). According to generative learning theory (Fiorella & Mayer, 2015; Wittrock, 1989), summarizing encourages students to select only the essential information from the lesson and reorganize it using their existing knowledge. Consistent with this explanation, research suggests that prompting students to generate summaries of text can enhance learning beyond rereading the text or verbatim copying from the text (Bretzing & Kulhavy, 1979; Doctorow et al., 1978; Wittrock & Alesandrini, 1990). In an exemplary study by Bretzing and Kulhavy (1979), high school participants studied a text describing a fictitious tribe of people. Some students were instructed to write a short summary of the main ideas for each page of text; others were asked to copy the main ideas from each page verbatim. Results indicated that the summarizing group outperformed the copying group on both immediate and delayed recall tests.

However, not all studies report benefits of summarizing. In fact, learner-managed solutions for redundancy, including the generation of summaries, can be difficult for students (e.g., Mirza et al., 2020), and many students struggle to generate summaries of sufficient quality without instructional guidance or training (Bednall & Kehoe, 2011; Dunlosky et al., 2013; Hooper et al., 1994). If students are not able to accurately extract the key ideas from the lesson, it is unlikely to enhance learning. For example, Bednall and Kehoe (2011) found no overall benefit of asking undergraduates to summarize lessons describing logical fallacies, yet the quality of the summaries students generated was positively associated with posttest performance. Similarly, Spirgel and Delaney (2016) tested summarizing across a wide range of studying conditions and found no evidence that writing summaries was more effective than restudying the text. Students were better able to remember the ideas they included in their summaries, but many students struggled to identify the main ideas. These findings suggest students need training or guidance on how to generate quality summaries (e.g., Bean & Steenwyk, 1984; King, 1992; Taylor & Beach, 1984).

Another point is that, although past research typically defines summarizing as generating verbal summaries, research suggests creating visual summaries (such as schematic drawings) can often be more effective, especially when learning from complex science texts (Bobek & Tversky, 2016; Leopold & Leutner, 2012). Overall, students learn better when they are taught how to use strategies for creating more concise verbal or visual summaries of the learning material.

Problem: Material Does Not Emphasize Essential Information

As described above, if the learning material includes nonessential information, the learner must divert cognitive resources to manage this information in addition to the key information for learning. The signaling principle (e.g., Mautone & Mayer, 2001; see van Gog, 2014) recommends incorporating cues to signal the most important learning elements. This strategy allows learners to focus only on the key elements to learn instead of diverting working memory resources to process nonrelevant information. Also, because of effects of distinctiveness on working memory (see Oberauer et al., 2018), visually cueing an element to make it more salient aids in making it more memorable. Fig. 4 depicts the attachment of the coronavirus to its host cell; Fig. 4a shows a nonsignaled format, while Fig. 4b shows the key elements in the viral attachment being signaled with a thick circular frame (i.e., a border around the relevant elements).

Fig. 4
figure 4

a Problematic design not emphasizing the essential information. b Instructor-managed solution signaling with an added element. c Instructor-managed solution signaling without adding elements. d Learner-managed solution by highlighting the essential textual information

Four recent meta-analyses (Alpizar et al., 2020; Richter et al., 2016; Schneider et al., 2018; Xie et al., 2017) reported significant small to large effect sizes (Kraft, 2020), for both retention and transfer scores, indicating that instructional multimedia with signals was more effective than the formats without signaling features. In addition to these effects with multimedia materials, the positive effects of signaling have also been observed with text-only passages, as described below.

Instructor-Managed Solution: Signal the Essential Information

Instructor-managed signaling techniques can be broadly organized in two groups (see de Koning et al., 2009; see also Castro-Alonso et al., 2019a; de Koning & Jarodzka, 2017): (a) signaling with added elements, and (b) signaling without added elements. Examples of signaling with added visual elements include pointing devices (e.g., arrows, lines, fingers, and hands), frames, labels, and underlined text, among others. Signaling without added elements include contrasts and spotlights, zooming, color coding, transparencies, blurring, lighting, and combinations. Figure 4a depicts a nonsignaled format, while Fig. 4b shows signaling with one added element (a thick circular frame), and Fig. 4c shows signaling without added elements (less transparency for the key elements).

The study by Lin and Atkinson (2011) provides an example of effective signaling with added elements, where red arrows were used as the added elements for signaling depictions about rock cycles. The study showed that undergraduates shown static images with the red arrows were faster to learn about the geology topic than students not given this type of signaling. In an experiment with elementary school students learning about plant morphology in a virtual reality classroom, Liu et al. (2020) compared a control condition without signaling with a condition in which signaling involved lines and arrows. Results of the comprehension and transfer tests showed that the signaled group outperformed the nonsignaled condition.

Effective signaling is also reported by Wang et al. (2018) in two experiments with undergraduates learning from multimedia lessons about chemical synaptic transmission. The signaling devices in these experiments were the gestures provided by the pedagogical agents who taught the lessons. Participants who watched the agents using deictic hand gesturing and eye gaze outperformed the students who observed static agents without gesturing. Similarly, in a study with children learning math equivalence through multimedia, Cook et al. (2017) also observed that gesturing pedagogical agents were more effective than nongesturing agents.

Nevertheless, as suggested by de Koning et al. (2009), signaling with added elements can be ineffective. This has been reported in studies in which the groups watching hands signaling the learning elements presented lower achievement scores than the groups that did not watch these extra hands (Castro-Alonso et al., 2015, 2018; Schroeder & Traxler, 2017). As predicted by de Koning et al. (2009), when comparing signaling with added elements to signaling without added elements, the latter may be more effective, as it does not add to the number of elements to be processed in working memory. In other words, the hands may produce a negative redundancy effect (Kalyuga & Sweller, 2014).

The spotlight strategy of signaling without added elements was used by de Koning et al. (2010). This signaling uses light contrasts that keep the original brightness of the essential elements and dims out the secondary visuals. In an experiment wherein psychology undergraduates learned about the human cardiovascular system through animations, de Koning et al. (2010) found that spotlight signaling was effective for retention, inference, and transfer scores. A technique of zooming-in was tested by Amadieu et al. (2011) in an experiment of psychology undergraduates learning about synapses through animations. This signaling without added elements was effective to increase the comprehension scores of the students.

Cueing with colors has also shown positive effects of signaling without added elements (e.g., Ozcelik et al., 2010). For example, Jamet (2014) investigated undergraduates studying about the cognitive theory of multimedia learning through static images and narrations. The comparison was made between a condition shown the learning elements being colorized in synchrony with the narration versus a group studying without these color changes. Results showed that these signals were effective for guiding attention to the learning areas and producing higher scores in the retention test, but not the transfer test.

Learner-Managed Solution: Underline or Highlight Information

Underlining or highlighting text to signal the essential information is one of the most popular learner-managed strategies intended to support learning (e.g., Dunlosky et al., 2013), particularly because they can be applied easily and without much time and effort in addition to the main learning task. Figure 4d shows an example of highlighting text to signal key information.

Underlining or highlighting texts are effective techniques because they have two functions (see Miyatsu et al., 2018). One is a storage function, meaning that underlined or highlighted text makes the marked text easier to identify later. This is related to the isolation effect described by Dunlosky et al. (2013), in which information that is more distinctive is better remembered than less distinctive information (see also Oberauer et al., 2018). The other function mentioned by Miyatsu et al. (2018) to explain the effectiveness of these techniques is a generative function (see Fiorella & Mayer, 2015), meaning that underlining or highlighting requires learners to personally select the important information which likely elicits more thorough processing because they have to decide which information is most important.

Several studies (e.g., Blanchard & Mikkelson, 1987; Fowler & Barker, 1974; Rickards & August, 1975; Yue et al., 2015) have demonstrated that signaled texts via learner-managed underlining or highlighting were better remembered than nonsignaled information. However, students do not always take full advantage of these signaling techniques (e.g., Dunlosky et al., 2013; Nist & Kirby, 1989). A prime concern is that students often do not know how to highlight correctly, and need to be guided or trained to highlight in order to increase its effectiveness (see Dunlosky et al., 2013; see also Miyatsu et al., 2018). While some studies have shown benefits with multisession trainings of several hours in total (e.g., Amer, 1994), other studies have shown that comparable benefits can be obtained in a single session of one or two hours of training (e.g., Leutner et al., 2007). Similar to signaling texts, highlighting, underlining or other forms of marking should also be effective strategies to signal visualizations (cf. Schlag & Ploetzner, 2011).

Problem: Material Shows Too Much Transient Visual Information

As described by Ayres and Paas (2007), a detrimental transient information effect occurs when dynamic visualizations (e.g., videos and animations) show too many visual elements leaving the screen rapidly (see also Castro-Alonso et al., 2014). Because of these fast-paced dynamic visualizations, students do not have enough time to process the depictions in working memory. Cognitive load theory predicts that this problem becomes worse when more transient information must be managed in working memory (e.g., Castro-Alonso et al., 2018). In contrast, materials with no transient information, such as static images, should be better suited for learning.

Despite the cognitive demands of learning from dynamic visualizations, two recent meta-analyses (Berney & Bétrancourt, 2016; Castro-Alonso et al., 2019b) showed overall small-sized effects favoring dynamic over static visualizations. Nevertheless, Castro-Alonso et al. (2016) criticized prior research comparing dynamic to static visualizations because it sometimes fails to control moderating variables, such as appeal, media, and interaction. Also, gender has not been properly controlled in these comparisons (see Castro-Alonso et al., 2019b).

In all, although the dynamic versus static comparisons need to provide more conclusive evidence with well-matched experimental designs, there is evidence favoring less-transient over more-transient dynamic visualizations. For example, an ineffective design of an animation includes excessive transient information, as shown in Fig. 5a. As this dynamic visualization accumulates transient information by showing several consequent steps, it may not allow students to effectively process it in working memory.

Fig. 5
figure 5

a Problematic design of a dynamic visualization with too much transient information. b Instructor-managed solution segmenting the dynamic visualization. c Learner-managed solution by controlling the pace of the dynamic visualization

Instructor-Managed Solution: Segment Dynamic Visualizations

An instructor-managed solution to avoid the problematic transient information effect is to provide segmented animations with necessary interspaced lapses of time (see Fig. 5b), in order to allow working memory to replenish before additional information is shown (cf. Chen et al., 2018; Leahy & Sweller, 2019).

Positive evidence for segmenting is provided by Biard et al. (2018), in which occupational therapy university students learned a medical hand procedure from videos. Students who learned from segmented videos outperformed students who were given unsegmented full-length videos. This study used segments that were understandable. In other words, the cuts to produce the segments must not disrupt the narrative of the dynamic visualizations. Disrupting this logical flow, for example by introducing midsentence breaks in the narrations, has proved to be ineffective, even though these breaks stop the transience of the visualizations. In other words, segmentation is effective if it provides a segment that has internal logic so it can be processed by working memory as a single element (see Kurby & Zacks, 2008; see also Zacks, 2020).

In an example of this phenomenon, Boltz (1992) studied university participants watching chapters of a miniseries. An experimental condition with short commercial breaks that did not interrupt major idea units was compared to a condition in which the breaks did stop the flow of ideas. As predicted, the interrupting breaks hindered memory for the plot of the chapters, but the noninterrupting breaks did not. In a later study, Schwan et al. (2000) measured recall of details in videos depicting procedural tasks. Increased recall was observed in the participants who watched the videos with changes of camera angles coinciding with the breaks in the idea units. In other words, changing the camera angle was an effective method to highlight a change in the narrative unit.

In all, these findings support two (instructor-managed) techniques for improving video lessons when there is a transition from one idea to the next. The most effective technique is to add short breaks in between these idea changes, as it does not only sustain the idea unit, but it also controls the transient information problem of the dynamic visualization. The second technique, which appears somewhat less effective, is to add changes in the angle shot by the camera. Arguably, changing camera angle is less effective because it does not manage the transient nature of the information depicted.

Learner-Managed Solution: Control the Pace of Dynamic Visualizations

When students self-manage transient information, they are given interactive features, such as a scrollbar (e.g., Hatsidimitris & Kalyuga, 2013) or a next button (e.g., Mayer & Chandler, 2001; Stiller et al., 2009), in order to control the pace of the videos or animations (see Fig. 5c). As with segmented materials, pace-controlled dynamic visualizations allow less transient information to be processed, and thus provide a more effective format for learning. For this learner-managed solution to be effective, the learners should have some knowledge about when to perform these interactions with the materials. Also, some of these interactions (e.g., using a scrollbar) could demand more cognitive resource than others (e.g., clicking).

An example of effective pace-control is reported in Stiller et al. (2009), who investigated university participants studying the structure of the eye through a multimedia module. Results indicated the pace-control condition outperformed a control condition without these pace-control features on a subsequent post-test. Similarly, in a study by Höffler and Schwartz (2011), university students learned about dirt removal from a surface. The groups allowed to rewind, fast-forward, and pause the pace of the presentation outperformed and self-reported less cognitive load than the groups without these pace-control features.

Effective pace-control was also reported by Hatsidimitris and Kalyuga (2013) in two experiments with physics undergraduates studying animations with or without using the pace-control feature of a scrollbar. Both experiments showed that the scrollbar conditions outperformed the nonscrollbar conditions. Also, there were two notable observations in this study. First, Experiment 2, in which the physics expertise of the learners was controlled, showed that only novices were significantly benefited by the scrollbar, but more knowledgeable students were not helped or hindered by this learner-managed feature. Second, in both experiments the participants were pretrained on how to use the scrollbar, as participants did not effectively use it without the previous guidance. This study with pace-control showed the importance of the learners’ expertise. The expertise or prior knowledge of the learner is also central in every instructor- and learner-managed strategy, as described next.

Comparing the Effects of Instructor-Managed Versus Learner-Managed Solutions

As described in the introductory section, most of the research of the cognitive load theory and the cognitive theory of multimedia learning has investigated the effects of different strategies managed by the instructors, teachers, and instructional designer. There is less evidence concerning learner-managed (or self-regulated learning) solutions to deal with the cognitive load of instructional materials (e.g., Eitel et al., 2020b; see de Bruin et al., 2020). There is also less research comparing the effectiveness of instructor-managed versus learner-managed strategies.

The expertise reversal effect, a key phenomenon described by cognitive load theory (see Chen et al., 2017; Kalyuga et al., 2003), helps predicting the learning scenarios in which an instructor- or a learner-managed solution would be most effective. Similar predictions can be made following generative learning theory (see Fiorella & Mayer, 2016; Fiorella & Zhang, 2018). The expertise reversal effect suggests that instructor-managed solutions may be most effective for novice learners, whereas learner-managed solutions may be most effective for more advanced students. Similarly, generative learning theory proposes learners need adequate background knowledge or guidance from the instructor (e.g., scaffolding or feedback) to benefit from engaging in learner-managed solutions (i.e., generative learning activities).

Although confirmatory research is needed in the literature, we predict that instructor-managed solutions will be most effective when learners’ expertise is low. In this case, instructor-managed solutions will be preferable than learner-managed solutions because novices (low expertise) will have problems implementing solutions (see Dunlosky et al., 2013), without extensive guidance or training in how to implement learner-managed solutions effectively (e.g., generating high-quality text summaries, Colliot & Jamet, 2018). The rationale is that novice learners do not have enough working memory resources to deal with the learning materials and the learning strategies at the same time, so they need solutions or guidance provided by the instructors (Chen et al., 2017; Leahy & Sweller, 2020).

In contrast, learning scenarios with higher levels of learners’ expertise may be more adequate for learner-managed solutions. In these learning scenarios, the learners’ expertise would help the students to cope with the generative actions of learner-managed solutions (see Fiorella & Mayer, 2016; Fiorella & Zhang, 2018). In these cases, learners would not only manage the generative actions but they would also benefit from attempting them, as they could generate accurate relationships among the ideas presented through the educational materials and their existing knowledge or expertise (Fiorella & Mayer, 2016; Fiorella & Zhang, 2018).

Although there is emerging literature comparing instructor- and learner-managed solutions to the problematic instructional designs described in the present review, these studies have not systematically tested the aforementioned predictions by comparing different degrees of learner’ expertise. In other words, this section of the present review is somewhat speculative. However, this relatively novel line of research has produced interesting findings that warrant further investigation. In general, these results are mixed, sometimes supporting the instructor solutions, sometimes showing more support for the learner solutions, and sometimes failing to show differences between these two approaches, as described next.

Material Contains Only Text

As with the other four strategies described here, the multimedia principle research that has compared the use of instructor-provided and learner-generated visualizations has produced mixed results. Van Meter and colleagues (Van Meter, 2001; Van Meter et al., 2006) found benefits of generating drawings over studying text with provided illustrations, especially when students were provided strong forms of instructors’ guidance, such as explicitly prompting students to compare their drawings to an instructor-provided illustration. It is also important to note that the drawing conditions in these studies spent considerably more time than the students who did not draw. In contrast, Leopold et al. (2013) found that creating drawings produced excessive cognitive load, resulting in poorer learning outcomes than studying provided illustrations. Finally, Schmidgall et al. (2019) found no significant difference on immediate or delayed learning outcome measures between learning from provided or generated visuals. Future research could investigate if different levels of learners’ expertise or instructors’ guidance influence these results.

For example, the recent experiment by Zhang and Fiorella (2019), which combined instructor-managed and learner-managed solutions, could be explained because novice learners benefited most from engaging in generative drawing when they received an instructor-provided visualization as feedback (i.e., high guidance). In the study, university participants studied a text-based lesson on the human circulatory system and had opportunities to both study provided visualizations (instructor-managed solution) and generate their own visualizations (learner-managed solution). Across two consecutive study periods, students either studied a provided illustration with the text twice (provided–provided), generated their own visual from the text twice (generated–generated), studied a provided visualization first and then generated their own visualization (provided–generated), or generated their own visualization first and then studied the provided visualization (generated–provided).

Results indicated that the benefit for transfer test performance was strongest for students in the generated-provided condition. An explanation is that generating a drawing and then studying a provided illustration allows students to maximize the unique benefits of learner-managed and instructor-managed visualizations—first, they are forced to use their limited prior knowledge or expertise to translate the text into their own visual representation; then they receive feedback by studying an accurate instructor-provided illustration. These findings tend to support the predictions of the expertise reversal effect and the generative learning hypotheses for low learners’ expertise, but further research is needed to replicate these results.

Material Presents Texts and Visualizations Separately

Regarding the split-attention effect or spatial contiguity principle, there are also mixed results supporting the instructor-managed or the learner-managed solutions. For example, there are studies (Agostinho et al., 2013; de Koning et al., 2020a, b; Gordon et al., 2016) more supportive of the instructor-managed solution. In these cases, the integrated formats provided by the instructor were more effective than the split-attention designs, but the learner-managed action of moving on-screen text to the picture did not hinder or benefit learning.

However, two of these studies (de Koning et al., 2020a, b) revealed that a mental learner-managed strategy, in which the participants imagined moving text segments near the visualizations, was as effective as the instructor-managed solution providing an integrated format. Null differences were also observed by Tindall-Ford et al. (2015) in a study with secondary school children studying through computer multimedia. Transfer results showed that the instructor-managed integrated format was as effective as the learner-managed condition in which the students used the computer mouse to move the texts into the pictures. Both conditions outperformed the split-attention format.

Lastly, there is also evidence showing that the learner-managed solution can be more effective. As such, among university students, using the physical self-managed integration strategy when studying information in a split-attention format yielded higher recall (Sithole et al., 2017) and transfer (Roodenrys et al., 2012; Sithole et al., 2017) performance than studying either a split-attention or an instructor-managed integrated format (Sithole et al., 2017). Furthermore, Roodenrys et al. (2012) showed that students who had been taught the learner-managed strategy for a first learning task, spontaneously used the learner-managed strategy on a second set of split-attention instructional materials and outperformed learners who had studied the first learning task in a conventional split-attention format or an instructor-managed integrated format.

Bodemer et al. (2004) provided two experiments with university students learning from split-attention formats versus instructor- and learner-managed integrated multimedia. Experiment 1, which investigated an easy task of learning the functioning of a tire pump, showed that both instructor-integrated and learner-integrated formats were superior to the split-attention format. Although the learner-managed condition showed marginally higher learning scores than the instructor-managed group, the differences were nonsignificant. In contrast, the difference was significant in Experiment 2, which included a more difficult task of learning statistics. As such, Experiment 2 showed that the learner-managed group outperformed the other two conditions, instructor-managed and split-attention, which did not differ between them. Future research controlling the learners’ expertise and instructors’ guidance could help explaining these mixed results comparing the instructor- and learner-managed solutions.

Material Contains Redundant or Nonessential Information

Concerning the redundancy effect and coherence principle, there is also a need of future investigations testing the role of learners’ prior knowledge. The only studies we are aware of comparisons between an instructor-managed condition and a learner-managed condition were reported by Mirza et al. (2020). They conducted three experiments with primary school children studying the water cycle to examine whether they could self-manage the redundancy effect after being instructed on how to remove evidently redundant information. A condition in which children had to study redundant materials (control) was compared to a condition in which children had to study redundancy-free materials (instructor-managed), and a condition in which children had to study redundant materials, but were guided how to remove the redundancy (learner-managed). Redundancy was created by repeating information provided in the diagram as textual information in text boxes. Because the text boxes in the learner-managed condition were provided as paper cut-out sections and attached to a sheet of paper containing the diagram, they could easily be removed by the participants. None of the experiments revealed significant differences between the conditions in learning outcomes. Although the authors claimed that the results suggested that teaching learners to remove redundant information was just as effective as presenting them with instructor-managed materials, more studies are needed to substantiate this claim.

The study by Eitel et al. (2019) could also shed some light on the effectiveness of learner-managed redundancy, but it needs to be noted that the redundancy effect was confounded with the signaling principle in this study. Eitel and colleagues investigated three groups of adult learners studying a booklet about the steps in the formation of lightning. One condition included nonessential seductive information consisting of texts and pictures. In contrast, there were instructor-managed and learner-managed conditions to deal with this redundant nonessential information. The instructor-managed solution group was provided with booklets including only the essential information for the learning task. The learner-managed condition was informed that a red frame (signaling with added elements) indicated the essential learning information. The results were mixed, as both solutions were better than the control condition with the redundant unsignaled information.

Also, these findings showed that a learner-managed solution to avoid problematic redundancy could be achieved not only by generating more concise materials (e.g., written summaries or drawings) but also by solely reading the essential information only (see also Eitel, Bender, & Renkl, 2020a). Nevertheless, the frame signaling the essential information, which was only provided in the learner-managed condition, cannot rule out that these effects could be attributed to signaling rather than to avoiding redundancy. Signaling examples are provided next.

Material Does Not Emphasize Essential Information

The signaling principle has also provided mixed results of higher learning associated with either learner- or instructor-managed solutions. Fowler and Barker (1974) compared conditions in which students were: (a) asked to read scientific texts while they had to highlight relevant information, (b) provided the texts in which relevant information was highlighted, or (c) provided a text without these visual cues. Results of retention scores showed that students had a better memory for highlighted information. Crucially, the results were higher for students that had to highlight the text themselves, compared to students given the highlighted texts. These findings and similar results reported by Rickards and August (1975) support the learner-managed solutions.

In contrast, there are findings more supportive of instructor-managed solutions. For example, in an experiment with psychology undergraduates, Colliot and Jamet (2018) compared three groups of students learning about the human memory systems through different formats of multimedia: (a) control condition, presented only the texts plus the illustration of the multimedia; (b) the multimedia plus an instructor-managed outline that signaled the topics and subtopics of the texts; and (c) the multimedia plus a learner-managed system to generate an outline signaling the topics and subtopics. Results of the transfer test showed that the highest scores were obtained by the group of students learning with the instructor-managed outlines, followed by the learner-managed condition, followed by the control group.

Similarly, in three experiments, Stull and Mayer (2007) investigated university student learning from textual passages about reproductive barriers between species. The experiments compared the effect of supplementing the texts with instructor-managed or learner-managed graphic organizers signaling the textual relationships. The three experiments, which differed in the complexity of the graphic organizers to study or generate, revealed that participants in the instructor-managed conditions outperformed those in the learner-managed conditions on the transfer tests. As with the other strategies, the signaling principle warrants further investigations, in order to test the relative effectiveness of instructor- and learner-managed solutions.

Material Shows Too Much Transient Visual Information

Again, as with the other strategies, the transient information effect and segmenting principle have not been investigated with different levels of learner expertise. Most of the research has focused on learning scenarios with low learners’ expertise, where the predictions of the expertise reversal effect and the generative learning theory support the instructor-managed solution (segmenting) over the learner-managed solution (pace-control).

As stated, the key difference between the segmenting and pacing control strategies is the agent who segmented the dynamic visualization (see Spanjers et al., 2010; see also Merkt et al., 2018). Segmenting is responsibility of an instructor or expert who chooses where to add pauses to foster a meaningful presentation of the contents (see Spanjers et al., 2010). In contrast, pacing control is usually responsibility of a novice learner who could add pauses in improper places and thus break the idea units that allow a meaningful presentation (cf. Boltz, 1992). This is the rationale for predicting that segmenting (instructor-managed solution) should be more effective than pacing control (learner-managed solution) for novice learners without much guidance (feedback and scaffolding) from the instructors.

The recent meta-analysis by Rey et al. (2019) provides supporting evidence for this prediction. As such, segmenting showed large effect sizes (Kraft, 2020) for both the transfer (d = 0.35) and retention (d = 0.42) tests, but the pacing control strategy showed only a significant large effect for transfer (d = 0.45) and a nonsignificant effect for retention (d = 0.19). Further research is needed to test the transient information effect and the segmenting principle for different levels of learners’ expertise.

Discussion

Research under the frameworks of the cognitive load theory and the cognitive theory of multimedia learning has provided several strategies to optimize instructional materials. In this review article, we described five of these strategies, which have usually been recommended for instructors, teachers, and designers. Here, we also provided recommendations for students who can self-manage these strategies and self-regulate their learning. The first strategy, known as the multimedia principle, proposes instructors to include visualizations supplementing texts, and recommends learners to generate (partially or completely) their own drawings or visualizations from textual materials. Second, the split-attention effect or the spatial contiguity principle advocate instructors to present texts integrated with visualizations (or contiguous with them), and proposes learners to move, trace, or imagine moving texts into visuals. The third strategy, the redundancy effect (similar to the coherence principle), recommends instructors to remove nonessential information, while it proposes learners to generate textual or visual summaries. Fourth, the signaling principle recommends instructors to include signals cueing the essential information and recommends learners to underline or highlight this information. Last, the transient information effect or segmenting principle advise instructors to segment animations or videos and recommend learners to control the pace of these dynamic visualizations.

As the research about learner-managed or self-regulated cognitive load is relatively recent (see de Bruin et al., 2020), there is not a corpus of results consistently supporting the instructor-managed or the learner-managed solutions. Currently, most evidence is mixed, either advocating for the instructor solutions or the learner solutions. There are also many studies not showing significant differences. Furthermore, some learner-managed solutions are easier to implement than others, as Mirza et al. (2020) reported for materials containing redundant information.

However, cognitive load theory can help predicting these effects. As such, the expertise reversal effect aids forecasting that the instructor-managed solutions, which are provided by instructors who are experts in their field of expertise, would be more effective for learners with low expertise (novices). In contrast, the learner-managed solutions would tend to be more effective for more expert students and/or when more guidance in how to use learner-managed solutions is provided by the instructors (e.g., feedback and scaffolding). Future research is needed that considers both instructor- and learner-managed solutions plus levels of learners’ expertise.

Instructional Implications

A first instructional implication of the current review is to follow the five strategies of the cognitive load theory and the cognitive theory of multimedia learning described here. In other words, efforts should be made in providing students with materials that (a) contain both text and visualizations, (b) present texts and visualizations contiguously or integrated, (c) strive to contain only essential learning information, (d) emphasize essential learning information, or (e) do not show too much transient visual information.

A second instructional implication concerns which agent, the instructor or the learner, should execute these strategies. Instructor-managed solutions could be more effective for novice students, whereas learner-managed solutions may be more effective for knowledgeable students. As the level of students’ expertise determines whether instructor- or learner-managed solutions are preferable, an implication is that instructors should assess continuously the level of expertise of their students, in order to gauge if the design solutions should be pursued by the instructors or the learners.

Limitations and Future Directions

One limitation of this review is that we considered the cognitive load theory evidence for groups of students, without consideration of their individual characteristics (besides expertise levels). Future directions for instructor- and learner-managed solutions could not only consider learners’ expertise as an important individual characteristic, but also assess the moderating effects of other learners’ properties, such as gender (e.g., Castro-Alonso et al., 2019b; Heo & Toomey, 2020; Wong et al., 2018), visuospatial processing (see Allen et al., 2019; Buckley et al., 2018; Castro-Alonso, 2019), mental effort (e.g., van Gog et al., 2020), and motivation (e.g., Eitel, Endres, et al., 2020).

A second limitation is that we focused on visualizations and texts, although cognitive load theory can be applied to different modalities. Future research could investigate the effects of instructor- and learner-managed effects when the learning materials involve different modalities, including visual, verbal, and haptic (see Baddeley, 2012; see also Sepp et al., 2019).

A third and last limitation is that we did not focus on cognitive load measures, but learning and performance scores. As different techniques to measure cognitive load have been developed, such as physiological or objective measures (see Castro-Alonso & de Koning, 2020; Charles & Nixon, 2019), a future direction of research is to include these measures when investigating the instructor- and learner-managed solutions reviewed here.

Conclusion

Researchers of the cognitive load theory and the cognitive theory of multimedia learning have provided several strategies to optimize instructional materials and multimedia. Usually these strategies have been recommended to teachers and instructors who want to produce more effective learning in their students. Here, we describe that these strategies can also be pursued by the learners who want to self-manage their learning process. As predicted by the expertise reversal effect and generative learning theory, novice students would tend to be benefited more by instructor-managed strategies, whereas expert students may be benefited more by their own learner-managed strategies.