Introduction

Practical elements are important parts of educational practice. In many instances, learners need to experience activities, mimic movements, or perform actions in order to successfully reach a learning objective. Learning activities can range from mental tasks such as making predictions, using body movement to grasp abstract concepts in a concrete manner, all the way to creating drawings and other products. A long-standing controversy within the fields of psychology and education is whether and how activity-based forms of learning (also commonly known as “learning by doing”) help learners. Learning by doing is typically used as the polar opposite of learning facts in an entirely non-practical manner devoid of applications of this knowledge (Reese, 2011). Crucially, despite the many advantages of activities for learning, there are several findings suggesting that activity-based learning can easily become demanding for learners, leading to worse learning outcomes (e.g., Ploetzner & Fillisch, 2017; Stull & Mayer, 2007). Based on the contradictory and often confusing result patterns of this research area, this paper presents an in-depth analysis of the cognitive processes involved in activity-based learning and how to avoid the pitfalls of this instructional strategy.

In this review, the terms “activity-based learning” and applications of the “learning-by-doing principle” will be used interchangeably. Different forms of activity-based learning will be discussed as having more or less strongly developed activity-related demands and prerequisites. The analysis of activity-based learning will be focused on exemplary motor-based activities, complex generative learning activities, and mental activities. These facets of activity-based learning differ in their demands regarding movement and physical activity, mental effort, and task-related knowledge. Although in educational practice, “learning by doing” is often understood as involving physical actions rather than mental activities (Reese, 2011), cognitive activities such as making predictions will be discussed as a form of activity-based learning without a physical component.

Movement-related instructional interventions are usually based on the notion of embodied cognition, a perspective that emphasizes that cognitive processes are highly dependent upon sensory perception, bodily experience, and movement-related knowledge (for an overview, see Wilson, 2002). The application of these cognitive factors to learning is referred to as embodied learning (Kontra et al., 2012; Skulmowski & Rey, 2018) and such interventions encourage learners to trace using their fingers (e.g., Agostinho et al., 2015), to point their fingers (e.g., Zhang et al., 2023b), to enact (Lindgren et al., 2016), and to use gestures (Yohannan et al., 2022) or full-body movement (Johnson-Glenberg et al., 2014). Thus, embodied learning invokes the motor system with the aim to make content easier to understand (e.g., Shvarts & van Helden, 2023). The activity-related knowledge required in order to complete these types of activities is usually rather small, in particular when the body only is used as a means to more effectively understand certain information using simple motor actions such as pointing or tracing.

While movement-centered activities emphasize bodily activity and usually do not have high prerequisites, more complex activities can require learners to possess a certain amount of task-related knowledge (or practice). Currently, forms of activity-based learning known as generative learning are of particular interest in the literature (Fiorella, 2023; Fiorella & Mayer, 2016). Generative learning can range from mental activities, such as making predictions (Brod, 2021), to more complex forms, such as creating drawings (Wu & Rau, 2019; Zhang & Fiorella, 2021) or teaching (Fiorella & Mayer, 2013). Generative learning tasks risk introducing their own cognitive demands, thereby leading to an undue focus on the activity (the “doing”) instead of the actual learning task. These tasks add a component of task-related knowledge, such as knowledge on how to generate drawings during learning or letting learners teach in order for them to learn. These activities require considerable task-related prerequisites, such as the ability to draw shapes in perspective or knowledge regarding teaching and instruction. In contrast to activities relying on easily performed motor activities such as re-enacting movements, some of the negative results found concerning complex generative learning may stem from their complexity that may distract learners from the actual learning goal.

In this review, different forms of activity-based learning are analyzed using the framework of cognitive load theory (Sweller et al., 1998, 2019) and Geary’s (2008) distinction between evolved and acquired knowledge. As examples for generative learning, learning by drawing and learning by teaching are discussed in detail. Both instructional strategies have in common that they include an (at least partially) extraneous activity component that is hoped to enhance or facilitate the rest of the learning process. Thus, both may risk to trigger “doing without learning.”Footnote 1 The analyses of these two forms of activity-based learning can be transferred to other types of learning by doing. In addition to a cognitive load analysis and discussion of several forms of activity-based learning, this review contains a discussion of often-overlooked inter-individual differences relevant to these activities that may impact learning. Finally, recommendations for the integration of activity-based learning in educational practice are given. In the following section, cognitive load theory is presented as the basis for the cognitive analysis of learning by doing.

Cognitive Load Theory

In order to analyze the cognitive effects of activity-based learning in depth, the theoretical framework of cognitive load theory will be used in this paper. Cognitive load theory is built upon memory models that assume a limited working memory capacity (Sweller et al., 1998, citing Baddeley, 1992). Learners can only keep a certain amount of items (referred to as “elements” within cognitive load theory, Sweller et al., 1998) in their working memory before becoming cognitively overwhelmed. As a result, one way of optimizing learning is to ensure that only relevant content is being processed in working memory during a learning task (Sweller et al., 1998). In order to achieve this aim, cognitive load theory recommends to reduce unnecessary cognitive load as much as possible so that learners’ cognitive systems have the opportunity to focus on the relevant cognitive load of a task (Sweller et al., 2019). In this theory, extraneous cognitive load is the label for all the cognitive demands that may distract or obstruct learners from engaging with the actual content to be learned (i.e., intrinsic cognitive load). Thus, if an instructional design lets learners engage with relevant content without distracting them, intrinsic cognitive load is maximized by reducing extraneous cognitive load.

In previous research, numerous design factors have been found to contribute towards extraneous cognitive load, such as placing related elements at a (spatial or temporal) distance (Chandler & Sweller, 1992). Thus, optimizing visual design by placing related elements at a closer proximity has been found to facilitate learning (Chandler & Sweller, 1992). But beyond this and other design factors primarily relevant for the design of texts and visualizations, instructional activities also have been revealed to introduce their own forms of extraneous cognitive load (see Kalyuga & Singh, 2016). Based on the understanding of the cognitive system during learning tasks gained from cognitive load theory, the following section describes which types of actions generate different levels cognitive load.

Biologically Primary and Secondary Knowledge

The effort required for learning can be estimated based on the type of knowledge involved (Paas & Sweller, 2012). Geary (2008) describes two types of knowledge. On the one hand, learners are assumed to possess biologically primary knowledge, which is thought not to be in need to be deliberately acquired, but rather is stated to exist as innate (or at least easily grasped) capacities (Paas & Sweller, 2012). Among these abilities, the ability to speak is an example (Paas & Sweller, 2012, citing Kuhl, 2000). However, beyond such rather basic abilities, most knowledge needs to be acquired with deliberate effort (Paas & Sweller, 2012). Learning a foreign language, complex mathematical procedures, or biological knowledge could be considered as candidates for biologically secondary knowledge (Paas & Sweller, 2012). Utilizing biologically secondary knowledge usually generates cognitive load, while making use of biologically primary knowledge generally does not (Paas & Sweller, 2012).

This distinction between biologically primary and biologically secondary knowledge has been applied to instructional settings in several studies. It has been found that instructional design built upon the distinction between biologically primary and biologically secondary knowledge can be effective in optimizing learning, potentially due to a lower level of cognitive load involved. For instance, easily performed biologically primary motor actions such as finger tracing can provide cognitive benefits (such as linking visually displaced information) without causing cognitive load (e.g., Agostinho et al., 2015; Korbach et al., 2020).

It is not the focus of the current review to discuss whether the utilization of biologically primary knowledge indeed causes little cognitive load exclusively due to evolutionary factors. Nevertheless, this theoretical distinction lets us distinguish between rather basic, automated activity components, such as finger tracing, speaking, and arranging physical items, and more elaborate activities, such as browsing the Internet or creating a digital presentation. The latter activities require a multitude of smaller activity components and deliberately acquired knowledge, such as the knowledge on how to use a computer, digital presentation software, and the wide variety of options within it. These types of activities demand a higher amount of cognitive resources, occupying learners’ working memory capacity. The following section summarizes the different components generally involved in activities and enables a more precise analysis of different learning-related activities.

Layers of Action Planning and Performance

Earlier work on cognitive load theory utilized fine-grained analyses of different cognitive processes involved in problem-solving activities (Sweller, 1988). Sweller (1988) described how novice learners working on problem-solving tasks tend to invest more unnecessary cognitive effort by using means-end analyses (i.e., working backwards from the final goal of a task and setting up a number of intermediate goals). Accordingly, this less effective way of completing a task is so demanding that learning is inhibited (Sweller, 1988). In the current review, the design of activity-based learning will be analyzed by listing the task demands introduced by activities and whether the (components of) learning activities become so demanding that the primary task of learning is suppressed by a secondary learning task.

In order to discuss in which phases of action planning and performance cognitive load may arise, it is necessary to define the different layers of the cognitive system involved in actions. An influential phenomenological model of action was presented by Pacherie (2008). In this model, different intentions involved in the planning and performance of actions are the main focus. Pacherie describes distal intentions (D-intentions), proximal intentions (P-intentions), and motor intentions (M-intentions). In Pacherie’s model, D-intentions and P-intentions constitute the higher level intentions that agents form. D-intentions consist of deliberate plans that are aimed at achieving rather global goals (Pacherie, 2008). These intentions are thought to be able to span over long periods of time and may involve several sub-actions (Pacherie, 2008). Examples for D-intentions could be the plan to write an essay or to create a drawing. In the model, P-intentions are concerned with practical actions that can be performed in an immediate situational context, such as writing something down or drawing individual objects. P-intentions are realized using low-level M-intentions (Pacherie, 2008). M-intentions consist of motor representations at the neural level, involving biomechanics (Pacherie, 2008). M-intentions fulfill P-intentions and generate state changes in an agent’s environment through motor action and feedback loops (Pacherie, 2008). An example would be the motor patterns involved in generating marks on paper in order to write letters or to draw contour lines of objects.

Pacherie’s (2008) model highlights that actions usually involve deliberate planning, their actualization at an individual level, and the necessary motoric and neural base of these actions. Thus, it is possible to broadly divide actions into their actualization at the neural and motor level (such as performing gestures, tracing, and pointing) and the more global, conscious, and deliberate, goal-oriented combinations of movements involving activity-related task knowledge (such as drawing).

Important for the purpose of analyzing instructional activities is that D-intentions usually have the aim to accomplish a complex activity, composed of several proximal activities, which in turn consist of individual motor activities. Motor activities are usually deeply ingrained in learners, resulting in a low level of cognitive load during their execution. Collections of larger numbers of motor activities arranged in patterns to achieve complex learning activities can be considered to place a higher cognitive burden on learners. These distinctions between learning activities will be presented in detail in the following section.

Types of Learning Activities

Harnessing the potential of activity and movement has been an important aim of cognitive load theory research over the last years (Mavilidi et al., 2022; Paas & Sweller, 2012; Sepp et al., 2019). Based on the theoretical perspectives just discussed, it is possible to distinguish between certain types of approaches to activity-based learning. Movement-based and embodied approaches highlight how motor activities of varying levels of bodily engagement affect learning (Johnson-Glenberg et al., 2014; Mavilidi et al., 2022; Sepp et al., 2019; Skulmowski & Rey, 2018). Regardless of the extent of bodily engagement, movement-based learning can be characterized as using relatively simple movement patterns within learning tasks (for an overview, see Skulmowski & Rey, 2018), in contrast to complex learning activities that require certain prerequisites before they can be performed. Among these more complex activities, generative learning (Fiorella, 2023) plays a particularly important role, as this type of learning lets learners create external representations, such as drawings (Zhang & Fiorella, 2021), or involves them in more engaging forms of learning, such as learning by teaching (Fiorella & Mayer, 2013). Both motor-based and generative learning activities will be discussed in detail in the following sections.

Motor Activity–Based Learning

The types of interventions categorized under the label motor activity–based learning have the goal of using the motor system to (1) learn procedures and (motor) skills (e.g., Rabattu et al., 2023; for an overview, see Mavilidi et al., 2018) and (2) to use the motor system to enact, understand, or mentally connect otherwise abstract content (e.g., Amico & Schaefer, 2021; Cherdieu et al., 2017; Damsgaard et al., 2022; Parrill et al., 2023; Smyrnis et al., 2022; Yohannan et al., 2022; Zhang et al., 2021). Often, embodied learning interventions are designed to convey declarative knowledge, but can be targeted at motor learning. The motor activity utilized in these interventions can range from finger tracing (e.g., Agostinho et al., 2015) to whole-body movements (e.g., Johnson-Glenberg et al., 2014). As examples for the former type of intervention, several studies support the claim that finger tracing can help learners to understand content while being such a simple, biologically primary motor activity that it does not generate cognitive load (Agostinho et al., 2015). On the other end of the spectrum, studies on the effects of whole-body movements do not always result in benefits when compared to the utilization of less bodily activity (for an overview, see Skulmowski & Rey, 2018).

In response to this pattern of findings, several theoretical accounts aimed at explaining and predicting the benefits of motor activity have been developed. One influential approach is the taxonomy of Johnson-Glenberg et al. (2014), in which learning is presented as supported by the three factors motoric engagement, gestural congruency, and immersion, with the assumption that an increase in these dimensions will lead to enhanced learning. Since then, two related alternative classification systems have been proposed. Skulmowski and Rey (2018) simplified Johnson-Glenberg et al.’s (2014) system to the two dimensions bodily engagement and task integration, with bodily engagement being mostly identical to the motoric engagement component in Johnson-Glenberg et al.’s conceptualization. Bodily engagement was found to have a nonlinear influence on learning, as too little of it deprives learners of opportunities to enact or experience certain types of knowledge, while excessive bodily activity was described to be linked to extraneous cognitive load (Skulmowski & Rey, 2018). Skulmowski and Rey’s (2018) dimension of task integration describes how well the actions performed by learners match the content to be learned, similar to Johnson-Glenberg et al.’s (2014) concept of gestural congruency.

The relationship between movements and tasks has been explored in more detail by Mavilidi et al. (2022) and their 2 × 2 system with the dimensions integration and relevance. They define integration as the temporal connection between movements and tasks and present relevance as the correspondence between movements and learning on the content level (the latter definition based on Mavilidi et al., 2018). The meta-analysis by Mavilidi et al. (2022) revealed that the largest positive effects on memory were achieved using the combination of a high integration and a high relevance. Judging from these reviews and meta-analyses, the aspect of providing learners with opportunities for performing relevant motor activities appears to be a crucial factor for fostering learning using movement. The question of how much bodily activity should be utilized seems to be more complex.

The effectiveness of movement-based learning has been found to be dependent upon a number of aspects derived from the theories discussed above. In recent years, research on movement-based learning increasingly relies on the notion of cognitive costs and benefits (e.g., Amico & Schaefer, 2021; Lui et al., 2020; Skulmowski et al., 2016). In this perspective, performing movements is thought to generate a certain level of (extraneous) cognitive load depending on their complexity (e.g., Skulmowski et al., 2016). If the cognitive benefits that can be gained from performing movements do not exceed these costs, movement-based learning is considered detrimental (Amico & Schaefer, 2021; Skulmowski et al., 2016). In other words, in an unsuccessful case of movement-based learning, learners need to generate complex motor intentions (M-intentions) or even learn how to perform certain motor activities in a deliberate and conscious manner, thereby requiring them to build additional biologically secondary knowledge and to manage this additional cognitive load. Thus, they may end up “doing without learning.”

In sum, judging from the available research, movement-based learning can be considered as promising if certain conditions are met. A number of studies have shown that movement-based learning has the lowest risk for cognitive costs if the motor activity involved in learning avoids cognitive load, for instance by making use of biologically primary knowledge, such as tracing (Agostinho et al., 2015) and pointing (Zhang et al., 2023b) behaviors. As the performance of motor activity encoded as biologically primary knowledge does not generate cognitive load (Paas & Sweller, 2012), these types of activities can be considered to pose only a small risk of cognitive overload. More extensive use of the body needs to have a clear relevance for the task while not causing substantial extraneous cognitive load. Requiring additional biologically secondary knowledge to be acquired at the same time would be one example for a risky combination. Beyond motor activities, other instructional approaches engage learners in even more elaborate tasks with a greater amount of task-related knowledge. These types of activities will be discussed in the following section.

Complex and Generative Activity–Based Learning

In contrast to motor activity–based learning, which is primarily concerned with utilizing learners’ motor system, more complex forms of activity-based learning challenge learners to engage in meaningful tasks that often have the aim of generating tangible products. These tasks typically consist of a number of different operations. Thus, learners do not only need to (unconsciously) form motor intentions and perform movements (as in many embodied learning interventions), but rather have to deliberately plan how to accomplish specific, elaborate task goals. Thus, learners need to generate D-intentions, P-intentions, and M-intentions. As already mentioned, these tasks often have a generative nature, meaning that the activities produce tangible or observable results, such as in the case of learning by drawing (for an overview, see Fiorella & Zhang, 2018) and learning by teaching (Fiorella & Mayer, 2013). Similarly to the suboptimal design of motor activity–based learning with excessive and distracting movements (as already discussed above), introducing more complexity into an activity may increase (extraneous) cognitive load, thereby diminishing the working memory capacity learners’ cognitive system can devote to the actual content to be learned. As in the case of motor activity–based learning, complex activity-based learning involving activity-related knowledge is also increasingly being framed in relation to costs and benefits. For instance, Zhang and Fiorella (2021) judge the benefits of creating drawings during learning in relation to the time and effort necessary in generating these drawings. An in-depth discussion of issues of complex activity-based learning is provided in the following sections using example techniques.

Learning by Drawing

A popular instructional approach with a high prima facie plausibility is learning by drawing. Learning by drawing is thought to aid learners in creating a detailed mental model (Fiorella & Zhang, 2018). While it is often assumed that drawing is a skill that virtually all learners have (or are at least are able to acquire), the empirical evidence suggests that this form of learning by doing can indeed cognitively overwhelm learners. Some studies have shown that letting learners generate drawings, concept maps, and other representations can lead to worse learning results than providing them with pre-made materials (e.g., Ploetzner & Fillisch, 2017; Stull & Mayer, 2007). However, the literature also contains positive results of having learners generate drawings themselves (for an overview, see Ainsworth & Scheiter, 2021). A review on learning by drawing found that letting learners draw their own external representations as a means to support learning from texts is more effective than solely proving texts, but in many instances not as effective as other strategies such as using pre-made drawings (Fiorella & Zhang, 2018). The level of guidance and drawing training appears to be an important determinant of successful learning by drawing (Fiorella & Zhang, 2018), suggesting that using this strategy without appropriate guidance could be too demanding. As a result, learning by drawing is a good example for the importance of the instructional design of learning activities.

Several distinctions between types of drawings can be made, such as whether they feature abstract or representational content (McCrudden & Rapp, 2017; Fiorella & Zhang, 2018; see Fig. 1). Learning by drawing often focuses on generating representational drawings that bear a resemblance to a physical entity (Fiorella & Zhang, 2018). Generating a representational drawing requires learners to engage with the spatial structure, shape, and other visual features of an object. Creating relational drawings (Hegarty, 2011), such as concept maps, does not necessarily need to involve visual features, but instead prompts learners to focus on relationships, semantic connections, and other more abstract features of physical or theoretical entities.

Fig. 1
figure 1

The difference between relational drawings and representational drawings. While the relational drawing (as defined by Hegarty, 2011) merely presents the abstract relationship between elements, representational drawings depict a physical entity and thus contain shapes and colors to generate a more or less faithful visual representation of reality (the pictured example of the human heart is loosely based on Gray, 1918, pp. 527, 546). Creating a recognizable representational drawing usually involves more skill than is necessary to draw a relational diagram, thus generating cognitive load during learning by drawing

When analyzed using the distinction between biologically primary and secondary knowledge, it could be argued that generating simple relational drawings involves only a limited amount of biologically secondary knowledge. If learners are able to place the elements they want to map out in a manner that enables them to highlight their relationships, all they need to do (on the motoric level) is to draw circles, rectangles, and connecting lines. These are fairly easy and well-rehearsed activities that should not generate a high level of cognitive load. Thus, creating a relational drawing during learning could be thought to involve a certain level of intrinsic cognitive load (i.e., mentally structuring the information to be presented) and very little extraneous cognitive load stemming from the drawing activity. Relating this analysis to the layers of activities described by Pacherie (2008), learners do not need to form a complex distal intention of creating an elaborate drawing, but merely have the aim to arrange letters and simple types of marks in an understandable way. While this can be cognitively demanding (depending on the content to be presented), summarizing and visually presenting information is a skill that can be trained and is unlikely to evoke a cognitive overload in typical learning situations. The proximal intentions would constitute the plans to place and connect individual elements on a page, while the motor intentions would simply be to write letters and draw certain types of lines. All of these activity components should result in a relatively low level of cognitive load after sufficient training.

Unfortunately, the same cannot be said about creating representational drawings. Achieving a drawing that faithfully represents a physical entity places a number of burdens on learners. They need the manual dexterity to perform precise movements with their fingers and hands as well as being able to draw straight lines and complex shapes without excessive distortions. Furthermore, they need a trained sense of proportion to transfer the spatial relationships between the elements of an object onto their page or digital canvas. In addition, certain types of representational drawings require knowledge of perspective and vanishing points. Lastly, creating a representational drawing can also involve shading and other detailing for which learners need to possess knowledge on how to render shadows on surfaces (for an overview of the aforementioned components of drawing, see Robertson & Bertling, 2013). Although some of these aspects are typically taught as a part of art education, it would be quite optimistic to assume that all students achieve sufficient drawing skills to effortlessly perform all these objectives while at the same time trying to memorize complex learning content in a learning task. Analyzed using Pacherie’s (2008) model (see Fig. 2), learners need to generate the D-intention of creating a drawing with a particular composition, perspective, and style, which can be considered cognitively effortful, as these aspects rely on biologically secondary knowledge that needs to be trained. Then, learners must form a series of P-intentions, such as constructing the individual objects they want to draw. While many of the steps can be automated through practice, drawing representational visualizations in perspective still remains an effortful use of biologically secondary knowledge. Only the motor aspects of drawing individual strokes could be considered as a form of biologically primary knowledge (when taking into account the early emergence of cave paintings, see d'Errico et al., 2016; García-Alonso et al., 2022; see also the discussion of early artistic expression as a form of biologically primary knowledge presented by Tang et al., 2019). Drawing has previously been compared to tracing, a type of biologically primary knowledge (Ginns & King, 2021). However, given that the lines and strokes in representational drawings need to be highly precise and must closely follow the shapes of real objects, it could be argued that the drawing process may introduce a certain level of task demands on the motor level. For instance, constructing lines in correct perspective or applying accurate cross-hatching as a shading technique may add some cognitive load to every stroke. In sum, in contrast to relational drawings, creating representational drawings adds cognitive load by heavily relying on biologically secondary knowledge at most levels of the task. Thus, learning by generating representational drawings could be considered to be an example of a learning-by-doing activity with a high risk for an increased extraneous cognitive load arising from activity-related task demands, again potentially leading to “doing without learning.”

Fig. 2
figure 2

A summary of the combination of Pacherie’s (2008) model of action planning, Geary’s (2008) distinction between biologically primary and biologically secondary knowledge, and cognitive load theory (Sweller et al., 2019). Complex learning tasks usually contain task-related knowledge, such as the knowledge of how to create a representational drawing. Thus, learning tasks often rely on biologically secondary knowledge that has been trained in its own right. The distal intention of creating a drawing can be broken down into several proximal intentions that incrementally help learners achieve their goal. In the case of drawing, this would be the construction of individual objects in perspective, usually involving shading. Drawing these shapes on paper requires a high amount of motor intentions that result in making a number of marks on paper. Such basic motor activities can be considered as making use of biologically primary knowledge and do not generate much cognitive load. As learners become increasingly proficient in creating drawings, the cognitive load of the task-related knowledge needed for the drawing task decreases and learners can focus on the intrinsic cognitive load of the learning task. However, high task demands may require the utilization of biologically secondary knowledge, for instance when detailed and physically correct shading or an accurate construction of shapes in perspectives is necessary. In this case, individual strokes may be cognitively demanding, posing a risk to distract learners from the actual learning task

In fact, the results regarding learning by drawing found in the literature paint an interesting pattern. As summarized by Fiorella and Zhang (2018), letting learners generate drawings generally results in higher learning outcomes than text-based strategies such as reading or writing. The authors of that paper explained this result using the assumption that drawing triggers more thorough forms of mental engagement. However, the same review reports that other instructional methods that go beyond reading and writing can be more effective than learning by drawing, such as simply imagining a phenomenon rather than drawing it (the authors cite Leutner et al., 2009, who also found that imagining reduced cognitive load compared to drawing). Interestingly, drawing was found to be particularly useful for later tests involving drawing (Schleinschock et al., 2017), underlining that more elaborate learning activities need to be matched to the assessment method (Skulmowski & Xu, 2022). Furthermore, a drawing task on tablet computers has resulted in worse learning outcomes regarding text-based information when compared to learning without drawing (Jamet & Michinov, 2022), highlighting that the use of generative learning strategies can transform a learning task and should not be perceived as an “add-on” of minor importance.

In sum, learning by drawing is an example for a method relying on learners’ activity that can turn out to be cognitively overwhelming and may require learners to allocate their cognitive resources towards the completion of a secondary activity, thereby potentially suppressing learners’ focus on the actual content to be learned. Analyzed using the framework of biologically primary and secondary knowledge, the example of learning by drawing from texts could be interpreted as using a biologically secondary form of knowledge (drawing representational visualizations) in order to acquire the biologically secondary knowledge contained in texts. Since the use of biologically secondary knowledge is associated with a greater cognitive load as summarized above, it becomes clear why offering drawing guidance has been found to be beneficial by Fiorella and Zhang (2018). If learners receive sufficient guidance for their drawing task, it is possible that this task primarily relies on motor actions that mainly make use of biologically primary knowledge, such as when drawing simple marks in order to connect different elements in a drawing, rather than having to go through all the mental steps involved in creating a drawing from scratch. Thus, avoiding cognitive overload in a learning activity by relying mainly on biologically primary knowledge (or at least highly automated and well-trained skills) could be a strategy to optimize learning by doing while keeping learners actively engaged in a task instead of using more passive instructional methods. Otherwise, the cognitive costs of drawing may outweigh the cognitive benefits and could constitute an example of a learning activity that inhibits learning due to its complexity. An alternative approach to avoiding cognitive overload would be to restrict the use of learning by drawing to easier and familiar topics (Lowe & Mason, 2017). This analysis of learning by drawing using a combination of cognitive load theory, Geary’s (2008) distinction between biologically primary and secondary knowledge, and Pacherie’s (2008) model can be transferred to various other generative and creative tasks in order to predict their success in fostering learning.

Learning by Teaching

Another activity-oriented instructional approach is called learning by teaching and, as suggested by the name, aims at enhancing learning by letting learners teach others (e.g., Fiorella and Mayer, 2013). Despite positive findings concerning this method (e.g., Fiorella & Mayer, 2013), the approach also risks that learners invest their cognitive resources on tangential or entirely unrelated mental processes. After all, becoming a professional teacher usually requires a university degree in which effective teaching strategies, among other aspects, are covered and trained. However, it may be argued that core aspects of teaching are based on communicative processes, which could be considered a form of biologically primary knowledge. Calero et al. (2018) even speak of a teaching instinct, highlighting how deeply ingrained teaching is in human nature (see also Strauss & Ziv, 2012). However, it may still be valuable to analyze which aspects of teaching can be cognitively taxing.

Using a similar analysis based on Pacherie’s (2008) and Geary’s (2008) work as presented in the preceding section, learning by teaching could be divided into the following components: The distal layer of this task would be the intention to gain knowledge by explaining information to others, requiring at least some insight into training techniques, constituting biologically secondary knowledge. Learners using the learning-by-teaching method may choose to use a particular strategies on how to organize, present, and communicate information while keeping several contextual aspects (such as their “students’” prior knowledge) in mind. The use of these techniques and strategies represents an additional task involving deliberate cognitive activity and monitoring processes that first need to be trained, hence constituting biologically secondary knowledge. Based on this analysis, this facet of the learning task is likely to introduce cognitive load, at least in people not used to teaching others. The different actions needed to communicate information to others form the proximal layer of the task. These individual actions, such as verbally explaining information, creating simple organizers on a blackboard, and formulating questions, could be considered to generate only a limited amount of cognitive load (as they present combinations of biologically primary and biologically secondary knowledge). Lastly, the components of these actions on the motor level would be simple movements such as gesturing and pointing. Judging from the example of gesturing, this type of simple movements does not cause cognitive load (Hostetter & Bahl, 2023; Roth and Welzel, 2001) and can even contribute towards learning (e.g., Broaders et al., 2007; Novack et al., 2014). Thus, in contrast to perspectives emphasizing the instinctive nature of teaching (e.g., Calero et al., 2018), a closer analysis reveals that there is a considerable amount of biologically secondary knowledge involved that poses the risk of creating a secondary teaching task. Such a task has the potential to overshadow the primary learning task. Additional problematic factors involved in learning by teaching will be reviewed in the following sections.

Compared to other activity-based approaches, learning by teaching usually adds another component, namely the social aspect. A recent study highlights that this component needs profound consideration. Wang et al. (2023) compared teaching alone to a camera with two groups involving in-person “students,” either in a one-to-one setting or realized by teaching a groups of students. They found that teaching to a camera was more effective than teaching other humans in-person, as the former reduces anxiety and extraneous cognitive load, collectively referred to by the authors of that paper as distraction. As reviewed by Lachner et al. (2022), other studies similarly revealed negative effects of a high social presence during learning by teaching. Thus, although humans have evolved to be communicators, the specific situation of having to communicate material in a highly understandable way allowing to instruct others may not be the optimal way to study that material compared to settings with smaller extraneous task demands.

As with the method of learning by drawing, a superficial analysis of learning by teaching might overlook many intricacies that may negatively impact learners. Potential examples that could complicate learning by teaching would be learners lacking in social skills (or even having deficits in theory of mind, see Kline, 2015), having a highly introverted personality (see Fontana & Abouserie, 1993), or a false understanding of teaching methods. Thus, learning by teaching introduces a number of extraneous factors that may be exacerbated by learners’ individual differences. These factors may introduce additional extraneous cognitive load, at times potentially by relying on biologically secondary knowledge (such as knowledge on teaching methods). Thus, while learning by teaching may be an engaging task that invites learners to thoroughly understand the material they want to teach, the multitude of extraneous task demands involved could lead learning by teaching to be considered a secondary task. In line with this reasoning, learning by teaching may be another example for a generative instructional method that puts too much emphasis on the aspect of doing without offering enough opportunity to learn in return.

Mental Activities

As discussed above, some forms of learning by doing can be demanding tasks in their own right, possibly diminishing learning when compared to other types of learning activities. Learning by drawing and learning by teaching are only two examples for this problem. Both of these approaches have in common that they introduce a major secondary task into the learning process (drawing and teaching, respectively). In contrast to learning by drawing and learning by teaching, there are several instructional activities that do not introduce a secondary task into the learning process. These kinds of activities primarily make use of learners’ mental faculties, but go beyond the types of memorization and elaboration techniques typically used by learners. As mentioned in the section on learning by drawing, it can be valuable to mentally engage with learning contents by imagining them (e.g., Leopold et al., 2019; Leutner et al., 2009), which has previously been named the imagination effect (Leopold & Mayer, 2015). Despite the very limited or even nonexisting motor activity found in this kind of learning, it is considered to be a type of generative learning (Fiorella & Mayer, 2016).

Mental activities that profoundly engage learners in learning tasks can make use of biologically primary knowledge combined with emotional responses. For instance, a body of research suggests that predicting the results of scientific phenomena or processes can enhance learning (for an overview, see Brod, 2021), and the surprise element is often emphasized in such studies (Brod et al., 2018, 2022; Theobald & Brod, 2021). Pupillometric variations during learning tasks provide intriguing insights into the effects that epistemic emotions such as curiosity and surprise can have on learning (Theobald et al., 2022).

In summary, there are learning activities that are highly engaging and provide a strong component of meaningful engagement without having to introduce a cognitively costly secondary task relying on the motor system or complex activity routines. The mental activities outlined in this section pose a challenge for accounts of learning by doing that emphasize the motor aspect of learning. Research on mental activities such as imagining or predicting provides evidence for the claim that activity-based learning should not be narrowly focused on introducing motor activity or secondary tasks, but rather on engaging learners while avoiding extraneous cognitive load. Activities that do not fit the stereotype of “learning by doing” due to their omission of motor activity and other more elaborate interactions with the environment highlight that the optimization of learning activities by avoiding demanding secondary tasks may be a promising direction. In addition, recent research revealed evidence regarding the effectiveness of combining minor motor activities such as tracing and mental activities such as imagining (Wang et al., 2022). The results of that study suggest that the combination of tracing for a first set of problems and tracing with closed eyes (i.e., imagining) for the rest of the task can reduce problem-solving times compared to not using tracing and imagining during the acquisition of knowledge using worked examples in the context of mathematics.

Summary

Summarizing the theoretical approaches to complex and generative activity–based learning, which could be considered the prime candidate for what is commonly referred to as “learning by doing,” it becomes apparent that increasing the complexity of learning activities raises the risk for cognitively overwhelming learners. In the case of learning by drawing, it has been found that simplifying the drawing task can have a positive effect (e.g., Fiorella & Zhang, 2018). However, it has also been revealed that using instructor-generated visuals still often outperforms letting learners generate their own drawings (e.g., Ploetzner & Fillisch, 2017). Similarly, as summarized above, learning by teaching can also result in cognitive costs without providing adequate cognitive benefits, in particular due to the social dimension of teaching with the potential to complicate the task. Cognitive load stemming from learners being cognitively overwhelmed by the social situation of teaching other students can be avoided by reducing the number of people that are to be taught by learners, for instance by letting them teach towards a webcam (Wang et al., 2023).

Both of these instructional methods run the risk of introducing a secondary task that relies on biologically secondary knowledge—learners need to consciously think about creating a drawing or have to teach in an understandable manner. Thus, these approaches could be considered as high-risk methods in which (extraneous) cognitive load needs to be avoided in every possible way. Otherwise, more passive methods can be preferable, casting doubt on the universality of the notion of learning by doing. There are activities emphasizing learners’ engagement with content that feature a comparably small risk of cognitively overwhelming learners, for example letting learners generate predictions (e.g., Brod et al., 2018). Although such activities do not cause a physical change in the world and do not even necessarily activate the motor system, the engagement that can be achieved using them should not be underestimated.

In addition, the analysis of learning by drawing and learning by teaching revealed that both activities share a high level of cognitive load in their distal intentions (i.e., their global activity plan) while consisting of relatively simple motor components (e.g., drawing lines and gesturing). The analyses presented in this part of the review suggest that learning using activities may be more effective if the extraneous task demands are removed and the focus is shifted towards the less demanding components, such as gesturing.

Considerations for Digital Activities

The preceding sections primarily dealt with learning in the physical world. However, the turn towards digital enhancements and replacements of physical materials and activities raises several highly important questions regarding the transferability of traditional and analogue activities into the digital space. In this section, issues that may arise from the utilization of digital alternatives are discussed.

Digital activities offer learners possibilities to practice specific movements (e.g., in virtual reality sports training simulations, Harris et al., 2021) or to train entire behavioral routines (e.g., safety procedures in virtual reality, e.g., Makransky et al., 2019). Despite positive findings of activity-based learning using such technology, generative activities have been found to be challenging in the digital world and require additional consideration. Of particular interest in the context of the current review are teaching approaches aimed at promoting learning by creating material products, such as concept maps, drawings, or other representations. As outlined above, even when using analogue tools in order to create representations such as drawings, there have been discussions warning that learners may be cognitively overwhelmed in such settings if the contents to be learned are unfamiliar and too complex (Lowe & Mason, 2017). A problem arising in the effort to promote digitalization in education is the common assumption that digital alternatives will have a similar or even equal effect on cognitive, emotional, motivational, and social factors during learning. A considerable number of studies suggest the opposite: digital substitutes often have a distinct impact on these variables that in some cases is not comparable to the analogue version despite their superficial similarities. This can already be shown using very basic activities. Schueler and Wesslein (2022) found that some activities that were previously found to avoid cognitive load by relying on biologically primary knowledge do not have the exact same effects when ported to the digital world. In their study, finger tracing was revealed to be difficult when performed on tablet computers, in contrast to several previous studies affirming the potential of finger tracing. In addition, a recently published study revealed results concerning how finger pointing differs from pointing behavior using a computer mouse. Zhang et al. (2023a) found that while finger pointing increased retention performance compared to learning without pointing when learning from online material designed in a way to scatter learners’ visual attention, mouse pointing had no positive effect. This result suggests that even very basic actions differ in their effect on cognitive processing depending on whether they are performed in the physical or virtual world.

Another striking example highlighting that digital tools lead to different outcomes is the case of creativity and collaboration. Both in a laboratory study and in field experiments, Brucks and Levav (2022) found that the number of creative ideas generated in a collaborative task is decreased by webcam-based collaboration in contrast to in-person collaboration. The authors of that paper explained this finding using gaze-tracking data that suggests that digital collaboration can lead to an unnatural fixation on the screen, reducing gaze patterns that involve looking around the room instead of “staring” at the collaboration partner on-screen. Thus, the use of digital technology always needs to be analyzed concerning the differences to analogue activities, and even small differences can have a strong impact.

An important aspect to consider when turning to digital versions of real-world activities concerns the simplification of activities and movement. Educational activities in the physical world allow a rich set of actions and perceptions to occur that are often essential for the task (see Cooper & Tisdell, 2020). Digital learning activities often use a very limited selection of interaction possibilities, such as mouse clicks or controller interactions that usually lack haptic feedback and freedom of movement. Virtual reality (VR) and augmented reality (AR) offer the potential to remedy these problems by providing a high range of motion and realistic motion-based input (for an overview, see Lindgren & Johnson-Glenberg, 2013). In tasks with a focus on transfer and procedural learning, the benefits of these forms of technology become evident. For instance, in a study by Makransky et al. (2019), VR improved participants’ performance in movement-based transfer tests in a security training setting compared to reading about security procedures. In that study, VR resulted in positive effects on more tests than the “desktop VR” version using a regular computer and screen. A study on learning in VR by Johnson‐Glenberg et al. (2021) revealed that an inappropriate use of movement can harm learning. Their study indicates that a passive design can be appropriate for the use on desktop computers, but learning in VR without having the possibility to act with the environment is detrimental. Therefore, different technologies require their own design for optimal results.

Biologically Secondary Knowledge in Digital Learning

As in the case of traditional instruction, the distinction between biologically primary and secondary knowledge can also be highly useful when analyzing why some educational approaches result in less effective learning when transferred to digital tools and environments. It becomes apparent that digital implementations of learning by doing are at risk of cognitively overwhelming and distracting learners. While the creation of a drawing, a poster, or other types of external representations can rely on a number of rather simple “action units,” such as making marks on paper, spatially organizing materials, and other basic actions that do not induce excessive cognitive load, achieving the same results using digital tools usually relies almost exclusively on biologically secondary knowledge. For instance, creating a digital drawing or poster requires users in most design software to input settings for a blank canvas, to make numerous choices in the presentation (such as the fonts, colors, and formatting), and to draw shapes using a computer mouse. All of these steps risk being more demanding and distracting for learners than using their biological primary knowledge (together with a limited amount of biologically secondary knowledge) in order to create materials during the learning process (see Fig. 3).

Fig. 3
figure 3

A visual representation of different amounts of cognitive load involved in physical learning activities and their digital counterparts. Creating a poster uses a number of capabilities that are usually highly practiced and automated, such as physically arranging items to be glued on the poster or writing down words. Learners generating a poster thus mainly have to be concerning with the overall design of the poster and can keep their cognitive resources reserved for engaging with the content. While creating a digital presentation or poster, however, learners not only deliberately put effort into the design, but also need to navigate the user interface, which gives them a high amount of options. As engaging with the user interface and considering the variety of formatting options may not be automated, these biologically secondary activities generate their own extraneous cognitive load

For novice software users, this becomes an even more challenging task as they need to pick up the skills to use the software as they are trying to learn the content that they are visualizing or presenting using that software. Although expert users of many software packages claim that they do not need to consciously think about most basic actions (that are thought to have entered their “muscle memory”), it is debatable whether they go through similar steps of action planning and execution using digital tools as less experienced users, only somewhat faster. As a result, educators should not assume that transferring a learning activity from the physical to the digital world will result in similar (or even identical) cognitive processes and demands. These concerns for unnecessary secondary tasks can be transferred to various other forms of activity-based learning, such as serious games that first require a certain investment in learning the controls, mechanics, and narrative of an educational computer game (Skulmowski & Xu, 2022).

Summary

In summary, approaches relying on learning by doing should be carefully evaluated when transferred to the digital space. As many tasks involving digital tools involve more extensive planning and rely on memorized steps in their execution, these digital activities may be challenging for novices and young learners. Furthermore, even simple activities such as finger tracing can cause more cognitive load when performed using digital devices instead of physical entities (e.g., Zhang et al., 2023a), highlighting the risk of cognitive overload. It is therefore important to carefully consider how and when to transfer physical and generative activities to the virtual realm. While virtual environments feature several unique possibilities, the cognitive load that can be generated through their use should be considered. The danger of a cognitive overload could be mitigated by choosing user interfaces that avoid introducing biologically secondary knowledge (for instance by relying on intuitive interface implementations focusing on touchscreen activities and gestures) and using artificial intelligence to externalize demanding aspects of complex activities (e.g., Wahn et al., 2023; for an overview of digital externalization, see Skulmowski, 2023) to leave more cognitive resources for the actual learning process.

Implications and Recommendations

Activities are an essential part of learning and instruction. However, the term “learning by doing” is often used in such overly general and haphazard ways that (aspiring) educators may gain the impression that virtually all learning tasks can be enhanced or made more palatable by adding some practical element. Importantly, it is usually not feasible to facilitate learning using activities that are themselves cognitively demanding. In such cases, the “doing” can replace the “learning” by failing to leave sufficient cognitive resources for learners to memorize content.

The reviewed research indicates that it is necessary to distinguish between various forms of learning activities, namely more immediate motor-focused activities and complex, generative learning activities. In the case of activities targeted at utilizing the motor system, it is essential to avoid generating cognitive load as a result of overwhelming learners with too much (and potentially task-irrelevant) movement. However, for many procedural learning tasks, movement is an indispensable component that can optimally convey the information to be learned. In such instances, the cognitive load introduced by the activity could be considered as an essential part of the learning task, thereby transforming the otherwise extraneous cognitive load into intrinsic cognitive load (see Skulmowski & Xu, 2022). As a result, an increase of cognitive load that is related to the learning objective may even be desirable (Skulmowski & Xu, 2022) as long as the combined cognitive load does not cognitively overwhelm learners.

In the case of complex activities, which includes approaches such as generative learning, it is advisable to assess whether these activities introduce a secondary task that takes up learners’ cognitive capacity without an appropriate return on this cognitive investment. For instance, learners do not automatically gain a better understanding of something they had to draw during learning compared to working with provided visualizations. In fact, drawing may be considered as a cognitively demanding task as soon as complex shapes, perspective, and shading become necessary. Similarly, learning by teaching may foster specific thought processes (Marno et al., 2021; Pi et al., 2021), but the social component may be too demanding for some learners.

Using the proposed analysis system in which cognitive load theory, Geary’s (2008) distinction between biologically primary and biologically secondary knowledge, and Pacherie’s (2008) phenomenological model of action layers is combined, educators can predict how well an activity will help learners achieve their learning objectives. By listing whether the action components on the distal, proximal, and motor levels are cognitively demanding, educators have a starting point for their assessment whether an activity will likely be too complex. In this case, educators may choose to simplify the task by adding guidance, replacing demanding aspects of the task, or substituting the use of biologically secondary knowledge with biologically primary knowledge.

However, if a complex task such as learning to draw or learning to teach is part of the learning objective, the increase in cognitive load due to the activity would still be in alignment with the instructional aims. Skulmowski and Xu (2022) describe cognitive load alignment as a strategy based on cognitive load theory that is useful for contexts involving a high level of extraneous cognitive load, in particular stemming from various aspects of digital learning. In this approach, a higher cognitive demand of an activity, such as dealing with complex interaction patterns, can be regarded as ultimately fostering learning if these interactions are a necessary component of the learning task. Applied to the present problem of learning by doing, it may be advisable to limit learning activities to learning tasks that can be instructionally aligned with these activities. For example, learning how to code already is a highly demanding learning activity that is unlikely to be enriched by adding the component of learning by teaching. However, for learning content that features a communicative component, such as language learning, an activity such as learning by teaching may be well-suited. While these two examples remain speculation until an empirical investigation, the strategy of cognitive load alignment may provide a starting point for the cognitive analysis of tasks. Framed using concepts from research on embodied learning, this approach could be linked to the dimensions of integration and relevance presented by Mavilidi et al. (2022). Applied to the broader field of learning by doing, learning activities should be well-integrated into the learning task and need to have relevance for the learning objective.

Educators are asked to closely consider the trade-offs found in different learning activities. The preceding analysis suggests that if content is relatively easy and perhaps not very engaging, introducing a learning activity may be a helpful strategy, particularly when the activity is in alignment with the objective. If a learning task is already quite demanding, it may be possible to facilitate learning by using activities that introduce little additional cognitive load, such as imagining or predicting.

Future research needs to identify (components of) learning activities that can interfere with an otherwise engaging learning activity. These findings could be used to formulate more specific guidelines on how to design learning by doing. While the cognitive demands of two popular generative learning activities, learning by drawing and learning by teaching, and two less taxing, regularly used mental activities are analyzed in detail in this paper, the general pattern of this analysis can be applied to many other forms of (activity-based) instruction.

Contextual factors of the learning task may require additional considerations. For instance, the effectiveness of learning by drawing may depend on learners’ prior knowledge (Lin et al., 2017). As Lin et al. (2017) found that drawing only fostered learning for learners with a low prior knowledge, they theorize that drawing may be an unnecessary addition for learners with high prior knowledge.

An important implication of the analysis presented in this paper is that teacher education, in particular digital learning, needs to include a balanced view of activity-based learning and the use of digital tools. Aspiring teachers need to be made aware of the various cognitive demands that “learning by doing” can involve. Most importantly, teachers need to be cautious when combining content with learning activities. Without a consideration of the intrinsic cognitive load of learning content and the extraneous cognitive load that can be introduced through learning activities, these two types of cognitive load can easily add up to cognitively overwhelm learners. Thus, the cognitive analysis of learning tasks according to cognitive load theory and related principles should be practiced by pre-service teachers and other educators.

Conclusion

In conclusion, some ill-designed interventions based on the learning-by-doing principle may be comparable to opening up a restaurant while merely trying to learn how to cook. They can cognitively overwhelm learners with a considerable number of unrelated tasks and thereby may prevent learners from focusing on the intrinsic cognitive load of a learning assignment. Thus, it is important that educators strip away superfluous and distracting elements of activities in order to prevent keeping learners from being burdened with irrelevant secondary tasks. However, less demanding activities and activities that are well-aligned with the learning objective remain important components of instruction and offer great potentials for learning.