The spacing effect was described by Ebbinghaus (1885) in what may be the first example of an effect by the new discipline of experimental psychology. It occurs when learning episodes that are spaced with gaps between them result in superior learning compared to the same episodes presented in massed form without spacing. Interleaving is sometimes described as an example of spacing (e.g., Taylor & Rohrer, 2010), because the learning episodes relevant to one skill are spaced by interleaved learning episodes relevant to a different skill. In this paper, we argue that the spacing and interleaving effects have distinct causes, with spaced practice having been explained by a variety of constructs including the cognitive load related effect of working memory resource depletion, and interleaved practice explained by the discriminative-contrast hypothesis providing an example of a learning-to-perceptually-discriminate effect.

We suggest two different hypothesised processes underlie these two effects:

  1. 1.

    In spaced practice, periods of mental activity are alternated with periods of mental rest-from-deliberate-learning. Mental rest-from-deliberate-learning offers the opportunity to restore depleted working memory resources caused by mental effort during learning.

  2. 2.

    In interleaved practice, periods of practicing one skill are alternated with periods of practicing a different skill. Interleaving does not involve mental rest-from-deliberate-learning but rather assists learners to discriminate between similar concepts or topics of a domain.

A detailed justification of these two hypotheses is provided next.

Cognitive Load Theory and the Spacing Effect

Cognitive load theory (CLT) is an instructional theory based on human cognitive architecture and evolutionary psychology (Sweller et al., 2011; Sweller et al., 2019). It can be described by five fundamental principles (Sweller & Sweller, 2006).

We acquire information using either the borrowing and re-organising principle under which information is obtained from other people, frequently teachers, or if we cannot obtain information from others, we randomly generate new information and test it for effectiveness using the randomness as genesis principle during problem solving. That information then is processed in a limited capacity and duration working memory by the narrow limits of change principle before being stored for indefinite periods of time in our large, long-term memory using the information store principle. Based on signals from the external environment, indefinite amounts of that stored information then can be transferred back to working memory to govern appropriate action using the environmental organizing and linking principle. These five principles serve as the cognitive base for instructional design. They have been used to predict that when dealing with complex, high element interactivity information that imposes a heavy working memory load, a reduction in element interactivity will enhance learning (Sweller, 2010).

Working memory resource depletion occurs when a task requires a heavy cognitive effort that depletes the limited working memory resources indicated by the narrow limits of change principle, so depressing performance on an immediately subsequent task. Substantial evidence for working memory resource depletion using multiple experiments was provided by Schmeichel (2007). There also is some evidence that working memory resource depletion may be more likely when two tasks share common cognitive components (Healey et al., 2011). The depleted resources can be restored by rest-from-deliberate-learning (e.g., Tyler & Burns, 2008). CLT assumes that inserted spacing with rest-from-deliberate-learning during spaced practice allows the restoration of depleted working memory resources, thus enhancing subsequent learning.

As indicated above, cognitive load effects usually require high element interactivity information that imposes a heavy working memory load. The claim that the spacing effect is a cognitive load effect that is obtainable with simple materials that are low in element interactivity requires further discussion, as it apparently contradicts the usual cognitive load theory assumption that cognitive load effects require complex materials that are high in element interactivity. Most cognitive load effects are caused by an instructional procedure that overwhelms working memory during learning. Learning is enhanced when compared with an alternative instructional procedure that reduces working memory load. For example, presenting information in a split-attention format requires the use of more working memory resources than presenting the same material in a physically integrated format, leading to the split-attention effect. If intrinsic cognitive load is low because element interactivity is low, the effect will not be obtained because even under split-attention conditions, total cognitive load does not exceed working memory resources.

Working memory depletion may be more likely under high element interactivity conditions caused by extraneous cognitive load and so contribute to cognitive load effects, but depletion is not required as an explanation. The effects can be explained by cognitive load theory without any recourse to depletion effects.

The spacing effect is different. The effect is not obtained because working memory is overwhelmed at a given point due to high element interactivity. Rather, we have hypothesised that the effect is obtained because working memory is depleted when learning continuously without rest-from-deliberate-learning. Currently, we do not know whether depletion is greater using high than low element interactivity information. It may well be greater but given the large number of examples of the spacing effect using simple memorisation tasks that are low in element interactivity, the effect clearly does not require high element interactivity information. The fact that most cognitive load effects require high element interactivity information does not contradict the hypothesis that working memory depletion can occur using high or low element interactivity information because most cognitive load effects do not require the incorporation of depletion into their explanation.

Chen et al. (2018) designed two experiments to test the hypothesis that the spacing effect is due to working memory resource depletion. Primary school students were assigned to either massed or spaced practice when learning fraction addition. Massed practice was completed in 1 day. The same amount of practice was completed by the spaced practice group over a period of 3 days. For both groups, after all practice sessions were completed, a working memory capacity test was administered immediately prior to the content test phase. For massed practice, all learning activities, including the working memory resources test and post-content test, were conducted within a day, whereas the spaced practice group received the working memory resources test and post-content test a day after the third day of practice. Accordingly, the working memory resources test for the massed group was administered immediately after the learning phase and immediately before the content test while the same test for the spaced group was administered after a rest-from-deliberate-learning but also just before the content test. This procedure allowed differences in the working memory test results between the two groups to indicate whether any depletion due to deliberate learning was reversed after rest and whether that reversal was associated with differences on the content test. The results showed that spaced practice resulted in superior learning to massed practice with that superiority associated with increased availability of working memory resources.

Spacing Effect and Rest-from-Deliberate-Learning

A spacing gap can be either across individual items that together constitute a single concept with those items massed or spaced, or across practice sessions (Carpenter, 2014). In all studies classified in this paper as testing the effect of spacing, the space allows rest-from-deliberate-learning rather than being filled by another task that requires deliberate learning. It does not refer to rest from other cognitive activities such as for example, subtracting from 100 in steps of 7 that may impose a heavy cognitive load but with any learning being incidental rather than deliberate. None of the studies classified as spacing effect studies in this paper intentionally presented participants with explicit learning tasks although during long periods of rest-from-deliberate-learning, some of the uncontrolled activities of participants may have included intentional learning tasks. For our purposes, the only relevant criterion to be included during the rest-from-deliberate-learning phase is that learners had periods when intentional learning did not occur. Critically, for the interleaving effect, the “resting” period is filled with deliberate learning and it is that factor that distinguishes spacing from interleaving experiments. The effect of mental rest-from-deliberate-learning on cognitive function may be analogous to resting muscles after heavy physical exercise.

Explanations for the Spacing Effect

As indicated above, based on the rest-from-deliberate-learning and working memory resource depletion assumption, Chen et al. (2018) suggested that the spacing effect is due to working memory resource depletion during learning with recovery after rest-from-deliberate-learning. There are many earlier explanations. Delaney et al. (2010) favored a study–phase hypothesis according to which the gaps between learning sessions increase forgetting, so more effortful retrieval is required for the next learning session compared to massed learning. It follows from the study–phase hypothesis that an even larger spacing effect should be obtained if the rest-from-deliberate-learning period is replaced by other, mentally taxing activities that might result in even more forgetting of the target activity and so requiring even more effortful retrieval. Accordingly, the study–phase hypothesis and cognitive load theory generate opposing predictions. These predictions are incidentally tested by interleaved practice, discussed below.

There are earlier suggestions such as rehearsal, consolidation, voluntary–attention, habituation, and encodingvariability (Hintzman, 1974) that have been proposed as explanations of the spacing effect. The rehearsal explanation suggests that learners continue to process the learned material during spacing and so memory of the information is consolidated. During massed practice, interference from the following information during rehearsal will occur (Atkinson & Shiffrin, 1968; Delaney et al., 2012; Greeno, 1967; Rundus, 1971). The consolidation explanation is similar to the rehearsal explanation in that if no intrusion comes from the immediately following information, the learner engages in strengthening and consolidating the memory trace of the initial event, even though the initial event is absent (Delaney et al., 2012; Landauer, 1969, 1974). These explanations contradict the working memory recovery hypothesis since they assume that the information continues to be processed during “rest” periods, suggesting that real rest is not occurring and so working memory recovery also should not occur.

The encoding–variability explanation assumes that the greater the variability of presentations of the information, the more successful is encoding. Increasing the intervals between presentations increases the chance of similar information being coded differently thus increasing the strength of memory traces (e.g., Maddox, 2016; Martin, 1972; McFarland Jr. et al., 1979). Assuming that differential coding of the same information increases the burden on working memory, this explanation also contradicts an increase in working memory resource availability when information or activity is spaced.

Both the voluntary–attention explanation (Hintzman, 1974, 1976; Koval, 2019; Underwood, 1969, 1970) and the habituation explanation (Hintzman, 1974; Koval, 2019) suggest that there is insufficient attention paid to the following information if it arrives too soon after the initial information or activity. Both explanations are in accord with the working memory recovery hypothesis since a decrease in attention or habituation due to subsequent cognitive activity is consistent with the effect of a decrease in working memory resources due to activity.

The Discriminative-Contrast Hypothesis and Interleaved Practice

Blocked practice occurs when all episodes associated with learning skill A are presented before switching to all episodes of learning skill B (e.g., AABB). Alternatively, interleaved practice occurs when each episode of learning skill A is alternated with an episode of skill B (e.g., ABAB). Generally, research has shown that interleaved practice is superior to blocked practice (e.g., Kang & Pashler, 2012; Kornell & Bjork, 2008; Taylor & Rohrer, 2010; Rohrer & Taylor, 2007; Shea & Morgan, 1979; Wahlheim et al., 2011a, b). The interleaving effect can be explained by the discriminative-contrast hypothesis.

Early research on interleaved practice was built on the contextual interference hypothesis (Battig, 1972) which makes predictions identical to the discriminative-contrast hypothesis. Both hypotheses assume that increased contextual interference during practice could produce more distinctive and elaborative processes that allow learners to distinguish between similar categories of information. The discriminative-contrast hypothesis was initially proposed by Kornell and Bjork (2008) to deal with category learning, such as distinguishing types of birds or types of paintings. Kang and Pashler (2012) designed a study to test the hypothesis. Different paintings from different artists were interleaved for discrimination. In order to test the influence of discriminative practice, participants in one group viewed the paintings from the same artist that were interleaved by cartoons that were very easy to discriminate from paintings. Zulkiply and Burt (2013) used a similar design to Kang and Pashler (2012). In both studies, interleaved practice was found to be more effective for materials with a low degree of discriminability, supporting the discriminative-contrast hypothesis.

Explanations for the Interleaving Effect

A common explanation for the interleaving effect is that since interleaving automatically includes spacing, interleaving is simply an example of the spacing effect and so can be explained in an identical manner. As indicated below, we doubt that that explanation is viable. The discriminative-contrast or contextual interference hypothesis provides an alternative. Rohrer et al. (2014) extended that hypothesis. Their association hypothesis, built on the discriminative-contrast hypothesis, suggests that students not only learn to discriminate among and contrast problems but also associate a specific response strategy to individual problems when those problems are presented in interleaved form. The association hypothesis may possibly explain the interleaving effect when using problems that may be easier to discriminate between, although it is not clear why block practice would not also facilitate associations. Possibly, novice learners do need to learn to discriminate between problems from the same field (e.g., mathematics) even if for a more expert learner, they are completely different.

Detailed evidence in the literature for the mental rest-from-deliberate-learning and discriminative-contrast hypotheses

In this study, literature about interleaving and spacing effects was systematically searched and analysed to test the hypotheses that the spacing effect is due to mental rest-from-deliberate-learning while the interleaved effect can be explained by the discriminative-contrast hypothesis. An interleaved design necessarily results in spacing, increasing the difficulty of establishing a clear boundary and definition of interleaving and spacing effects. Research studies on spaced and interleaved practices were systematically searched using Scopus, Web of Science, ERIC, and PsycARTICLES as databases. By using “Spacing effect” AND “Learning” and “Interleaving effect” AND “Learning” separately as keywords, there were a total of 657 studies found and screened from the four databases. We conducted our searches from the 10th of May 2020 to the 20th of August 2020 using Scopus and Web of Science. ERIC and PsycARTICLES searches were completed by the 6th of January 2021.

Inclusion Criteria

The following inclusion criteria for the review were used: (a) the language for publication was English; (b) a quantitative measurement of performance was included; (c) across all included studies, participants were students across all stages; (d) publications were in journals, conference proceedings, or books.

Exclusion Criteria

Some conditions were used to exclude some of the searched studies: (a) the language of publication was not English; (b) the authors did not report an experimental study; (c) the authors did not measure learning but instead measured other factors such as motivation.

Article Screening

Based on inclusion and exclusion criteria, the first author screened each study by reading the abstract initially, leaving 150 studies for further screening. Thereafter, the full texts were read, focusing more on experimental materials and procedures used, resulting in the exclusion of a further 31 studies. This procedure left 119 studies to be included for analysis.

Coding of Included Studies

Two tables were created to code studies demonstrating successful spacing and interleaving effects separately. In this systematic review, rest-from-deliberate-learning is defined as activities that did not require learners to process information in working memory that needed to be transferred to long-term memory for later use. Accordingly, rest-from-deliberate-learning activity tasks were not targeted for learning and testing. For example, the play activity between two targeted items in the study of Vlach et al. (2008) was classed as a rest-from-deliberate-learning activity. Based on this classification of rest-from-deliberate-learning, 48 studies were classed as testing for the spacing effect (Table 1) and 67 as testing for the interleaving effect (Table 2). In addition, 2 studies of the 67 classed as interleaving studies along with another 3 studies that could not be obviously classified as either spacing or interleaving studies based on rest-from-deliberate-learning found neither spacing nor interleaving effects.

Table 1 Summary of spaced practice studies
Table 2 Summary of interleaved practice studies

In Table 1 for the spacing effect, which included rest-from-deliberate-learning for all studies, the factors coded were (a) the authors and year of publication; (b) the major rest-from-deliberate-learning activity and its difficulty; (c) the complexity of materials; (d) the length of rest-from-deliberate-learning; (e) effect sizes. In Table 2 for the interleaving effect, none of which included rest-from-deliberate-learning, the factors coded were (a) the authors and year of publication; (b) the degree of similarity of the materials; (c) effect sizes.

Analysis of the Literature Search

The Spacing Effect and Evidence Supporting Hypothesis 1

Except for Bego et al. (2017) published in conference proceedings and Ebbinghaus (1885) published in a book, the other studies were all journal articles. All of the 48 spaced practice studies (see Table 1) obtained the spacing effect and used a rest-from-deliberate-learning period either by distributing learning sessions across multiple days or months or by inserting rest-from-deliberate-learning activities between learning sessions for 30 s to hours or days. From Table 1, it can be seen that the majority of rest-from-deliberate-learning activities comprised sleeping and taking a break between two learning sessions. The type of break varied between studies, from direct breaks with no tasks to playing or reading some non-targeted passages. In all cases, the break could result in mental rest-from-deliberate-learning, such as sleeping, allowing the restoration of working memory resources that had been depleted during the learning session. Additionally, activities that were in some cases inserted during breaks but which differed from the targeted learning materials or concepts and were not relevant to learning also fell within our definition of rest-from-deliberate-learning. The length of rest-from-deliberate-learning also varied, from 30 s to a few days. Eleven of the 48 studies found a spacing effect with very short rest-from-deliberate-learning periods (30 s but less than 12 h), but the majority of studies found a spacing effect with a long-period rest-from-deliberate-learning. Thirty of the 48 studies used simple experimental materials, such as word pairs. Although the materials were themselves simple, during the experiments, participants had to complete multiple trials continuously.

Thirty-seven of the 47 studies included sleeping as rest-from-deliberate-learning. The remaining ten studies did not include sleeping but included breaks, such as playing or reading learning-irrelevant passages to rest learners from deliberate learning. The main function of breaks was to take students’ attention away from learning, similar to the breaks between classes in schools. Based on reported effect sizes in Table 1, the spacing effect is robust for varied lengths of resting, from 30 s to a few days. For example, a large effect was found for resting of 30 to 50 s (Vlach et al., 2008).

The positive effects of spaced practice have been supported by the research studies of Table 1 in many topic areas from learning motor movements (Dail & Christina, 2004a, b) to learning physics (Grote, 1995). Other studies, focusing on recalling word lists, also showed superior results of spaced practice (Cain & Willey, 1939). Rea and Modigliani (1985) found spaced practice with young children when learning multiplication facts and spelling lists.

Although some studies found superior results of spaced practice with very short study interventions (seconds: e.g., Whitten II & Bjork, 1977), others have obtained the effect with longer interventions (days or weeks: Dobson et al., 2017; Chen et al., 2018). The duration of an intervention seems not to be a moderator for spaced practice (Gerbier et al., 2015; Godbole et al., 2014; Kapler et al., 2015).

Two types of spaced practice that are discussed are uniform spaced practice and altering spacing intervals. Whereas uniform spaced practice consists of fixed or equal intervals between two or more learning episodes, in expanding spaced practice, intervals are gradually increased while in contracting spaced practice, intervals are gradually reduced. Küpper-Tetzel et al. (2014) investigated the effectiveness of uniform, contracting and expanding spaced practice. Twenty-eight word pairs were created by using 56 concrete and highly familiar nouns with no obvious semantic associations among them. For uniform spaced practice, the learning intervals were 3 days, and for expanding spaced practice, the first learning interval was 1 day and then 5 days with contracting spaced practice moving from 5 to 1 day. Results indicated that expanding and uniform spaced practices were better for long retention intervals compared to contracting spaced practice, which was better for shorter retention intervals. A clear superiority of expanding spaced practice was found by Toppino et al. (2018) and Vlach and Sandhofer (2012). Generally speaking, expanding spaced practice is superior to uniform spaced practice (Cepeda et al., 2006) and contracting spaced practice is the least effective for long-term retention.

The results of these studies are consistent with the hypothesis that spaced practice is a cognitive load effect due to working memory resource depletion. Chen et al. (2018) provided evidence that working memory resources decreased after cognitive effort and recovered after rest-from-deliberate-learning. Based on the definition of rest-from-deliberate-learning, all included studies were classified as spaced practice without necessarily following the original definition of spacing or interleaving in the reported study.

The Interleaving Effect and Evidence Supporting Hypothesis 2

As indicated in Table 2, Maass et al. (2015) and Ostrow et al. (2015) reported an interleaving effect in conference proceedings with all other experiments reported in journal articles. All 67 studies demonstrated the interleaving effect and used very similar interleaved concepts during learning. They shuffled targeting concepts or inserted lags (i.e., intervening concepts) between two targeted concepts without no mental rest-from-deliberate-learning. A lag effect is categorised as a spacing effect in many studies, as different concepts or skills were spaced, but based on our definition of rest-from-deliberate-learning, introducing a lag should be classed as interleaving, as no rest-from-deliberate-learning was inserted although there was spacing. The 67 studies testing for interleaved practice (see Table 2) are in accord with the hypothesis that interleaved practice is due to learning to discriminate and contrast. The general design testing for the interleaving effect is as follows:

  1. 1.

    Shuffling: ABCBCACAB……. [where A, B, C are targeted skills]

  2. 2.

    Lagging: AaaaBbbbCccc…… [where A, B, C are targeting skills that are from a similar domain to the skills associated with a, b, c]

An interleaving effect has been found on inductive learning in which learners acquire a concept or category by observing exemplars. In Kornell and Bjork’s (2008) study, a given artist’s paintings were presented in blocked practice or interleaved with other artists’ paintings as interleaved practice. Participants were required to discriminate between paintings by different artists and indicate which of the previously seen artists, if any, painted each of a series of new paintings. Similar results were reported by Kornell et al. (2010). Wahlheim et al. (2011a, b) presented bird species either individually or in pairs. Participants were better able to recognise different bird species in pairs. Carvalho and Goldstone (2014) designed a series of blobs, created by randomly generating curvilinear segments, to investigate interleaved practice. When designing those blobs, they varied not only similarity between—but also within—categories. The results again found that interleaved practice was more effective for high-similarity categories.

In addition to category learning, some research studies in other domains, such as Mathematics, Music, and Language, also supported the discriminative-contrast hypothesis with interleaved practice (Rohrer & Taylor, 2007; Taylor & Rohrer, 2010). Rohrer and Taylor (2007) compared interleaved practice with blocked practice in two experiments. The results of both studies showed interleaved practice to be more effective than blocked practice. Taylor and Rohrer (2010) designed four types of mathematics problems that were either blocked or interleaved. They contrasted and controlled the spacing factor in order to test the interleaving effect only. Interleaved practice greatly enhanced learning performance by improving students’ ability to associate each problem with its correct solution. Wong et al. (2020) chose music pieces composed by six musicians. Participants were either in the interleaved condition where the music pieces from different composers were alternated, whereas, in the blocked condition, participants listened to all music pieces from one composer then switched to listening to all music pieces from another composer. Interleaved practice was found to be more effective than blocked practice in distinguishing styles of different composers. A study of interleaved practice in foreign language learning was reported by Pan et al. (2019). In their experiment, interleaved practice was designed for learning Spanish grammar and found to be superior to blocked practice. Rozenshtein et al. (2016) presented six examples of 12 common chest radiographic patterns in either interleaved or blocked practice format. Interleaved practice enabled participants to better distinguish and recognise differences in patterns than blocked practice, which supported the discriminative-contrast hypothesis.

It needs to be noted that for some studies, such as Rohrer et al. (2015), it is not possible to disentangle spacing and interleaving effects. In these studies, learning sessions were distributed across days, allowing sleeping/breaks for restoring working memory resources (testing the spacing effect), and in addition, including interleaving within each assignment for contrasting and discriminating purposes (testing for the interleaving effect). It also should be noted that in the study of Rohrer et al. (2014), while dissimilar mathematics problems were used, students were novices with respect to those problems and so had to learn which solution was appropriate, an activity which required them to contrast and discriminate between the problems leading to the interleaving effect.

Null Effects of Spacing and Interleaving

Tables 1 and 2 list positive examples of spacing and interleaving, respectively. They are in accord with the two hypotheses that the spacing effect requires a mental rest-from-deliberate-learning period while the interleaving effect requires two tasks that students need to learn to discriminate between. Of course, stronger evidence for the mental rest-from-deliberate-learning hypothesis would be obtained if it could be shown that spacing without mental rest-from-deliberate-learning reduced or failed to demonstrate the spacing effect while interleaving domains that were obviously different reduced or failed to demonstrate the interleaving effect.

There were five studies that (1) all used spacing between different concepts or skills with no rest-from-deliberate-learning, resulting in no spacing effect; (2) all used materials that were very dissimilar, resulting in no need to contrast and discriminate materials and so finding no interleaving effect. These null effects support both hypotheses. The relevant studies were all interleaving experiments in which the domain sets were obviously different and equally difficult. Thus, all the studies used experimental designs in which there was spacing with no rest-from-deliberate-learning and interleaving without the need to learn to discriminate between domains. With the exception of the study of Ostrow and Heffernan (2015) that was published in a conference proceeding, the other four studies were all journal articles.

Ostrow and Heffernan (2015) compared an interleaved design (A1, A2, B1, B2, C1, C2, A3, B3, C3, B4, C4, A4) with a blocked design (A1, A2, A3, A4, B1, B2, B3, B4, C1, C2, C3, C4). They found no difference between conditions with knowledgeable learners. There were three skills taught: Complementary and Supplementary Angles, Surface Area of a Pyramid, and Compound Probability without Replacement. Although the different skills were automatically spaced in the interleaved design, no rest-from-deliberate-learning was allowed, and so the spacing effect was not generated, in accord with Hypothesis 1. In addition, as learners were knowledgeable in the domain, there was no need to contrast and discriminate similar concepts and skills; therefore, the interleaving effect was not generated, in accord with Hypothesis 2.

De Croock and Van Merriënboer (2007) investigated the effects of blocked practice and interleaved practice on troubleshooting skills in three dissimilar scenarios: One set dealt with injury cases, one with damage cases, and one with traffic cases, and found no difference between groups. A design similar to Ostrow and Heffernan (2015) was used. The failure to find the effects due to either spacing or interleaving with neither rest-from-deliberate-learning nor a need to contrast or discriminate again supported both hypotheses.

Carpenter and Mueller (2013) presented a series of French words that represented different pronunciation rules. The words were arranged either blocked by rule or interleaved by rule. The performance of blocked practice was superior to that of interleaved practice. The explanation was that there was no discrimination required when learning those words and rules. No spacing effect was found as rest-from-deliberate-learning was not allowed. Similarly, when asking students to learn word pairs, such as anatomy names and Indonesian words, in either interleaved or massed practice, a null effect was found by Hausman and Kornell (2014). They suggested that word pairs might be very easily distinguished, so the discriminative-contrast hypothesis was not tested. Since there was no rest-from-deliberate-learning period, the spacing effect was again not obtained.

Yan and Sana (2020) investigated the interleaving effect by interleaving either one level (domain or concept) or two levels (domain and concept); the concepts were chosen from two unrelated domains: physics and statistics. When the two levels were both interleaved, there was no significant difference between an interleaved design and blocked design. This result again supports our hypotheses since there was neither rest-from-deliberate-learning nor difficult to discriminate domains or concepts.

Similarity of Materials

The materials used in Table 2 and the five studies indicating no effect may need further discussion. With respect to Table 2, all materials were chosen from the same domain or area, which were not easily distinguishable by novice learners. Often, participants had to contrast and discriminate between formulae. Paintings were another type of materials used by many of the studies in Table 2. Participants needed to contrast and discriminate the paintings by the style of each painter, a very subtle distinction. Other studies used words or word pairs from the same language as stimuli for experiments. Many studies used words or word pairs that were unfamiliar to participants, such as nonsense words, or low - or medium-frequency words. Of the five studies indicating no effect, Carpenter and Mueller (2013) and Hausman and Kornell (2014) also used word pairs, but they were easily distinguishable by the use of different rules associating the word pairs or by their appearance. In some cases, anatomy names and foreign words needed to be distinguished. In the five studies indicating no effect, the experimental materials were clearly different reducing the need for learners to learn to discriminate between them.

Limitations and Future Studies

The concept of mental rest-from-deliberate-learning requires further empirical evidence to determine relevant parameters. Some factors, such as the length of resting and the levels of cognitive demands of the tasks that are being rested from, need empirical investigation. A meta-analysis also would be useful.

For the spacing effect, Chen et al. (2018) did not control learner activities during resting. While the rest was sufficiently long to ensure some rest-from-deliberate-learning did occur, more strictly controlled experiments would be useful to test whether working memory depletion is directly linked to controlled resting.

We have suggested that the spacing effect with its rest-from-deliberate-learning procedure under spaced conditions allows depleted working memory resources to be replenished. While results supporting this suggestion were obtained by Chen et al. (2018) in two experiments, we are not aware of any other studies that assessed working memory resources. Future research should directly test the hypothesis that the spacing effect is a cognitive load effect due to working memory depletion while the interleaving effect can be explained by the discriminative-contrast hypothesis.

Contributions of the Study

This systematic review aimed to provide evidence that the spacing and interleaving effects cannot be explained by the same theoretical base. The results of this study suggest that the two effects should be seen as theoretically distinct with the spacing effect as a cognitive load effect and the interleaving effect as a perceptual effect. From a practical perspective, the study provides practitioners with guidance on when to apply spaced instruction or when to apply interleaved instruction in classrooms. The conditions appear to be very different. Spacing should be used when learners need to mentally rest from learning while interleaving should be used when learners need to discriminate between apparently similar but in fact different instructional areas.

Conclusion

The data of this review suggest that the spacing effect needs a rest-from-deliberate-learning period between learning episodes. A requirement for a rest-from-deliberate-learning period can be directly tested using the procedures of the interleaving effect. Interleaving provides spacing but without rest-from-deliberate-learning because what would otherwise be a rest-from-deliberate-learning period is filled with the interleaved activity. Accordingly, if a rest-from-deliberate-learning period is required for the spacing effect, the interleaving effect should not occur. The fact that the interleaving effect does occur leads to the next question—are the interleaving and spacing effects caused by the same cognitive processes?

We suggest the requirement for a rest-from-deliberate-learning period is demonstrated by a particular category of experiments on the interleaving effect. Differences between interleaved and blocked practice can be more difficult to obtain depending on the discriminability of the interleaved materials. For experiments with very obviously different materials that are immediately discriminable, the interleaving effect is unlikely to be obtained. It is this class of experiments that provide evidence that rest-from-deliberate-learning between episodes, not just spacing, is required for the spacing effect. Other experiments, with materials that are more difficult to discriminate between, provide clear evidence of the interleaving effect. The effect is more likely to be obtained if interleaved information is not readily discriminable. Interleaving may assist learners to discriminate between areas that are more difficult to distinguish apart, resulting in superiority of interleaving over blocking.

Based on this analysis, we suggest that rest-from-deliberate-learning periods permit recovery of depleted working memory resources as indicated by the results of Chen et al. (2018). That mechanism is not usually available in interleaving studies because in those studies, the spacing is due to interleaving rather than rest-from-deliberate-learning and so the interleaving effect is likely to have different causal factors. The suggested mechanism that drives the results of successful interleaving studies is the discriminative-contrast hypothesis. That hypothesis is not relevant in spacing studies because multiple areas that learners need to discriminate between are not used.

By distinguishing between studies that did and did not use rest-from-deliberate-learning, on the one hand, and did and did not use multiple, difficult to discriminate learning areas, on the other hand, the two effects can be clearly separated by two different cognitive mechanisms. The spacing effect can be explained by working memory depletion after cognitive effort and recovery during rest-from-deliberate-learning. In contrast, the explanation of interleaved practice focuses on discrimination between highly similar concepts with evidence showing that interleaved practice is not moderated by varying working memory capacity (Sana et al., 2018).

One of the consequences of interleaved practice automatically involving spaced practice is that the two procedures frequently and understandably are conflated (e.g., Taylor & Rohrer, 2010). Some research studies have tried to distinguish the two practices by inserting time delays within blocked practice to match the time delays of interleaved practice (e.g., Kang & Pashler, 2012; Taylor & Rohrer, 2010). However, in most cases, interleaved practice automatically combines interleaving and spacing manipulations (Kornell & Bjork, 2008), a fact that has erroneously led to the assumption that both effects have the same cause.