When learning a new domain of knowledge or skill, is learning more efficient if encounters with the same material are adjacent or spaced out in time? Atkinson and Shiffrin (1968) studied one form of this question, comparing items in a paired-associates task that appeared on immediately consecutive trials (massed items) with ones whose appearances were separated by other intervening items (spaced items). Their dual-store memory model explained the observed memory advantage of spaced items, because massed items enjoy less total time in the short-term buffer and thus have less opportunity to enter long-term memory. The present article considers spacing on a different timescale, whereby the same set of material (e.g., a whole list of paired associates) is studied repeatedly, either with no delay between blocks (massed condition) or alternating with another intervening task (spaced condition). The buffer explanation of Atkinson and Shiffrin (1968) is not relevant in this paradigm, but their model’s control processes for search and retrieval from long-term memory are, as discussed in detail below.

Extensive research has demonstrated that spacing is superior to massing practice, at least when it comes to retention. That is, when training has ceased and learning gains are compared after an equivalent retention interval, spaced acquisition typically causes superior memory. This retention advantage has been documented both when massed practice is defined as the reexposure of the same stimulus with zero intervening items (Cepeda, Pashler, Vul, Wixted, & Rohrer, 2006) as well as when massed practice is defined as the least spaced of several conditions, despite the existence of some intervening items during massed practice (Kahana & Howard, 2005). However, less research has been conducted into the effects of spacing practice during the acquisition process, in which learning gains per training event are compared not after an equivalent retention interval, but during the course of training itself.

It is widely predicted that it is more efficient to train with a massed than a spaced practice schedule, where again we use these terms to refer to spacing at the level of lists, not individual items. This prediction stems from three theoretical mechanisms. First, decay theory (extant since Thorndike’s law of disuse in 1913) holds that memory strength decays with time. As more time passes in the conduct of spaced than massed practice, decay theory predicts a stronger memory trace during massed than during spaced acquisition. Second, (retroactive) interference theory (popular since McGeoch’s refutation of the law of disuse in 1932) holds that cognitive processing—not time—between encoding and retrieval of information leads to a weaker memory trace. Because amount of cognitive processing is positively correlated with the passage of time, interference theory generally makes the same predictions as decay theory regarding massed and spaced acquisition. As the determinant of forgetting is attributed to the amount rather than the type of cognitive processing, this mechanism is referred to here as general interference. Third, theories of memory have emphasized contextual drift as a forgetting mechanism, such that an increased amount of time between encoding and retrieval is associated with reduced similarity between temporal context cues stored in memory and those available at retrieval (e.g., search of associative memory, or SAM; Raaijmakers & Shiffrin, 1981).Footnote 1 Whether under the name of decay, general interference, or contextual drift, this time-dependent loss of memory strength or accessibility is a common characteristic of many major theories of memory, such as the buffer model of Atkinson and Shiffrin (1968; see their Fig. 1), SAM, the retrieving effectively from memory (REM; Shiffrin & Steyvers, 1997) model, the context maintenance and retrieval (CMR; Polyn, Norman, & Kahana, 2009) model, the adaptive character of thought (ACT-R; Pavlik & Anderson, 2005) model, and the new theory of disuse (Bjork & Bjork, 1992). The incorporation of a time-dependent memory diminishment leads naturally to the prediction of massed acquisition advantages.

Fig. 1
figure 1

Proportion correct by spacing condition and block in Experiment 1. Error bars represent standard errors of the mean

This theoretical prediction has been repeatedly upheld in the literature. Although not originally framed as such, Atkinson and Shiffrin (1968, Experiment 3) assessed the effect of lag (i.e., size of spacing delay) between successive study events of a paired associate and the total number of study events on acquisition performance. Spacing delays were occupied by the study of other paired associates. They found that longer spacing delays were associated with worse performance. This massed advantage at acquisition has since also been found in the domains of English paired associate learning (Karpicke & Roediger, 2007; Maddox & Balota, 2015), Japanese–English word-pair learning (Pavlik & Anderson, 2005), and the learning of psychology texts (Rawson & Dunlosky, 2013). However, data supporting the opposite prediction, that of spaced acquisition advantages have also been observed. Such research has been conducted in the domains of inverted alphabet printing (Archer & Bourne, 1956), rotary pursuit (Bourne & Archer, 1956), olfactory conditioning in honeybees (Deisig, Sandoz, Giurfa, & Lachnit, 2007), appetitive conditioning in rats (Sunsay & Bouton, 2008), and several studies within the domain of motor learning as chronicled in the meta-analysis by Lee and Genovese (1988). If massed practice advantages support the mechanisms of trace decay, general interference, and (possibly) contextual drift and their implementation within models of memory, then spaced practice advantages should give pause to those theories.

What might explain the inconsistency within spaced acquisition research? We suggest that to bring order to this pattern of results, attention must be brought to the role of selective interference in determining whether massed or spaced acquisition advantages are observed. Whereas a general interference dynamic predicts that the amount of cognitive processing between encoding and retrieval influences loss of memory strength, a selective interference dynamic (latent in theories of memory such as the modal model of Atkinson & Shiffrin, 1968, the working memory model of Baddeley, 1992, and the multiple-resources model of Wickens, 2008) predicts that the type of interpolated mental activity matters as well. Specifically, selective interference calls to attention the variation in overlap between cognitive processes used in completing two or more tasks, with similar tasks being more interfering to each other than distinct tasks.

We hypothesize that the presence of both massed and spaced acquisition advantages in the literature is attributable to the role of selective interference during spaced learning; specifically, massed advantages occur due to the involvement of selective interference during spaced acquisition conditions. This hypothesis receives support from an examination of the studies reporting massed or spaced acquisition benefits above (see Table 1): When the spaced practice involved the conduct of tasks of the same processing type as the primary task, massed advantages were often observed; when the spaced practice was defined by a general delay and not undertaking a directly conflicting task, spaced advantages were often observed. Despite the hints these studies provide, we are aware of no investigation that has included the crucial test of manipulating the nature of interference within a single spacing effect study during acquisition. The present investigation seeks to determine whether spacing or massing acquisition trials leads to superior performance, as a function of the nature of the relationship between interleaved tasks.

Table 1. Nine studies of the spacing effect at acquisition demonstrate mixed results that might be explained by relationship between primary and delay task

To evaluate the effects of selective interference on spaced acquisition, we take as our starting point the distinction between verbal and spatial processing domains. The assumption that performing verbal and spatial tasks emphasizes distinct processing codes (Atkinson & Shiffrin, 1968; Baddeley, 1992; Wickens, 2008) suggests a means to manipulate selective interference in the context of the spacing effect. Under this assumption, the theory of selective interference predicts that spacing out learning trials of one task with trials of another task that draws upon the same processing code (i.e., verbal or spatial) will elicit a massed acquisition advantage (e.g., as in Atkinson & Shiffrin, 1968, Experiment 3). However, in contrast to theories of memory emphasizing forgetting due to trace decay, general interference, or (possibly) contextual drift, selective interference theory predicts that spacing out learning trials of a task with trials of another task that taxes a different processing code should not impair spaced learning, leading to no difference between spaced and massed conditions. In fact, because some past research has demonstrated a spaced advantage at acquisition (e.g., Bourne & Archer, 1956), we might predict that alternating tasks that use different processing codes will yield a spaced advantage, due to some additional (to be determined) mechanism.

Experiment 1

We first investigated the acquisition of paired associates in an anticipation paradigm in which subjects were repeatedly tested upon letter-number pairs, a procedure that has long been used for assessing the effect of elapsed time on memory (e.g., Atkinson & Shiffrin, 1968, Experiment 3). Between blocks, subjects received either 5 min of reading (spaced practice) or no reading (massed practice). Reading periods were occupied by the studying of text passages that participants were instructed to encode for later retrieval. Although the paired associate learning and reading tasks are otherwise very dissimilar, they both tax verbal processing codes. We hypothesized that the addition of spacing delays with the reading task between paired associate blocks would introduce selective interference in the form of added memory load, invoking a performance detriment that would lead to a massed practice advantage.

Method

Participants

A total of 171 subjects (114 female, 57 male; median age = 18 years) enrolled in introductory psychology courses were randomly assigned to massed and spaced practice conditions. Subjects were given course credit for their participation.

Procedure

Stimuli consisted of 15 two-digit, two-letter pairs, introduced to participants under the guise of fictional chemical elements (e.g., Nz : 97). The procedure for each trial followed an anticipation paradigm that showed the subject a cue (e.g., Nz) on the computer screen and waited up to 5 s for a two-digit response on the keypad, concluded by pressing the space bar. Feedback was provided for 5 s by displaying the correct two-digit response where the text-entry box had been previously. Following feedback termination, the words “Next Element” were presented in the middle of the screen for 2 s to serve as a fixation point. Each of the 10 blocks of 15 trials included one presentation of every stimulus. The presentation order was different for each block and was pseudorandomly generated such that no item began or ended a block more than once, and all subjects experienced the same orders.

The only procedural detail that varied between conditions occurred between blocks. For subjects in the massed condition, instructions between blocks stated, “The previous series is over. The next series of elements begins now. Remember to submit your two-digit responses by pressing the SPACE BAR before the 5-second time limit has ended.” For subjects in the spaced condition, instructions stated, “The previous series is over. You will now spend 5 minutes reading a text passage. The text passage will close by itself after 5 minutes.” These text passages (word length M = 901, SD = 210), which described introductory psychology concepts (Heffner, 2001), were displayed on a single scrollable screen for 5 min, automatically transitioning to the next block of elements upon the elapse of 5 min. Instructions for the spaced condition noted that this procedure of list learning block followed by a 5-min passage would be repeated for 10 blocks, with a new psychology passage provided between each pair of successive paired associate blocks (no passage was presented following the 10th block). The passages contained no explicit references to the psychology of learning or memory. After subjects in the massed condition completed the tenth block, they were shown all nine psychology passages for 5 min each. At the conclusion of the experiment, all subjects were asked four four-alternative multiple-choice questions about passage contents to confirm they had attended to the passages. As the timing of these questions differed between groups, an analysis of group differences on these questions is not informative, although we note that overall accuracy on these questions was 89.1%, demonstrating that subjects were attentive to this verbal task.

Results

Four subjects were excluded for extremely low scores on the paired associate task (two from each condition); whereas the mean proportion correct on the final block was 87.6% for other subjects (SE = 1.4%), all four excluded subjects scored 0% on the final block. All other subjects were retained for analysis, leaving a final sample of 167 (85 in the massed and 82 in the spaced condition).

In the first block, subjects had no prior exposure to the correct answers and so were almost always wrong in their responses (correct guess rate 1.7%). Therefore, Block 1 was excluded from the analysis. A 2 (condition: massed, spaced) × 9 (block: 2 through 10) ANOVA with repeated measures on block was performed on proportion correct on the paired associates task. Importantly, a main effect of condition was observed, F(1, 165) = 5.30, p = .023, η2 = .03, such that the massed-practice group (M = 64.0%, SE = 1.1%) performed significantly better than did the spaced-practice group (M = 57.6%, SE = 1.2%; see Fig. 1). A main effect of block was also observed, F(8, 1320) = 828.53, p < .001, η2 = .84, indicating substantial improvement throughout the experiment (Block 2: M = 13.3%, SE = .9%; Block 10: M = 87.6%, SE = 1.4%). The interaction between condition and block was not significant, F(8, 1320) = 1.05, p = .398, η2 = .01.

These results suggest that massed practice was more efficient than spaced practice. This massed practice advantage is consistent with an explanation based on selective interference, whereby the spaced practice group experienced a memory load from the secondary task that emphasized the same verbal processing code as the primary task. However, these findings could also be explained by theories of memory emphasizing trace decay, general interference, or contextual drift, as the massed practice condition experienced less time between blocks during which memory could weaken. Experiment 2 was conducted to decide between these explanations, by determining whether a massed practice advantage would persist when the processing code used during the spacing delay was distinct from that used in the primary task.

Experiment 2

Experiment 2 was identical to Experiment 1 with one exception. Rather than using letter–number paired associates, the primary task was altered to tax spatial processing code, and therefore required subjects to learn associations between numbers and spatial positions. With this reduced overlap between processing codes used in the primary and interleaved tasks of the spaced practice condition, we hypothesized that the detrimental effects of selective interference would no longer affect spaced acquisition. In contrast to Experiment 1, we predicted that either the two groups would not differ, or the spaced practice group would outperform the massed practice group, as has been found by other researchers using interpolated tasks that do not share processing code with the primary task (e.g., Archer & Bourne, 1956; Bourne & Archer, 1956). For reasons already reviewed, theories of memory emphasizing trace decay, general interference, or (possibly) contextual drift would predict a massed practice advantage.

Method

Participants

A total of 201 subjects (114 female, 86 male, one undeclared; median age = 19 years) enrolled in introductory psychology courses were randomly assigned to massed and spaced practice conditions. Subjects were given course credit for their participation.

Procedure

We modified the procedure of Experiment 1 to use the same motor responses while altering the processing code emphasized by the primary learning task. We randomly assigned the 10 digits on the keypad (0–9) to new key locations such that no key remained in its normal location. This new number mapping was randomized only once to ensure that the mapping of numerals to key locations was consistent across subjects and thereby matched the motor requirements of Experiment 1 (for correct responses). Unlike in Experiment 1, masking tape was placed over the number keys to encourage a spatial rather than a verbal representation. Similar to the paradigm used in Experiment 1, the procedure on each trial followed an anticipation paradigm that showed subjects a two-digit number and waited 5 s for a two-digit response. The task was to learn the new mapping of numerals to spatial locations (no letters were shown) and to use these associations to type the numbers presented on the screen. Subjects were required (unbeknownst to them) to press the same keys as were pressed by subjects in Experiment 1. For example, Experiment 1 subjects correctly typed “97” in response to the cue “Nz.” To equate key presses across experiments, subjects in Experiment 2 were instructed to type a two-digit number (in this case “82”) whose rearranged keys corresponded to where “97” would otherwise have been, followed by the space bar (the concluding keystroke). The trial sequence was also matched across tasks, such that stimulus “82” appeared on the same trial numbers that “Nz” had. Upon typing of each digit, a pound symbol (#) was added to the screen in a text entry box to inform the subject that the digit had been submitted without showing him or her what that number was. After the submission of a two-digit response and press of the space bar (or the expiration of the trial after 5 s), feedback was provided that lasted 5 s and consisted of displaying the two-digit number the subject had actually typed (given the rearranged keyboard) presented in blue font, below the number he or she was asked to type, presented in black font. Thus, feedback differed between experiments in a subtle manner: In Experiment 1, the correct target was displayed after each trial, regardless of the response that had been submitted. In Experiment 2, feedback consisted of presentation of the typed responses.

All other aspects of the procedure were identical to those in Experiment 1. Overall accuracy on the four multiple-choice questions about passage contents was 89.7%.

Results

One subject (from the spaced condition) was excluded for extremely low scores, scoring 0% accuracy on half of the blocks, including the final block, despite 94.9% overall mean accuracy for other subjects on the final block. This exclusion criterion matches that used in Experiment 1. All other subjects were retained for analysis, leaving a final sample of 200 (104 in the massed and 96 in the spaced condition).

A 2 (condition: massed, spaced) × 9 (block: 2 through 10) ANOVA with repeated measures on block was performed on proportion correct. A main effect of condition was not observed, F(1, 198) = 1.60, p = .208, η2 = .01, such that the spaced practice group (M = 86.3%, SE = 0.7%) did not perform significantly differently than did the massed practice group (M = 84.1%, SE = 0.7%). A main effect of block was observed, F(8, 1584) = 248.04, p < .001, η2 = .56, indicating substantial improvement throughout the experiment (Block 2: M = 53.8%, SE = 1.8%; Block 10: M = 94.9%, SE = 0.6%). Importantly, the interaction between condition and block was significant, F(8, 1584) = 2.77, p = .005, η2 = .01. Examination of this interaction revealed an advantage for spaced over massed practice on early trials that disappeared on later trials when performance approached the ceiling (see Fig. 2).

Fig. 2
figure 2

Proportion correct by spacing condition and block in Experiment 2. Error bars represent standard errors of the mean

These results contrast with those of Experiment 1, which found a massed acquisition advantage. This finding is in keeping with previous experiments that demonstrated spaced acquisition advantages with primary spatial tasks interleaved with verbal secondary tasks. For example, Archer and Bourne (1956) and Bourne and Archer (1956) had subjects practice inverted alphabet printing and rotary pursuit skills interleaved with rest periods occupied by conversing with the experimenters. Although their procedure did not impose memory demands from the secondary task, whereas our current procedure did, spaced acquisition advantages were observed.

Experiments 1 and 2 compared

Given the extensive similarity in procedures between Experiments 1 and 2, as well as the common subject population, we conducted an analysis that combined data from both experiments and treated experiment (and by proxy, processing code overlap) as a between-subjects variable, which allows for a test of the interaction between spacing condition and code overlap despite the lack of random assignment to experiments. The analysis took the form of a 2 (condition) × 2 (experiment) × 9 (block) ANOVA on proportion correct with repeated measures on block. The main effect of experiment was significant, F(1, 363) = 244.67, p < .001, η2 = .40, indicating that participants were more accurate making the spatial responses (M = 85.2%, SE = 0.5%) than the verbal responses (M = 60.8%, SE = 0.8%). The main effect of condition was not significant, F(1, 363) = 1.83, p = .177, η2 = .01, indicating no overall difference in accuracy between conditions when collapsing across experiment (massed: M = 75.1%, SE = 0.7%; spaced: M = 73.1%, SE = 0.8%). Importantly, the interaction of Condition × Experiment was significant, F(1, 363) = 7.37, p = .007, η2 = .02, demonstrating that the difference in acquisition performance between spacing conditions reliably depends upon processing code overlap. Additionally, there was a main effect of block, F(8, 2904) = 969.12, p < .001, η2 = .73, a significant interaction of Block × Experiment, F(8, 2904) = 113.36, p < .001, η2 = .24, a significant interaction of Block × Condition, F(8, 2904) = 2.94, p = .003, η2 = .01, and no three-way interaction, F(8, 2904) < 1.

Results from Experiments 1 and 2 support our hypothesis that spaced practice is hindered in the presence of selective interference. By holding constant essentially all procedural details except primary task content and therefore processing code overlap, comparison of these experiments enabled a direct comparison to be drawn between spacing effects in the presence and absence of selective interference. Our results thus demonstrate that spacing effects at acquisition are sensitive to selective interference: Only in its presence are massed acquisition advantages observed.

However, there is a potential confounding that deserves further investigation. Specifically, it is possible that the difference in results between Experiments 1 and 2 was due not to contributions of selective interference, but instead to some inherent difference in learning or memory processes between verbal and spatial tasks. That is, the two experiments differed not only in overlap between primary and secondary tasks but also in the nature of the primary task itself. A similar distinction has been observed in discussions of the procedural reinstatement principle (Healy, Wohldmann, & Bourne, 2005; Lohse & Healy, 2012), by which declarative information (e.g., verbal knowledge) is intrinsically difficult to retain but easy to transfer across contexts, whereas procedural information (as in motor tasks) is intrinsically easy to retain but difficult to transfer across contexts. To rule out this interpretation, we conducted a follow-up study in which the primary verbal learning task of Experiment 1 was paired with a spatial interleaved task rather than a verbal interleaved task. If verbal learning is inherently more amenable to massed acquisition regardless of the interspersed task, then a massed acquisition advantage should be observed. If, however, selective interference caused by shared processing code is the underlying cause of Experiment 1’s massed practice advantage, then the advantage should disappear. Furthermore, rather than using an interleaved task that yields data only once at the end of the experiment (as in Experiments 1 and 2), the spatial task in Experiment 3 is assessed on a block-by-block basis. Thus there is no canonical distinction between primary and interleaved tasks in Experiment 3: Both tasks can be treated as primary, with the other acting as the interleaved task. This design thereby crosses the primary task (verbal vs. spatial) with the practice schedule (massed vs. spaced) within the context of a noninterfering interleaved task learning procedure.

Experiment 3

To evaluate the selective interference interpretation of the spacing effect at acquisition, Experiment 3 required that participants learn two tasks of equal importance: a mirror-tracing task emphasizing spatial processing and the paired associate task of Experiment 1 emphasizing verbal processing. A massed practice group experienced all blocks of paired associates before the mirror-tracing task, whereas a spaced practice group alternated blocks between the two tasks. We assumed that (as in Experiment 2) the spaced practice group would not experience selective interference because the two tasks do not share processing code. Accordingly, we predicted that acquisition performance on both tasks would be equal or greater in the spaced practice condition as compared with the massed practice condition, as only with selective interference would massed advantages be predicted. If confirmed, our prediction would support the selective interference interpretation, provide evidence against theories of memory emphasizing trace decay, general interference, or (possibly) contextual drift, and rule out the possibility that verbal and spatial tasks exhibit fundamentally different patterns of spacing effects at acquisition.

Method

Participants

A total of 229 subjects (148 female, 81 male; median age = 18 years) enrolled in introductory psychology courses were randomly assigned to massed and spaced practice conditions, and subjects were given course credit for their participation.

Procedure

The procedure consisted of two learning tasks that were performed either in a massed design with no delay between blocks of the same task, or in a spaced design in which the two tasks alternated from block to block, with blocks of Task A serving as the delay between blocks of Task B, and vice versa.

Tasks

The verbal task used in this experiment was identical to the paired-associate task used in Experiment 1. A spatial task was selected to minimize the possibility of verbal mediation and rely only on procedural rather than declarative learning, in an attempt to maximize the verbal-spatial distinction between the two tasks. To this end, we selected the mirror-tracing task made famous in the study of H.M. by Milner, Corkin, and Teuber (1968). Unlike in previous studies, in which the task was mechanical and performance was recorded via the conductance of an aluminum plate beneath the shapes to be traced, we coded this task onto a computer for ease of alternating it with the verbal task, and participants performed the task by manipulating a wireless mouse. Participants began the mirror-tracing task by reading instructions as follows: “You will perform a ‘mirror-tracing’ task for 5 minutes. In this task, you will see a shape, and be asked to trace it with the mouse. Your goal will be to trace each shape for 60 seconds, keeping your cursor on the black line of the shape, and going around the shape either clockwise or counter-clockwise. You will attempt to trace the shape as many times as possible while minimizing errors until time expires.” As noted in the instructions, each trial lasted 60 s, and there were five trials per block, each tracing a different shape: a five-pointed star, a hexagon, a square with concave rounded corners, a cartoon heart, and a wide symmetrical cross, presented in one of nine randomized orders. Each trial began by presenting the shape in the center of the screen, with a white background and black perimeter lines to trace. Mouse position was continuously recorded, starting at the moment the cursor (a red dot) entered any point of the black tracing perimeter. The cursor began the first trial of each block at the top left corner of the screen and began each subsequent trial where it had previously been at the conclusion of the last trial. The cursor’s path was depicted by a thin blue line trailing the cursor as it moved, to simulate the mark from a pen. The movement of the mouse in either horizontal direction led to movement of the cursor in the opposite direction, simulating movement in a mirror. Vertical movement was unmodified. Whereas previous studies used completion time as a primary dependent variable for this task, our measure differed to allow for control over the amount of time spent on the task. This modification was enacted because the spatial task served as the delay activity for the verbal task and thus needed to last a consistent length of time. Performance on this task was thus measured as time on target, the proportion of time spent with the cursor atop the perimeter of the shape (1/64th the width of the screen; approximately the width of a pencil on a standard monitor) divided by the total amount of time since the cursor’s first movement onto the perimeter.

Design

All subjects, regardless of condition, began with the paired-associates task.Footnote 2 For subjects in the massed practice condition, instructions noted that each block of element symbol–number pairs would begin immediately after the previous one. After these subjects completed the 10th block, they saw instructions for the mirror-tracing task and immediately began massed practice of this spatial task for nine blocks of five 1-min trials each.

For subjects in the spaced practice condition, instructions stated that they would alternate between learning the chemical elements and performing a 5-min mirror-tracing task, with accompanying instructions for each task. This procedure was repeated for nine blocks, thereby providing the same number of data points as was provided by 10 blocks of verbal learning (where the first block is uninformative).

Results

Three subjects (two from the massed condition, one from the spaced condition) were excluded for extremely low verbal accuracy scores, scoring 0% accuracy on the final block, despite 87.2% overall mean accuracy for other subjects on the final block. This exclusion criterion is identical to that used in Experiments 1 and 2. Ten additional subjects (eight from the massed condition, two from the spaced condition) were excluded for repeatedly failing to attempt to learn the mirror-tracing task: They exhibited two or more trials in which they failed to move the cursor onto the shape at all, despite having 60 s to do so. Subjects were excluded from the entire experiment rather than solely from the analysis for the task in which they failed to perform to criterion, as they exhibited behaviors indicative of not taking the experiment seriously. All other subjects were retained for analysis, leaving a final sample of 216 (105 in the massed and 111 in the spaced condition).

Verbal learning

A 2 (condition: massed, spaced) × 9 (block: 2 through 10) ANOVA with repeated measures on block was performed on proportion correct (see Fig. 3). Unlike in Experiment 1, a main effect of condition was not observed, F(1, 214) < 1, such that the spaced practice group (M = .624, SE = .010) did not perform significantly differently than did the massed practice group (M =.606, SE = .011). A main effect of block was observed, F(8, 1712) = 1,050.89, p < .001, η2 = .83, indicating substantial improvement throughout the experiment (Block 2: M = .157, SE = .009; Block 10: M = .872, SE = .012). The interaction between condition and block was not significant, F(8, 1712) < 1. These results contrast with the massed practice advantage observed in Experiment 1, which used the same primary task, but a verbal interleaved task.

Fig. 3
figure 3

Proportion correct for verbal data by spacing condition and block in Experiment 3. Error bars represent standard errors of the mean

Spatial learning

Unlike in the verbal learning task, subjects’ performance on the first block of the spatial learning task was indicative of learning (i.e., not chance performance) due to the presence of continuous feedback (as opposed to feedback provided only after each item’s response submission in the verbal task), so data from this block were included in the analysis. A 2 (condition: massed, spaced) × 9 (block: 1 through 9) ANOVA with repeated measures on block was performed on accuracy, calculated as proportion of time on target, averaged across the five shapes in each block (see Fig. 4). A main effect of condition was observed, F(1, 214) = 4.642, p = .032, η2 = .02, such that the spaced practice group (M = .744, SE = .05) performed the mirror-tracing spatial task significantly better than did the massed practice group (M = .700, SE = .006). A main effect of block was observed, F(8, 1712) = 61.447, p < .001, η2 = .22, indicating substantial improvement throughout the experiment (Block 1: M = .603, SE = .012; Block 9: M = .740, SE = .012). The interaction between condition and block was also significant, F(8, 1712) = 2.126, p = .031, η2 = .01. This interaction appears to be due to equivalent performance between massed and spaced conditions on Block 1 (as expected; on Block 1 the spacing manipulation had not yet commenced). Results from this analysis support and generalize the findings of Experiment 2 that spaced practice benefits acquisition in the absence of processing code overlap.

Fig. 4
figure 4

Motor accuracy for mirror-tracing data by spacing condition and block in Experiment 3. Error bars represent standard errors of the mean

Effects of verbal and spatial delay tasks compared

The procedure involved in collecting paired associates data from Experiments 1 and 3 differed only in the nature of the delay task (psychology text passage learning in Experiment 1 and mirror-tracing task learning in Experiment 3) and drew from the same subject population. We conducted a between-experiments analysis to assess the impact of the delay task on the spacing effect at acquisition.

The analysis took the form of a 2 (condition: massed, spaced) × 2 (delay task type: verbal, spatial) × 9 (block: 2 through 10) ANOVA on accuracy scores for verbal learning with repeated measures on block. The main effect of condition was not significant, F(1, 379) = 1.488, p = .223, η2 < .01, nor was the main effect of delay task type, F(1, 379) < 1. The main effect of block was significant, F(8, 3032) = 1,858.286, p < .001, η2 = .83. Importantly, the Condition × Experiment interaction was significant, F(1, 379) = 4.715, p = .030, η2 = .01. In Experiment 1, the massed condition exhibited better accuracy (M = .640, SE = .01) than the spaced condition (M = .576, SE = .01). In Experiment 3, this advantage reversed, with the massed condition performing somewhat worse (M = .606, SE = .01) than the spaced condition (M = .624, SE = .01). Neither the Block × Experiment nor Block × Condition two-way interaction was significant, both Fs < 1, nor was the three-way interaction between experiment, condition, and block, F(8, 3032) = 1.429, p = .179, η2 = .01. These results indicate that the effect of spaced practice on learning verbal materials reliably depended upon the nature of the delay task used: When the delay task involved processing code overlap with the primary verbal learning task, engaging in that delay task impaired acquisition. However, when the delay task drew upon a different processing code so that processing code overlap was minimal, engaging in that delay task had a statistically nonsignificant but numerically opposite effect upon learning the primary verbal task.

General discussion

The present investigation sought to test whether massed and spaced acquisition advantages, each independently observed across previous investigations, would both be observed within a series of controlled experiments depending upon the nature of the relationship between primary and delay tasks used. Our intent was to determine whether this inconsistency within the relevant literature could be explained by the effects of selective interference caused by processing code overlap between interleaved acquisition trials. In line with the predictions of selective interference, our results demonstrate a pattern whereby massed acquisition is beneficial relative to spaced acquisition only when the task being spaced shares processing code (i.e., verbal or spatial modality) with another task that is experienced during the spacing delay. This pattern was manifested in the massed advantage accompanying shared processing codes in Experiment 1. Our results also demonstrate a pattern whereby spaced acquisition is beneficial relative to massed acquisition when the task being spaced does not share processing code with the delay task. This pattern was manifested in the interaction between spacing condition and block in Experiment 2 and in the main effect of condition for the dependent measure of spatial learning in Experiment 3. Although no main effect of spacing condition was observed for verbal learning in Experiment 3, there was a numerical trend favoring the spaced condition. The nature of code-specific interference introduced by delay tasks determines whether massed or spaced acquisition advantages are observed, a conclusion further supported by the contrasts of Experiments 1 and 2 and of Experiments 1 and 3. The present data cannot be explained by time-dependent loss of memory strength, as produced by the theoretical mechanisms of trace decay or general interference, but such data can be explained using the principle of selective interference.

How might the buffer model of Atkinson and Shiffrin (1968) explain these results? In this model, information that is re-presented while still in the buffer (e.g., consecutive trials of the same cue–target pair) receives deficient processing; it does not benefit from a second complete round in the buffer as a novel cue–target pair (or a pair re-presented after falling out of the buffer) would. Thus, when lag between trials of the same cue–target pair is zero intervening items (i.e., massed at the trial level rather than list level), massed acquisition would be characterized by less total time in the buffer for each item and consequently less transfer to long-term memory than spaced acquisition (Atkinson & Shiffrin, 1968). However, when massed acquisition is at the list level (and such lists exceed the capacity of short-term memory) rather than the trial level, as in the present investigation, reexposure of the same cue–target pair occurs on a time scale far exceeding the capacity of the short-term store. Thus, the buffer explanation does not apply to spaced acquisition advantages at this time scale.

However, the buffer model is relevant to explaining selective interference, albeit for a different reason. Atkinson and Shiffrin (1968, see their Fig. 1) assume that at each level of memory (i.e., sensory, short term, and long term), information is divided according to modality or processing code. Although modality subdivision is the hallmark of several theories of short-term memory (e.g., Baddeley, 1992; Wickens, 2008), the extension of this distinction into long-term memory by Atkinson and Shiffrin (1968) is especially relevant. We propose that it is the susceptibility to interference within but not across the processing codes of long-term memory that helps explain the massed acquisition advantages on the timescale of our experiments.

How does this code-specific interference occur? One possible explanation has been provided by the successors of the Atkinson and Shiffrin model, in particular SAM’s (Raaijmakers & Shiffrin, 1981) and REM’s (Shiffrin & Steyvers, 1997) tenets that storage and access of contextual representations underlie the ability to retrieve information from long-term memory. According to a subset of these contextual fluctuation theories of memory context can be decomposed into temporal and task context (Annis, Malmberg, Criss, & Shiffrin, 2013). In these theories and others (e.g., CMR; Polyn et al., 2009), representations consist of item information (what was presented) as well as task context (how the items were presented) and temporal context (when the items were presented). This three-faceted model of contextual memory might account for both massed and spaced acquisition advantages in the present study, as follows. Retrieval cues are more useful when they are uniquely indicative of their association. When task context is similar across tasks (as when code-specific interference is present), task context as a retrieval cue is less useful than when task context is unique. Stated differently, knowing that one has engaged in a verbal task is of little help to a learner trying to recall information from one of two verbal tasks, but it is useful for restricting the search set for a learner who had been studying one verbal and one spatial task. Moreover, components of memory probes are generally assumed to combine multiplicatively (see Shiffrin & Steyvers, 1997, for a normative justification), and so task context will be especially important for distinguishing between two tasks that have both been performed recently (i.e., that have similar temporal contexts). Therefore, by alternating learning trials of two activities that share processing code modality and therefore task context, learners undertaking spaced acquisition of shared-code tasks may be especially vulnerable to interference in retrieval. The massing advantages we have observed in the presence of processing code overlap are consistent with an account of memory whereby diminished use of task context due to shared-code acquisition under spaced conditions hinders performance relative to both massing practice and to different-code acquisition.

How can models of context-based retrieval explain spaced acquisition advantages in the absence of processing code overlap? Here the predictions of these models are not necessarily straightforward due to their inclusion of memory dynamics that oppose one another. On the one hand, similarity between temporal context on any given trial and stored contexts from previous trials is greater under massed acquisition than under spaced acquisition. This dynamic in isolation predicts a massed acquisition advantage. On the other hand, spaced acquisition also affords the learner a more varied set of context elements to draw from at retrieval, and this varied context set is theorized to function as a retrieval facilitator (e.g., Malmberg & Shiffrin, 2005; Raaijmakers, 2003). At retrieval (i.e., after a fixed delay), only the latter of these dynamics is relevant, but at acquisition both are. Thus, the benefits in a massed condition of greater similarity between current context and context in individual memory traces are offset by restricted variability of those stored contexts, and, conversely, the benefits in a spaced condition of diverse context are offset by dissimilarity of current context to individual stored contexts. It is not obvious which dynamic prevails or, consequently, which condition leads to better acquisition performance. The predictions of formal models are liable to depend on extraneous implementation details, such that one could design a model consistent with either prediction. Nevertheless, it may be viable to develop a formal model in which context variability dominates context similarity, thus producing a spacing advantage for nonoverlapping tasks, and the contributions of task context reverse this effect to a massing advantage with overlapping tasks, thus capturing our full set of results.

Theories emphasizing context in memory are not the only candidates for explaining a spacing advantage in acquisition in the absence of selective interference. In memory consolidation theories (for systems consolidation, see Squire, Genzel, Wixted, & Morris, 2015; for cellular consolidation, see Rudy, 2014), recently encoded information is subject to two opposing dynamics: trace decay, which leads to massed advantages (Hardt, Nader, & Nadel, 2013), and the process of consolidation, in which information undergoes a period of temporary vulnerability in exchange for lasting strength. Although consolidation theories and the buffer model differ in important ways,Footnote 3 they share the principle of information being vulnerable before it reaches a robust state. Conceivably, the act of interleaving tasks that do not share processing code allows consolidation to occur, such that Task A is consolidated during the conduct of Task B. This consolidation process might strengthen the target memories, a benefit not afforded by massed acquisition trials characterized by less consolidation time. However, when the interleaved tasks do share processing code, engaging in Task B might not constitute a consolidation event for Task A, but rather might somehow interfere with that consolidation from occurring.

In the present investigation, we conducted three of the four possible combinations of primary task and delay task processing codes, such that we assessed verbal and spatial learning with a verbal delay, and verbal learning with a spatial delay, but we did not assess spatial learning with a spatial delay. The reason this combination was not assessed is that the literature already provides ample data relevant to this combination. For example, results by Shea and Morgan (1979) exemplify research into the effects of interleaving (or “contextual interference”) studies on the acquisition of spatial (i.e., motor) skills, and in doing so provide a combination of spatial learning with a spatial delay task.Footnote 4 These authors demonstrated that when spatial tasks are spaced with other spatial tasks, blocked (i.e., massed) acquisition is superior to interleaved (i.e., spaced) acquisition. This finding has been confirmed in several other spatial learning contexts. Blocked acquisition advantages have been reported in spatial domains, such as those of piano playing (Abushanab & Bishara, 2013), physical and imagined movement sequences (Gabriele, Hall, & Lee, 1989), and baseball batting (Hall, Domingues, & Cavazos, 1994), all using spatial tasks as both primary and “delay” or interleaved tasks. Blocked acquisition advantages have also been observed in verbal tasks such as using logic rules (Schneider, Healy, Ericsson, & Bourne, 1995), mathematics (Taylor & Rohrer, 2010), and verb conjugation (Pan, Tajran, Lovelett, Osuna, & Rickard, 2017), each using verbal tasks as primary and delay tasks, reinforcing our conclusion that the harmful effects of selective interference at acquisition are not specific to either verbal or spatial learning. The essential equivalence between studies nominally conducted in the spacing and interleaving paradigms suggests that interleaving and spacing effects would be usefully conceptualized together.

In particular, we contend that both of these domains of research would benefit from our findings of selective interference, as our results bear on whether performing a second task significantly affects the rate of acquisition of the first task, regardless of whether the second task is thought of as a delay or an interleaved task. Various studies of the spacing effect occupy the period of delay between blocks of the primary task with filler tasks that require attention, yet are presumed not to be interfering with the primary task (e.g., Taylor & Rohrer, 2010) or instructions to rest by reading or speaking with the experimenter (e.g., Bourne & Archer, 1956). However, in many other studies, delay tasks are very similar to the primary task and even necessitate attention and effortful processing. This procedure of using effortful activity during the delays constituting interstudy intervals is evident in studies such as those by Karpicke and Roediger (2007), Maddox and Balota (2015), and those reviewed in the classic chapter by Hintzman (1974), in which successive spaced occurrences of any given verbal associate pair were separated by other paired associate trials. Clearly this within-list spacing procedure would be expected to introduce far more code-specific interference than the requirement to wait for time to elapse without activity constraints. In the case of interleaving effects, one task is typically alternated with one or more other tasks that are certain to interfere with the learning of the first task (e.g., Hall et al., 1994; Shea & Morgan, 1979). There thus seems to exist a continuum of procedures used to investigate the effects of delay tasks on repeated performance that encompasses both spacing and interleaving effects. On one end of the continuum lies the distributed practice of a task interspersed with a delay task that introduces no interference, such as quiet rest or sleep (e.g., Jenkins & Dallenbach, 1924). On the other end of the continuum lies the distributed practice of a task interspersed with a delay task known to cause interference, such as a variant of the primary task. Somewhere in the middle lie procedures in which the interfering effect of a delay task is uncertain. The present study suggests that the primary determinant of interference on acquisition performance is the presence of processing-code overlap between tasks, and this characteristic determines where on the interference continuum the used procedure will lie, and as we have shown, influences whether massed or spaced advantages at acquisition should be expected. If studies of the spacing effect were always conducted with minimal interference (just as studies of interleaving effects are always conducted with substantial interference), we could propose that spacing effect paradigms are typically accompanied by spacing advantages, whereas interleaving effect paradigms are accompanied by massed (or blocking) advantages. Absent this consistency from the spacing effect literature, attention must be paid to the relationship between the primary and delay tasks used in studies of the spacing effect, as the choice of tasks that share the same processing code will likely affect performance and should therefore be theoretically motivated.

The present investigation constitutes an important step toward understanding the structure of human memory systems, particularly with respect to the conditions under which interference occurs at the level of long-term memory, and in doing so raises many interesting questions worthy of further research. If code-specific interference is truly operating on long-term memory (via influences either on retrieval or on consolidation), then the effect of task overlap should be observed in acquisition schedules spaced over hours, days, or weeks. We anticipate difficulty in manipulating the presence of verbal and spatial processing over these longer time frames, but our theorizing predicts that such effects should still be found at such extended intervals. We also note that in order to collect data at acquisition, learning trials must also be test trials, as is the case in the anticipation method for paired associates used here. This reliance upon test trials calls into question whether these effects are in some manner reliant upon the act of testing to elicit them, an important consideration given that testing has been found to benefit memory more so than restudying (Roediger & Karpicke, 2006). In other words, it is possible that testing during acquisition, rather than acquisition per se, is at least partially responsible for the effects we have recorded.

In conclusion, the present research on spacing effects at acquisition suggests that long-term memory is characterized by susceptibility to interference, in either storage or retrieval, that is selective to processing code, such as verbal versus spatial modality. Massing was observed to benefit acquisition only when this interference was present, and spacing was observed to benefit acquisition only when it was absent. These data might be accounted for by contextual fluctuation theories, born of the work by Atkinson and Shiffrin (1968), or by consolidation theories, espoused by neuroscience. Which class of theories can better explain this important and relatively unexplored domain of research is now an empirical question.

Author note

Adam P. Young, Alice F. Healy, Matt Jones, and Lyle E. Bourne, Jr., Department of Psychology and Neuroscience, University of Colorado Boulder.

The research reported here was supported in part by Grant No. DRL1246588 from the National Science Foundation. We are indebted to the members of the Center for Research on Training at the University of Colorado for their helpful suggestions about this research. We are especially grateful to James Foster for his help programming the mirror-tracing task used in Experiment 3, and for the helpful feedback received at the Symposium on Memory Dynamics and the Optimization of Instruction Revisited, American Psychological Association Convention, Denver, August 2016.