Whereas previous studies of interleaved mathematics practice had required students to learn kinds of problems that were nearly identical in appearance (Fig. 3), the results reported here demonstrate that this benefit also holds for problems that do not look alike (Fig. 4). That is, the benefit of interleaved mathematics practice is not limited to the ecologically invalid scenario in which students encounter only superficially similar kinds of problems. Although it might seem surprising that a mere reordering of problems can nearly double test scores, it must be remembered that interleaving alters the pedagogical demand of a mathematics problem. As was detailed in the introduction, interleaved practice requires that students choose an appropriate strategy for each problem and not only execute the strategy, whereas blocked practice allows students to safely assume that each problem will require the same strategy as the previous problem.
However, the interleaved practice effect observed here might reflect the benefit of spaced practice rather than the benefit of interleaving per se. As we explained in the introduction, the creation of interleaved mathematics assignments guarantees not only that problems of different kinds will be interleaved, but also that problems of the same kind will be spaced across assignments, and spacing ordinarily has large, robust effects on delayed tests of retention. We therefore believe that spacing contributed to the large effect observed here (d = 1.05). Still, we have reason to suspect that interleaving, per se, contributed as well. In one previous interleaved mathematics study, students in both the interleaved and blocked conditions relied on spaced practice to the same degree, and interleaving nevertheless produced a large positive effect (d = 1.23; Taylor & Rohrer, 2010). In the present study, though, we chose to compare interleaved practice to the kinds of assignment used in most textbooks, which is a massed block of problems.
Theoretical accounts of the interleaved mathematics effect
How does interleaving improve mathematics learning? The standard account holds that the interleaving of different kinds of mathematics problems improves students’ ability to distinguish or discriminate between different kinds of problems (e.g., Rohrer, 2012). Put another way, each kind of problem is a category, and students are better able to identify the category to which a problem belongs if consecutive problems belong to different categories. This ability to discriminate is a critical skill, because students cannot learn to pair a particular kind of problem with an appropriate strategy unless they can first distinguish that kind of problem from other kinds, just as Spanish-language learners cannot learn the pairs PERRO–DOG and PERO–BUT unless they can discriminate between PERRO and PERO.
This discriminability account parsimoniously explains the interleaving effects observed in previous mathematics interleaving studies, because participants in these studies were required to discriminate between nearly identical kinds of problems (Fig. 3). For instance, one of these previous studies included an error analysis, and it showed that the majority of test errors in the blocked condition, but not in the interleaved condition, occurred because students chose a strategy corresponding to one of the other kinds of problems that they had learned—for example, using the formula for prism edges rather than the formula for prism faces (Taylor & Rohrer, 2010). Furthermore, the students in this study were given a second final test in which they were given the appropriate strategy for each test problem and asked only to execute the strategy, and the scores on this test were near ceiling in both conditions. In sum, the data from this earlier experiment are consistent with the possibility that interleaving improves students’ ability to discriminate one kind of problem from another (or discriminate one kind of strategy from another).
However, in the present study, discrimination errors appeared to be rare. In a post-hoc error analysis, three raters (two of the authors and a research assistant, all blind to conditions) examined the written solution accompanying each incorrect answer and could not find any solutions in which students “used the wrong strategy but one that solves another kind of problem.” The raters then expanded the definition of discrimination error to include solutions with at least one step of a strategy that might be used to solve any kind of problem other than the kind of problem that the student should have solved. With this lowered threshold, discrimination errors still accounted for only 33 of the 756 incorrect answers (4.4 %), with no reliable difference between conditions (5.1 % for interleaved, 4.0 % for blocked). For the other incorrect answers, students chose the correct strategy but incorrectly executed it (45.9 %), or they relied on a strategy we could not decipher, often because they did not show their work (49.7 %). The virtual absence of discrimination errors is arguably not surprising, partly because the different kinds of problems did not look alike, and partly because some strategies were obviously an inappropriate choice for some kinds of problems (e.g., trying to graph a line by creating a proportion). The rarity of discrimination errors in the present study raises the possibility that improved discrimination cannot by itself explain the benefits of interleaved mathematics practice.
We suggest that, aside from improved discrimination, interleaving might strengthen the association between a particular kind of problem and its corresponding strategy. In other words, solving a mathematics problem requires students not only to discriminate between different kinds of problems, but also to associate each kind of problem with an appropriate strategy, and interleaving might improve both skills (Fig. 5). In the present study, for example, students were asked to learn to distinguish a slope problem from a graph problem (a seemingly trivial discrimination) and to associate each kind of problem with an appropriate strategy (e.g., for a slope problem, use the strategy “slope = rise/run”), and the latter skill might have benefited from interleaved practice. Yet why would interleaving, more so than blocking, strengthen the association between a problem and an appropriate strategy? One possibility is that blocked assignments often allow students to ignore the features of a problem that indicate which strategy is appropriate, which precludes the learning of the association between the problem and the strategy. In the present study, for example, students who worked 12 slope problems in immediate succession (i.e., used blocked practice) could solve the problems without noticing the feature of the problem (the word “slope”) that indicated the appropriate strategy (slope = rise/run). In other words, these students could repeatedly execute the strategy (y
2 – y
1)/(x
2 – x
1) without any awareness that they were solving problems related to slope. In brief, blocked practice allowed students to focus only on the execution of the strategy, without having to associate the problem with its strategy, much like a Spanish-language learner who misguidedly attempts to learn the association between PERRO and DOG by repeatedly writing DOG.
It might be possible to experimentally tease apart the effects of interleaving on discrimination and association. In one such experiment, participants would receive either blocked or interleaved mathematics practice during the learning phase, as they typically do, and then take two tests. The first test would assess only discrimination. For example, students might be shown a random mixture of five problems—four problems of one kind (e.g., word problems requiring a proportion) and one problem of a different kind (e.g., a word problem requiring the Pythagorean theorem)—and then be asked to identify the problem that does not fit with the others (the Pythagorean theorem problem). Students would repeat this task many times with different kinds of problems. On a second test measuring both discrimination and association, students would see problems one a time and, for each problem, choose the correct strategy, but not execute it. Scores on the first test (discrimination only) should be greater than scores on the more challenging second test (discrimination and association), with larger differences between the two test scores reflecting a poorer ability to associate a kind of problem and its strategy. Therefore, if interleaving improves association, the difference between the two test scores should be smaller for students who interleaved rather than blocked.
Category learning
Finally, although we focused here on mathematics learning, several studies have examined the effect of interleaved practice on category learning. For example, participants might see photographs of different kinds of birds (jays, finches, swallow, etc.) one at a time, in an order that was either blocked (each of the jays, then each of the finches, etc.) or interleaved (jay, finch, swallow, etc.), and interleaving would produce greater scores on a subsequent test requiring participants to identify previously unseen birds (e.g., Birnbaum, Kornell, Bjork, & Bjork, 2013; Kang & Pashler, 2012; Kornell & Bjork, 2008; Wahlheim, Dunlosky, & Jacoby, 2011; but see Carpenter & Mueller, 2013). As with the results of previous interleaved mathematics tasks, the positive effect of interleaving on category learning could also be attributed to an improved ability to discriminate between, say, a jay and a finch. To our knowledge, though, it remains an untested possibility that this effect might also reflect a strengthened association between each category (e.g., finches) and the category name (“finch”). The relative contributions of enhanced discrimination and stronger associations to interleaving effects could be disentangled by an experiment analogous to the mathematics experiment proposed in the previous section: Participants would receive two tests: a discrimination-only test requiring them to sort birds (or identify the one bird that is different from others), and the usual test requiring them to name novel birds, which would require both discrimination and association. In summary, although strong evidence exists showing that interleaved practice can improve both mathematics learning and category learning, it seems unclear why either of these effects occur.