Although psychologists often assume independence between responses when analyzing their data, analyses across different areas of cognitive psychology have revealed that making a response modifies the probability and timing of subsequent responses (e.g., Bertelson, 1961; Gilden, 2001; Remington, 1969; Stewart, Brown, & Chater, 2002). The study of human memory has revealed a number of examples of such sequential dependencies. For instance, recalling an item often reduces the probability of recalling that item again (Farrell & Lewandowsky, 2012; Henson, 1998a; Jahnke, 1969; Vousden & Brown, 1998) and can also limit the probability of recalling other items from memory (Nilsson, Wright, & Murdock, 1979; Roediger, 1974; Tulving & Arbuckle, 1966). In recognition memory, Malmberg and Annis (2012) observed a number of sequential dependencies (e.g., an “old” response tends to be followed by another “old” response), and in free recall, items that are presented close together in time tend to be recalled together (Howard & Kahana, 1999; Kahana, 1996).

This article is concerned with sequential dependencies in the recall of sequences from short-term memory and what they can tell us about the mechanism(s) subserving this fundamental competency. Empirically, short-term memory for serial order is examined using the serial recall task, in which participants are given short sequences of verbal items (e.g., letters, digits, or words) that they must subsequently recall in order. Over the past 15 years, this task has supported considerable theoretical development in our understanding of how people remember sequences over the short term (Brown, Preece, & Hulme, 2000; Burgess & Hitch, 1999; Farrell, 2012; Henson, 1998b; Page & Norris, 1998; see Lewandowsky & Farrell, 2008b, for a review).

Existing theories can be broadly divided into three different classes. Chaining models (Lewandowsky & Murdock, 1989) hold that items are linked together in a chain, such that having recalled item n, people can recall item n + 1 by directly using item n as a cue to that next item. Primacy models (Farrell & Lewandowsky, 2002; Page & Norris, 1998) assume that order is represented in a decreasing gradient of activations across elements in a sequence, with recall of that sequence being accomplished by repeatedly choosing the most active item as the next response and then suppressing its activation in order to prevent it from being recalled again. Finally, in positional models (Brown et al., 2000; Burgess & Hitch, 1999; Farrell, 2012; Henson, 1998b; Lewandowsky & Farrell, 2008b), items are associated with some exogenous representation of their position in the sequence (e.g., states of a timing signal; Brown et al., 2002), such that the item at a particular position can be retrieved by cuing memory with the positional representation corresponding to that position.

Much empirical and theoretical work in recent years has been dedicated to distinguishing between these different theoretical accounts of order memory (e.g., Farrell & Lelièvre, 2009; Farrell & Lewandowsky, 2004; Henson, Norris, Page, & Baddeley, 1996; Oberauer & Lewandowsky, 2008). One apparent regularity that has been argued to be particularly informative for distinguishing between these different mechanisms is the form of the sequential dependencies following an error during recall. Consider the case where a sequence of four elements—here labeled A, B, C, and D—have been presented to a person to remember. Of interest is what happens following the first ordering error; in our case, let us assume that A is correctly recalled at the first output position but item C is then erroneously recalled at the second output position. In such a situation, which item tends to be recalled next, item B or item D? Chaining models predict that item D will be recalled next; having recalled item C, the person is assumed to use C as a cue to the next item in the chain, which is item D. Primacy models predict the reverse effect; of the two remaining items, B is more active than D and will tend to be selected for recall next. A very basic version of the positional model will predict no preference between B and D, since the positional marker associated with C will serve equally well as a cue for B and D. Nevertheless, the various positional models incorporate additional assumptions, such as overwriting in the weight matrix storing associations (Burgess & Hitch, 1999), a primacy gradient in conjunction with the positional cues (Burgess & Hitch, 1999; Farrell, 2012; Henson, 1998b; Lewandowsky & Farrell, 2008b), or response suppression (Brown et al., 2000; Burgess & Hitch, 1999; Henson, 1998b; Lewandowsky, 1999), which will bias the model to preferentially recall item B or D next.

Until recently, analyses of typical serial recall data have shown a pattern that is qualitatively consistent with primacy gradient models: The premature erroneous recall of an item tends to be followed by the displaced item (B in our example above), thereby “filling in” the empty spot left by the prematurely recalled item (C in our example) (Henson, 1996; Surprenant, Kelley, Farley, & Neath, 2005). The ratio of such fill-in errors to “infill” errors—where the premature recall of an item drags along with it the following item (D in the above example)—tends to be approximately 2:1 (Henson, 1996; Surprenant et al., 2005). This dependency in serial recall errors, in conjunction with other error dependencies associated with the recall of sequences containing repeated items (e.g., Henson, 1998a) and sequences mixing rhyming and nonrhyming items (Farrell & Lewandowsky, 2003; Henson et al., 1996; Lewandowsky & Farrell, 2008a), constitutes strong evidence against chaining models of serial recall.

However, a recent article by Solway, Murdock, and Kahana (2012) reported striking evidence of sequential dependencies more in line with chaining models. Solway et al. applied two analyses that they used as indices of positional clustering and temporal clustering, respectively. Positional clustering (the tendency for recalled items to cluster around their correct position) was measured by examining the probability of recall as a function of the distance between an item and its correct position, commonly called transposition distance. When measured in aggregate form, this reveals a peaking around a distance of 0 (i.e., correct recall), with decreasing probabilities for larger absolute distances (see, e.g., the top left panel of Fig. 1). Solway et al. showed that a more diagnostic pattern is revealed by making this analysis conditional on the first ordering error, which will usually be an anticipation (i.e., premature recall) of an item in the sequence. When the transposition distance was computed only for those responses immediately following the first order error, Solway et al. found that the function peaked at negative values, indicating that a further anticipation was made. This is similar to an infill error in the analyses of Henson (1996) and Surprenant et al. (2005) and is consistent with participants navigating a chain of associations.

Fig. 1
figure 1

Aggregate positional clustering plots. Each panel shows, for a particular data set, recall probability as a function of displacement (the difference between the output position at which an item was recalled and its correct output position). Error bars plot 95 % repeated measures confidence intervals. Refer to Table 1 for the source of the data illustrated in each panel

Solway et al. (2012) also measured temporal clustering (the tendency for recalled items to cluster around the position of the last item recalled), using the lag conditional response probability (lag-CRP) function, which has been commonly applied to examine sequential dependencies in free recall (e.g., Farrell & Lewandowsky, 2008; Howard & Kahana, 1999; Kahana, 1996; Ward, Tan, & Grenfell-Essam, 2010). The lag-CRP function records the probability of recalling an item k as a function of the lag k − j between that item and the item just recalled, j. As for the positional analysis, temporal clustering in the aggregate form is mostly uninformative and shows a pattern similar to that seen in the top left panel of Fig. 3, whereby the most likely next response is at lag +1 (i.e., the item that followed the just recalled item on the input list) and, otherwise, monotonically declines over the range examined by Solway et al. The temporal clustering effect is more informative when again made conditional on the first ordering error having just been committed during recall. Under such conditions, the lag-CRP function shows little change in its form in the analyses of Solway et al., indicating once more that an anticipation tends to be followed by a further anticipation, because the first erroneously recalled item (which is usually an anticipation) was followed by the next list item (which then also must be an anticipation). Solway et al. presented a reduced-form compound chaining model—incorporating remote as well as contiguous associations between items—and showed that it was able to account for the sequential dependencies in errors witnessed in their analyses, whereas a positional model (Burgess & Hitch, 2006) was not.

When one attempts to adjudicate between extant models of serial recall, the results of Solway et al. (2012) create renewed and intriguing ambiguity. Whereas the data of Henson (1996) and Surprenant et al. (2005) speak against chaining as a process supporting serially ordered behavior, the analyses of Solway et al. point to chaining as a key mechanism. To address these apparent inconsistencies, we analyzed a number of existing serial recall data sets gathered for other purposes, using the measures employed by Solway et al. and Henson (1996). As is detailed next, we found a consistent pattern of results that run contrary to the predictions of chaining models.

Analysis of typical serial recall data

We analyzed data from 19 experiments that were previously examined for different purposes in a recent article (Farrell & Lewandowsky, 2012). The details of the experiments are given in Table 1. The experiments all involved serial recall of letters or digits, and the sequences presented to participants were of a length typical of tasks in the serial recall literature (five to seven). The experiments differed in terms of additional features. Some of the experiments were control conditions for experiments examining phonological similarity effects (Farrell & Lewandowsky, 2003; Lewandowsky & Farrell, 2008a). A number of the experiments varied the timing of the presentation of stimuli within or between sequences in order to examine phenomena such as purported temporal isolation effects (Farrell, 2008; Lewandowsky, Brown, Wright, & Nimmo, 2006; Nimmo & Lewandowsky, 2006), while others varied the nature and timing of distracting activity following list presentation (Lewandowsky, Geiger, Morrell, & Oberauer, 2010; Lewandowsky, Geiger, & Oberauer, 2008). Finally, several other experiments were run primarily to examine other aspects of performance, such as response latencies (Farrell & Lewandowsky, 2004) and sequential dependencies affecting the recency effect (Farrell & Lewandowsky, 2012).

Table 1 Details of experiments

Positional clustering

Figure 1 shows the transposition gradients for each of the 21 conditions. Each panel shows the probability of recalling an item as a function of the displacement of the item (the signed difference between its output position and its list position) for one of the data sets described in Table 1. Positive values represent postponements (i.e., recalling items too late), and negative values represent anticipations (recalling items too early). Following Solway et al. (2012), these probabilities were corrected for the number of opportunities to make positional confusions over various distances (e.g., there are many ways to confuse adjacent items, whereas a confusion over the longest possible distance can involve only the first and last items on the list); repetition and extralist intrusion errors were ignored. The pattern in Fig. 1 is stable and consistent with Solway et al.’s results in showing a peak at displacement 0 (representing a large number of correct responses) and a falling off with increasing displacement.

The more diagnostic analysis examines those same transposition gradients immediately following the first order error (such that, at most, a single response is included from each sequence); these are plotted in Fig. 2. These show a flatter profile and cluster around a displacement of 0 or a positive displacement. This implies a consistent pattern in these data: An anticipation error (which the first order error will typically be) tends to be followed by a postponement error.

Fig. 2
figure 2

Conditional positional clustering plots. Each panel shows, for a particular data set, recall probability as a function of displacement when responses are conditioned on the first order error. Refer to Table 1 for the source of the data illustrated in each panel

These results contrast strikingly with those observed by Solway et al. (2012) in the four data sets they examined, where displacements following the first order error tended to peak at negative values. Before attempting to explain the discrepancy in these results, we first report analyses of temporal clustering and fill-in/infill errors.

Temporal clustering

The lag-CRP analyses point to a similar discrepancy between these data sets and those analyzed by Solway et al. (2012). Lag-CRP was analyzed by calculating the lag (i.e., signed difference) between successive items in the output sequence, excluding extralist intrusion, repetition, and omission errors. As for the positional analyses, these values were corrected for the opportunity of making lags over various distances. Figure 3 shows the lag-CRP analysis for all responses, and the general pattern replicates that seen in Solway et al.’s analyses: Transitions over larger distances are generally less likely to be committed, and transitions of +1—following recall of an item with the item that followed it on the list—are highly favored. However, analyses of the lag-CRP functions immediately following the first order error (Fig. 4) show a quite different pattern. In 19 out of 21 cases, the probabilities for a lag of −1 are numerically greater than for a lag of +1, and in 13 cases, this difference is statistically significant (see the figure for details). What this means is that when people have committed an anticipation, the most likely next response is an item from earlier in the list—in all likelihood, the item just skipped over for the anticipation. Our analysis contrasts with the results of Solway et al., who found that +1 transitions dominated responses even when the analysis was restricted to the first order error.

Fig. 3
figure 3

Aggregate temporal-clustering plots. Each panel shows, for a particular data set, recall probability as a function of the lag (difference) in input position between successive items in the output sequence. An asterisk in the top right of a panel indicates that the difference between +1 and −1 lags was statistically significant by a chi-square test. Refer to Table 1 for the source of the data illustrated in each panel

Fig. 4
figure 4

Conditional temporal-clustering plots. Each panel shows, for a particular data set, recall probability as a function of the lag between successive responses, when conditioned on the first order error. An asterisk in the top right of a panel indicates that the difference between +1 and −1 lags was statistically significant. Refer to Table 1 for the source of the data illustrated in each panel

Fill-in versus infill errors

Examination of fill-in:infill ratios (Henson, 1996; Surprenant et al., 2005) for these data leads to similar conclusions. This ratio varied from 0.33 to 6.57 across data sets, with a geometric mean of 1.98. When made conditional on the first order error, as had been done in previous fill-in analyses (Henson, 1996; Surprenant et al., 2005), a larger range was observed (min = 1, max = 17) and a larger geometric mean (3.77). These analyses agree with the positional and temporal-clustering analyses in showing that the anticipation of an item tends to be “filled in” by the postponement of an earlier item, rather than triggering an additional anticipation error.Footnote 1

Analysis of Grenfell-Essam and Ward (2012)

The results we have reported here show that the general tendency to infill errors observed in Solway et al.’s (2012) analyses are not representative of the serial recall task as it is typically conducted. Rather, corroborating previous analyses (Henson, 1996; Surprenant et al., 2005), we found a general tendency toward fill-in errors, such that items that have been displaced by anticipations are reported in the immediately following output positions. This raises the following question: What is the source of this discrepancy, and might this be informative?

There are a number of procedural differences between the experiments analyzed here and those examined by Solway et al. (2012). Notably, their experiments employed longer lists (between 10 and 19 items per sequence), and all required participants to remember words rather than letters or digits. In addition, their experiments did not require precise placement of items; rather, the requirement was that any items that were recalled should be recalled in their relative order of presentation, such that participants were free to skip any number of items without penalty. The experiments examined here, by contrast, all required participants to report items in their precise location and also recorded omissions explicitly if these were allowed. Finally, some of the experiments examined by Solway et al. were serial learning experiments in which participants were given repeated presentations of the same list.

To attempt to pin down at least some of the possible sources of this discrepancy, we examined the large data set of Grenfell-Essam and Ward (2012) in a further analysis. In their Experiment 2, participants were required to recall lists of words in serial order, but words could be skipped over without penalty (participants were also required to report the presentation position of items by placing them in a grid; we ignore this aspect of the task in this analysis and focus specifically on dependencies in output order). An important manipulation was list length, which varied from a single word to 15 words. Figures 5 and 6 show the results of this analysis, focusing on the pattern following the first order error for the condition in which list length was known by participants prior to presentation. The results for longer list lengths are similar to those obtained by Solway et al. (2012) in their analysis of the data of Golomb, Peelle, Addis, Kahana, and Wingfield (2008): The displacement function tends to be weighted toward negative displacements (Fig. 5), and the lag-CRP function shows a clear tendency to produce +1 transitions (Fig. 6). At shorter list lengths, these patterns are less pronounced but do not show the opposite pattern that was observed in the other analyses reported here for serial recall of short lists. Accordingly, list length is an apparent determinant of the nature of the sequential dependencies observed, although it is clear that other aspects of the tasks—including the instructions given to participants and the type of stimuli (letters and digits vs. words)—also play a role in determining the precise pattern observed.

Fig. 5
figure 5

Aggregate positional-clustering plots for the data of Grenfell-Essam and Ward (2012). Each panel shows, for a particular list length, recall probability following the first order error as a function of displacement (the difference between the output position at which an item was recalled and its correct output position). Error bars plot 95 % repeated measures confidence intervals

Fig. 6
figure 6

Conditional temporal-clustering plots for the data of Grenfell-Essam and Ward (2012). Each panel shows, for a particular list length, recall probability as a function of the lag between successive responses, when conditioned on the first order error. An asterisk in the top right of a panel indicates that the difference between +1 and −1 lags was statistically significant

When is recall dominated by fill-in versus infill?

On the basis of the analyses presented here, we can entertain several possible explanations for the different findings from different experiments and their implications for models of serial recall.

The first possibility is that the difference primarily reflects the distinction between short- and long-term memory. That is, the mechanism supporting the retention of shorter lists of letters or digits may be different from that supporting memory for longer lists of words. This is consistent with several models of serial recall (Burgess & Hitch, 1999; Page & Norris, 1998) that were explicitly developed as computational versions of the phonological loop model (Baddeley, 1986), the classic theory of verbal short-term memory. From this perspective, the results of Solway et al. (2012) are not problematic, since they speak to processes that are simply beyond the purview of such models. Equally, the results presented here reinforce the conclusion of other authors about the inadequacy of chaining of items as a mechanism supporting short-term memory (Baddeley, 1968; Farrell & Lewandowsky, 2003; Henson, 1996; Henson et al., 1996; Surprenant et al., 2005), while acknowledging that longer-term sequence memory may primarily be driven by a mechanism such as chaining, as suggested by the results of Solway et al. However, the results of Grenfell-Essam and Ward (2012) complicate this picture somewhat, since a systematic pattern is not observed for shorter lists when their data are also taken into account.

A second possibility is that the building up of a stable representation of a sequence over multiple trials—as in two of the experiments examined by Solway et al. (2012)—involves the transition to a chained representation. Initial theories of serial learning assumed that such chaining occurs (e.g., Ebbinghaus, 1885), and chaining-based computational models have been successfully applied to a wealth of serial learning data (Lewandowsky & Murdock, 1989). In addition, evidence from transfer to paired-associate learning (e.g., Crowder, 1968) and the spin list technique (e.g., Ebenholtz, 1963; Kahana, Mollison, & Addis, 2010) provide some evidence against positional representations (although see Hitch, Fastame, & Flude, 2005) and seem to point to the role of pairwise associations between items in supporting sequence memory. However, this presents a straightforward explanation for only two of the experiments examined by Solway and colleagues, since the other two experiments (Golomb et al., 2008; Kahana & Caplan, 2002)—along with Grenfell-Essam and Ward (2012)—involved serial recall of novel sequences.

Rather than positing systemic or functional differences, we instead seek to explain the data within a single model architecture. Recently, Farrell (2012) presented a model that accounts for performance in short-term serial recall and episodic memory (free recall and serial recall) tasks. Farrell’s (2012) account of short-term serial recall is akin to existing computational models (Brown et al., 2000; Henson, 1998b; Lewandowsky & Farrell, 2008b) in assuming that items are associated to positional markers but that a primacy gradient in associations is also enforced (for a full explanation of the model, see Farrell, 2012). Figure 7 shows that without any modifications, Farrell’s (2012) model does a good job of accounting for the sequential effects presented here. This figure analyzes the simulation reported in Fig. 3 of Farrell (2012), which modeled serial recall of short (six-item) sequences of letters; the only modification is that a larger number of model replications were used here (50,000). The top row of Fig. 7 shows that the model produces the correct pattern of displacements; in the unconditional analyses (top left panel), responses cluster around displacement 0, and when the analysis is contingent on the first order error, the model shifts toward favoring +1 displacements, as can be seen in a number of the experiments reported here. The lag-CRP analysis (bottom row) is also consistent with the data: Transitions of lag +1 are most favored in the unconditional analysis (bottom left panel), and this flips around to favor transitions of lag −1 in the conditional analysis (bottom right panel). The fill-in ratio produced by the simulation is 4.44, a little higher than the geometric mean from the studies analyzed here, but certainly producing a good approximation. Experimentation with the model revealed that it produced a lower fill-in ratio and produced more 0 displacements in the conditional analysis (top right panel) when the primacy gradient in the model was flattened.

Fig. 7
figure 7

Predictions from Farrell’s (2012) model for short sequences of letters. Panels give predictions for aggregate positional clustering (top left panel), positional clustering when conditioned on the first order error (top right panel), aggregate temporal clustering (bottom left panel), and temporal clustering when conditioned on the first ordering error (bottom right panel)

What about longer-term recall? Solway et al. (2012) showed that another positional model—namely, that of Burgess and Hitch (2006)—failed to account for their data. If recall is driven by positional associations, one might expect Farrell’s (2012) model to fail on these data in the same way. Although this might appear to necessarily be the case at first glance, Farrell (2012) made some additional assumptions that are important for understanding longer-term memory. One of these assumptions was that participants will spontaneously group a sequence into subsequences. Such grouping behavior has been suggested to apply to short-term memory (e.g., Henson, 1996; Madigan, 1980) and is captured by assuming that sequences are represented at two levels: the position of an item in a group and the position of that group in the sequence overall (see, e.g., Henson, 1998b). Along with the assumption that there is some variability in the group sizes and that participants recall the items from a group by first recalling the group context binding those items together (unless the last group is recalled immediately, in which case its group context is carried over from the list), Farrell (2012) showed that the model could account for a number of findings from longer-term recall. An important consequence of this grouping is that the grouping binds together proximate items in a manner that might look like chaining when analyzed in the manner employed here; if an item is recalled at a particular position, that will often be because its group context was recalled, meaning that the next item in that same group will tend to be recalled next.

In one of his demonstrations, Farrell (2012) simulated one of the data sets examined by Solway et al. (2012)—that of Golomb et al. (2008)—and Fig. 8 shows the analysis of the simulation results using the measures employed by Solway et al.; again, the model details and parameter values are exactly as described in Farrell (2012), with only the number of replications being increased. The figure shows that the model produces the appropriate pattern of positional and temporal clustering. When plotted in aggregate form, responses cluster around their correct position (top left panel), whereas when contingent on the first order error, responses tend to cluster at earlier positions following the first order error (top right panel). The bottom row shows that the model produces an asymmetry in the lag-CRP favoring +1 transitions, irrespective of whether the analysis is made conditional on the first order error. The model produces a fill-in ratio of 0.51, in the ballpark of the fill-in ratio calculated from Golomb et al.’s data (0.39) and consistent with Solway et al.’s analyses in suggesting that anticipations are followed by further anticipations.

Fig. 8
figure 8

Predictions from Farrell’s (2012) model for long sequences of words. Panels give predictions for aggregate positional clustering (top left panel), positional clustering when conditioned on the first order error (top right panel), aggregate temporal clustering (bottom left panel), and temporal clustering when conditioned on the first ordering error (bottom right panel)

Finally, the model also accounts for Grenfell-Essam and Ward’s (2012) data displaying a tendency to apparent chaining-like behavior at shorter list lengths. Farrell (2012) reported a simulation of a similar free recall experiment conducted by Ward et al. (2010); this simulation was adapted to the methodology of Grenfell-Essam and Ward by requiring forward recall (skipping items allowed) and assuming a probability of recalling only the last group of .3 (the same value as that used in the simulation of the Golomb et al., 2008, data). Figure 9, which shows the results of the simulation of lists of six words (longer lists gave results similar to those for the Golomb et al., 2008, simulation), makes clear that the model can produce apparent chaining-like behavior even for short lists.

Fig. 9
figure 9

Predictions from Farrell’s (2012) model for a shorter list of words. Panels give predictions for aggregate positional clustering (top left panel), positional clustering when conditioned on the first order error (top right panel), aggregate temporal clustering (bottom left panel), and temporal clustering when conditioned on the first ordering error (bottom right panel)

The simulations of Farrell’s (2012) model show that the difference in sequential dependencies between the various analyses presented here and in Solway et al. (2012) do not imply a shift in representation or the use of different memory systems. Rather, these differences are attributable to an interaction between grouping and task constraints. One difference between simulations producing fill-in versus infill is that in the short-term serial recall task, all items are stored in a single group; however, an additional simulation showed that the model produced similar results if spontaneous grouping was assumed. The more fundamental difference is that the tasks modeled in the first simulation all require that participants produce the entire sequence; in some cases, omission errors were allowed if people wished to skip a particular position, but some report of this skip was explicitly required. In the experiments examined by Solway et al.—including that of Golomb et al. (2008), and the study of Grenfell-Essam and Ward (2012) analyzed here—participants could skip items, the only requirement being that the items that were reported in their relative presentation order. From the perspective of the model, this means that participants could skip over entire groups. Indeed, to account for the recency observed in Golomb et al.’s serial recall data, Farrell (2012) assumed that on each trial, there was a .3 probability that only the last group would be recalled; a similar assumption was made here in modeling the Grenfell-Essam and Ward data. Otherwise, the model attempted recall of the groups in a forward order, but confusions between groups means that the wrong group context could be retrieved, such that items in those groups would tend to “travel together” in the output. These can both give the illusion of item-level chaining in cases of longer lists.

This theoretical analysis accords with Ward et al.’s (2010) suggestion that variations in performance across different memory tasks are likely to reflect differences in task constraints; that is, these patterns can be as much a reflection of how people use their memory as about the basic processes of memory itself. In the case of strict serial recall, participants are required to recall an item at every position, and so skipping intermediate items is not possible. In the case of looser methodologies—free recall or forward order tasks such as that of Golomb et al. (2008)—participants need not initiate recall with the first item and have an increasing tendency to initiate recall with the last few list items as list length increases (Grenfell-Essam & Ward, 2012; Ward et al., 2010). In the case of free recall, this is a sensible strategy for maximizing accuracy, since the last few list items are especially accessible immediately following list presentation (Lewandowsky, Brown, & Thomas, 2009). When the constraint of forward order is introduced, the recency advantage trades off against the constraint of only being able to recall items following those that have been recalled so far. The nonzero probability of recalling only the last group in the model captures the idea that participants may abandon attempting recall of the entire list in favor of having access to a relatively high-fidelity serially ordered representation of the last few list items.

Conclusions

We have shown that in short-term serial recall, there is a consistent tendency for participants to “fill in” gaps in the sequence left by anticipation errors with earlier items that have been displaced. This pattern has been observed in other short-term serial recall data (Henson, 1996; Surprenant et al., 2005), but it contrasts with recent similar analyses of longer sequences (Solway et al., 2012). Although this discrepancy may, at first glance, suggest that memory for different types of information or sequences of different length rely on different types of representations (positional and chaining, respectively), we instead have shown that both patterns are consistent with a model assuming a hierarchical representation of positional information (Farrell, 2012), the primary factor being whether participants are allowed to confuse or skip over groups.

Although the data are by themselves constraining, there are more detailed aspects of the fill-in and infill error data that have yet to be subjected to rigorous quantitative modeling. Specifically, Surprenant et al. (2005) examined fill-in ratios at the level of individual serial positions and noted no strong tendency for the error ratio to change across output positions. Surprenant et al. suggested that this finding is problematic for all serial recall models. It is clear why this is the case for some models (e.g., primacy gradient models will presumably predict that the ratio should decrease over output positions, since the ratio of activations of successive items decreases), but it is an open question as to whether models combining positional representations in conjunction with a primacy gradient (Brown et al., 2000; Farrell, 2012; Farrell & Lewandowsky, 2004; Henson, 1998b; Lewandowsky & Farrell, 2008b) might account for these more intricate features of the data.

What is clear is that the results of our analyses are problematic for simple chaining models (e.g., Lewandowsky & Murdock, 1989), which assume pairwise associations between successive items, as well as complex chaining models—such as the model presented by Solway et al. (2012; see also Murdock, 1995)—which incorporate remote, as well as pairwise, associations between items (although see Dennis, 2009). These models predict that premature recall of an item will drag a chain of items following that item along with it. In contrast, the results are consistent with other experiments and modeling efforts pointing to a fundamental role of positional representations in supporting sequence memory. A number of contemporary models of serial recall assume a multilevel representation of sequences in order to account for constraining data patterns from the recall of grouped lists (Brown et al., 2000; Burgess & Hitch, 1999; Henson, 1998b; Lewandowsky & Farrell, 2008b), particularly the tendency for confusions of items between groups to maintain their within-group position (e.g., Farrell & Lelièvre, 2009; Henson, 1999; Ryan, 1969a, 1969b). The learning of sequences over repeated trials also points to some hierarchical organization (e.g., Bower, Lesgold, & Tieman, 1969; Bower & Winzenz, 1969; Hitch, Flude, & Burgess, 2009; Johnson, 1970; Martin & Noreen, 1974), and work on sequential task switching implies some positional representations within chunks (Mayr, 2009). These findings suggest that positional representations are a ubiquitous feature of human short- and long-term memory, and—as is demonstrated here—Farrell’s (2012) model, with its assumption of hierarchically organized, primacy graded positional markers, is able to provide a good account of the observed dependencies in errors across different tasks and time scales.