Modeling working memory: An interference model of complex span
This article introduces a new computational model for the complex-span task, the most popular task for studying working memory. SOB-CS is a two-layer neural network that associates distributed item representations with distributed, overlapping position markers. Memory capacity limits are explained by interference from a superposition of associations. Concurrent processing interferes with memory through involuntary encoding of distractors. Free time in-between distractors is used to remove irrelevant representations, thereby reducing interference. The model accounts for benchmark findings in four areas: (1) effects of processing pace, processing difficulty, and number of processing steps; (2) effects of serial position and error patterns; (3) effects of different kinds of item–distractor similarity; and (4) correlations between span tasks. The model makes several new predictions in these areas, which were confirmed experimentally.
KeywordsWorking memory Computational modeling
Working memory can be characterized as a system for holding a limited amount of information available for processing. Its limited capacity has been shown to have considerable generality across various contents and methods of measurement (Kane et al., 2004; Oberauer, Süß, Schulze, Wilhelm, & Wittmann, 2000). Variations in working memory between groups and between individuals have been shown to correlate with performance in a broad range of complex cognitive activities (for a review, see Conway, Jarrold, Kane, Miyake, & Towse, 2007).
The most commonly used paradigm for measuring working memory capacity is the complex-span paradigm. There are several variants of complex span, the earliest being the reading span (Daneman & Carpenter, 1980) and counting span (Case, Kurland, & Goldberg, 1982) tasks, later followed by operation span (Turner & Engle, 1989) and spatial variants of the paradigm (Shah & Miyake, 1996). The general schema of all complex-span tasks is that encoding of a list of memoranda (e.g., words, letters) for serial recall is interleaved with a distracting processing task (e.g., reading a sentence or verifying an equation). The term complex span has been coined in contrast to simple span, which refers to immediate serial recall without a parallel distractor task.
Multiple variants of complex span have been validated as measures of working memory capacity by the findings that they correlate well with each other and with other indicators of working memory capacity (Oberauer et al., 2000; Schmiedek, Hildebrandt, Lövdén, Wilhelm, & Lindenberger, 2009) and that they are good predictors of a range of performance indicators in tasks that are theoretically assumed to require working memory, such as tests of reasoning and fluid intelligence (Conway, Kane, & Engle, 2003), text comprehension (Daneman & Merikle, 1996), and explicit learning of a rule (Unsworth & Engle, 2005), as well as a number of experimental tasks requiring cognitive control, such as the Stroop task (Kane & Engle, 2003) and the antisaccade task (Unsworth, Schrock, & Engle, 2004). Therefore, understanding the cognitive processes in the complex-span paradigm would be a fundamental step toward understanding the capacity limits of cognition. The success of complex span as a measure of working memory capacity has inspired much experimental work and various theoretical efforts directed at analyzing the underlying processes (e.g., Barrouillet, Bernardin, & Camos, 2004; Bayliss, Jarrold, Gunn, & Baddeley, 2003; Engle, Cantor, & Carullo, 1992; Oberauer & Lewandowsky, 2011; Towse, Hitch, & Hutton, 2000; Unsworth & Engle, 2007).
With few exceptions, theories of the processes involved in complex span, like theories of working memory in general, have so far remained verbal descriptions of mechanisms. This is problematic because it is generally acknowledged that working memory is a complex system, and comprehensive theories of working memory typically assume numerous mechanisms and processes that operate together (Baddeley, 1986; Cowan, 1995). With theories of such complexity, unambiguously determining predictions for a specific set of circumstances easily surpasses our human reasoning abilities (Farrell & Lewandowsky, 2010). The problem is often compounded by the vagueness of verbal theories, which leave many critical details unspecified (for an example, see Lewandowsky & Farrell, 2011). These problems can be addressed using computational modeling. Writing a theory as a computer program forces the theorist to specify the model in sufficient detail for the program to run. Moreover, running the program provides a means to derive precise and unambiguous predictions from the model. Every single decision on the way from the general principles of a theory to its detailed implementation, and every step on the way to its predictions for a specific experiment, is fully transparent in the programming code.
Computational modeling has been applied fruitfully to one experimental paradigm of working memory research, the serial-recall task (Burgess & Hitch, 1999; Farrell & Lewandowsky, 2002; Henson, 1998b; Page & Norris, 1998). The goal of the present work is to apply what we have learned from modeling of serial recall to developing a computational model of behavior in the complex-span paradigm. This is no trivial step, because complex span appears to rely on core cognitive abilities to a far greater extent than does simple span. For example, even though the surface similarity between different complex-span tasks (e.g., operation span vs. sentence span) is far less than the surface similarity between simple-span tasks from different domains, performance correlates more highly across domains for the complex-span than for the simple-span task (Kane et al., 2004). Moreover, because the complex-span task shares many features with other paradigms of working memory research—short-term retention of information, a requirement to retain serial order, distraction by a concurrent task, and coordination of multiple competing processes—a computational model of complex span will serve as a springboard for more precise theorizing in the field as a whole.
One central theoretical question about working memory is why it has limited capacity. Many theories explain the capacity limit by assuming that representations in working memory quickly decay over time unless they are actively maintained by rehearsal or refreshing (Baddeley, 1986; Barrouillet et al., 2004). This assumption has been incorporated into the only two computational models of complex span proposed so far (Daily, Lovett, & Reder, 2001; Oberauer & Lewandowsky, 2011). The assumption of rapid time-based decay, however, has been repeatedly questioned by empirical observations (for a review, see Lewandowsky, Oberauer, & Brown, 2009). One common alternative to decay is that working memory capacity is limited by interference between representations (Jonides et al., 2008; Nairne, 2002; Saito & Miyake, 2004). To date, however, the concept of interference has remained underspecified, thus limiting its theoretical utility (Jonides et al., 2008). We overcome this limitation here by instantiating the interference notion in a detailed computational model of complex span.
Our model attributes the capacity limit of working memory entirely to interference. The model accounts for all of the findings that provided the initial empirical support for a decay-based theory of complex span, the time-based resource-sharing (TBRS) theory of Barrouillet, Camos, and colleagues (Barrouillet et al., 2004; Barrouillet, Bernardin, Portrat, Vergauwe, & Camos, 2007). The TBRS theory has, arguably, been the strongest contender for explaining complex-span performance to date, and therefore we will compare our new model to a computational implementation of the TBRS theory (Oberauer & Lewandowsky, 2011).
This article proceeds as follows: We start by presenting our model—first informally as a set of theoretical assumptions, and then formally as a computational instantiation. We then apply the model to four sets of empirical findings. These represent benchmark findings from the complex-span paradigm that should serve as priority targets for modeling. The first is a set of findings concerning the relation between short-term retention and the temporal parameters of concurrent processing. These findings provided the empirical basis for the TBRS theory (Barrouillet et al., 2007; Barrouillet, Portrat, & Camos, 2011). The second set of findings represents a detailed analysis of recall errors in complex span, which has proved highly informative for models of simple span. The third set of findings concerns the effects of different kinds of similarity between memory items and distractors. The fourth set pertains to the pattern of correlations between span tasks across different domains that arises from the study of individual differences. The model is shown to handle all four sets of findings.
A distributed neural-network model for complex span
Our model is an extension of the SOB (“serial-order-in-a-box”) model, a distributed neural-network model of serial recall (Farrell & Lewandowsky, 2002). The initial SOB was an auto-associator in the tradition of the brain-state-in-a-box (BSB) architecture (Anderson, Silverstein, Ritz, & Jones, 1977), from which the model derived its name. The second version, called C-SOB (Farrell, 2006; Lewandowsky & Farrell, 2008b), has a two-layer structure, with one layer representing serial positions and the other representing items (the prefix “C” stands for “context,” because the position representations are a form of context). Both item and position representations are distributed—that is, they consist of patterns of activation across a large number of processing units in the network. Different items are represented by different patterns across the same set of units. Thus, item representations have well-defined similarity relations to each other, reflected in the similarity of the patterns representing them; the same holds for positions. Items are encoded in C-SOB through Hebbian associations between item and position representations: The first list item is associated with the first position representation (a.k.a. a position marker), the second item is associated with the second position marker, and so on. Memory for order is maintained by the patterns of association in the weight matrix that connects position markers to item representations. The use of context markers to represent order is a standard tool among memory theorists and has gained substantial empirical support (Lewandowsky & Farrell, 2008b).
Memory performance is limited because all item-to-position associations are superimposed in the same weight matrix, so that at the point of recall the matrix represents each individual association only in a distorted fashion. One feature of both SOB and C-SOB, which is at the heart of much of the models’ predictive power, is that encoding strength is determined by an item’s novelty. Novelty is assessed by computing an expectation for each incoming item, on the basis of already-encoded memories, and determining the similarity between this expectation and the actual item. The more novel the incoming item is, the more strongly it is encoded. This process of assessing novelty to determine an item’s encoding strength is termed “novelty-gated encoding.” The assumption of novelty-gated encoding has been part of SOB since its inception, and has received independent empirical support (Farrell & Lewandowsky, 2003).
Our new model, SOB-CS (CS for “complex span”), builds directly on C-SOB (Farrell, 2006; Lewandowsky & Farrell, 2008b), maintaining its original theoretical principles but slightly updating its mathematical formalization (see Electronic Supplementary Material for an explanation of these technical details). In addition, SOB-CS incorporates two further theoretical assumptions whose introduction was necessitated by the presence of distractors in the complex-span task.
First, we assume that processing a distractor, such as reading a word or carrying out an arithmetic operation, inevitably results in the encoding of a representation of the distractor into working memory in the same way as the memoranda (Oberauer & Lewandowsky, 2008). There is considerable precedent in the literature for this assumption (e.g., Logan, 1988). Therefore, distractors create interference with encoded item representations. Novelty-gated encoding applies to distractors in the same way that it applies to items, so that repeatedly processing the same distractor incurs less interference than does processing different distractors.
Second, like most models of working memory, ours assumes that the system engages in active restoration of an unimpaired memory state when time allows. This assumption is motivated by the finding that memory performance in complex span is better when distractor operations are demanded at a slower pace, leaving more free time between each distractor and the next stimulus (Barrouillet et al., 2004; Barrouillet et al., 2007). Whereas in decay-based models, active restoration typically refers to boosting decayed traces up to their original strength (by some sort of rehearsal or refreshing mechanism), active restoration must be conceptualized differently in interference-based models. When the main limiting factor for performance is interference, active restoration must reduce the impact of interference. This can be accomplished in several ways. For reasons of parsimony, we have so far implemented only one of them in SOB-CS, by applying a mechanism that is already embodied in all SOB models to date—namely, the removal of interfering material from memory, which by implication restores the quality of earlier memories. Because the removal notion is central to SOB-CS, it deserves to be placed into a broader theoretical context.
One general theoretical insight that emerged from our modeling work is that a successful model must have a mechanism for clearing working memory of no-longer-relevant contents. Without this, the system would soon be overloaded with outdated material. For example, when mentally solving an expression such as “24 × 3,” it would be inopportune to retain “3” in working memory during the final step of adding “12” to “60”—even though the concept “3” necessarily had to be brought to mind a brief moment before in order to compute “12.” In general, rapid updating of working memory would be impossible without a clearing or removal mechanism, because the system would soon be choked by proactive interference, and demonstrably, this does not happen (Kessler & Meiran, 2008; Oberauer & Vockenberg, 2009).
In decay-based models, removal of old contents from memory occurs by default (viz., they simply fade away), and active maintenance must be engaged to retain contents that are still relevant. Interference-based models operate by the reverse logic: All contents are maintained by default, and active removal is necessary to remove those that are no longer relevant. Thus, whereas decay-based models must be equipped with a mechanism for active maintenance, interference-based models must include a mechanism for active removal.
This necessary link between interference and removal has been largely ignored in the literature (for an exception, see Hasher, Zacks, & May, 1999). SOB-CS provides a precise mechanism explaining how removal is accomplished whenever there is free time in-between distractor operations—for example, during a pause between solving a distracting equation and presentation of the subsequent memorandum. During those pauses, the immediately preceding distractor representation is gradually removed from memory using Hebbian antilearning. This operation gradually undoes the association between the distractor and the position marker (see Kessler & Meiran, 2008, for the related idea of “dismantling” outdated bindings in an updating task).
The assumption of distractor removal (or “unbinding”) is a generalization of an assumption that is common in models of serial recall: Once a list item is recalled, it is removed from memory to avoid perseveration. There is strong evidence to support such a mechanism for response suppression (Farrell & Lewandowsky, 2004; Henson, 1998a), and it has been implemented in many models of serial recall (G. D. A. Brown, Preece, & Hulme, 2000; Burgess & Hitch, 1999; Page & Norris, 1998). In all versions of SOB, response suppression has been modeled using the mechanism of Hebbian antilearning. Accordingly, response suppression is an instance of removing no-longer-relevant information from memory. Here, we simply generalize this notion to distractors.
By specifying how information is removed from working memory, we flesh out one basic operation for controlling the contents of memory. Control over which information is held in working memory is being recognized as an important source of individual differences in working memory capacity (Hasher et al., 1999; Jost, Bryck, Vogel, & Mayr, 2010; Vogel, McCollough, & Machizawa, 2005). Explicitly modeling the control processes operating on the contents of working memory is a prerequisite for understanding why working memory capacity also correlates with various control processes in tasks with little involvement of memory (Kane, Conway, Hambrick, & Engle, 2007).
To summarize, strong conceptual considerations mandate the presence of some control process that can clear working memory of unwanted contents. The removal notion is supported by data and theoretical precedent, and in SOB-CS we instantiate the removal process using a mechanism of proven theoretical utility.
In SOB-CS, removal of distractors plays a role similar to that of rehearsal or refreshing of memory items in other theories. Our model does not presently include a maintenance process for the strengthening of items (e.g., rehearsal or refreshing), for three reasons. First, rehearsal or refreshing are necessary mechanisms of maintenance when memory traces are assumed to decay over time; however, in a model that attributes forgetting to interference, the threat to remembering comes from the presence of interfering material, not from the decay of memoranda, and therefore, the most effective way of protecting memory is to remove the sources of interference. Second, the existing evidence does not yield strong support for a causal role of rehearsal in complex-span tasks. While there is no doubt about the existence of articulatory rehearsal, the evidence for it being causally responsible for superior memory performance is less than compelling. For example, people who report using articulatory rehearsal in a complex-span task do not perform much better than those who report just reading the memory items as their strategy, and more effective strategies, such as elaboration, are reported by only a minority of participants (Dunlosky & Kane, 2007; Kaakinen & Hyönä, 2007). Third, as we show below, we have successfully modeled benchmark findings cited in support of refreshing without actually requiring the refreshing of memory items, and we therefore have omitted that mechanism for reasons of parsimony.
We remain open to the possibility that a rehearsal or refreshing mechanism might become necessary in a future extension of the model, if new results become available that mandate its inclusion. To summarize this crucial point: We do not claim that people do not rehearse during complex-span tasks. The existence of rehearsal is beyond dispute. We also do not rule out the possibility that rehearsal benefits memory; however, the evidence to date has turned out to be inconclusive upon closer inspection. What we demonstrate in the remainder of this article is that rehearsal is not needed to account for benchmark data in complex span.
In addition to the two new assumptions just discussed, we make explicit two hitherto tacit assumptions in SOB. First, previous versions of SOB have—for simplicity—modeled basic processes as time invariant, on the basis that in all paradigms to which the theory has been applied to date, sufficient time was available for processes to run to completion. Under those circumstances, the theory could leave temporal aspects of those processes unspecified for parsimony’s sake. By contrast, in SOB-CS, we must explicitly model the time dependence of encoding and retrieval processes because, in complex span, performance is strongly influenced by temporal parameters (Barrouillet et al., 2004). In particular, we make the uncontroversial assumption that the degree of encoding and the extent of removal of stimuli increases up to a point, as more time is available to those processes. There is considerable evidence that encoding into short-term memory takes time to be accomplished (Jolicœur & Dell’Acqua, 1998). Likewise, there is evidence that removal of information from working memory takes time (Oberauer, 2001).
Second, we make explicit the notion of a focus of attention in SOB-CS. The last stimulus encoded into working memory, or the last item manipulated, typically enjoys a privileged status of heightened availability (Garavan, 1998; McElree, 2006; Oberauer, 2003a), supporting the notion that, by default, the last representation operated upon remains in the system’s focus of attention. In SOB-CS, as in other distributed neural-network models, there is at any point in time a pattern of activation in each layer of units that (more-or-less accurately) represents an event (i.e., an item or a distractor in a specific position). We regard this currently active representation as the content of the focus of attention. The active representations in the focus of attention are those that are available for processes such as encoding (through Hebbian learning) and removal (through Hebbian antilearning, as explained below). By default, the last-presented stimulus (item or distractor), together with its serial position, is in the focus of attention, and thereby the association between that stimulus and its position can be encoded or removed.
Definition of variables in SOB-CS
Vector representing list position i
Vector representing list item i
Vector retrieved when weight matrix is cued with pi
Vector representing distractor k following list item j
Weight matrix connecting item layer and position layer
Similarity between neighboring positions
Similarity between dissimilar list items
Similarity between similar list items
Similarity between item prototype and distractor prototype
Encoding rate for memory items and distractors
Removal rate for memory items and distractors
Threshold for logistic function transforming energy into encoding strength
Gain for logistic function transforming energy into encoding strength
Output interference: Noise added to weight matrix after each recall event
Discriminability between recall candidates (= steepness with which similarity falls off with Euclidean distance from retrieved vector)
Energy of item i; Energy of distractor k following item j
A(i); A(j, k)
Asymptotic encoding strength of item i and of distractor k after item j, respectively
Ω(i); Ω(j, k)
Asymptotic strength of antilearning for item i and for distractor k after item j, respectively
Encoding strength for encoding item i, computed from A(i), te, and R
Encoding strength for encoding distractor k following item j, computed from A(j, k), td, and R
Antilearning strength for removing item i, computed from Ω(i), ts, and r
Antilearning strength for removing distractor dj,k, computed from Ω(j, k), ts, and r
Time spent attending to, and thereby encoding, an item
Time spent attending to, and thereby encoding, a distractor
Time spent removing an item or distractor
Architecture and representations
Whereas position markers are shared by all tasks requiring memory for serial order, the representations of stimuli depend on the category of items and the distractors involved in a task. We constructed representations for four categories of stimuli used in the experiments that we simulated: letters, digits, words, and generic visuospatial stimuli, each of which can serve as memory items or as distractors. For letters, we used the representations of 16 consonants (six similar and ten dissimilar) constructed by Farrell (2006) and Lewandowsky and Farrell (2008a) for use with C-SOB. The similarity structure between these 16 vectors reflects a three-dimensional multidimensional scaling solution for an empirical confusion matrix between these letters (Hull, 1973). The average similarity between consonant representations, computed as their cosine, was .65 for the similar and .50 for the dissimilar subsets.
Nine vectors were constructed to represent digits. These were created from a common prototype such that their average similarity was .50; this value reflects the fact that digits are, on average, less confusable than letters (Jacobs, 1887). Finally, we created nine sets of nine words each. The words within each set were similar to each other (cosine = .65), and words from different sets were dissimilar (cosine = .5). For simulations of experiments not manipulating similarity, we used a random mixture of similar and dissimilar letters or words for all memory lists and distractor sets. Visuospatial stimuli (used in Simulations 5 and 6) were generated in the same way as the words, except that they are represented in a separate section of the item layer.
Distractor representations were created in the same way as the item representations—that is, by sampling individual distractors from a distractor prototype. For simulations in which items and distractors came from the same broad category (i.e., digits, letters, or words), we used the same prototype for generating both items and distractors, so that the average similarity between an item and a distractor equaled that between two items and between two distractors. For simulations in which the items and distractors came from different categories, we derived the prototype of the distractors from the prototype of the items, so that they had a similarity governed by the item–distractor similarity parameter sc, which was set to .35, the same value as in previous simulations (Oberauer & Lewandowsky, 2008).3
Encoding and recall
(see Lewandowsky & Farrell, 2008b). The computation of energy can be interpreted as the generation of an expectation for the item in position i, given the current state of memory as reflected in the state of the weight matrix before encoding of item i. The expectation is computed by cueing the weight matrix W with the new position pi. Energy is the negative dot product between the expectation (computed as Wpi) and the actual item vi, and it reflects the degree of mismatch between the expectation and the actual item—that is, the item’s novelty. Equation 4 implies that more novel items, which have less negative (or even positive) energy values, are encoded more strongly. The use of energy to compute the weighting of incoming information is a core principle of SOB that turns out to be critical for the model’s predictions for complex span. Recent research on the processing of novelty in the hippocampus lends support to a mechanism very similar to the one assumed for SOB-CS (Kumaran & Maguire, 2007).
Our computation of encoding strength in SOB-CS differs from previous instantiations of SOB in that it makes encoding strength time-dependent. The duration of encoding an item into working memory can be estimated from dual-task studies (Jolicœur & Dell’Acqua, 1998) and from studies using masked presentation of visual stimuli (Vogel, Woodman, & Luck, 2006). These studies converge on an estimate of 150–300 ms as the average time for encoding an item into working memory. Jolicœur and Dell’Acqua applied a formal model to three of their experiments, from which we estimated the encoding rate for individual letters to be about six items per second.4 We therefore set the encoding rate to R = 6 in Eq. 3. This value implies that ηe(i) reaches 95 % of its asymptote after 500 ms. Thus, with encoding times of 500 ms or more, encoding in SOB-CS is virtually indistinguishable from encoding in previous versions of C-SOB, and Eq. 3 functionally reduces to the simple equality ηe(i) = A(i) that—bar the use of a logistic squashing function—is familiar from earlier applications of C-SOB.
In this equation, the Euclidean distance measure D is weighted by the free parameter c, which determines the discriminability between retrieval candidates. With larger values of c, similarity falls off more steeply with distance, so that the most similar retrieval candidate is more clearly discriminated from the less similar ones. For computational reasons, D is normalized by subtracting the minimum distance across all of the n retrieval candidates from the distance for each candidate.
The set of recall candidates includes not only the list items but also other items in the experimental vocabulary. This enables the model to generate extralist intrusion errors. When the distractors come from the same stimulus category as the items (e.g., both are words), we assume that the distractors are also included in the candidate set, so that intrusions of distractors in recall can be modeled. (When distractors are categorically different from the memoranda, they are excluded from the set of candidates because people can prevent intrusions on the basis of categorical information.)
Overt recall itself impairs memory for the remaining items; this effect is known as output interference (Cowan, Saults, Elliott, & Moreno, 2002; Fitzgerald & Broadbent, 1985; Oberauer, 2003b). In SOB-CS, as in C-SOB, we implemented output interference by adding Gaussian noise with a standard deviation No to each weight in W after recall of each item.
Distractor encoding and removal
The distractor is encoded while attention is devoted to it (Phaf & Wolters, 1993). Like items, distractors reach near-asymptotic encoding strength after about 500 ms.
Estimates from experiments in which participants were instructed to (temporarily or permanently) remove part of the contents of working memory have shown that removal takes between 1 and 2 s (Oberauer, 2001, 2002). Therefore, we set r to 1.5, which implies that the rate of antilearning for removal has reached 95 % of its asymptote Ω(j, k) after 2 s.
To summarize, SOB-CS has four fixed parameters and six free parameters (see Table 1). We regard as fixed parameters those that were treated as fixed parameters in previous versions of C-SOB and whose values we did not change; they all pertain to the similarity between representations. We regard as free parameters those that we adjusted manually—either on the basis of independent evidence, as in the case of the rate parameters for encoding and removal, or to find values that generated good model fits to the benchmark data. We set the free parameters to the same values in all simulations reported in this article, except in a few cases, which will be explicitly noted: the threshold and gain parameters of the logistic function that translates energy into encoding strength, e = –1,000 and g = 0.0033; the encoding rate, R = 6; the removal rate, r = 1.5; the confusability parameter for items during retrieval, c = 1.3; and the output interference parameter, No = 1.5.
Before moving on to describe the application of SOB-CS to data from the complex-span paradigm, we first will briefly summarize an alternative account of complex-span performance, TBRS* (Oberauer & Lewandowsky, 2011), as that model serves as an important baseline for assessing SOB-CS’s account of several key phenomena.
An alternative theory: The time-based resource-sharing (TBRS) model
A new model must at the very least explain the data that constitute the main empirical support for extant theories and models. One contender for explaining performance in complex span is the time-based resource-sharing (TBRS) theory (Barrouillet et al., 2004), which has recently been instantiated in a computational model, TBRS* (Oberauer & Lewandowsky, 2011). TBRS* is the most sophisticated implementation yet of two popular assumptions about working memory: that memory traces decay rapidly over time, and that decay can be prevented by some form of active maintenance (i.e., rehearsal or refreshing). It is clear from the foregoing discussion that those assumptions stand in diametric opposition to the architecture of SOB-CS. Our first goal in this article is therefore to demonstrate that SOB-CS can explain the key phenomena cited in support of TBRS and TBRS* without committing to the core assumptions of the TBRS theory. Here, we summarize the TBRS theory and the key findings in its support.
Experiments within this paradigm have revealed three consistent regularities that lend support to the cognitive-load equation: First, memory performance decreases as the pace at which processing steps are required increases (Barrouillet et al., 2004; Barrouillet et al., 2007). This effect has been found across a large variety of memory materials and distractor tasks (Hudjetz & Oberauer, 2007; Vergauwe, Barrouillet, & Camos, 2010), and with children as well as adults (Barrouillet, Gavens, Vergauwe, Gaillard, & Camos, 2009; Portrat, Camos, & Barrouillet, 2009). In terms of the cognitive-load equation, increasing the pace means increasing the ratio of aN to T by increasing N, reducing T, or both.
Third, when the pace and the time demands of individual operations are held constant, increasing the number of operations following each memory item—and, hence, the total duration for each processing episode—has often been found to leave memory performance unaffected (Barrouillet et al., 2004; Oberauer & Lewandowsky, 2008). This is predicted by the cognitive-load equation: Increasing the number of operations at a constant pace means increasing both N and T by the same proportion, so cognitive load is unchanged. This third finding, however, needs to be qualified: In a series of experiments using word reading as the distractor task, we varied the number of distractor words to be read aloud after each memory item. If the same word was repeated four times, memory was as good as when reading a single word, but when three different distractor words followed each item, additional forgetting was observed (Lewandowsky, Geiger, Morrell, & Oberauer, 2010).
In sum, the TBRS theory and its computational implementation, TBRS*, currently offer the strongest alternative account to SOB-CS for explaining key experimental results from the complex-span paradigm. We therefore regard the findings that provided initial crucial support for TBRS as the first set of benchmark results that our new model needs to explain.
Complex span: Benchmark findings and new predictions
Benchmark findings from complex span (regular font) and new predictions from SOB-CS (italics)
Increasing pace of processing operations impairs memory. Increasing operation duration while holding pace constant impairs memory.
Effects of operation duration and free time
The effect of operation duration is smaller than that of free time
Present experiments (Electronic Supplementary Material)
Effect of number of operations
At constant cognitive load, number of operations has no effect on memory when distractors are identical, and it impairs memory when distractors differ.
Serial-position curves are characterized by extended primacy and small recency.
Serial-position curves of complex span are parallel to those of simple span.
The proportion of order errors follows an inverted U-shaped function over serial position; the proportion of item errors increases monotonically.
More order errors than item errors occur with simple span; the reverse is true for complex span.
Transpositions follow the “locality constraint”: They are more likely to occur to close than to distant positions.
Transposition gradients are equally steep for simple and complex span.
Item–distractor similarity: Proximity
Similar distractors interfere less with memory than do dissimilar distractors if they immediately follow items that they are similar to.
Oberauer et al. (2012)
Item–distractor similarity: Categorical
Recall is worse when distractors come from the same category as the items than when they come from different categories.
Item–distractor similarity: Feature-space overlap
Processing tasks in the same content domain interfere more with memory than do processing tasks in a different domain.
Correlation of simple and complex span
Simple- and complex-span measures load on separate but correlated factors.
Correlation across content domains
Span tasks with verbal and visuospatial content load on separate but correlated factors.
The benchmarks and the accompanying new predictions can be grouped into four sets. The first set consists of effects related to the interplay between short-term retention and processing, which is a longstanding topic of research and theorizing on working memory (Bayliss et al., 2003; Case et al., 1982; Towse et al., 2000). These are the effects that constitute the empirical support for the TBRS, described above. The second set consists of serial-position curves, error patterns, and transposition gradients. These are findings that, although given relatively little attention so far in the working memory literature, have been important in constraining models of serial recall. The third set pertains to the effects of similarity between memory items and distractors. These similarity effects are diagnostic for the mechanisms of interference in working memory. We test a new prediction arising from the assumptions in SOB-CS about the interference between items and distractors, and address the longstanding question of whether the disruption of immediate memory (“storage”) by distractor processing is domain-general or domain-specific. The fourth set concerns individual differences, which have been a major topic of research with the complex-span task. One of the reasons that so much interest has focused on complex span (and other working memory tasks) is that 50 % of the variance in working memory capacity across individuals is shared with measures of fluid intelligence (Conway et al., 2003). In addition, there has been considerable interest in the patterns of correlations between simple-span and complex-span tasks; we focus on those correlations because they fall within the scope of SOB-CS. In the remainder of this article, we will report the simulations through which we applied SOB-CS to these four groups of benchmark findings and present new model predictions, together with data testing them.
Cognitive load and the number of operations
The cognitive-load effect (Barrouillet et al., 2004) is an important regularity concerning the interplay between memory and processing in working memory. It implies that whereas the processing of distractors impairs memory, more generous intervals of free time between individual processing steps can be used to restore memory. The TBRS theory (Barrouillet et al., 2004) identifies decay as the cause of forgetting during processing operations, and refreshing as the beneficial force in the free intervals between operations. In contrast, SOB-CS assumes that distractor processing damages memory via the interference introduced from distractor representations entering working memory (Eqs. 13–15), and that the beneficial effect of free time arises from the gradual removal of distractor representations in-between processing steps (Eqs. 16–18).
Levels of cognitive load for Simulations 1 and 5
0.3 / 0.5
1.2 / 2.0
0.3 / 0.5 / 0.7
The results of Simulation 1 clarify how SOB-CS accounts for the first two of the three benchmark results cited in support of the TBRS theory. First, increasing the pace of processing for a fixed total processing duration decreases the free time after each distractor operation, thereby leaving less time to remove the preceding distractor. Second, increasing the duration of individual operations while holding their pace constant has two effects in SOB-CS. One is that longer attention to each distractor leads to stronger encoding, thereby creating more interference. This effect is small and levels off after 500 ms. The other, more pronounced effect is that as more of the fixed time between two operations is spent on processing the distractor, less time is left for removing it.
Unpacking cognitive load: Operation duration and free time
Cognitive load is determined by two temporal variables, the duration of distractor operations and the free time in between them. These components play different roles in SOB-CS and TBRS*. In SOB-CS, increasing the operation duration has only a limited effect through increasing the strength of encoding of interfering distractors (up to about 500 ms), whereas increasing the free time has a more pronounced beneficial effect because removal of the preceding distractor is a relatively slow process. In TBRS*, extending the operation duration leads to more decay, which continues as long as attention is captured by the operation, and extending the free time enables more refreshing of memory items. The two models differ in that SOB-CS predicts only a very small effect of operation duration if free time is held constant, whereas TBRS* predicts a much larger effect of operation duration (Oberauer & Lewandowsky, 2011, Fig. 6).
In Simulation 2, we modeled these experiments with SOB-CS and, for comparison, with TBRS*. We used the response latencies for size judgments to estimate the operation duration (separate estimates were taken for the first operation after each memory item and for successive operations because their latencies differed substantially; see Electronic Supplementary Material). Overt response latencies do not reflect the time for which a cognitive operation captures central attention (Barrouillet et al., 2007; Pashler, 1994) because sensory and motor processing components can be carried out independently of central attention. Estimates of the duration of those noncentral processing components are consistently between 350 and 550 ms (S. D. Brown & Heathcote, 2008; Ratcliff, Thapar, & McKoon, 2004, 2010). We therefore subtracted a noncentral component of 500 ms from the measured times to obtain an estimate of central processing duration per size judgment. The noncentral 500 ms were added to the nominal free time because this time could be used to remove distractors (SOB-CS) or to refresh memory items (TBRS*). In other words, we extended the nominal free time by 500 ms and reduced the distractor-processing time by the same amount in order to reflect the likelihood that some proportion of the distractor time did not involve the attentional bottleneck. Without this assumption, both models would predict greatly exaggerated effects of free time, and TBRS* would underpredict memory performance.
Figure 7 also shows the predictions of SOB-CS (middle panel) and those of TBRS* (bottom panel). For SOB-CS, we used the default parameter values, except for the discrimination parameter c, which we raised to 2.0 to bring the predictions into the overall accuracy range of the data. For TBRS*, we also used the default parameter values (Oberauer & Lewandowsky, 2011), except for the decay rate, which we reduced from 0.5 to 0.4 to raise performance to the empirical accuracy level. The simulation results of SOB-CS confirm what we saw in Simulation 1: Free time had a relatively large effect, whereas the effect of operation duration was tiny. The simulation with TBRS* shows much larger effects of operation duration than does SOB-CS. This is because in TBRS*, longer operations lead to more decay, which results in substantial forgetting.
For a quantitative comparison of the data with the predictions of the two models, we focused on the critical effect of operation duration. Across the three experiments, the mean effect of the manipulation of operation duration (i.e., size judgment difficulty) on memory was a 2.3-percentage-point loss of performance, with a 95 % confidence interval of [0.8, 3.8]. The predicted effect from SOB-CS was 1.0 percentage points, falling inside the confidence interval of the data. The predicted effect of TBRS* was 8.6 percentage points, clearly outside the confidence interval. Therefore, the data support the unique prediction of SOB-CS that the effect of cognitive load primarily reflects a beneficial effect of free time following a distractor, whereas the duration required to process the distractor plays only a minor role.
One potential objection to our model comparison in this section is that our simulations were contingent on our estimate of the time for sensory and motor processes (500 ms), during which central attention was not occupied. With different estimates for the duration of noncentral processes, TBRS* might give a better account of the data, and SOB-CS might look worse. To investigate this issue, we ran the simulations with different values for the assumed duration of noncentral processes, ranging from an implausibly short 0.1 s to an implausibly long 0.8 s. The results of these simulations are presented in Electronic Supplementary Material; they show that, irrespective of the particular estimate of the noncentral component in the size-judgment latencies, SOB-CS gives a better account of the data than does TBRS*.
The effect of the number of operations
The third benchmark finding cited in support of the TBRS is that the number of operations in between memoranda has no effect on memory. Barrouillet et al. (2004) predicted from their model that as long as cognitive load was held constant, the number of successive distractor operations in a complex-span task should not affect memory performance. This prediction plays an important role in the TBRS theory, because it protects the theory against a challenge that other decay-based theories face. Much evidence against decay has come from studies showing that extending a distractor-filled retention interval has no effect on memory (for a review, see Lewandowsky, Oberauer, & Brown, 2009). The TBRS theory apparently escapes this challenge by predicting that the retention interval will have no effect as long as cognitive load is held constant. Therefore, it is important to examine this prediction carefully.
Simulations with TBRS* have revealed a deviation from the predictions derived by Barrouillet et al. (2004), for reasons that are intuitively obvious upon closer inspection. TBRS* predicts that memory will decline with an increasing number of operations when cognitive load is at least moderately high (Oberauer & Lewandowsky, 2011). Brief reflection reveals this prediction to be inevitable within the TBRS theory: The only circumstance under which performance can be independent of the number of operations is when the time for refreshing exactly balances the decay experienced during processing. Whenever the effect of decay is stronger than that of refreshing during an individual processing operation (and the free time following it), the TBRS theory must predict that increasing the number of operations will lead to worse memory. We confirmed this prediction by simulation with TBRS* (Oberauer & Lewandowsky, 2011). We next consider the empirical pattern involving the effects of the number of operations, before we turn to simulations of SOB-CS to investigate whether the model can reproduce that empirical pattern.
Empirically, the effect of increasing the number of operations is quite nuanced and is determined by the relationship between the successive distractors in a processing episode (Lewandowsky et al., 2010; Lewandowsky, Geiger, & Oberauer, 2008). When the distractors are all identical (e.g., “April, April, April”), saying them three or four times does not lead to more forgetting than does saying them once. In contrast, when three different distractors (e.g., “April, May, June”) follow each memory item at encoding, recall is substantially impaired relative to a single distractor. Thus, any forgetting that could be attributed to decay is turned on or off depending on properties of the stimuli that are not considered relevant by the TBRS theory.
By contrast, these effects are predicted by a key principle of the SOB model series—namely, novelty-gated encoding: After processing and encoding the first distractor, each further identical distractor has negligible novelty, and hence is encoded with negligible strength. In contrast, when successive distractors differ from one another, each of them is to some degree novel, and therefore is encoded with substantial strength, thus adding to interference. When a series of different distractors follows each memory item, SOB predicts that memory will suffer when more of them are added.
The middle panel of Fig. 8 shows the results of Simulation 3, in which we applied SOB-CS to the same experimental conditions. The simulation used letters as memoranda and words as distractors to match the materials in the experiment. Representations were generated such that different words had an average similarity (i.e., vector cosine) of .5 with each other, and words overall had an average similarity of approximately .1 with the letters. Operation duration was set to 0.5 s, the value that results in near-asymptotic encoding of each distractor, in accordance with the measured word-reading latencies (>2 s for three words). Free time was set to 0.1 s to reflect the fact that there was hardly any temporal gap between the reading of successive words, as enforced by the experimenter, who urged participants to speak continuously without pauses and advanced the display sequence as soon as they had finished speaking.
With the exception of the recency effect for the single-distractor condition (which was predicted but absent in the data), the simulation closely matched the empirical data. In particular, SOB-CS accurately reproduced the interaction between number of distractor operations and distractor similarity: Increasing the number of distractors had an adverse effect on memory if and only if the distractors differed. This interaction presents a challenge for TBRS, which has no mechanisms sensitive to the similarity between distractors. This raises the question: how well could TBRS* account for the data of Experiment 3 in Lewandowsky et al. (2010) if additional assumptions were made that were particularly favorable to the model?
The bottom panel of Fig. 8 reproduces a simulation with TBRS* for that experiment (Oberauer & Lewandowsky, 2011). For this simulation, we assumed that whereas reading a new word occupies the attentional bottleneck for 0.3 s, repeating the same word does not require any further attention after the first word. Thus, cognitive load is assumed to be substantially lower in the condition with four identical than with three different distractors. It is important to realize that those assumptions are maximally favorable to TBRS*: In actual fact, reading-aloud repetitions of a word off the screen are unlikely to be completely attention-free. Only if we make this favorable assumption can TBRS* account for the relative accuracies of the four experimental conditions, averaged across serial positions. However, even under these favorable circumstances, TBRS* erroneously predicts that the effects of distractors will be entirely absent at the first list position and will increase strongly over serial positions, particularly at the last position. We will explore the reason for this erroneous prediction in the next section, when we discuss serial-position effects.
To summarize, SOB-CS correctly predicts that the effect of the number of distractor operations is modulated by the similarity of successive distractors. TBRS* can provide a post-hoc explanation for this modulation, but still it accounts for the detailed pattern of data less well than does SOB-CS.
Discussion: Cognitive load and number of operations
The strong and approximately linear relationship between memory performance and cognitive load (Barrouillet et al., 2004; Barrouillet et al., 2011) has been one of the important discoveries of the last decade in the field of working memory. There is little doubt that this function results from the interplay of two opposing processes: one that is detrimental to memory and occurs during distractor processing, and one that is beneficial to memory and occurs during brief pauses in between processing of the memoranda and distractors.
To date, the only available explanation for the cognitive-load effect has identified time-based decay and refreshing of memory traces, respectively, as those two opposing processes. This explanation lies at the core of the TBRS model. Independent evidence, however, strongly speaks against a major role of time-based decay in short-term or working memory (Lewandowsky, Oberauer, & Brown, 2009). This raises the question of whether the effect of cognitive load on immediate memory can be explained without assuming decay. Our simulations have established that SOB-CS reproduces the benchmark cognitive-load findings from complex-span tasks without invoking decay, and without invoking rehearsal or refreshing. These simulation results show that the crucial finding upon which the TBRS was built—the cognitive-load function—does not constitute unique evidence for that theory. On the contrary, when cognitive load is broken down into its two temporal components—operation duration and free time—SOB-CS arguably provides a better quantitative account of their individual effects than does TBRS*.
SOB-CS also accounts for the detailed pattern of results concerning the third benchmark, the effect of the number of operations. A previous version of C-SOB correctly predicted that under high cognitive loads, the number of operations would matter if and only if the distractors differed from each other; SOB-CS reproduced that pattern here. TBRS* can explain this finding only with the addition of favorable assumptions about variations in operation duration, and even then it mispredicts the interaction of the distractor effect with serial position.
The success of SOB-CS in modeling the first three benchmark findings lends support to the assumptions responsible for this success: Forgetting in working memory is primarily due to interference; concurrent processing adds to interference because distractor information is encoded into working memory; and the strength of distractor encoding is modulated by the distractor’s novelty, whereas free time following distractor operations can be used to reduce interference by gradually unbinding the preceding distractor from its context marker.
Inside complex span: Serial-position curves and error patterns
Our second set of empirical benchmarks involves the serial-position curve for the conventional method of scoring (i.e., recall of the correct item in the correct position), as well as for item errors and order errors within the complex-span task. SOB-CS makes two novel predictions for these benchmarks, both of which pertain to a comparison of simple to complex span. Our search of the literature revealed that the experiments of Lewandowsky et al. (2010) are the only ones that afford a controlled comparison of the serial-position curves for simple and complex span. Simulation 3 above demonstrated that SOB-CS accurately reproduces these serial-position curves. For the present discussion, we will focus on the condition without distractors (simple span) and on the condition with three different distractors following each letter, because that condition is most representative of complex-span tasks.
One prediction from SOB-CS is that the serial-position curves of simple span and complex span are largely parallel, as is shown in the middle panel of Fig. 8. This prediction is important because it distinguishes SOB-CS from models assuming decay together with rehearsal or refreshing to counteract it, such as the TBRS* model. As noted above, TBRS* predicts a strong interaction of the contrast between simple and complex span with serial position, with hardly any effect of distractor processing on recall of the first list item, and increasingly adverse effects for later list items. The reason for this prediction is that TBRS* must assume cumulative refreshing; that is, in each free-time interval, refreshing starts with the first list item. Decay models of complex span must assume cumulative rehearsal or refreshing, because decay alone imposes a strong recency gradient on memory strength (i.e., early list items decay more than later items). Cumulative refreshing, which prioritizes earlier list items, is needed to overcome this recency gradient and instead to produce a primacy effect on recall (Oberauer & Lewandowsky, 2011).6
Cumulative refreshing largely protects the first item from decay in complex span. As cognitive load in complex span increases, refreshing progresses less far into the list (because less free time is available), but as long as cognitive load is not extremely high, the first item is always refreshed. In consequence, TBRS* must predict that the first list item is largely immune to manipulations of interference or cognitive load, but that those effects will increase across serial positions. As we noted in the introduction, any theory assuming decay needs a mechanism of rehearsal or refreshing, and therefore faces this problem.
SOB-CS does not predict this strong interaction, because it does not require cumulative refreshing to maintain a list in memory. Instead, distractor interference and distractor removal apply in the same way to each list position. As is shown in the top panel of Fig. 8, the prediction of largely parallel serial-position curves for simple and complex span was borne out by the data. This confirms the first new prediction of SOB-CS.
Item and order errors
In the model, the locality constraint results from the association of items to position markers, together with the overlap of neighboring position markers, which decreases over positional distance. Therefore, each position marker cues not only the item associated with it, but also items associated with neighboring positions to the degree that the position markers overlap. As a consequence, the retrieved vector is similar not only to the correct item but also, to some degree, to the neighboring items, such that close neighbors have a higher chance of being confused with the correct item than do more distant neighbors.
Discussion: Serial-position effects and error types
To conclude, SOB-CS accurately predicted detailed patterns of behavior in simple and complex span: the serial-position curve, the proportions of item and order errors as a function of serial position, and the transposition gradients. The results of these simulations confirm a number of assumptions in SOB-CS. Complex span and simple span use the same basic mechanisms for remembering lists in serial order: Items are associated with position markers that overlap as a function of their ordinal distance. In complex span, as in simple span, position markers are advanced with every new list item, not with every event (i.e., not by distractors) or by the passage of time alone. If position markers changed with every distractor, or with the passage of time (as is assumed in temporal-distinctiveness models; G. D. A. Brown, Neath, & Chater, 2007), the position markers of neighboring items would be much more dissimilar in complex span than in simple span, and the transposition gradient of complex span would be flatter.
The main difference between the two paradigms is the added interference to item representations from the superposition of distractor information in complex span. Encoding of distractors distorts the associations of memory items with their positions, thereby impairing the reconstruction of the original item from the retrieved approximation vi'. Interference from distractors does not render the retrieved item representations more similar to each other, it only makes them less similar to their original representations; this is why the distractors increase item errors more than order errors. Because distractors are associated with the preceding item’s position, each list item suffers about the same degree of distractor interference in complex span. Therefore, the effect of distractors is largely additive with serial position. As we have seen in the simulations with TBRS*, the additive effect of distractors is not easily explained in the context of models that rely on decay and cumulative rehearsal or refreshing.
Similarity between items and distractors
Interference is commonly assumed to depend on similarity between the interfering materials. Therefore, every interference model should predict similarity effects in working memory correctly. Similarity between list items is known to have a detrimental effect on serial recall, and previous versions of SOB have accounted for these similarity effects in great detail (Farrell, 2006; Farrell & Lewandowsky, 2003). We therefore do not address interitem similarity within memory lists again, but instead focus on a similarity relation that is pertinent particularly to complex span—namely, the similarity between memory items and distractors—as our third set of benchmark results.
The first kind of similarity is proximity in feature space. When an item and a distractor come from the same broad category (e.g., both are words), they share the same feature space, which means that they can be meaningfully compared on the same feature dimensions. Their similarity can be evaluated as the proportions of features that they share (e.g., shared semantic features or shared rhyme), which is reflected in their proximity in feature space (see objects A and B in Fig. 11). This is the kind of similarity relation usually manipulated between memory items in simple-span paradigms.
The second kind is categorical similarity: Research on the relation between memory and distractor materials often manipulates whether they come from the same broad category. For instance, when words are used as the memory items, the distractors in the similar conditions would also be words (or sentences), and the distractors in the dissimilar condition could be digits (or equations). In this case, the items and distractors might still have a large degree of feature overlap (e.g., words and digits share many phonemes), but they would nevertheless be clearly distinct by their category membership (e.g., objects B and C in Fig. 11). This can be expressed by a category boundary in feature space.
A third kind of relation, which can also be thought of as similarity in a broad sense, is the degree of feature-space overlap. When items and distractors are from different representational domains (e.g., verbal vs. visuospatial), they cannot be meaningfully compared within the same feature space, because their features are values on different dimensions (e.g., objects A and D in Fig. 11). For instance, the question of how many features a phonological and a spatial representation share does not even arise, because spatial representations do not include phonemes, and phonological representations do not include spatial features. In what follows, we will investigate all three kinds of similarity through simulations in SOB-CS.
Proximity in feature space between items and distractors
When items and distractors come from the same stimulus category (e.g., words), similarity between items and distractors can be manipulated in the same way as similarity within lists. We recently explored this kind of similarity manipulation through simulations with SOB-CS and in a series of experiments (Oberauer, Farrell, Jarrold, Pasiecznik, & Greaves, 2012). We discovered that, under certain conditions, SOB-CS predicts better memory with higher item–distractor similarity. This similarity benefit arises if and only if the distractors immediately follow the items that they are similar to.
The item–distractor similarity benefit is a counterintuitive prediction that arises from the conjunction of two assumptions that are unique to SOB-CS: First, representations of items, distractors, and their positions are distributed, and second, each distractor is encoded by associating it with the position of the preceding item. The distributed nature of representations in SOB-CS implies interference by superposition. Each item–position association, and likewise, each distractor–position association, creates a distributed pattern of changes to the same weight matrix. Thus, all associations are superimposed, and each individual association is distorted by all others present in the weight matrix. This interference by superposition is the main cause of forgetting in SOB-CS.
We tested these predictions with four experiments manipulating the phonological similarity between nonword items and nonword distractors (Oberauer et al., 2012). Each memorandum was followed by two distractors to be read aloud. In the similar-following condition, each item was followed by two distractors similar to it. Using capital letters for memoranda and lowercase letters for distractors, such a sequence would be AaaBbbCccDdd, with each letter denoting a set of similar nonwords. In the similar-preceding condition, the items and distractors were still similar to each other, but the distractors were followed by the items similar to them: AbbBccCddDaa. In the control condition, the items and distractors were dissimilar throughout: AeeBffCggDhh.
Categorical similarity between items and distractors
Categorical similarity of items and distractors is varied when items and distractors are drawn from the same versus from two different categories. For instance, Turner and Engle (1989) created four versions of complex span by combining memory lists of words or of digits with distractor tasks involving words (reading sentences) or involving digits (verifying equations). Memory was worse for the combinations in which the items and distractors came from the same category. Conlin, Gathercole, and Adams (2005) replicated this pattern. These data show that whereas high feature-space proximity of items and distractors within a category is neutral or even helpful for memory, as discussed in the immediately preceding section (Oberauer et al., 2012), high categorical similarity between them is detrimental.
SOB-CS treats these two forms of item–distractor similarity differently. When distractors come from a different category than the items and people represent them as such (such that they are separated by a represented category boundary), we assume that people exclude the distractors from the set of recall candidates. Even if the representation retrieved at a given position were very blurry, people would not report a digit if they knew that all of the memoranda were words; the category boundary prevents interference by confusion. Therefore, in our simulations we included the distractors in the candidate set if and only if they came from the same stimulus category as the items (with digits, letters, and words constituting the three available categories for verbal materials). These categories are so clearly distinct that people arguably represent them as such, and this enables them to exclude distractors from the candidate set if they come from a different category than the memoranda. As a consequence, item–distractor combinations from the same stimulus category are disadvantaged because of an increased chance of intrusion errors from the distractor set. Simulation 4 involved the four combinations of Turner and Engle (1989). We simulated recall of six-item lists with four operations after each item at an intermediate level of cognitive load (0.5 s of operation duration, followed by 0.5 s of free time). The mean accuracy for recalling digits was .75 when digits were used as the distractors, but it increased to .80 when the distractors were words. In contrast, the mean accuracy for recalling words was .52 with digit distractors, which fell to .46 when words were used as the distractors. Thus, SOB-CS reproduces the finding that memory is worse when the distractors come from the same broad category as the items.
Feature-space overlap between items and distractors
One longstanding question in working memory research has been whether working memory is a unitary system or should be conceived of as fractionated into separate, domain-specific subsystems. Much research along these lines has been guided by Baddeley’s (1986) tripartite model of working memory that proposes different subsystems for verbal (in particular, phonological) and visuospatial maintenance. Numerous studies have been conducted in search of double dissociations between verbal (including numerical) and visuospatial working memory with dual-task combinations that cross the content domain of the primary task (verbal vs. visuospatial) with that of the secondary task (see Jarrold, Tam, Baddeley, & Harvey, 2011).
We focus here only on those studies that have investigated complex-span performance with verbal and with visuospatial memory items, combining each with either verbal or visuospatial distractor tasks. The results have been mixed. Some experiments have found that distractor processing in the same domain impaired memory, whereas distractor processing in the other domain had no effect on memory at all (Hale, Myerson, Rhee, Weiss, & Abrams, 1996; Myerson, Hale, Rhee, & Jenkins, 1999), or had a substantially reduced effect (Chein, Moore, & Conway, 2011; Shah & Miyake, 1996). Other researchers have found only partial dissociations, such that verbal memory was impaired more by verbal than by visuospatial processing, but visuospatial memory was impaired approximately equally by processing in both domains (Bayliss et al., 2003; Vergauwe et al., 2010).
Different representational domains can be characterized as separate feature spaces (illustrated by the rectangular and diamond-shaped spaces in Fig. 11), which are implemented in distributed neural networks such as SOB-CS as nonoverlapping sets of units in the item layer. When items and distractors have no shared feature dimensions (i.e., are represented by nonoverlapping feature spaces), they differ in similarity in a different way than when they have no shared features (i.e., they have low proximity within a feature space). Two phonologically very dissimilar words might share no features but still be located in the same feature space. They are represented as different patterns across the same set of units in the item layer, such that they distort each other when superimposed. In contrast, a phonological representation of a word and a visuospatial representation of orientation cannot even be compared on any shared feature dimension. Instead they are represented as patterns across nonoverlapping sets of units, and therefore don’t interfere with each other.
Thus, if the representations involved in the processing activity of a complex-span task are from a domain entirely different from that of the memory items, SOB-CS predicts no interference between them. The problem with evaluating this strong prediction against the existing data is that we cannot be confident that a nominally visuospatial processing task involves only visual or spatial representations, and that a nominally verbal task involves only verbal representations, for at least four reasons.
First, some of the distractor tasks used in complex span require processing of both verbal and visuospatial information. For instance, the verbal processing task of Bayliss et al. (2003) involved searching a visual display for an object whose color matched that of an object named verbally. Even an easy search task such as theirs involves moving attention in space and processing the objects’ colors, thus generating spatial and visual representations. Second, the presentation and response modalities of nominally verbal processing tasks often involve visual and spatial features. For instance, word reading involves processing of the visual word form; sentence reading in addition involves eye movements. Distractor tasks often require manual responses to keys distinguished by their spatial locations (e.g., Vergauwe et al., 2010). Both eye movements and limb movements to spatial targets are known to disrupt spatial working memory (Lawrence, Myerson, Oonk, & Abrams, 2001). Third, it is usually not known to what degree people maintain in working memory a verbal representation of the task instruction for a nominally visuospatial processing task. For instance, the processing component of complex-span tasks sometimes involves choice tasks with arbitrary stimulus–response mappings (e.g., Vergauwe et al., 2010), and participants might use verbal self-instruction to remind themselves of which response key belongs to which stimulus category. Verbal self-instruction has also been shown to assist in task switching (Emerson & Miyake, 2003; Kray, Eber, & Karbach, 2008), and the complex-span paradigm requires frequent switches between encoding of memory items and working on the processing component. Finally, representations of memory items are often not domain-pure. Visual and spatial stimuli are often encoded in verbal format (e.g., by describing the position of a dot in a matrix as “middle-left”). Verbal items (in particular, words) are often represented semantically, and the meanings of many words are suffused with spatial aspects, both literally and metaphorically (Bar-Anan, Liberman, Trope, & Algom, 2007; Lakoff & Johnson, 1980). This impurity of stimulus representations can also apply to the stimuli involved in the processing task.
Accordingly, there are numerous reasons to believe that a nominally verbal processing task does not involve purely verbal representations, and that a nominally visuospatial task does not involve purely visual or spatial representations. Therefore, the strong prediction of SOB-CS that processing tasks from a different representational domain should not interfere at all with memory is very difficult to test in practice. Realistically, for most experiments we can only make the weaker prediction that processing tasks should interfere more with memory items in the same domain than with memory items in a different domain. On balance, the extant evidence summarized above is consistent with that prediction: There is some cross-domain interference, but it is weaker than within-domain interference (Jarrold et al., 2011).
Simulation 5 served to investigate the interference between memory items and distractors from different domains under the assumption of different proportions of shared feature dimensions. The simulation used a design similar to that of Simulation 1, testing memory span for the same 15 levels of cognitive load but holding the number of operations constant at four. The memory items were letters; for simplicity and consistency with the preceding simulations, we assumed purely verbal representations for the letters. The distractors were modeled as primarily visuospatial representations, with their degree of impurity (i.e., overlap with the verbal section of the item layer) varied over six levels: 0, 5, 10, 20, 30, and 50 percent of the 150 units of the verbal section were recruited for the distractor representations.
We conclude that SOB-CS can explain the occasional finding of cross-domain interference between memory and processing by assuming some degree of task impurity. Specifically, SOB-CS can explain the results of Vergauwe et al. (2010), who demonstrated cross-domain interference that increases linearly with cognitive load; this is shown in the declining span-over-load curves in Fig. 15. At the same time, the model can also reproduce the double dissociation of verbal and visuospatial working memory: With less-than-perfect overlap of feature dimensions, cross-domain interference is smaller than within-domain interference, as can be seen by comparing Fig. 15 to Fig. 5. With no overlap, there is no cross-domain interference at all. Thus, SOB-CS can explain the main experimental evidence for the distinction of domain-specific subsystems in working memory. SOB-CS does not require such domain-specific subsystems, it only requires the straightforward assumption that entities in different domains are represented in different feature spaces, such that their representations use different sets of units.
Discussion: Variety of item–distractor similarity
To summarize, SOB-CS accounts for the effects of three kinds of similarity (or dissimilarity) between items and distractors (see Fig. 11). The most radical form of dissimilarity is a change of content domain. When distractors come from a different content domain from that of the items, their representations use only partially overlapping sets of units, and therefore interference between them will be reduced and, in extreme cases of no overlap, eliminated. In this way, SOB-CS explains the frequently observed double dissociation between verbal and visuospatial working memory tests without assuming separate subsystems.
A second form of dissimilarity, combining items and distractors from different categories within a content domain, also reduces the amount of interference, because it facilitates exclusion of distractors from the set of recall candidates.
Whereas the first two kinds of similarity increase interference, SOB-CS predicts that the third kind, proximity between items and distractors within the same feature space, reduces interference under some conditions. Our experiments (Oberauer et al., 2012) have confirmed this counterintuitive prediction, lending strong support to the assumptions about item–distractor interference in SOB-CS.
To understand the effects of the three kinds of similarity on memory, it is important to consider how they modulate the two kinds of interference in SOB-CS: interference from superposition and interference by confusion. Interference from superposition determines how much the retrieved vector vi' is distorted relative to the vector vi representing the originally encoded stimulus. The extent of mutual distortion of two representations is larger, the more that their feature spaces overlap. Within the feature space they have in common, however, higher similarity (i.e., higher proximity) implies less distortion.
The second form of interference occurs through confusion of the correct item with another recall candidate. This occurs when the retrieved vector vi' is compared to all recall candidates. The chance of interference by confusion depends on which elements are included in the candidate set; this is why excluding distractors from a different category than the memoranda improves complex-span performance. The probability of confusion also increases with the proximity among the candidates in feature space. Thus, proximity in feature space has two opposing effects: It reduces the degree of interference from superposition, and it increases the chance of interference by confusion. Both effects were shown in experiments in which we varied the phonological similarity of items and distractors (Oberauer et al., 2012): Higher similarity improved memory overall, but also led to a specific increase of intrusions from distractors replacing the items that they were similar to.
Much of the appeal of the complex-span paradigm comes from its impressive success as a tool for assessing working memory capacity as an individual-differences variable, both within an age group (Conway et al., 2005; Engle, Tuholski, Laughlin, & Conway, 1999; Kane, Bleckley, Conway, & Engle, 2001) and across age groups at both ends of the life span (Bayliss, Jarrold, Baddeley, Gunn, & Leigh, 2005; Gathercole, Pickering, Ambridge, & Wearing, 2004; J. McCabe & Hartman, 2003). Our final simulation thus addressed individual differences in simple- and complex-span performance.
Findings from correlational studies with span tasks can be grouped into two sets: those concerning correlations between different kinds of simple and complex span, and those concerning correlations of span tasks to external criteria, such as measures of intelligence or academic achievement, or to experimental tasks measuring various cognitive constructs. The latter set, though undoubtedly theoretically highly relevant, are currently outside the scope of our modeling, because modeling these relationships would require modeling not only the span task but also the external criteria (e.g., performance on intelligence tests). Therefore, we will focus here on the first group of findings.
One noteworthy feature of the four-factor structure of Kane et al. is that the two complex-span factors are more closely correlated (.83) than are the two simple-span factors (.63). This pattern has also been observed in a large correlational study of working memory in children (Alloway, Gathercole, & Pickering, 2006). This finding can be interpreted as reflecting a domain-general source of variance that affects complex span more strongly than simple span.
In computational models such as SOB-CS, individual differences in task performance arise naturally from individual differences in parameter values. In Simulation 6, we introduced variance across the simulated subjects in the c parameter (which determines the discriminability between retrieval candidates) and the r parameter (removal rate). We chose differences in c as a source of variance shared between simple and complex span but specific to each content domain, because it is plausible that the discriminability of representations in the set of recall candidates is domain specific: Individuals might have highly distinct verbal representations but less distinct visuospatial representations, or the other way around. Therefore, we assumed two uncorrelated c parameters, one for verbal and one for spatial span tasks.
We chose differences in the removal rate r as a source of variance that plays a larger role in complex than in simple span, and thereby accounts for the distinction between complex-span and simple-span measures. Recall that in complex span, removal affects all distractors as well as list items after their recall, whereas in simple span, removal is limited to postrecall response suppression.
Simulation 6 reproduced the design of Kane et al. (2004), crossing content domain (verbal–numerical vs. visuospatial) with span type (simple vs. complex), representing each design cell with three independent tasks. We created a normal distribution of parameter values across subjects (N = 2,000), adding Gaussian noise with a mean of zero and a standard deviation of 0.15 to the c parameter, and adding Gaussian noise with a mean of zero and a standard deviation of 0.5 to the r parameter. Two uncorrelated distributions of c were created in that way, one for the six verbal span tasks and one for the six visuospatial span tasks. A single distribution of r applied to all tasks. The means of c and r, as well as the values of all other parameters, were the same as in Simulation 1.
The complex-span tasks of Kane et al. (2004) differed in the processing tasks that they involved, and little information is available about the operation durations and free time in these tasks. For simplicity, we used the same time values for all six complex-span tasks, assuming intermediate values (i.e., 0.5 s operation duration followed by 0.5 s free time). Each complex-span task involved four operations following each item.
The three span tasks in each design cell (e.g., the three verbal complex spans) were simulated as three independent replications with the same parameter values for each subject; they differed only in the stimulus sets, which were generated anew for each task, and the random noise introduced by output interference. The tasks from different content domains in addition differed by the individual parameter values of the distinctiveness parameter c. As in Kane et al. (2004), each subject completed three trials of each memory-set size for each task; whereas in the original study the range of set sizes was calibrated to each task’s difficulty, in the simulation we ran all set sizes from one to nine for all tasks. Performance was scored, as in Kane et al. (2004), by calculating the proportions of items recalled in correct position, averaged across all trials of each task.
The bottom panel of Fig. 16 shows the results of fitting a four-factor structural-equation model to the simulated data. The model gave an excellent fit for the data, χ2(48) = 62.1, CFI = .998, RMSEA = .012, SRMR = .014. Simulation 6 reproduced the key results of Kane et al. (2004): Spans from different content domains loaded on separate but substantially correlated factors. Within each domain, complex-span factors were separate from, but highly correlated with, those for simple span. This correlation was driven by the shared variance of c. The cross-domain correlation was larger for complex than for simple span. Because the c parameters for the verbal and spatial tasks were uncorrelated, the positive correlation across domains could only come from variations in r. The removal parameter r had a larger effect on cross-domain correlations in complex span than on those in simple span, because in simple span it only governed the effectiveness of response suppression, whereas in complex span it also governed the effectiveness of distractor removal.
Simulation 6 demonstrated that SOB-CS can reproduce benchmark findings from individual-differences studies concerning the factorial structure of span tasks. The simulation showed that variation in two model parameters—the discriminability of representations in the recall candidate set, c, and the rate of removal, r—was sufficient to generate the benchmark pattern of correlations. We do not claim that variations in this particular pair of the parameters are uniquely necessary to explain the data.
Evidence for a role of the distractor removal parameter r in explaining individual differences in complex span comes from a study by Carretti, Cornoldi, De Beni, and Palladino (2004). They used a version of complex span in which, on each trial, participants listened to several short word lists, remembering the last word of each list. Whenever participants heard an animal word, they had to tap on the table. At recall of the list-final words, people made more intrusions of animal than of nonanimal distractors (replicating a previous finding by De Beni, Palladino, Pazzaglia, & Cornoldi, 1998). Animal-distractor intrusions were specifically increased in people with low working memory capacity. Low-capacity participants also showed a larger priming effect for the complex-span distractors in a subsequent lexical decision task, and a larger latency advantage for accepting animal distractors in a recognition test, when these tests were carried out right after a complex-span trial. These findings show that individual differences in how strongly distractors remain in working memory at the end of a trial are related to individual differences in capacity, as would be expected if individual differences in the efficiency of removing distractors (parameter r) were in part responsible for variation in measures of working memory capacity.
Working memory is one of the core constructs of cognitive psychology. So far, theorizing in the field has primarily involved the verbal description of components and mechanisms. Our goal for this article was to apply the conceptual rigor of computational modeling of serial-recall tasks to complex span, one of the major paradigms for studying working memory. This computational approach is embodied in our interference model of working memory, SOB-CS. We now discuss the key assumptions of the new model, its limitations, its relations to other theories, and the most important theoretical conclusions from our work.
New assumptions in SOB-CS: Distractor encoding and removal
Our new model, SOB-CS, introduces two assumptions that go beyond existing theories and models of working memory.
The first new assumption is that distractors create interference by being encoded into working memory. In particular, distractors are associated with the position of the immediately preceding item, using the same mechanisms as item encoding. This assumption is motivated by a host of findings showing that memory encoding is an obligatory byproduct of processing (Craik & Lockhart, 1972; Hyde & Jenkins, 1969; Logan, 1988). In the field of immediate recall, Phaf and Wolters (1993) provided direct evidence that distractor words spoken aloud are incidentally encoded into memory to the degree that they attract attention (see also Aldridge, Garcia, & Mena, 1987). Therefore, we assumed that the strength of distractor encoding is a function of how long attention is devoted to processing them, as well as of the novelty of the distractor. Evidence for the fact that distractors are associated with the immediately preceding item’s position marker comes from our finding that, when distractors intrude into recall, they are more likely to replace the immediately preceding item than another list item (Oberauer et al., 2012).
The second new assumption in SOB-CS is particularly novel and unique. During the free-time interval following encoding of a distractor, that distractor’s association with the currently focused position is gradually removed from memory. This assumption is supported by three lines of reasoning. First, distractor removal is a natural extension of the mechanism of response suppression in SOB, a process for which there is strong evidence. For example, people are very unlikely to commit repetition errors, even if lists contain repeated items (Duncan & Lewandowsky, 2005; Henson, 1998a; Jahnke, 1969). Our assumption in SOB-CS simply generalizes the rationale and the already-existing mechanism of response suppression: Representations that are no longer relevant are removed by Hebbian antilearning. Thus, similar to the way in which a just-recalled item is suppressed because it has become irrelevant, the just-processed distractor is removed by an identical process because it, too, has become irrelevant.
Second, removal of no-longer-relevant contents is a necessary mechanism for a functioning working memory system that does not rely on decay. Hasher, Zacks, and their colleagues have argued that removal (in their terms, “deletion”) of irrelevant working memory contents is one of the inhibitory functions that becomes deficient as we reach old age, and as a consequence, working memory becomes cluttered and is rendered inefficient (Hasher & Zacks, 1988; Hasher et al., 1999). Direct evidence for the removal of irrelevant subsets of memory items comes from experiments in which people encoded two sets of digits or words and were then informed which of them was (temporarily or permanently) irrelevant for the upcoming task. The effect of the number of items in the irrelevant set on latencies for accessing elements from the remaining set diminished gradually over the time, disappearing 1–2 s after the cue (Oberauer, 2001, 2002, 2005b). The time course of the vanishing irrelevant set-size effect guided our decision to set the removal rate parameter r to a value according to which removal was nearly complete after 2 s. A neuroimaging study with the same paradigm showed that the neural activity associated with the irrelevant set rapidly declines to baseline shortly after the cue and reemerges when the same set is cued as relevant later, directly demonstrating flexible control of the contents of working memory (Lewis-Peacock, Drysdale, Oberauer, & Postle, 2011).
Third, independent evidence has emerged for the notion that selective removal is an active process that not only takes time but also competes with other processes. Fawcett and Taylor (2008) combined an item-wise directed-forgetting paradigm with detection of a visual probe as a secondary task. The visual probe appeared at variable intervals after the cue, which indicated for each item whether it should be remembered or forgotten. Response times to the visual probe were delayed more after a forget cue than after a remember cue, demonstrating that forgetting is an active process that delays response to an attention-demanding secondary task. This effect was obtained only at delays of less than 2 s after the cue, in agreement with our estimate that removing a representation from memory is completed after about 2 s. Wylie, Foxe, and Taylor (2007) added further evidence that instructed forgetting of a just-encoded item is an active process that recruits brain regions not involved in active remembering or in unintended forgetting. These findings provide evidence that removal of irrelevant information is different from selective maintenance of relevant information, and they point to the fact that an attentional bottleneck is necessary for the removal of information.
Distractor removal in SOB-CS takes the beneficial role that in other theories of working memory is taken by rehearsal or refreshing. For instance, in the TBRS theory, free time following a distractor operation is used to refresh memory items, and refreshing is a crucial component of how the TBRS theory explains the cognitive-load effect. In contrast, SOB-CS does not invoke rehearsal or refreshing to explain any of the benchmark findings.
The present authors differ in the extents to which they believe that rehearsal or refreshing plays a role for maintenance in working memory. So far, we have not implemented these processes in SOB-CS for two reasons. The first and most obvious reason is that a restoration process is not needed to explain the benchmark findings modeled here. We have demonstrated that the cognitive-load effect, one important piece of evidence cited in support of refreshing, can be explained without appeal to that mechanism. The second reason is that rehearsal or refreshing is not easily integrated with the other mechanisms of SOB-CS. Rehearsal or refreshing in its simplest form would mean that items are retrieved, using a position cue, and then reencoded by associating them again with the same position cue. Such a mechanism would be fairly ineffectual, because encoding an item for a second time in the same position is dampened by novelty gating. The expected increase in memory strength therefore would be minor, at best. This small expected gain stands against a substantial risk: If the wrong item were retrieved in a given position, the wrong item would be encoded in that position, and because that item was encoded in that position for the first time, it would be encoded fairly strongly, thus creating substantial interference. In sum, in the context of SOB-CS, there is little to gain and much to lose from such a mechanism of rehearsal or refreshing.8
That said, we must underscore that we do not rule out a role for rehearsal in working memory. It is abundantly clear that people do rehearse; a substantial proportion of people, when asked about their strategies on complex-span tasks, report some form of rehearsal. About one third of participants report repeating the memory items to themselves as their main strategy, whereas another third report using no strategy except for reading the memoranda. The final third of participants report more elaborate strategies, such as generating visual images for the to-be-remembered words or trying to combine the words into sentences (Bailey, Dunlosky, & Kane, 2008; Dunlosky & Kane, 2007). Thus, rehearsal as a behavioral phenomenon is well established, and we do not question its occurrence. Nonetheless, our simulations show that rehearsal is not necessary as a causal explanatory construct to account for complex-span performance. New data or other data not yet addressed by SOB-CS may eventually require the addition of such a mechanism (e.g., Jarrold, Tam, Baddeley, & Harvey, 2010).
In this context, it is illuminating that the performance of those individuals who report rehearsal by repetition is hardly better than the performance of people who report merely reading the items as they are presented, lending support to our contention that rote rehearsal is not needed to explain memory performance in complex span. Unlike rote rehearsal, more elaborate strategies are associated with better performance (Dunlosky & Kane, 2007; Kaakinen & Hyönä, 2007). This pattern of results meshes well with the earlier analysis that mere retrieval and reencoding of items is bound to be fairly ineffective in SOB-CS. Exploring the possibility that elaborative rehearsal might prove more successful in the model is a task for the future.
In contrast to rehearsal, removal of distractors has not figured prominently in self-reported strategies. We do not regard this as problematic for our model. One trivial explanation could be that no researcher ever considered the need for active removal. However, we believe that there is more to this conspicuous absence. Removal is unlikely to be the subject of self-reports because people report cognitive processes to the extent that they pay attention to them and remember them. Removing a no-longer relevant representation implies that it fades from the focus of attention and vanishes from working memory. By removing distractors, increasingly clean and distinct representations of the memory items emerge, which then have an increasing chance to be remembered later, not only when it comes to recalling the items, but also when it comes to reconstructing a memory of one’s own strategy. When asked what they did during a complex-span trial, participants will remember that, after processing a distractor, their attention eventually switched away from that distractor and to one or more of the items. The experience of this transition, which we argue is facilitated by removal of the distractors, can plausibly be described by participants as “refreshing the items” or “rehearsal.” The more efficient a person is in removing distractors from memory, the more rapidly and clearly the memory of the items would emerge from the fog of interference, and the more opportunity the person would then have to engage in further elaborative processes, such as visualizing the meanings of words or creating sentences from the words. Thus, faster distractor removal might be the common cause of better memory and of more elaborate processing of the items.
SOB-CS is an attempt to formulate a precise and empirically adequate model of one particularly popular and fruitful experimental paradigm of working memory research. Modeling complex span is clearly a necessary part of what it means to model working memory. At the same time, we recognize that it is only a small part of the theoretical and empirical landscape.
Computational models that spell out assumptions about representations and processes in as much detail as SOB are often limited to a single experimental paradigm, such as immediate serial recall. With the development of SOB-CS, we are generalizing the model, extending it from simple span to complex span. Nonetheless, to become a complete model of working memory, SOB-CS will have to be extended further to account for behavior on other prototypical working memory tasks, such as the Brown–Peterson paradigm and variations thereof (J. Brown, 1958; Jarrold et al., 2010; Peterson & Peterson, 1959) and memory-updating tasks (Ecker, Lewandowsky, Oberauer, & Chee, 2010; Pollack, Johnson, & Knaff, 1959; Yntema & Mueser, 1962). The model also needs to be extended to other response formats beyond serial recall and reconstruction, such as free recall (Bhatarah, Ward, & Tan, 2008; Farrell, 2012), probed recall (Tehan & Humphreys, 1995), and recognition (McElree, 2001; Oberauer, 2005a). We believe that the model is well suited for at least some of these extensions. For instance, updating of working memory requires efficient, targeted removal of representations that must be replaced (cf. Kessler & Meiran, 2008); SOB-CS already includes such a mechanism. Recognition requires a quick assessment of the familiarity of a probe; the computation of energy in SOB-CS offers a potential mechanism.
SOB-CS is also limited in that it does not make explicit how working memory relates to long-term memory. We are not committed to a strong distinction between working memory and long-term memory as separate systems, so we use these terms pragmatically as referring to memory phenomena over short time spans (on the order of seconds) and longer time spans; so far, SOB-CS has addressed only the former. The role of long-term memory is particularly pertinent to modeling the complex-span task. The complex-span paradigm is strikingly similar to the continuous-distractor paradigm (Bjork & Whitten, 1974) that has been commonly interpreted as reflecting recall entirely from long-term memory. Meanwhile, a substantial body of evidence from experimental (D. McCabe, 2008) and correlational (Unsworth, 2010; Unsworth, Brewer, & Spillers, 2009) studies, as well as from neuroscience (Chein et al., 2011), confirms that processes and performance on complex-span tasks are related to long-term memory (Unsworth & Engle, 2007).
The relation of working memory to long-term memory most likely goes in both directions: On the one hand, knowledge in long-term memory contributes to recall in working memory tasks. This is already acknowledged by all models that assume redintegration of distorted memory traces (e.g., Nairne, 1990; Schweickert, 1993), because redintegration requires intact long-term memory representations of recall candidates. Our simulation of individual differences in span tasks assumes that these individual differences arise in part from variation in the c parameter, which can be interpreted as reflecting the discriminability of representations in long-term memory. On the other hand, processes on working memory tasks generate memory traces that long outlast the individual trial (D. McCabe, 2008), exerting effects across trials that are sometimes beneficial, as in the so-called Hebb effect (Hebb, 1961), and sometimes harmful, as in proactive interference (Bunting, 2006).
So far, SOB-CS models only individual trials. After every trial, the weight matrix is reset to a state that reflects previous learning events in a very generic fashion (i.e., simply adding random noise to all weights with standard deviation No). Therefore, SOB-CS cannot yet account for interference or facilitation across trials. An obvious first step to account for effects beyond single trials would be to assume that the weight matrix is not reset after each trial, but instead squashed (i.e., multiplied by a value between 0 and 1). Squashing would be a mechanism for removing no-longer-relevant information in a wholesale manner, which is different from the targeted removal of individual representations. Incomplete squashing would leave traces of previous trials, giving rise to proactive interference. We anticipate that more sophisticated mechanisms will be needed to account for other aspects of the link between long-term and working memory.
One set of mechanisms has been proposed by one of us to explain the relationship between working memory and episodic memory (Farrell, 2012). Like SOB-CS, this model assumes that items are associated with a representation of temporal context, but that some portion of the context is used to bind together temporally adjacent items into episodic clusters. Successive lists in a working memory experiment would be partially separated by temporal context, which contributes to reducing interference between them. In simulating the effects of distractor activity in free recall (analogous to the complex-span tasks simulated here), Farrell assumed that distractors are clustered together with the item that they immediately follow, such that they are associated with the same cluster-level context as the preceding item. This parallels the assumptions made in SOB-CS and opens some avenues of integration across the two models. Other neural-network models of serial recall are making progress in explaining the effects of long-term learning on immediate recall (Botvinick & Plaut, 2006; Burgess & Hitch, 2006; Page & Norris, 2009), and we see this as an encouraging development from which we hope to learn for a future extension of our model.
Relation to other theories of working memory
In this section, we compare SOB-CS to other theories of short-term and working memory, beginning with a brief review of other computational models, followed by an attempt to relate our model to some of the most influential verbal theories of working memory.
As already noted, SOB-CS is closely related to other formal models of serial recall because it originated in that tradition. As a consequence, SOB-CS retains the achievements of previous versions of SOB in accounting for a multitude of phenomena in simple span (Lewandowsky & Farrell, 2008b), thus constituting the first computational model of working memory that generalizes across two paradigms, serial recall and complex span.
Several other computational models have addressed working memory, but they are concerned with paradigms that are beyond the current scope of SOB-CS (Ashby, Ell, Valentin, & Casale, 2005; O’Reilly & Frank, 2005; Oberauer & Kliegl, 2006). We are aware of only two other models that address complex span, and that are therefore direct competitors with SOB-CS: the ACT-R-based model of Daily et al. (2001) and our computational implementation of TBRS (Oberauer & Lewandowsky, 2011).
Daily et al. (2001) explained capacity limits in working memory through two factors, one being decay and the other a limited resource for activating representations that limits the degree to which, during rehearsal and retrieval, the correct item can be activated more strongly than competing items. The model accounts reasonably well for some of the benchmark phenomena known at the time: the decline of accuracy with memory-list length and the serial-position curve. The model also gives an account of individual differences in serial-position curves for different list lengths by varying a single parameter—namely, the amount of available resources. Daily et al. published their model before the benchmark findings related to cognitive load emerged, but their model has the potential to account for these effects in a way similar to the TBRS* model, because the model includes decay and rehearsal, and it includes a processing bottleneck so that the model can rehearse only when it is not engaged in processing a distractor.
We regard TBRS*, our computational implementation of TBRS, as the strongest competitor to SOB-CS for explaining experimental results with complex-span tasks, and we therefore focused initially on addressing evidence that has been cited as being uniquely supportive of TBRS (viz., the cognitive-load function). We have shown that SOB-CS successfully handles those data—and in two regards does so even more successfully than TBRS*. First, SOB-CS gives a better account of the cognitive-load effect when it is broken down into the effects of operation duration and free time (Fig. 7). Second, SOB-CS correctly predicts the joint effects of number of distractor operations, similarity between successive distractors, and serial position, whereas TBRS* could not reproduce the pattern of results even with favorable ad-hoc assumptions (see Fig. 8).
We acknowledge that the comparison between SOB-CS and TBRS* can be regarded as unfair, because TBRS* was implemented by ourselves rather than by the authors of the TBRS theory. Thus, despite our best efforts to make TBRS* as strong as possible, it is conceivable that a better way exists of implementing the TBRS theory as a computational model. The mere possibility that this is the case does not render irrelevant the challenges arising for the TBRS theory from the present results. If the TBRS theory is to explain the benchmark data of complex span, then there must be at least one computational implementation of the theory that can coherently explain them. In other words, among the many different ways in which all of the details left out by the verbal theory can be filled in, there should be at least one that works. It is now incumbent on proponents of the theory to show that there is an implementation that fixes the problems noted above and at the same time retains the success of TBRS* in accounting for a broad range of other findings (Oberauer & Lewandowsky, 2011).
Verbal theories of working memory
Many readers will ask: Where do I find in SOB-CS the familiar concepts of contemporary theories of working memory? The best-known theories of working memory today are only verbally formulated. They are often broader in scope than computational models, but lack the precision of computational models. Here we relate SOB-CS to some of the better-known verbal theories of working memory.
The working memory theory of Baddeley (1986, 2000)
Probably the most popular theory of working memory is the one introduced by Baddeley and Hitch (1974) and developed further by Baddeley (1986, 2000). It consists of four interacting components: a central executive, an episodic buffer, and two slave systems for domain-specific maintenance, namely the phonological loop (for verbal information) and the visuospatial sketch pad (for visual object information and spatial location). The episodic buffer serves as a device for maintaining integrated representations that cut across domain boundaries. At first glance, it might be tempting to relate the memory mechanism of SOB-CS—that is, the two-layer architecture, the principles of Hebbian association, and the redintegration mechanism—to the two slave systems in Baddeley’s model. However, in SOB-CS the distinction of two domain-specific subsystems is unnecessary, because the double dissociations of verbal and visuospatial contents emerges from the model through the lack of superposition interference between disjoint representational domains. For the same reason, SOB-CS does not need a separate memory system for entities integrating verbal and nonverbal features, such as the episodic buffer; verbal and nonverbal features can be bound together simply by associating them with the same context representation. SOB-CS decidedly differs from Baddeley’s model in that it assumes no time-based decay of phonological (or other) memory traces. We regard this as a strength of our model, because there is no convincing evidence for decay in verbal short-term and working memory (Lewandowsky, Oberauer, & Brown, 2009).
Nothing in SOB-CS corresponds to the central executive in Baddeley’s (1986, 2000) model. Clearly, a complete model of working memory will have to spell out explicitly the executive processes that control its contents. We have only begun to do so by formalizing the basic processes of encoding and removal of representations from memory. Eventually, models of the memory component of working memory such as SOB-CS will have to be combined with computational models of the executive processes working on the memory contents (e.g., Botvinick, Braver, Barch, Carter, & Cohen, 2001; Chatham et al., 2011; Verguts & Notebaert, 2008).
The embedded-process theory of Cowan (1995, 2005)
Cowan (1995, 2005) conceptualized working memory as consisting of two embedded components, the activated part of long-term memory and the focus of attention. Activated long-term memory has no capacity limit, but its contents are prone to forgetting due to decay and interference. The focus has a limited capacity of approximately four chunks that it protects from decay and interference.
It is not obvious how to map SOB-CS onto the main constructs in Cowan’s theory. The focus of attention in SOB-CS is limited to a single content–context conjunction at any time, akin to the notion of a one-chunk focus of attention in other, related theories (McElree, 2006; Oberauer, 2002), not to a focus encompassing up to four chunks. The two-layer architecture and its connecting-weight matrix, which serves to maintain memory of several items, does not fit the notion of Cowan’s focus, either: In contrast to Cowan’s focus, the weight-based memory of SOB-CS is not limited to a discrete number of chunks, it is not immune to interference, and it bears no conceptual relation to attention. If anything, the weight matrix of SOB-CS could be considered as fleshing out the contribution of ancillary mechanisms serving maintenance functions over and above the focus of attention in Cowan’s theory. Thus, there is currently no counterpart for Cowan’s focus of attention in SOB-CS. This might be a weakness of our model, insofar as there is evidence for the characteristic features of Cowan’s focus—a fixed capacity limit of four chunks and immunity to interference for those chunks—that cannot be explained within SOB-CS. The main evidence for a focus with these characteristics comes from recognition paradigms that are, so far, outside the scope of SOB-CS (Cowan, Johnson, & Saults, 2005; Rouder et al., 2008; Saults & Cowan, 2007); this observation underscores the need to extend our model to these paradigms.
One of us (Oberauer, 2002, 2009) has proposed a framework similar to Cowan’s in which the region of direct access roughly corresponds to Cowan’s focus of attention. In contrast to Cowan’s focus, the region of direct access is not assumed to have a fixed capacity limit in terms of a “magical number” of chunks, and it is not assumed to be immune to interference. Rather, the capacity limit of the direct-access region is attributed to interference between different item–context bindings that are maintained simultaneously. In keeping with this idea, interference between item–context associations plays the main role in explaining the limitations on retrieval in SOB-CS. Thus, the associative memory mechanism of SOB-CS can be tentatively interpreted as a model of the direct-access region, and the current contents of the item and position layers can be regarded as the contents of the one-chunk focus of attention in the theory of Oberauer (2002, 2009).
The theory of executive attention of Engle and Kane
Engle, Kane, and their colleagues developed a theory of working memory addressing primarily individual differences (Engle et al., 1999; Kane et al., 2007; Kane & Engle, 2002). They described performance on the complex-span task as reflecting contributions from domain-specific storage systems, plus a general resource for controlled, or executive, attention. They define executive attention as the ability to maintain goal-relevant representations in the face of distraction. In the context of SOB-CS, removal of no-longer-relevant representations from memory is the mechanism for minimizing interference from potentially distracting representations. Therefore, individual differences in the removal parameter could provide an explanation for the associations between complex-span and executive-attention measures without any memory component, such as the Stroop effect (Kane & Engle, 2003) and the antisaccade task (Unsworth et al., 2004). At this point of model development, distractor removal is the only control mechanism explicitly modelled, but other control mechanisms, such as the degree to which irrelevant information can be prevented from entering working memory, are also likely to contribute to individual differences in working memory performance (Awh & Vogel, 2008; Hasher et al., 1999; Jost et al., 2010). We envision that a parameter for the degree of filtering at encoding will be added in a later extension of the model.
The executive-attention view could be interpreted in the context of SOB-CS as the claim that shared variance between complex span and measures of fluid intelligence comes primarily from variance in the efficiency of the executive-control parameters. An alternative view has been advanced by Colom, Rebollo, Abad, and Shih (2006), who argued that variance in the “storage” component of complex (as well as simple) span is responsible for the strong correlation with fluid intelligence. In the context of SOB-CS, this view would imply that fluid intelligence is mostly related to the memory parameters (e.g., the distinctiveness parameter c). A future analysis of individual differences in complex span in terms of the parameters of SOB-CS might be instrumental in moving beyond the dichotomy of “storage” versus “executive attention” and provide insights into which mechanisms and processes are responsible for the differences between individuals with high and with low complex span.
Neuroscientific theories of working memory
Theories about the neuronal substrate of working memory can be divided into two classes. The majority of theories assume that retention in working memory relies on persistent neural firing. An alternative view is that the contents of working memory are maintained by rapid changes of synaptic weights (e.g., Mongillo, Barak, & Tsodyks, 2008). In SOB-CS, memory is entirely based on the connection weights, and as such our model is most compatible with synaptic-change-based theories of working memory. Whereas there is compelling evidence for load-dependent neural activity during working memory retention (Curtis & D’Esposito, 2003; Vogel et al., 2005), recent evidence has suggested that this neural activity might not directly code the contents of working memory. Rather than remaining active during the entire retention interval, the patterns of neural activity correlated with working memory contents are reactivated when needed for processing (Barak, Tsodyks, & Romo, 2010; Lewis-Peacock et al., 2011). These findings support weight-based models such as SOB-CS, which (approximately) reproduce item representations as activation patterns from the weight matrix when given a cue related to that item.
At the same time, mapping mechanisms in SOB-CS to neuronal processes is far from straightforward. For instance, there is no obvious counterpart in SOB-CS for load-dependent neural activity during the retention interval. One possibility arises from the model of short-term synaptic potentiation of Mongillo et al. (2008), according to which rapid weight changes require recurrent nonspecific neural activity to be upheld over time. As more items are encoded, the weights in the weight matrix deviate further from zero, and as a consequence, more neural activity might be needed to uphold the weight matrix. Obviously, this is a very speculative attempt to reconcile our model with neuroscientific evidence. An important step toward tightening the link between weight-based models of working memory and neuronal mechanisms would be to investigate the neural substrates of rapid synaptic weight changes.
SOB-CS provides new answers to a number of pressing questions in working memory research.
First, with SOB-CS we offer a computational model of complex span that explains forgetting entirely through interference. Thus, we provide a simple answer to one of the key questions of working memory research: Why is working memory capacity limited? In SOB-CS, the capacity limit arises entirely from interference. This explanation is more parsimonious than any alternative, because all theories of working memory acknowledge the existence of interference, and those that explain the capacity limit by other processes, such as decay or limited resources, must assume these processes in addition to interference. We have shown that a purely interference-based model can explain benchmark data, such as the cognitive-load effect, that constitute the empirical foundation for the currently most viable decay-based theory (Barrouillet et al., 2007; Barrouillet et al., 2011).
Our computational model provides a clear, unambiguous formulation of the mechanisms of interference in working memory, which serves as a starting point for a detailed empirical investigation of these processes. We distinguish two kinds of interference, one arising from superposition, the other arising from confusion. We show that different kinds of similarity modulate these two kinds of interference in different ways. The model generates new and in part counterintuitive predictions about the effect of item–distractor similarity, which were experimentally confirmed (Oberauer et al., 2012).
A second question that has long been debated in the working memory literature concerns the relation between memory and processing. SOB-CS accounts for the full range of benchmark findings relating to this issue. The key pair of assumptions that we make is that all information attended to during a concurrent processing task is encoded into memory, and thereby potentially interferes with other memories, and that interference can be reduced by gradual removal of irrelevant information. These assumptions lead to the correct predictions that the cognitive-load effect is primarily an effect of the free time between distractor operations (Simulation 2) and that the number of distractors has an effect on memory if and only if the distractors differ from each other (Simulation 3).
A new theoretical discovery emerging from our modeling efforts is the strong link between interference and removal. Any model that explains the limited capacity of working memory without appealing to decay must assume some form of removal of no-longer-relevant information. Without removal, the available capacity would soon be cluttered with irrelevant information. SOB-CS is the first model with an explicit, well-defined mechanism of removal.
Our modeling results also pose a challenge to theories assuming that memory representations are actively maintained by rehearsal or refreshing. As discussed above, we do not deny a potential role for these processes. However, our modeling results have shown that benchmark findings that so far have been interpreted as strong evidence for rehearsal or refreshing (in particular, the cognitive-load effect) can be explained without those processes. This finding raises the question of which phenomena demand the assumption of rehearsal or refreshing. Other modeling work (Oberauer & Lewandowsky, 2008, 2011) has shown that, even in the context of decay-based models, rehearsal is effective only in a narrow set of circumstances. Taken together, these demonstrations imply that rehearsal and refreshing are not the only conceivable processes of active restoration; removal of irrelevant information should be taken seriously as a contender. We have developed an explicit computational model of how removal could work; we hope that this encourages proponents of rehearsal or refreshing to specify with equal precision what happens when people rehearse or refresh.
Another longstanding question is whether working memory is a unitary system or should be conceptualized as consisting of several domain-specific subsystems. Proponents of both views can point to considerable evidence in their favor. Within our distributed connectionist modeling framework, we offer a principled explanation that accommodates the evidence cited in favor of both sides of the debate. Working memory is a unitary system that operates with representations from different content domains. Different content domains are characterized by different feature dimensions, which are represented by different sets of units in the content layer. To the degree that the contents of tasks carried out concurrently use representations from different domains, these tasks do not interfere, because the representations do not overlap in the neural network. In practice, however, hardly any task is content-pure, and therefore even nominally “visuospatial” tasks involve some verbal features, and nominally “verbal” tasks involve some visuospatial features. This explains why tasks used to represent different content domains nevertheless interfere with each other to some degree.
Evidence for both domain-general and domain-specific aspects of working memory also comes from correlational studies. Various measures of working memory capacity share a large proportion of their variance, thus pointing to a nonnegligible source of variance reflecting general working memory capacity. At the same time, verbal and visuospatial span tasks load on separate, though correlated, factors. We have shown that the patterns of correlations between various simple- and complex-span tasks can be explained with SOB-CS by assuming individual differences in two parameters, one domain-specific parameter that governs the distinctiveness of memory representations in simple- and complex-span tasks, and one general parameter affecting the removal of irrelevant information, which is particularly important in tasks combining memory with processing demands. Again, a unitary system operating on domain-specific representations explains the full range of results.
In conclusion, we have proposed the first purely interference-based computational model for a key paradigm in research on working memory. The model explicitly describes the basic mechanisms of working memory: encoding of items in their relative positions, retrieval of individual items by positional cueing, interference from concurrent processing, and the control of interference by removal of no-longer-relevant information. The model successfully accounts for a number of benchmark findings from complex-span tasks and makes successful new predictions. We hope that our work will encourage other researchers to make competing theoretical ideas equally explicit, so that the debate about the mechanisms of working memory can be advanced to a greater level of precision.
The program code is also available at our web pages: www.cogsciwa.com; www.psychologie.uzh.ch/fachrichtungen/allgpsy/Team/Oberauer_en.html; http://seis.bris.ac.uk/~pssaf/publications.html.
The similarity between neighboring positions was labeled tc in previous publications; here we rename it to sp to avoid confusion with the time parameters.
The actual similarity between items and distractors is lower (e.g., about 0.1 in the case of letters and digits) because items are derived from the item prototype and distractors are derived from the distractor prototype.
Jolicœur and Dell’Acqua (1998) fitted a model to the data of three of their experiments. In that model one parameter, τ1i, is the mean of an exponentially distributed time variable that represents the sum of the times for sensory and perceptual processing and for short-term consolidation of a single item. These means were estimated to be 0.304, 0.150, and 0.275 s for the three modeled experiments (4, 6, and 7), respectively. The rate of the exponential function is the inverse of the mean: that is, 3.3, 6.7, and 3.6, respectively, for the three experiments. The rate is higher for Experiment 6 than for the other experiments, and Jolicœur and Dell’Acqua argued that this could be because in Experiment 6, a single letter was to be encoded into memory on every trial, whereas the other experiments involved a mixture of trials with one and with three letters. In this regard, Experiment 6 might be the most representative one for the encoding of single letters in a serial-recall task.
For instance, span at a cognitive-load level .33 is worse than expected from a strictly linear function, and span at the following load level (.37) is better than expected. Between the first and the second of these data points, operation duration is more than doubled (from 0.3 to 0.7), and free time is doubled (from 0.6 to 1.2). The increase of operation duration, however, has only a negligible effect, whereas the increase of free time has a more substantial effect. Therefore, span at the higher load level exceeds that at the lower load level in this particular comparison, and in the others that generate nonmonotonicities in Fig. 5.
We investigated other refreshing schedules in the context of TBRS* (Oberauer & Lewandowsky, 2011), such as selecting items for refreshing at random or refreshing only the last-presented item. These resulted in a serial-position curve with strong recency and little if any primacy, because early list items have the longest retention intervals, and therefore suffer most from decay. We believe that we investigated all simple and straightforward refreshing schedules, although other, more elaborate schedules might be compatible with the largely additive effects of serial positions and distractors, and at the same time still generate the correct shape of the serial-position curve. As the space of possible refreshing schedules is vast, it is impossible to explore them all. The onus is now on decay theorists to propose a schedule that brings their theory in line with the data.
Unsworth and Engle (2006), reanalyzing data from Kane et al. (2004), reported serial-position effects on correct recall, intrusions, and omissions that differed in some regards from published data with simple span tasks and from our data from complex-span tasks. Unsworth and Engle (2006) found that the probability of omissions was largest at the first and last output positions and that the probability of intrusions was largest at the second output position, and from that point decreased monotonically with output position. We believe that this unusual pattern was a consequence of the lack of control over output order: Participants were asked to write the memoranda into slots on an answer sheet; they could fill the slots in any order. It is likely that participants often began recall by putting the last list item in the last slot (Lewandowsky, Brown, & Thomas, 2009), thereby avoiding intrusion errors at that position. Because of the apparently uncontrolled output order in the study of Unsworth and Engle (2006), we do not compare our simulations to their detailed analysis of errors. This critique does not invalidate the analyses of Kane et al. (2004), which relied on scores aggregated across serial positions.
There are reasons to believe that this problem is not specific to SOB-CS. Existing computational models of working memory, including those that assume a central role for rehearsal, do not model it in detail (Page & Norris, 1998). In a previous modeling study, we found that even in the context of a decay-based model, rehearsal can be detrimental rather than beneficial (Oberauer & Lewandowsky, 2008).
K.O., S.F., and C.J. were supported by Grant RES-062-23-1199 from the Economic and Social Research Council (ESRC). S.L. was supported by a Discovery Grant from the Australian Research Council (ARC) and an Australian Professorial Fellowship. The help of Charles Hanich with collecting the data is greatly appreciated.
- Anderson, J. A. (1991). Why, having so many neurons, do we have so few thoughts? In W. E. Hockley & S. Lewandowsky (Eds.), Relating theory and data: Essays on human memory in honor of Bennet B. Murdock (pp. 477–507). Hillsdale: Erlbaum.Google Scholar
- Anderson, J. A. (1995). An introduction to neural networks. Cambridge, MA: MIT Press.Google Scholar
- Baddeley, A. (1986). Working memory. Oxford, U.K.: Oxford University Press, Clarendon Press.Google Scholar
- Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 8, pp. 47–89). New York, NY: Academic Press.Google Scholar
- Carretti, B., Cornoldi, C., De Beni, R., & Palladino, P. (2004). What happens to information to be suppressed in working memory tasks? Short and long term effects. Quarterly Journal of Experimental Psychology, 57A, 1059–1084.Google Scholar
- Chatham, C. H., Herd, S. A., Brant, A. M., Hazy, T. E., Miyake, A., O’Reilly, R., & Friedman, N. P. (2011). From an executive network to executive control: A computational model of the n-back task. Journal of Cognitive Neuroscience, 23, 3598–3619. doi:10.1162/jocn_a_00047 PubMedCrossRefGoogle Scholar
- Conlin, J. A., Gathercole, S. E., & Adams, J. W. (2005). Stimulus similarity decrements in children’s working memory span. Quarterly Journal of Experimental Psychology, 58A, 1434–1446.Google Scholar
- Conway, A. R. A., Jarrold, C., Kane, M. J., Miyake, A., & Towse, J. N. (Eds.). (2007). Variation in working memory. New York, NY: Oxford University Press.Google Scholar
- Cowan, N. (1995). Attention and memory: An integrated framework. New York, NY: Oxford University Press.Google Scholar
- Dale, H. C. A., & Gregory, M. (1966). Evidence of semantic coding in short-term memory. Psychonomic Science, 5, 153–154.Google Scholar
- De Beni, R., Palladino, P., Pazzaglia, P., & Cornoldi, C. (1998). Increases in intrusion errors and working memory deficit of poor comprehenders. Quarterly Journal of Experimental Psychology, 51A, 305–320.Google Scholar
- Hasher, L., Zacks, R. T., & May, C. P. (1999). Inhibitory control, circadian arousal, and age. In D. Gopher & A. Koriat (Eds.), Attention and performance XVII: Cognitive regulation of performance. Interaction of research and theory (pp. 653–675). Cambridge: MIT Press.Google Scholar
- Hebb, D. O. (1961). Distinctive features of learning in the higher animal. In J. F. Delafresnaye (Ed.), Brain mechanisms and learning (pp. 37–46). Oxford, U.K.: Blackwell.Google Scholar
- Henson, R. N. A. (1996). Short-term memory for serial order. Dissertation, University of Cambridge. Retrieved from www.mrc-cbu.cam.ac.uk/~rh01/thesis.html
- Henson, R. N. A., Norris, D. G., Page, M. P. A., & Baddeley, A. D. (1996). Unchained memory: Error patterns rule out chaining models of immediate serial recall. Quarterly Journal of Experimental Psychology, 49A, 80–115.Google Scholar
- Jost, K., Bryck, R. L., Vogel, E. K., & Mayr, U. (2010). Are old adults just like low working memory young adults? Filtering efficiency and age differences in visual working memory. Cerebral Cortex, 21, 147–1154.Google Scholar
- Kane, M. J., Conway, A. R. A., Hambrick, D. Z., & Engle, R. W. (2007). Variation in working memory capacity as variation in executive attention and control. In A. R. A. Conway, C. Jarrold, M. J. Kane, A. Miyake, & J. N. Towse (Eds.), Variation in working memory (pp. 21–48). New York, NY: Oxford University Press.Google Scholar
- Kane, M. J., Hambrick, D. Z., Tuholski, S. W., Wilhelm, O., Payne, T. W., & Engle, R. W. (2004). The generality of working memory capacity: A latent-variable approach to verbal and visuospatial memory span and reasoning. Journal of Experimental Psychology: General, 133, 189–217. doi:10.1037/0096-34126.96.36.199 CrossRefGoogle Scholar
- Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago, IL: University of Chicago Press.Google Scholar
- Lewandowsky, S., & Farrell, S. (2008b). Short-term memory: New data and a model. In B. H. Ross (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 49, pp. 1–48). San Diego, CA: Elsevier Academic Press.Google Scholar
- Lewandowsky, S., & Farrell, S. (2011). Computational modeling in cognition: Principles and practice. Thousand Oaks, CA: Sage.Google Scholar
- McElree, B. (2006). Accessing recent events. In B. H. Ross (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 46, pp. 155–200). San Diego, CA: Academic Press.Google Scholar
- Oberauer, K., Farrell, S., Jarrold, C., Pasiecznik, K., & Greaves, M. (2012). Interference between maintenance and processing in working memory: The effect of item–distractor similarity in complex span. Journal of Experimental Psychology: Learning. Memory, and Cognition, 38, 665–685. doi:10.1037/a0026337 CrossRefGoogle Scholar
- Unsworth, N., Schrock, J. C., & Engle, R. W. (2004). Working memory capacity and the antisaccade task: Individual differences in voluntary saccade control. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 1302–1321. doi:10.1037/0278-73188.8.131.522 PubMedCrossRefGoogle Scholar