Introduction

Working memory actively holds information in mind, making it accessible and manipulable in support of ongoing cognitive tasks (Baddeley, 1986). Working memory is critical to nearly all domains of cognition, constraining fluid intelligence (Fukuda, Vogel, Mayr, & Awh, 2010; Kane, Bleckly, Conway, & Engle, 2001) and relating to important predictors of life outcome, such as reading comprehension and academic success (Alloway & Alloway, 2010; Daneman & Carpenter, 1980). Both the architecture and capacity of working memory have been studied extensively (Baddeley, 2000). In the broader working memory literature, explanations for its meager capacity include limited information buffers (Cowan, 2001; Oberauer, 2002), time-based decay (Baddeley & Scott, 1971; Broadbent, 1958; Portrat, Barrouillet, & Camos, 2008), and interference (Lustig, May, & Hasher, 2001; Oberauer, Lewandowsky, Farrell, Jarrold, & Greaves, 2012). In the domain of visual working memory, the ability to actively hold information in mind is generally thought to require a finite mental commodity or buffer that is shared by memory representations. This commodity is typically viewed either as “resources” that are continuously divisible and flexibly allocated to objects or features (e.g., Alvarez & Cavanagh, 2004; Bays & Husain, 2008; Fougnie, Asplund, & Marois, 2010; Wilken & Ma, 2004) or as fixed “slots” that are constrained to represent a discrete number of objects or chunks (e.g., Anderson, Vogel, & Awh, 2011; Awh, Barton, & Vogel, 2007; Cowan, 2001; Luck & Vogel, 1997; Miller, 1956; Zhang & Luck, 2008).

The slots versus resources debate, in brief

At their core, slot models propose that only a handful of representations can be stored in working memory. This view was supported by early research (Luck & Vogel, 1997; Vogel, Woodman, & Luck, 2001). In contrast, the core idea of resource models is that memory stores a limited amount of information, in unspecified units. Thus, the basic resource model predicts a trade-off between the amount of information that must be stored per item and the number of items that can be stored, consistent with the findings of other research (Alvarez & Cavanagh, 2004; Bays & Husain, 2008; Wilken & Ma, 2004). The discovery of a trade-off between quantity and quality would seem to invalidate slot models (Bays & Husain, 2008; Wilken & Ma, 2004). However, slot models have since evolved. To account for the quantity–quality trade-off, they now allow multiple slots to store copies of the same object (e.g., the slots + averaging model of Zhang & Luck, 2008) or are hybrid, having both a slot limit and a resource limit (Awh et al., 2007; Xu & Chun, 2006). Although these hybrid models are all nominally slot models, they are quite different from the “pure” slot models of earlier days (e.g., Cowan, 2001).

Given that the class of slot models has become so diverse, we argue here that focusing on the distinction between slots and resources leads to conflating or ignoring the following eight underlying issues, including (1) the existence of an upper bound on the number of items about which information can be represented, (2) the quantization of the commodity used for storing memories, (3) the trade-off between the number of items stored and the amount of information stored about each item, (4) the extent to which this relationship is stable versus stochastic or variable, (5) the flexibility with which the memory commodity can be assigned to different items or reassigned to new items, (6) the format of the memory representation, (7) how working memories are formed through encoding, and (8) how we use our memory representations to make responses.

In our view, a dichotomous framing muddies the space of possible theories and, in doing so, leads to the confounding of what are distinct questions about the nature of working memory. In particular, the slots versus resources framing encourages researchers to treat theories as holistic, where in reality they consist of distinct commitments to these eight core theoretical issues, and possibly to other issues as well. We note that papers rejecting a class of models (e.g., slot models) on the basis of some particular piece of evidence tend to confound ideas that are ultimately separable. For example, they confound whether memory is discrete with whether it represents objects holistically, or whether it is variable in precision with whether there is an upper limit on the number of items that can be stored. The current paradigm—accepting and rejecting particular models as wholes, rather than considering each of their component commitments—hinders progress toward building a comprehensive theory of visual working memory. Thus, in this article, we focus on laying out a core set of questions about working memory, emphasizing the ways that the slots versus resources debate has led to these questions being either confounded or ignored.

7 ± 2 core questions

Question 1. Is there an upper bound on the number of items about which information can be stored?

Is there a fixed upper bound on the number of objects that can be stored in memory? Is it impossible for observers to store information about more than a handful of objects (Alvarez & Cavanagh, 2004; Cowan, 2001; Zhang & Luck, 2008), such that no information is retained about objects beyond the limit? Or are observers, instead, capable of storing information about a large number of objects with no specific upper bound, even if they choose to focus only on the few that are behaviorally relevant (Bays & Husain, 2008; Wilken & Ma, 2004)?

The amount of information stored in working memory is often described by estimating how many objects must have been remembered to achieve a given level of performance (e.g., Cowan’s K; Cowan, 2001). This number is usually found to be quite small, three to four objects worth, even when many objects are presented. However, these estimates generally rely on the assumption that observers either remember an object entirely or fail to remember it at all (a so-called “high-threshold” model; see, e.g., Rouder et al., 2008). However, in general, it is possible for that same level of performance to be achieved by remembering a small bit of information about each of many objects (Bays & Husain, 2008; van den Berg, Shin, George, & Ma, 2012). Because many models thus predict the same level of performance, they also predict the same values of K. Therefore, finding a maximum value of K does not in and of itself provide evidence of an upper bound on the number of items that can be stored.

Consequently, the question of whether there is an upper bound on working memory storage remains highly debated, and new approaches have been developed that shed light on the issue. For example, there is some evidence that observers can remember low-resolution information about many objects when doing so is advantageous (Bays & Husain, 2008), but these data may be contaminated by a test-display confound that allowed observers to respond correctly even when they did not remember the items (Thiele, Pratte, & Rouder, 2011). On the other hand, evidence for an upper bound has also been questioned. For example, initial findings showed that presenting more than three to four objects results in an increase in guessing but does not decrease the quality of memory for items that are stored, which was taken as evidence for an upper bound of three to four objects that can be remembered (Zhang & Luck, 2008). However, others have proposed that increasing the number of presented items increases swaps (i.e., reporting the wrong item from memory; Bays, Catalao, & Husain, 2009) and that putative guesses might actually represent low-fidelity memories (van den Berg et al., 2012). Thus, the extent to which there is an upper bound on the number of items that can be remembered remains unclear.

Question 2. What is the quantization of the mental commodity used for memory storage?

Is the mental commodity used for memory storage discretely divided or continuously divisible? Traditional slot and resource models disagree about the nature of the underlying memory-supporting commodity and whether it is quantized (Fig. 1). The central issue is whether the commodity is discrete or can be divided arbitrarily. Some models propose that the commodity is divided into a small number of equal-sized quanta, with the particular number possibly varying across people and trials (Zhang & Luck, 2008; fixed-resolution, slots + averaging), while others argue that the commodity can be continuously divided (resource models). Some of the models that assume a continuously divisible commodity also suggest an upper bound on how many representations can be maintained (Q1; Alvarez & Cavanagh, 2004; Awh et al., 2007: bounded, resource-limited), whereas others argue that the commodity can be continuously divided without any upper bound on the number of objects that can be represented (Bays & Husain, 2008; Wilken & Ma, 2004; unbounded, resource-limited).

Fig. 1
figure 1

Distinguishing quantization from an upper limit on the number of stored objects. Each of N units of a mnemonic commodity is assigned to one of K stored objects, as indicated by the unit’s color. Memory can be more or less quantized independent of an upper limit on the number of items that can be stored

Quantization is a fundamental concept with roots in the “magical numbers” of memory—the proposal that memory capacity is best described by the maximum number of psychologically meaningful units of information that can be remembered (e.g., 7 ± 2 chunks according to Miller, 1956, and four items according to Cowan, 2001). However, it is unclear whether the distinction between discrete and continuous division of the mental commodity is essential in the current debate between slots and resources. Discreteness is not sufficient to be a slot model, and continuity is not required to be a resource model. This is because slot models invariably go beyond discreteness by proposing a specific quantization: coarse, with only a handful of quanta, committing to both Q1 and Q2. In contrast, continuous resource models claim continuity but could make do with a less-than-continuous commodity—for example, one with finely chopped discreteness. Consider the possibility of a commodity divided into 100 quanta that can be apportioned at whim, but where people rarely try to remember more than three to four objects. Is this a slot model or a resource model? Although quantal, such a model shares much in common with continuous resource models and likely would not be accepted as a “slot model.” Thus, quantization per se may not be of direct theoretical importance to many existing models and is distinct from the commitment to an upper limit on the number of objects about which information can be stored (see Fig. 1).

Question 3. What is the relationship between quantity (the number of items about which information is stored) and quality (the amount of information stored about each of them)?

Is there a trade-off between the number of items that are stored and how precisely each one is remembered? Researchers have made various assumptions about the answer to this question. For example, item-limit models in the style of Cowan (2001) assume a fixed upper bound on the number of possible representations but do not specify the relationship between their quantity and quality. In contrast, a recent variant of the slot model (Zhang & Luck, 2008) posits an upper bound in quantity and predicts a particular falloff in quality as quantity increases, with its form implied by the process of averaging independent samples. Other slot models (e.g., that of Awh et al., 2007) allow for greater flexibility in the allocation of the commodity to items and thus, while predicting a falloff in fidelity when a greater number of items are stored, do not make specific predictions about the falloff’s form or how allocation of the commodity determines fidelity. Continuous resource models are more flexible still, because they do not commit to a representational format for items stored in memory, requiring further assumptions to link quality to quantity. For example, the model of Bays and Husain (2008) assumes a power law relationship between the proportion of available resources that are assigned to an item and the precision of memory for it, which can then be translated to a measure of information. Similarly, the model of van den Berg et al. (2012) treats information as the commodity, assuming a power law relationship between the number of stored items and Fisher information. Both models rely on mathematical assumptions that are not directly derived from the format of observers’ representations. Consequently, support for these models, in the form of a power law relationship between the number of stored items and the precision of storage, does not provide evidence for any particular format of memory representations. Thus, models that commit to a specific, quantal organization for memory, like that of Zhang and Luck (2008), predict a particular relationship between the number of quanta assigned to an item and its fidelity; that is, the answers to Q1 and Q2 constrain the answer to Q3. However, models with a continuous resource component have considerably more leeway in specifying a function that relates the number of items stored to the fidelity with which they are stored. In general, any monotonic decrease in memory quality as quantity increases would be consistent with—but not specifically predicted by—a continuous resource model.

Models that do not specify the format of representations will, nonetheless, often predict some relationship between the proposed upper bound (Q1) and the shape of the quantity-fidelity curve (Q3) and, in doing so ,use the answer to Q3 to infer something about Q1 (an upper limit) or Q2 (quantization). For example, in Anderson et al. (2011), the presence of a plateau in memory fidelity (with increasing set size) is taken as evidence of an upper bound in the number of items that can be stored (Q1). However, the assumption that a plateau in memory fidelity is determined solely by an upper bound in the number of stored item was recently challenged by Brady, Konkle, Gill, Oliva, and Alvarez (2013), who suggest that the plateau can occur for other unrelated reasons—for example, from the difficulty of retrieving low-fidelity items from memory. Inferring an upper bound or quantization from the quantity-fidelity curve requires additional assumptions and should, therefore, be treated as a distinct theoretical commitment.

Question 4. Does the number of items stored completely determine the fidelity of storage?

Until recently, most models of visual working memory implicitly assumed a deterministic relationship between the number of items stored and their fidelity (e.g., Wilken & Ma, 2004; Zhang & Luck, 2008). Contrary to this assumption, recent work has demonstrated that memories are variably precise (Bae, Olkkonen, Allred, Wilson, & Flombaum, in press; Brady & Alvarez, 2012, 2014; Fougnie, Suchow, & Alvarez, 2012; van den Berg et al., 2012). The information content of items stored in memory may be variable (van den Berg et al., 2012) and subject to stochastic degradation (Fougnie et al., 2012), even when controlling for variability caused by display-level factors such as differences in the memorability of certain colors or locations (Fougnie et al., 2012). Furthermore, different configurations of items (e.g., different visual displays) produce reliably different estimates of how many items are represented and how precisely they are represented, even when holding constant the number of items that are present (Brady & Alvarez, 2012, 2014; Brady & Tenenbaum, 2013). This form of variability appears to be driven by configural or ensemble representations (Brady & Alvarez, 2012, 2014).

Variability in the quality of memory representations is sometimes taken as evidence that visual working memory is supported by a flexible resource (van den Berg et al., 2012). However, it is important to note that variability in precision does not necessitate rejecting other attributes of slot models (e.g., Q1, a fixed upper bound; or Q2, coarse quantization). Models with a fixed upper bound and a small number of quanta could nonetheless allow for variability in precision—for example, by allowing for uneven allocation of quanta to items. For this reason, it is worth considering the question of variability separately (as in van den Berg, Awh, & Ma, in press).

Question 5. Can the commodity be allocated or reallocated flexibly?

Visual working memory is the purposeful storage of relevant visual information over a short duration, and so models of it necessarily allow some top-down control in selecting which information from a scene is stored. Where theories differ is in the flexibility of the top-down selection process and in the possibility of reallocating the memory-supporting commodity after items have already been encoded in memory and are no longer visible.

The question of flexibility is often entangled with the debate over the coarseness of the commodity. For example, a recent paper argued against fixed-precision slot models by showing that allocation is more flexible than would be possible with a coarsely quantized slot system (Li, Shao, Xu, Shui, & Shen, 2013). On the other hand, Zhang and Luck (2011) argued that limits in the flexibility of allocation support the idea of coarse quantization (see also Machizawa, Goh, & Driver, 2012; Murray, Nobre, Astle, & Stokes, 2012). While the coarseness of a memory commodity constrains the flexibility with which it can be allocated, quantization and flexibility are distinct commitments of a model. For example, a model with a finely divisible commodity could have limits in the flexibility of allocation, such as the requirement that the commodity is divided evenly among the items to remember. Likewise, a model with a coarsely divisible mental commodity could have full flexibility in how quanta are allocated, with no constraints on how quanta are allocated to items. By considering questions of flexibility and control separately from other components of the slots versus resource debate, it also becomes possible to consider the degree to which allocation can be restricted to task-relevant items (Vogel, McCollough, & Machizawa, 2005).

Another question relevant to flexibility is whether the commodity can be reallocated after its initial allocation. Williams and colleagues have shown strong evidence for reallocation: When participants are told that particular items currently held in memory are no longer needed, performance improves for those that remain (Williams, Hong, Kang, Carlisle, & Woodman, 2013; Williams & Woodman, 2012; see also Makovski & Jiang, 2007; Matsukura & Hollingworth, 2011; Sligte, Scholte, & Lamme, 2008). To explain this result, they invoke the idea of reallocation—reassigning the mental commodity from one item to another during maintenance. In contrast, Zhang and Luck (2009) proposed a “sudden death” account of memory in which the commodity cannot be reallocated: When an item is forgotten, the commodity assigned to it goes down with the ship. Differences between these accounts may reflect the difference between purposefully and accidentally forgetting an item. However, explaining this discrepancy requires further elaboration of existing models.

The question of flexible allocation or reallocation is orthogonal to other components of the debate on slots versus resources. For example, one can imagine a “slot” model where slots are reallocated when an item is forgotten, or conversely, a “resource” model where the resource allocation is fixed and immutable after encoding.

Question 6. How are memories structured and what is their format?

Are visual working memories structured as monolithic object representations (Luck & Vogel, 1997), or are independent visual features such as colors and orientations stored separately from each other (Bays et al., 2011; Bays, Wu, & Husain, 2011; Magnussen, Greenlee, & Thomas, 1996)? Do objects have multiple levels of representation in working memory (Alvarez & Cavanagh, 2008; Fougnie & Alvarez, 2011; Wheeler & Treisman, 2002)? Do observers store information spanning multiple objects (e.g., texture, ensemble, or summary representations), and, if so, is this kind of representation independent of the storage of objects and features, or do they interact (Brady & Alvarez, 2011; Brady & Tenenbaum, 2013)?

Existing models provide a wide range of answers to these questions, having posited everything from constraints on the number of integrated objects that can be stored (Lee & Chun, 2001; Luck & Vogel, 1997), to constraints on features (Bays et al., 2011; Bays, Wu, & Husain, 2011; Magnussen et al., 1996), to intricate interactions between objects and features (e.g., a hierarchical representation that includes both an object level and a feature level, with constraints at both [“hierarchical feature bundles”; Brady, Konkle, & Alvarez, 2011]), to distinct visual feature representations whose stability is affected by the number of presented objects (Fougnie & Alvarez, 2011; Fougnie, Cormiea, & Alvarez, 2013).

Slot models have often suggested that memory maintains integrated objects (e.g., Luck & Vogel, 1997). Because of this, rejections of an object-based representation have been seen as evidence against slots. However, the format of the representation, the presence or absence of an upper bound (Q1), and the coarseness of quantization (Q2) are independent questions. For example, it is possible to construct a model with no upper bound (Q1) and fine quantization (Q2), but where the number of integrated objects determines fidelity (cf. Wilken & Ma, 2004).

Discovering the structure and format of representations in visual working memory representations—the kinds of information that are stored and the constraints present on features, objects, and ensembles—is a prerequisite to understanding its architecture. Without knowing the units of representation, it is difficult to ask other questions. For example, the existence of an upper limit (Q1) cannot be determined without knowing what is represented (what is it an upper limit of?), and the quantization of the commodity cannot be determined without knowing what it is allocated to (Q2) (Fig. 2).

Fig. 2
figure 2

Possible structures of representations in visual working memory. On the left is the stimulus display presented to the observer. On the right are three possible memory representations: bound objects, separable features, and hierarchical feature bundles. (Figure adapted from Brady et al., 2011)

Question 7. How are visual working memories formed?

Developing and testing a model of maintenance requires a model of encoding, because one cannot be certain that imperfect task performance is due to storage limits if encoding limits have not been ruled out.

Traditionally, models of working memory have assumed simple encoding models. For example, Cowan’s (2001) influential formulation of capacity assumes that observers select a random subset of K objects from a display and store them, encoding nothing about the others. It further assumes that enough time is allowed for complete encoding, such that it does not limit performance. To validate the use of simple encoding models like this one, measures are taken to ensure that effects are insensitive to stimulus timing. For example, Luck and Vogel (1997) manipulated how long stimuli were presented and found that observers’ performance was similar at 100 and 500 ms (see also Vogel et al., 2001). However, other studies have suggested that 100 ms is not sufficient for complete encoding, even with a display of simple objects (Bays et al., 2009; Bays et al., 2011; Bays, Wu, & Husain, 2011; Oberauer & Eichenberger, 2013). The disparity in reported rates of encoding, along with many studies’ failure to test for encoding limitations, suggests that caution is needed when ascribing performance limits solely to maintenance processes.

It is often assumed that observers encode items randomly from a display and that any regularities or imbalances in which items are encoded will average out over the course of many trials. There are good reasons to believe that this assumption is false. For example, Emrich and Ferber (2012) showed that spatial competition between nearby items during encoding affects which items are successfully stored. Competition at encoding can also occur when similar items are displayed simultaneously and can be alleviated by dividing items into two sequential presentations (Shapiro & Miller, 2011). Furthermore, this competition at encoding appears to depend on both the category of the stimuli and their location in the visual field (Cohen, Rhee, & Alvarez, 2013, 2014). Thus, different conditions in a working memory experiment (e.g., set sizes or stimulus categories) might be differentially limited by competition at encoding. Finally, work by Brady and Alvarez (2012, 2014) and Brady and Tenenbaum (2013) shows that observers do not, in fact, encode items at random. Rather, the configuration of items on a display can strongly affect what is encoded about each particular item. These configural effects differ in magnitude across different set sizes (Brady & Alvarez, 2012, 2014) and, thus, make it difficult to compare capacity across set sizes—a technique commonly used in determining the presence of an upper bound (Q1) and the coarseness of quantization (Q2).

The process of encoding stimuli into working memory remains understudied. Slot and resource models tend to be models of storage, without firm commitments about encoding. Unless these models are further elaborated, they will not be particularly useful in furthering our understanding of encoding. Therefore, focusing on the slots versus resources debate in its present form impedes progress in understanding this critical aspect of working memory.

Question 8. How do we use a memory representation to make a response?

Having specified how memories are encoded and maintained, a complete model of working memory must then specify how the process of retrieval occurs and how retrieved memories are used to make a response in a particular task. Some models of working memory treat this as a simple matter; for example, Zhang and Luck (2008) assume that responses are either guesses (if the item is not in memory) or samples from the underlying memory distribution (if the item is in memory). Other models focus on this question in greater detail, by considering, for example, the observer’s need to correctly recognize the correspondence between the probe and the item that is being tested (Bae, Wilson, Holland, & Flombaum, 2014; Bays et al., 2009). Some models even use counterfactual reasoning, with observers considering not only the current contents of memory, but also what would have been encoded and remembered had the display been different (Brady & Tenenbaum, 2013).

Consideration of how observers convert memories into responses is needed to fit models to empirical data. Slot models tend to assume a high-threshold, all-or-none model of response. For example Cowan’s K assumes that observers either fully represent an item (i.e., they represent it well enough that memory quality does not limit performance) or fail to represent it at all. Similarly, it is possible that observers sometimes—in an all-or-none manner—fail to report an item despite having it stored in memory (attentional lapses; Rouder et al., 2008). Resource models tend to focus on more probabilistic response strategies, where observers respond imperfectly despite having a representation of an item (e.g., Wilken & Ma, 2004; van den Berg et al., 2012). These models use various signal detection strategies for producing a response given a memory representation (e.g., a maximum absolute differences model or a sum of absolute differences model; Wilken & Ma, 2004). Although slot and resource models tend to use different response models, these commitments are rarely derived from their slot-like or resource-like nature. The assumed response strategies are usually independent of the factors that distinguish slot and resource models.

Another important question about using memory to produce a response is whether all the information that an observer uses comes from the active visual working memory system. Although tasks are often assumed to isolate limitations in working memory, many may involve the use of other memory systems, such as iconic memory (Saults & Cowan, 2007) or long-term memory (Lin & Luck, 2012; see Brady et al., 2011, for a review of the role of long-term memory in working memory tasks). Thus, when fitting a model that proposes an architecture for visual working memory to data from a particular task, researchers will need to take into account the possible contributions of other memory systems.

Discussion

We have outlined eight basic questions that are often confounded, obscured, or ignored in the current debate on slot versus resource accounts of visual working memory. Different instantiations of slot and resource models give different answers to these questions. For example, some “resource” models assume an upper bound on the number of representations that can be stored (Alvarez & Cavanagh, 2004), whereas others do not (Wilken & Ma, 2004). Similarly, some “slot” models assume that there is both a slot limit and a resource limit on memory (Awh et al., 2007), whereas others assume that there is only a slot limit, with the pattern of allocation of these slots completely explaining differences in item fidelity (Zhang & Luck, 2008). And existing slot and resource models rarely make firm commitments on all questions (e.g., Q4, the role of variability; Q5, the degree of flexibility; Q6, the content of stored representations; or Q7, the nature of encoding). From this, a complication arises: If a slot or resource model can be modified to give any answer to these questions, while still retaining its name and class, it becomes impossible to reject either class of models on the basis of experimental data. Instead, it becomes necessary to consider each model separately, not by its name or class, but in terms of the answers that it provides to each underlying question.

In addition to confounding some these questions, the focus on slots versus resources has drawn attention away from research on other important questions. For example, relatively little is known about the relationship between visual working memory and other forms of working memory and other cognitive capacities (although see Fukada et al., 2010). Within the broader working memory literature, which does not focus as much on the slots versus resources debate, important areas of research include the separation between the maintenance of visual and auditory information (Baddeley, 1986; Fougnie & Marois, 2006, 2011; Morey & Cowan, 2004, 2005), the separation between active storage (which is a limited-capacity amodal system) and the activated portion of long-term memory (which may recruit modality-specific rehearsal systems) (Cowan, 2001), and the role of working memory in online processing (Hollingworth & Luck, 2009). It is noteworthy that there is sparse communication between the slots versus resources debate and these broader topics (e.g., Baddeley, 2012), despite the fact that they have profound theoretical implications and will encourage consideration of how the visual memory system fits within a broader cognitive architecture.

Conclusion

The recent debate about the structure and format of visual working memory divides models into two camps: slots versus resources. This dichotomous framing obscures a set of at least eight underlying questions, which we elaborated, while leaving open questions fundamental to understanding the visual working memory system and its relationship to the cognitive system more broadly. By reframing the debate in terms of these eight questions, it becomes possible to place slot and resource models as poles in a more expansive theoretical space. By doing so, models of visual working memory will be more comprehensive and integrated with the broader working memory literature.