As we look at the environment around us, our subjective visual perceptions seem filled with detail. One is given the impression that at any moment, we have awareness of everything our eyes can tell us. Yet, although our eyes certainly do receive a wealth of detail about the world, evidence from the psychophysics laboratory suggests that our subjective experience of detailed vision is misleading. In fact, our moment-to-moment visual representations seem to be highly impoverished, containing little more than the basic ‘gist’ of what is before our eyes (Noë, 2002; O’Regan & Noë, 2001; cf. Marr, 1982). The phenomenon of change blindness (Simons & Levin, 1997) is one way in which the limited awareness of what we see can be demonstrated. Change blindness is typically demonstrated using the flicker paradigm (Rensink, 2000; Rensink, O’Regan, & Clark, 1997). In this paradigm, two different versions of a photographic image or display of objects are presented in alternation, usually interleaved with a briefly presented blank screen. The observer is usually instructed to locate and identify the change in the picture or display as quickly as possible. Most observers are typically rather poor at this task, behaving as if blind to the existence of the change until it is eventually located. The change blindness phenomenon occurs not just for subtle changes, but even when manipulations are more dramatic. To give an example, Rensink et al. presented two versions of a photograph of a market scene, using the flicker task. As these two versions were shown in alternation, the trousers of the market stall attendant in the picture changed their colour between blue and brown. Observers would often require tens of iterations before the change was spotted, despite the substantial colour change and despite the large area of the scene involved in the change.

An important aspect of the change blindness phenomenon is the obviousness of the change once it is either discovered or pointed out to the observer. This shows that change blindness is not simply a failure of sensory registration or discrimination ability or due to low-level perceptual interference caused by the flickering of the display. Instead, the phenomenon seems to highlight a genuine difficulty in perceiving and identifying changes when they are presented in a way in which the luminance transients, normally associated with the appearance of a change, are masked: If no blank mask is inserted between the two alternating versions of the scene, the change is rendered obvious, due to its associated luminance transient capturing our attention.

The flicker paradigm unquestionably demonstrates the importance of the bottom-up guidance of transients in making us aware of changes as they occur in our environment. However, the evidence it provides is not, by itself, conclusive evidence that we lack detailed visual representations. The paradigm gives, at best, a rather indirect measure of our moment-to-moment visual awareness: Even if our conscious perceptions represented the pre-change scene in all relevant details, in order to detect a change this representation needs to be retained intact through the presentation of both the blank interval mask and of the post-change scene (Angelone, Levin, & Simons, 2003; Mitroff, Simons, & Levin, 2004; Simons, 2000). Even were our pre-change scene representation robust enough to survive this, one would still need some effective way of comparing the entire scene against the current scene representation, in order to ensure that the change component was detected (Varakin, Levin, & Collins, 2007). In practice, the inherent spatial uncertainty as to the change’s location in the flicker task requires multiple perceptual decisions to be made about the scene before the change location is revealed (Wright, Alston, & Popple, 2002; Wright, Green, & Baker, 2000).

Thus, the flicker paradigm presents us with a task which taps into a multitude of visual processes. It is as much a measure of our ability to retain and compare visual information as of our ability to initially apprehend it. The considerations above demonstrate fundamental limitations in the paradigm as a tool for exploring visual awareness. Fortunately, there are alternatives which can be considered as more suited to this purpose. One such alternative is the mudsplash technique. This is a variant of the flicker paradigm in which a change is repeatedly presented, but without the intervening blank frame shown in the flicker paradigm; instead, a set of irrelevant objects (‘mudsplashes’) appear and disappear in the display in synchrony with the change, but in locations where they do not spatially overlap it (O’Regan, Rensink, & Clark, 1999; Watanabe, 2003). The transients generated by the onset and offset of the mudsplashes conceal the change transient which would otherwise be obvious.

With the change transient concealed in this way, change detection is almost as effortful as in the flicker paradigm. However, again, what the task informs us about our moment-to-moment visual perceptions is debatable. In addition to disguising the transient, the mudsplashes presumably also interfere with the normal process by which attention is allocated across a scene (Cole, Gellatly, & Blurton, 2001; Kramer & Sowon, 1995). This mudsplash technique may, therefore, alter the nature of what we see by involuntarily drawing our attentional resources away from what we might otherwise perceive towards the more salient transients associated with the appearance and disappearance of the mudsplashes. Furthermore, the mudsplash technique, like the flicker paradigm, still involves spatial uncertainty about the location of the required perceptual decision. This spatial uncertainty in itself is likely to compound the difficulty of detecting a change, since it means that all perceptual comparisons must always be made across multiple regions of the display to locate the change.

Another task, a modification of the mudsplash technique, presents a scene, onto which is added a single mudsplash which briefly occludes the display area, where a change occurs, before itself disappearing. This method has the advantage of drawing attention to the change location and so limiting the perceptual comparison to a single probed display region. Despite the greater cognitive simplicity of this method, as compared with, say, the flicker paradigm, O’Regan, Rensink and Clark (1999) still found that participants were rather poor in identifying the nature of a change, consistent with their having only a limited representation of the scene to draw upon. This difficulty was most apparent when participants were cued to a region of marginal interest within a scene—for example, changes to some letter markings on a photograph of a hockey pitch, rather than to the players at the centre of the image (O’Regan et al., 1999).

A further alternative to the cued change detection task described above, and one which arguably makes even fewer cognitive demands, is the immediate visual memory task (Reinecke, Rinck, & Becker, 2006; Wolfe, Reinecke, & Brawn, 2006). This is a task in which observers just have to make a simple decision about an object or location in a display or scene which is abruptly cued while the display is being viewed. A single transient or occluding mask is given both as a probe to the display location required for report and to conceal the to-be-reported attribute. No detection of change is required, so there is no requirement to compare previous and current percepts; the immediacy of the probe’s appearance means that memory requirements are as limited as possible.

Wolfe et al. (2006) presented observers with displays consisting of different coloured shapes in an experiment which used such an immediate memory task in an attempt to assess visual awareness. Individual shapes in the displays typically had one of a limited number of attribute values. This choice of stimuli meant that the task loaded heavily on the information held in visual representations, rather than on potential verbal or semantic memory processes which might also be utilised with displays consisting of mainly unique display items or with meaningful photographic scenes. In one version of the task, observers were shown 20 red or green disks. After viewing the display for about a second, 1 disk would suddenly become brighter. On half the trials, the disk would also change in colour (from red to green or green to red) simultaneously with the brightness increase. Despite seeing the change as it happened in front of their eyes, observers were little better than chance (52% correct) in reporting whether a change in colour had accompanied the increase in brightness. A second experiment showed that poor immediate memory performance was not limited to colour. In this experiment, displays consisting of 32 left- or right-leaning bars were presented for a duration similar to that in the previously described colour task. A probe consisting of an occluding square was abruptly presented to cover one of the bars, and the task was simply to report the orientation of the covered bar. Participants were no better than chance in this task. This was despite its apparent simplicity in asking only for a report about an object immediately after it was being viewed, with neither any spatial uncertainty about the to-be-reported-upon location/object nor any intervening interval to contend with.

The results Wolfe et al. (2006) found from these immediate visual memory tasks can be taken as direct evidence of a profoundly limited visual awareness of what is before our eyes. In these tasks, participants were doubtless aware that the scene contained a number of coloured dots or diagonal lines as they viewed the displays. What they seemed largely unaware of were the attributes of any particular object when actually tested. There seemed to be no iconic-like representation available that participants could draw on to assist them under these conditions of viewing when cued to report about what was at a location (cf. Becker, Pashler, & Anstis, 2000).

Further experiments reported by Wolfe et al. (2006) showed that spatial pre-cueing of an item leads to substantially improved accuracy in reports of its attributes. For instance, in one of these experiments (Wolfe et al., 2006, Experiment 6), a display of 20 coloured squares was presented; as observers viewed the display, three to eight of the items were pre-cued one by one by briefly flashing in series. After this, one of the pre-cued items in the display was covered, and the observer was asked to report the colour of this item. The more recently this probed item was pre-cued in the sequence, the more likely it was that its colour was accurately reported. Further analysis suggested that the pre-cueing advantage existed, at best, for the last four items in the sequence. The advantage enjoyed by pre-cued items occurred presumably because focal attention was drawn towards the item, leading to its consolidation in visual short-term memory (VSTM). The well-documented item capacity limitations of the VSTM store (Pashler, 1988; Vogel, Woodman, & Luck, 2001; cf. Wilken & Ma, 2004) accounted for the limited pre-cueing benefit of only the most recently attended items (see also Reinecke et al., 2006).

Together, these results suggest that our visual awareness at any moment is constrained to the attributes of no more than a maximum of four objects which can be maintained in VSTM if our attention has been specifically focused on them. However, the tasks given by Wolfe et al. (2006), while suggesting a very limited awareness of object attributes under conditions of diffuse attention, may underestimate what observers actually knew about the displays. It is possible, for instance, that although observers may have been at (or near) chance in reporting about, say, the colour of an object in a display, they may still have been aware of the existence of an object at the tested location. Thus, it may be that our awareness of object locations in a display may exceed awareness of what attributes these objects possess. Indeed, work on change detection has shown evidence consistent with this possibility. Changes consisting of the addition or deletion of an object tend to be detected more efficiently than changes involving some object attribute or changes to the semantic identity of an object (e.g. Aginsky & Tarr, 2000; Bahrami, 2003; Cole, Kentridge, & Heywood, 2004; Henderson & Hollingworth, 1999, 2003; Sanocki, Sellers, Mittelstadt, & Sulman, 2010; Simons, 1996).

Such research possibly demonstrates that change detection mechanisms are rather more sensitive to changes in object layout than to attribute changes. However, as we earlier argued, standard change detection measures tend to give a rather indirect measure of awareness in often being confounded by the involvement of other cognitive processes. Thus, on this evidence alone, it is unclear what such findings say about our visual awareness of object location and attribute information when viewing a scene.

However, independently of this work on change detection, other paradigms have also revealed advantages in participants’ reporting about location information. High accuracy is often displayed on tasks where one need rely only on object location information; performance tends to be lower when knowledge of object surface features is also required. For instance Horowitz et al. (2007), using a multiple object tracking (MOT) task, found that observers were more efficient in tracking objects in terms of just their positions than in tracking them when the task required attention to the unique identity of an item as well as its position.

More closely related to Wolfe et al.’s (2006) immediate memory tasks are a series of experiments performed by Huang and colleagues (e.g. Huang, Treisman, & Pashler, 2007). These tasks required observers to report about the locations and features of objects presented in static displays. They revealed an advantage in reporting about the former. In one of the tasks, two objects appeared briefly in a display either simultaneously or in close temporal succession. When the objects were presented in succession, participants were just as accurate in reporting about colour and location; with simultaneous presentation, reporting of location was equally accurate as with successive presentation, while reporting of colour was substantially impaired.

These findings seem to reveal something fundamental about the underlying structure of our visual representations (what Huang & Pashler, 2007, refer to as ‘Boolean maps’) and how they can be accessed at any given moment. It is argued that only a single feature value can be accessed at a time from these maps, while location information can always be accessed in parallel (Huang & Pashler, 2007; Huang et al., 2007). Further evidence supporting this conclusion was obtained by Huang (2010). Here, using more complex displays containing several objects, the same basic location-over-colour advantage was found. Participants were presented with two separate displays, each containing up to seven randomly located coloured dots. Participants had to make a speeded decision concerning whether the pattern of colours or locations of the dots matched across the two displays. Results showed that response times increased substantially with set-size when the decision concerned object colour, while an essentially flat set-size function was found when the decision concerned location. Thus, with multiple items in a display, participants were rather more efficient in reporting about the locations of display items than about their colours.

Therefore, independent techniques suggest that location information about objects has precedence in our visual awareness. The poor immediate visual memory for object attributes found by Wolfe et al. (2006) may have underestimated what observers knew about the displays they viewed. They may have had little-to-no awareness of whether a certain location contained, say, a red or green item or a left- or right-oriented item, but they may have been aware that a display item was present at that location and, correspondingly, aware of the emptiness of locations which were unfilled by any item. The tasks used gave no evidence either way on this possibility, because observers were only ever responding about display locations in which an item was present.

In four experiments, we used a modified version of the immediate visual memory task. The task was modified so that observers were probed at empty locations as well as ones containing objects. This modification allowed us to measure awareness of object attributes, as compared against awareness of the presence/absence of an object at a tested location.

All the reported experiments showed participants displays which contain a variable number of red and blue coloured shapes. In all the experiments, these displays were presented for several hundred milliseconds before a probe (a small black square) appeared to occlude a small region of the display. On some trials, the probe covered one of the coloured shapes; on other trials, it covered an empty location. All the experiments required participants to make an unspeeded keypress to indicate something about what was at this indicated location before it was occluded. In the first three experiments, participants made a simple two-alternative (yes/no) response according to whether or not they thought that a particular target object was present or absent at the probe location (e.g. “did the location contain a red circle?”). Within this task, participants’ knowledge about the locations of objects, as compared with knowledge about their colour, was determined by comparing false alarm (FA) rates in incorrectly reporting a specified target at empty versus occupied display locations.

If observers did know more about object locations than about their colours, a certain pattern of FA errors was expected across different display locations. It was expected that FA errors would be relatively infrequent at empty locations (since participants should tend to know whether or not a display location contained an object) and relatively frequent at distractor locations (because observers would often know that the location contained an object, while being unaware of its colour and, thus, unaware of whether the object was a target or a distractor).Footnote 1

In all four experiments, the number of items in the display was parametrically varied. This was done for two reasons. The first was to determine how accuracy in immediate memory would be affected by this variable. A large number of visual cognitive tasks show robust monotonic performance decrements associated with display set-size, including visual search (Palmer, 1994; Treisman & Gelade, 1980), MOT (Pylyshyn & Storm, 1988), and change detection (Kempgens, Loffler, & Orbach, 2007; Rensink, 2000; Wright et al., 2002; Wright et al., 2000). To date, this variable has yet to be explored systematically within the immediate memory paradigm. The second reason for the set-size variation was to determine how the different response error rates varied at empty and object-occupied locations. For instance, as Huang (2010) found, differences in awareness of object features and object locations may be most apparent when complex displays (i.e. ones containing many items) are viewed.

Experiment 1

Experiment 1 gave an initial test of the hypothesis that when viewing displays, participants tend to be more aware of object locations than object colour, using a simple variation on the basic paradigm used in Wolfe et al. (2006, Experiments 1 and 2). Participants were shown a display of red and blue coloured letter ‘O’s. A probe then appeared which covered a single display location, either one containing a red or blue letter or an empty space. When the probe appeared, observers had to respond yes if they thought the location contained a red ‘O’ (hereafter referred to as a target) or, otherwise, respond no—that is, if the location contained a blue ‘O’ (hereafter referred to as a distractor) or was empty. In the experiment the set-size of target and non-target items was varied across trials to determine these variables’ effects on overall performance and the error pattern produced across distractor and empty locations.

Method

Participants

The study employed 16 participants between 18 and 45 years of age recruited from staff and students in the Department of Psychology, Oxford Brookes University. Ethical approval for this and all the following experiments had been obtained from the University ethics panel for research involving human participants. All participants were of normal or corrected-to-normal visual acuity and had normal colour vision. Some participants received a course credit for taking part in the experiment.

Stimuli

Stimuli were displayed on an 18-in. flat screen Sony Trinitron CRT monitor running at a refresh rate of 60 Hz. The monitor was controlled by an IBM-compatible PC containing an Intel Pentium 4 (2.66 GHz) CPU and NVDIA GeForce 4 graphics card, running purpose-written software routines in Microsoft Visual Basic (Version 6.0). The software controlled all aspects of the experiment, including randomisation, stimulus presentation, response recording and presentation of auditory feedback via loudspeakers. The monitor was viewed in a sound-deadened and darkened room. Some limited diffuse background illumination was provided by a light source positioned behind the participants in order for them to be able to see the keyboard. The monitor was viewed from an approximate distance of 120 cm. Accurate reproduction of stimuli was ensured by calibrating and testing the monitor according to the procedure described by Hunt (1991). Measurement during the calibration procedure was performed using a CRS ColorCal Colorimeter (Cambridge Research Systems, Cambridge, U.K.). The displayed stimuli consisted of coloured letters. In Experiment 1, these were all letter ‘O’ characters presented in Arial font (size 50). At this size, each character subtended an angle of 0.69° at the given viewing distance. The letters were either red (CIE 1931 coordinates Y = 23.3, x = .42, y = .34) or blue (Y = 23.3, x = .21, y = .25). Within the experiment, differing numbers of these items were presented at various locations on a square neutral grey background region which was equal in luminance to the red and blue of the letters (Y = 23.3). This equal luminance of coloured items and background minimised the possibility of afterimages being seen due to luminance contrast, which participants might otherwise rely on when a location was probed. The background area subtended an angle of 13.01° × 14.96°. The remaining portion of the monitor screen, which fell outside the background region, was shown as a black border.

Procedure

Each trial consisted of presentation of a number of stimuli, each positioned at an individual location on the grey background display region. The number of targets and distractors was varied factorially across conditions. Three target set-size conditions (6, 12, 24) and two distractor set-size conditions (6, 24) were given. Within each set-size condition, there was some deliberate variation in the number of the target and distractor items. Thus, on any trial, the target and distractor items each had a 50% probability of having one less than the specified set-size number. So, for instance, for the target set-size 12, distractor set-size 24 factorial condition, the number of targets would be 11 or 12 with equal probability, and the number of distractors would be 23 or 24 with equal probability. This variation was necessary in order to discourage participants from adopting a counting strategy to indirectly determine whether or not an item had been present at the probe location, particularly in small set-size conditions. Items were positioned at 1 of 81 locations on a notional 9 × 9 grid, which itself was positioned on the grey background. Individual positions on this notional grid were spatially jittered across trials so that the grid was always irregular in shape and so that items tended not to be aligned with one another in the displays. The spatial jittering of grid positions was done with the constraint that adjacent items never overlapped or touched one another. Allocation for individual red and blue items on the grid was determined on each trial using a randomisation procedure. Empty locations were designated as notional grid locations to which neither a red nor a blue item had been allocated on a particular trial. The display of items was presented for a period that could vary randomly between 1,000 and 1,500 ms before the appearance of the probe. The probe consisted of an opaque black square (subtending an angle of 1.0°) which always covered one of the notional grid locations (see Fig. 1 for a schematic illustration of a single trial). The position in which the probe would appear on any trial was pseudo-randomised within the notional grid. However, probe position was weighted so that it had an equal probability of occurring over each of the three types of locations (target, distractor, empty location). Participants were instructed that their task on each trial was to report whether or not they thought that there had been a target at the probe location before it was covered. They were told to indicate their decision by making an unspeeded keypress of one of two designated keys on a standard computer keyboard. They were told to press the right slash key (‘/’) to make a yes response and to press the left slash key (‘\’) to make a no response. The left and right slash keys on the computer keyboard were appropriately relabelled with a ‘Y’ and ‘N’ to indicate their respective meaning in the experiment. Participants were told that they were required to guess if they were unsure on any trial. They were also informed that approximately one third of the trials would require a yes response and two third of the trials a no response. Responses were recorded by the computer’s hard drive. Auditory feedback in the form of an error tone was given each time an incorrect response was given on all practice trials and on the main trials. This feedback was given to maintain participant alertness on the task and to minimise possible response biases. When the participant had made a response on a trial, this instigated the next trial after a 500-ms blank inter-trial interval. A demonstration and some practice trials were given to participants before starting the main trials in order to familiarise them with the displays, the nature of the task, and the required response mapping. Two versions of practice trials were given in sequence. The first version of the task was deliberately made easy so that participants should not make errors. In this version, the probe was a hollow outline square rather than an opaque square. This ensured that participants could still see whether the location contained a target or a distractor or was empty as they made their response. Giving this task ensured that participants fully understood the instructions and the particular response mapping required. When participants showed that they could perform these first practice trials without making errors (meaning that they had successfully learned the response mapping required for the task), they were given a second set of practice trials, which were the same as the main trials (i.e. the probe was now opaque rather than hollow). Participants were given 30 randomly selected practice trials including trials from each of the combinations of target and distractor set-size conditions. There were 360 main trials. Within these, there were equal numbers of trials with each of the six factorial combinations of target and distractor set-sizes. Participants were given a 1-min break halfway through the experimental trials.

Fig. 1
figure 1

Schematic diagram of a single trial in Experiment 1. Participants see a preview of the display consisting of red and blue ‘O’ shapes on a grey background for a period between 1,000 and 1,500 ms. After this, a probe (consisting of a small black square) appears to cover one of the display locations. In Experiment 1, participants had to make an unspeeded forced choice decision of whether the probe area contained a red ‘O’ or did not

Results and discussion

Results were analysed through calculation of the proportion of hits and FAs in the responses for different trial types. These were calculated separately for each set-size condition. The proportion of hits [p(Hit)] was calculated from trials on which target locations were cued. On such trials, correct responses were treated as hits, incorrect responses as misses: p(Hit) was computed as the sum of hits divided by the sum of hits plus misses. The proportion of FAs [p(FA)] was calculated from trials on which non-target locations were cued. On such trials, correct responses were treated as correct rejections and incorrect responses as FAs. Separate p(FA) measures were able to be calculated from the two types of non-target location (distractor, empty). Finally, a signal detection accuracy measure was calculated using the A-prime statistic (A′; see Macmillan & Creelman, 2005). A′ is a non-parametric response-bias-free measure of task performance; it is calculated from p(Hit) and p(FA) and returns a single value in a range between .5 (chance level) and 1.0 (perfect performance). A′ was calculated individually for each participant and for each set-size condition; for the calculation of A′, p(FA) was calculated from across both types of non-target trial, distractor and empty. The across-participant means are shown in Fig. 2. Statistical analysis concentrated on the two aspects of the data most relevant for our hypothesis: the A′ measure of task accuracy and the p(FA) rates at distractor and empty locations across the conditions. Both were looked at in terms of how they varied as a consequence of set-size.

Fig. 2
figure 2

Results from Experiment 1. a. A′ discrimination accuracy for each of the three target-set-size conditions, with separate bars shown for the two distractor set-size conditions (6 distractors, grey; 24 distractors, – white). b. p(Hit) for each target set-size condition, with separate bars shown for the two distractor set-size conditions (6 distractors, grey; 24 distractors, white). c. For the 6-distractor set-size conditions and d. for the 24-distractor set-size condition, p(FA) rates at distractor and empty locations for the three target set-size conditions (distractor locations are dark grey bars, and empty locations are light grey bars). Error bars show 95% confidence intervals around mean

Across-participant means for each of these measures across the different conditions are shown in Fig. 2. It can be seen here that the basic pattern of data was not consistent with the hypothesis that participants knew more about object locations than about the colours of objects at those locations. Against the hypothesis, FAs were never greater in number in distractor locations than in empty locations. When there were 6 distractors in a display, FAs were most frequent in empty locations (i.e. opposite to the direction predicted by the hypothesis; see Fig. 2c); when there were 24 distractors, than there was little difference in the FA rates at distractor and empty locations (see Fig. 2d). Further statistical analysis, reported below, confirmed these observations.

A two-way repeated measures ANOVA (target set-size [three levels: 6, 12, 24] and distractor set-size [two levels: 6, 24]) was performed on the A′ scores. There was a significant main effect of target set-size, F(2, 30) = 55.41, MSE = 0.006, partial η 2 = .787, p < .001, and distractor set-size, F(1, 15) = 5.88, MSE = 0.013, partial η 2 = .282, p < .05, but the interaction between the two factors was not significant, F(2, 30) = 0.816, MSE = 0.099, p = .45. For the p(FA) data, separate two-way ANOVAs were done for the 6- and 24-distractor conditions. Each of these ANOVAs contained two factors: target set-size (three levels: 6, 12, 24) and cue location (two levels: distractor, empty). For the 6-distractor condition, both main effects were significant: cue location, F(1, 15) = 8.79, MSE = 0.061, partial η 2 = .369, p < .01, and target set-size, F(2, 30) = 32.64, MSE = 0.015, partial η 2 = .685, p < .001. The cue location × target set-size interaction did not reach significance, F(2, 30) = 2.74, MSE = 0.023, p =. 081. For the 24 distractor condition, the main effect of cue location was not significant, F(1, 15) = 0.24, MSE = 0.007, p = .63, but that of target set-size was significant, F(1, 15) = 44.48, MSE = 0.013, partial η 2 = .748, p < .001. For the 24-distractor condition, as with the 6-distractor condition, the cue location × target set-size interaction did not reach significance, F(1, 15) = 1.69, MSE = 0.005, p = .85.

Thus, Experiment 1, while showing clear effects of certain variables, yielded no support for the hypothesis that participants knew more about the locations of objects than about their colour. The A′ measure showed that accuracy in the visual immediate memory task was strongly influenced by set-size. Significant effects were found for both target and distractor set-size. However, the effect of distractor set-size was somewhat smaller, consistent with the task irrelevance of these items within this experiment. Although accuracy declined with set-size, even with the most difficult condition (24 targets and distractors), accuracy was still well above chance.

As was noted above in the Results and Discussion section, FA responses [p(FA)] displayed an interesting pattern of data across distractor and empty locations, although not one which was consistent with our hypothesis regarding awareness of object locations in relation to awareness of their colour. When distractors were plentiful in the display (distractors = 24), approximately equal proportions of these responses were found at distractor and empty locations; when there were only a few distractors in the display (distractors = 6), a substantially greater number of FAs were found at empty locations, as compared with distractor locations. Although this effect of location was significant, it was in the opposite direction to our hypothesis; our prediction was that p(FA)s would be greater in distractor locations than in empty locations.

Although Experiment 1’s results were inconsistent with our hypothesis regarding awareness of object locations and object colour, it is quite possible that this was a consequence of the way in which this hypothesis was tested. Certainly, the results suggest that under some conditions, participants were adopting a grouping strategy, which may have acted against us finding our predicted pattern of results. The larger p(FA) rate at empty locations relative to distractor locations when distractors are few in number suggests that, under these conditions, participants tend to group distractors together as a single perceptual object. This grouping should lead to a more efficient encoding of the individual object locations, associated with the distractor colour which is the basis for the grouping. This might have the consequence of reducing the FA rate for these distractor items, as compared with empty locations, consistent with what we observed.

Such a grouping strategy would be far less effective when distractors were more plentiful in the display. This explain why the FA rate was no longer proportionally smaller at distractor locations, as compared with empty locations, when 24, rather than 6, distractors were present in a display. However, it does not, by itself, explain why a higher FA rate was not now observed for distractors, given that these items could no longer easily be strategically grouped together. Another aspect of the stimulus display might explain this. It is well known that items can be perceptually segregated efficiently on the basis of a single feature dimension such as colour (Beck, Peterson, & Angelone, 2007; Treisman & Gelade, 1980). The nature of the task may have allowed participants to weight attention towards the locations containing target items and away from those containing distractors (Bacon & Egeth, 1994; Wolfe, 1994). With attention weighted away from distractors, this may have reduced the likelihood that these items were, in many cases, consciously represented.

Experiment 2 was designed to increase the likelihood that distractor items would be attended on all trials, regardless of the number of them in each display. In it, targets were distinguishable from distractors only by a feature conjunction of colour and shape, to make perceptual segregation of targets and distractors more difficult (Kim & Cave, 1995).

Experiment 2

Experiment 2 presented the same target items as in Experiment 1 (red ‘O’s); unlike in Experiment 1, however, there were two types of distractors in the displays: ones differing in colour from targets (blue ‘O’s) and ones differing in shape (red ‘X’s). It was expected that this manipulation would made it harder to confine attention to the target items in the displays without also attending to at least some distractors. This change was predicted to have two effects. First, it was predicted that distractor set-size would have a greater effect on accuracy than it did in Experiment 1. Second, if awareness of object locations was superior to awareness of object colour, the greater attention received by distractor items in Experiment 2 would lead to a pattern of results in which FAs were more frequent at distractor locations than at locations which were empty.

Method

Participants

Sixteen participants were recruited from the same population as for the previous experiment. None had taken part in the previous experiment.

Stimuli and procedure

All aspects of the experiment were the same as for Experiment 1, except for the difference in the distractor items. Approximately half of the distractors were blue ‘O’s, and half were red ‘X’s.

Results and discussion

Data were analysed in the same manner as that described for Experiment 1. Across participants, averages for A′, p(Hit), and p(FA) statistics are shown in Fig. 3. Here, it can be seen that the pattern of FAs was essentially the same as that in Experiment 1 with regard to distractor and empty locations (see Figs. 3c, d), a pattern confirmed by the statistical analysis of these data, reported below.

Fig. 3
figure 3

Results from Experiment 2. a. A′ discrimination accuracy for each target set-size condition, with separate bars shown for the two distractor set-size conditions (6 distractors, grey; 24 distractors, – white). b. p(Hit) for each target set-size condition, with separate bars shown for the two distractor set-size conditions (6 distractors, – grey; 24 distractors, – white). c. For the 6-distractor set-size condition and d. for the 24-distractor set-size condition, p(FA) rates at distractor and empty locations for the three target set-size conditions (distractor locations are dark grey bars, and empty locations are light grey bars). Error bars show 95% confidence intervals around mean

For A′ scores, an ANOVA showed a significant main effect of target set-size, F(2, 32) = 10.46, MSE = 0.014, partial η 2 = .395, p < .001, and distractor set-size, F(1, 16) = 16.14, MSE = 0.017, partial η 2 = .502, p < .001, but no interaction between these two, F(2, 32) = 0.43, MSE = 0.007, p = .431. Further analysis of the accuracy data was done comparing the scores in Experiment 2 with those in Experiment 1. This was done to test our second hypothesis that the different type of distractor stimuli employed in Experiment 1 would lead to a proportionally greater influence of distractors in this experiment, as compared with the previous one. In particular, this analysis showed a significant target set-size × distractor set-size × experiment interaction, F(2, 62) = 134, MSE = 0.009, partial η 2 = .041, p < .05. This interaction shows that, while target set-size had a similar effect across the two experiments, the distractor set-size variable had a proportionally greater effect in Experiment 2 than it did in Experiment 1. Thus, having targets defined by a feature conjunction, as they were in Experiment 2, altered the relative importance of targets and distractors in the task, as compared with Experiment 1: While in Experiment 1 targets had a bigger effect on accuracy than did distractors, in Experiment 2 the reverse was the case.

The increase in the effect of distractors on accuracy in Experiment 2 shows that distractors were more difficult to ignore than targets in this task. Despite this, when we look at the FA data, we see essentially the same pattern as in Experiment 1. In particular, there was no effect of cue location consistent with out hypothesis. For neither the 6- nor the 24-distractor condition were FA errors greater in distractor than in empty locations. For the 6-distractor condition, the main effect of target set-size was significant, F(2, 32) = 7.29, MSE = 0.027, partial η 2 = .313, p < .002; that of cue location approached, but did not reach, significance, F(1, 16) = 3.81, MSE = 0.082, p = .069, although, as with Experiment 1, there was a slightly higher rate of FAs at empty locations than at distractor locations. For the 24-distractor condition, target set-size was again significant as a main effect, F(2, 32) = 14.87, MSE = 0.010, partial η 2 = .482, p < .001, but cue location was not, F(1, 15) = 0.30, MSE = 0.012, p = .865, reflecting the similar FA rate at distractor and empty locations in this condition. Neither distractor condition exhibited any significant interaction between the two main effects (min. p = .266).

Experiment 3

Neither of the first two experiments indicated any tendency for FA errors to be more frequent for distractor locations than for empty locations. These experiments were similar in presenting displays for which participants knew some items were task relevant (targets) and some which were task irrelevant (distractors). In independently varying these items, these experiments revealed some of properties of immediate visual memory—in particular, the fact that even irrelevant items influence the accuracy with which relevant items can be reported. However, these experiments were possibly not well suited to test our basic question about the relative awareness of object locations and object colours, because the task encouraged participants to treat target and distractor items differently. The task situation is, therefore, possibly different from that in Wolfe et al. (2006).Footnote 2 Experiment 3 was designed to produce an equal spread of attention across target and distractor locations. This was done, first, by always presenting approximately equal numbers of target and distractor items and, second, by varying the colour of to-be-reported targets across trials and informing participants of this colour only at the point where a response was required from them. In Experiment 3, a high or low tone informed participants about the target colour on any particular trial. Because items of either colour could be the target on any given trial, attention should be evenly distributed across all items. Again, it was predicted that if participants knew more about object locations than about their colours, more FAs should occur in reporting the presence of targets at non-target locations than at empty locations.

Method

Participants

Twenty-five participants were employed, recruited from the same population as that specified for the previous experiments. None had taken part in the previous experiments

Stimuli and procedure

The same stimuli and equipment as in Experiment 1 were used. There were four item set-size conditions (6, 12, 24, 48), and displays always contained approximately equal numbers of red and blue items (co-varying target and distractor set-size), although, as in the previous experiments, some variation was introduced into the number of red and blue items in each display to minimise the likelihood of a counting strategy being adopted for smaller set-sizes. Displays were presented for a variable interval of between 1,000 and 1,500 ms before the onset of the cue. As in Experiment 1, the cue appeared with equal probability at the location of a red ‘O’ or a blue ‘O’ or at an empty location. Simultaneously with the cue onset, a tone lasting 500 ms was played through loudspeakers to inform participants of the target colour for the trial: A high (1500-Hz) tone indicated that the target was a red ‘O’; a low (200-Hz) tone indicated that it was a blue ‘O’. High and low tones had an equal probability of occurring on each trial. Participants used a standard computer keyboard to respond. They were instructed to make an unspeeded response according to whether or not the cued location contained a target. The ‘9’ key of the numeric keypad of the standard keyboard was allocated as a yes response for trials where red items were designated as targets; the ‘3’ key of the numeric keypad was allocated as a yes response for trials where blue items were designated as targets. The ‘9’ and ‘3’ keys were relabelled with red and blue coloured paper, respectively, to indicate this fact. The space bar was allocated as a no response for both types of trial. Participants were told that when they heard a high tone, they should press the red key (i.e. make a yes-red response); otherwise, they should press the space bar (i.e. make a not-red response). If they heard a low tone and thought that the cued location contained a blue ‘O’, they should press the blue key (i.e. make a yes-blue response); otherwise, they should press the space bar. The blue key was rendered inactive on trials with a high tone, and the red key was rendered inactive on trials with a low tone, since these were not valid responses on these trials. Participants were given a demonstration and practice as described for the previous experiments before doing the main trials of the experiment. A total of 300 main trials were given. As with all the previous experiments, an error tone was sounded when an incorrect response was given on both practice and main trials.

Results and discussion

The p(Hit), p(FA), and A′ measures were computed for each set-size condition. Across-participant averages of these are shown in Fig. 4. As can be seen from the pattern of FA responses in Fig. 4c, there was a small difference in the FA rate between distractor and empty locations, consistent with the hypothesis; however, as the analysis below shows, this difference was not a statistically significant one.

Fig. 4
figure 4

Results from Experiment 3. a. A′ discrimination accuracy for each of the four display set-size conditions (6–48). b. p(Hit) for each display set-size condition. c. p(FA) rate for each display set-size condition, with distractor locations as dark grey bars and empty locations as light grey bars. Error bars show 95% confidence intervals around mean

A one-way ANOVA of the A′ values showed a significant effect of set-size, F(3, 72) = 109.76, MSE = 0.003, partial η 2 = .821, p < .001. A t-test showed that performance was still significantly above chance even in the condition with the lowest accuracy, t(24) = 13.24, p < .001. Analysis of p(FA) was done using a two-way ANOVA with two factors: cue location (distractor, empty) and set-size (6, 12, 24, 48). The main effect of cue location was not significant, F(1, 24) = 1.73, MSE = 0.016, p = .201; the main effect of set-size was highly significant, F(3, 72) = 147.88, MSE = 0.009, partial η 2 = .860, p < .001. There was no significant interaction between the factors, F(3, 72) = 1.71, MSE = 0.008, p = .172.

Experiment 3 was done to encourage participants to allocate equal attention to target and distractor locations. In doing so, the FA rate in Experiment 3, unlike in the proceeding experiments, did exhibit a small tendency for the FA rate to be lower at empty locations than at distractor locations. However, this tendency was not one which reached statistical significance. However, there is an issue with Experiment 3 which make the interpretation of the results problematic with respect to the hypothesis regarding awareness of object locations and their colour. This issue was response bias. The pattern of responses suggested a tendency towards a conservativism criterion in participants when responding about the presence of a target (i.e. to respond no when unsure). A formal calculation of response bias (BD; see Donaldson, 1992) confirmed this tendency, which was particularly evident for the larger set-sizes (e.g. with set-size = 48, B D = +0.37; positive values on this measure indicate conservatismFootnote 3).

It is not clear why participants displayed such a bias, although there are at least two aspects of the experiment which may have contributed towards it. One possibility is the task contingencies: Fewer trials required a yes response than a no. The lower frequency of positive trials may have discouraged participants from making yes responses when uncertain. A second possibility concerns the response mapping. While the yes response was required only when a participant believed a target colour was present at a location, a no response was required in two cases: when the participants believed the location was empty and when they believed it contained a non-target. This response mapping in which the no response indicated two forms of decision possibility may also have encouraged participants to err towards making no responses when uncertain. The response conservativism would almost certainly have had the effect of reducing the number of FA errors on the task. As a consequence, this may have had reduced the sensitivity of FAs as a measure of visual awareness and, in turn, reduced the possibility of our finding statistically reliable differences across distractor and empty locations in this measure.

To tackle the possible influence of response bias, one could manipulate the relative occurrence of different trial types (e.g. increase the number of trials on which a target item is probed to try to increase yes responses). In the present experiment, the cue was presented with equal probability at target, distractor, and empty locations. In increasing the number of positive (target) trials, this balance across the different trial types would be lost; in particular, a smaller proportion of trials would have to be given in which the cue covered one of the empty locations. In addition, this manipulation would not address the other possible issue associated with the response bias mentioned in Experiment 3—that of the yes or no response requirement in which a yes response is required for one trial type, while a no decision is required for two trial types.

Both of these problems are addressable by discarding the two-alternative yes/no paradigm used in the experiments so far described and, instead, adopting a task in which participants are given three response alternatives—that is, report red, blue, or empty. This three-alternative response task (3AFC) confers several advantages for our task. First, it means that each location type is mapped onto a single response key in the experiment. Second, the 3AFC allows us to maintain the same task contingencies as in Experiment 3, in which the probe occurred with equal probability in each location type. In allowing participants to directly report what they thought was present at the location of the probe on each trial by selecting one of three keypresses, it was hoped that this would remove the response bias issue present in Experiment 3 and, in doing so, provide a more sensitive measure of what people know about the displays when tested.

Experiment 4 gave such a task as a final test of our original prediction. With the 3AFC task, it is not possible to talk of FAs as such. However, were it the case that participants knew more about item locations than about their colours, one would expect a particular pattern of response errors. Errors in incorrectly making a colour response (i.e. incorrectly responding red at a location where no red item was present or responding blue where no blue item was present) should occur more frequently at filled locations than at empty locations. For instance, we would expect to find that participants tended to incorrectly respond red to a location containing a blue item (or to incorrectly respond blue to a location containing a red item) more than to make these response errors at unfilled locations.

Experiment 4

Method

Participants

Twenty-five participants were employed, recruited from the same population as that specified for the previous experiments. None had taken part in any of the previous experiments

Stimuli and procedure

The same stimuli and equipment were used as in Experiments 1 and 3. The presentation of the visual displays was the same as in Experiment 3, except that no cue tone was presented when the probe appeared. When the probe appeared, observers were instructed to select from three response alternatives The same response keys for red and blue were used as in the previous experiment; the space key was designated as the empty response key. These three keys were accordingly relabelled on the computer keyboard. A demonstration and practice, as described for Experiment 1, was given before doing the main trials of the experiment. A total of 300 main trials were given in the main experiment. Feedback in the form of an error tone was given for incorrect responses on both practice and main trials,

Results

The proportions of correct and incorrect responses were calculated for each participant. The means of these are shown in Fig. 5, which gives (a) the proportion of correct colour responses (i.e. trials on which the occluded location contained a red [or blue] item and a red [or blue] response was given); (b) the proportion of correct empty responses (i.e. trials on which the probe location was empty and an empty response was given); (c) the proportion of incorrect colour responses (i.e. responses in which participants incorrectly responded red [or blue] to a location where a red [or blue] item was not present; separate bars are shown for the proportion of such errors at object locations [e.g. responding red to the location of a blue item] and at empty locations [e.g. responding red to an empty location]); and finally, (d) the proportion of incorrect empty responses (i.e. trials on which the location was filled and the response was empty).

Fig. 5
figure 5

Results from Experiment 4. a. p(correct ‘colour’ responses) for each set-size condition. b. p(correct ‘empty’ responses) for each set-size condition. c p(incorrect ‘colour’ responses) for each set-size condition, shown separately for these errors at object and empty locations. d. p(incorrect ‘empty’ responses for each set-size condition. Error bars show 95% confidence intervals around the mean

The key issue is the data pattern in Fig. 5c. Consistent with our hypothesis of observers knowing more about the locations of objects than about their colour, errors in reporting a coloured item were more frequent at filled (object) locations than at empty ones. This was the case across all four set-sizes.

The data in Fig. 5c were subjected to a two-way ANOVA: set-size (6, 12, 24, 48) and location (object, empty). The main effect of set-size was highly significant, F(3, 72) = 269.12, MSE = 0.002, partial η 2 = .918, p < .001; however, critically for the hypothesis, so was that of location, F(1, 24) = 10.2, MSE = 0.09, partial η 2 = .30, p < .001. The interaction between the two factors did not approach significance, F(3, 72) = 0.08, MSE = 0.003, p = .968.

Discussion

Experiment 4 involved presentations of the same stimulus displays as in Experiment 3, yet the basic result was different in terms of its significance. Results suggested that participants knew significantly more about object locations than about their colour. The pattern of errors in Experiment 3, while showing a trend in this same direction, did not show this effect to be statistically reliable. Thus, consistent with our original hypothesis, Experiment 4 finally demonstrated that participants did know somewhat more about the locations of the objects in the display than about the colours of the objects. However, it must be noted that this effect, although statistically reliable, was rather small in magnitude.

Given that the presented displays were the same in the two cases, why should Experiment 4 have yielded a significant effect consistent with our hypothesis when Experiment 3 did not? The only critical difference between the two experiments was in the responses participants were required to make and the differing task demands associated with these. Allowing three response options, rather than two, gives participants the opportunity to report more directly about what they think they know about a tested location. This may have made the task more sensitive to participants’ knowledge of what they had seen. When given a yes/no decision, participants seemed biased towards responding no under conditions of uncertainty. Because of this, what participants knew about object locations was, perhaps, not always expressed in the pattern of FAs. Such no responses were unavailable in Experiment 4. In this experiment, if participants knew that an object was present at a cued location but were not confident of its colour, it seems unlikely that they would respond empty; it is more likely that they would choose to respond either red or blue. Thus, participants’ knowledge about filled and empty locations would tend to be reliably expressed in the incorrect colour response rate in a way that it was not in the task given in the previous experiments. Indeed, examination of the response frequencies showed that the response bias prevalent in Experiment 3 was no longer an issue in Experiment 4. While in Experiment 3 there was a clear tendency towards a particular response, this was not the case in Experiment 4. In Experiment 4, each one of the three responses was made almost exactly on one third of the trials.

General discussion

Before deliberating on the findings of the four experiments regarding awareness of object locations and colour, we first briefly consider the immediate memory task itself. Across all the experiments, set-size variations had a pronounced effect on immediate memory task performance; generally, the more items there were in a display, the greater was the number of errors made, For target set-size, these effects were consistent with the greater load placed on the resources of visual attention and VSTM in monitoring and maintaining a representation of these items (Bays & Husain, 2008; Luck, Hillyard, Mouloua, & Hawkins, 1996; Luck & Vogel, 1997; Palmer, 1994). The task in Experiments 1 and 2, in having always the same target type across all trials potentially, allowed the distractors to be ignored. However, these experiments suggest that distractors were not ignored entirely. This is demonstrated by the fact that the number of the task-irrelevant (distractor) items significantly affected the accuracy with which the presence or absence of a target at a location was reported. The effect of distractor number was most evident when these items differed from targets across a conjunction of two features (Experiment 2); here, the number of distractors in the display influenced immediate memory accuracy more than did the number of targets.

The effect of distractors in these tasks presumably reflects the extent to which these items capture attention in an involuntary manner (Foster & Lavie, 2008; Theeuwes, 2004) and, in doing so, compete with targets for VSTM resources (Fakuda & Vogel, 2009; Vogel, McCullough, & Machizawa, 2005). Work on visual marking has often reported that when static irrelevant items are presented in a manner similar to that in the present experiments, they can be effectively inhibited and, thus, ignored (Donk, 2006; Watson & Humphries, 2000). Contrary to this, our results suggested that irrelevant items were not totally disregarded, even when distinguishable from targets by a simple colour disjunction (Experiment 1). This aspect of our results was in accordance with findings of Vogel et al. (2005). These authors used a change detection paradigm in which the memory and test array consisted of varying numbers of red and blue tilted bars. There was a general tendency for accuracy in detecting orientation changes to be lower when two blue bars were present, as compared with when none was present (although this was most prevalent in participants deemed to have a low VSTM capacity). Experiment 1 showed a similar tendency for wholly irrelevant items to influence accuracy in the rather different circumstances of an immediate memory task in which only the presence or absence of a red item needed to be reported. The greater interference with distractors defined by a conjunction (Experiment 2) occurs presumably because of the increased difficulty in allocating attention to targets when these are defined on the basis of multiple feature values (Bettencourt & Somers, 2009; Kim & Cave, 1995; Pinsk, Doniger, & Kastner, 2004; Wolfe, 1994).

Despite these effects, there was no significant tendency for distractors to be misreported as targets any more than empty locations. The only difference in the FA rate observed between distractor and empty locations in Experiments 1 and 2 was in the opposite direction of our hypothesis. Here, a greater number of FAs tended to be observed when distractor number was small. This effect we attribute to distractors being grouped together as a single object in the task when few in number. Similar grouping effects with small numbers of objects have been reported with dynamic displays in MOT tasks (e.g. Yantis, 1992).

The most critical experiments regarding our proposed hypothesis were Experiments 3 and 4. Unlike Experiments 1 and 2, these Experiments necessarily required equal attention to target and distractor items. In Experiment 3, this was done by telling participants the colour of the target item only after a display location was probed; in Experiment 4, it was done by not designating any items as targets and simply getting participants to report what they thought was contained at the probed location.

Experiment 3, like Experiments 1 and 2 preceding it, showed no significant tendency for distractor locations to be falsely reported as targets any more than empty locations. One clear issue we noted with the yes/no task in these experiments, however, was a marked tendency for participants to show bias towards making negative responses when uncertain. This may have meant that the participants’ knowledge about the locations of objects did not clearly translate into the pattern of errors this task produced. Experiment 4, in having three response options rather than two, seemed to address this possible response bias issue; here, an even spread was found across the three response categories, suggesting an absence of any tendency to favour any one response. In doing so, this task also revealed a clear tendency for certain response errors to occur more frequently at object locations, a pattern which suggested that participants sometimes clearly were aware that an object was present at a particular location when they were unaware what colour that object was. In other words, it showed that participants sometimes knew more about item location than about its colour (although even in Experiment 4, it must be noted that this main effect was rather modest in size).At the same time, the absence of any significant interaction between this location effect and set-size showed that awareness of object locations declined significantly with set-size at the same rate as awareness of colour.

Colour was a relevant attribute in Experiment 4 (and all the other experiments in this study). Work on change detection has shown that that our awareness of object attributes varies depending on how much attention is directed towards that attribute (Austen & Enns, 2000, 2003; Robinson & Triesch, 2008). Had colour been made less relevant to the task, the difference in awareness of object locations and object colour might have been even greater than that found in Experiment 4. The paradigm in Experiment 4 could easily be modified to test this possibility. This could be done by varying the type of response participants are required to make across different trials of the experiment, so that most trials did not require attention to colour. In such an experiment, most trials would require just a yes/no response about the presence or absence of any object at a cued location (meaning that colour could be ignored), and only occasional trials would require a three-alternative response, as in Experiment 4. Within the context of this experiment, a much larger difference might be found on the three-alternative response trials in the rate of incorrect colour responses at object and empty locations than that found in Experiment 4. Further research, using the immediate memory paradigm, could explore the influence of attention towards different attributes of an object.

The type of displays used in Experiment 4 may possibly explain the modest size of effect found in this experiment with respect to errors at filled and empty locations. Experiment 4, like Experiments 1 and 3, had displays containing items of one of two colours. These two-colour displays may possibly encourage grouping strategies in which spatially adjacent same-coloured items become represented as single perceptual objects (Rensink, 2001; Yantis, 1992). We, in fact, noted evidence for such a process based on the patterns of FAs in Experiments 1 and 2.

Using a change detection paradigm, Sanocki et al. (2010) recently found that participants tended to be rather good at detecting changes to object layout within a display. The good performance observed on this task was suggested to be due to the efficiency with which display items can be grouped by location. The authors argued that such grouping would be less effective for other attributes of objects than they would be for location. Contrary to this claim, our results seem to suggest that colour groupings are also effective. However, if displays contained multiple colours (rather than the two-colour displays used here), such grouping strategies might tend to founder and lead to poor accuracy in reporting about colour. Further research could explore the involvement of grouping processes in determining immediate memory performance and look at how the efficiency of such processes is affected by factors such as the heterogeneity of the display items.

The results of Experiment 4 are consistent with the findings of Huang and colleagues (e.g. Huang, 2010; Huang, Treisman, & Pasher, 2007) discussed earlier. Across several tasks, these authors found that location information was more readily available than colour information. For instance, Huang found that observers could more quickly determine whether two displays were different in terms of the locations of the objects they contained than they could determine whether they were different in the colours of the objects the displays contained. Their pattern of results suggested that the advantage for location information resulted from the efficiency with which location information could be selected in parallel. We found a similar advantage for location information over colour, although one more modest in terms of effect size.

One intriguing aspect of Huang’s (2010) data was the interaction between display set-size and awareness of object location and object colour. When the task required judgements about object locations, flat functions were obtained; when the task required judgements about colour, steep functions were obtained. In other words, the more items a display contained, the larger were the differences in awareness of location and colour information about objects. Inconsistent with this, Experiment 4 showed no hint of any interaction between set-size and location (distractor or empty). Nor did errors at empty locations produce anything like a flat function with set-size: Our data suggested that awareness of object locations deteriorated with set-size at the same rate as awareness of colour.

These differences in results are not surprising. Fundamental differences exist both in the complexity of the displays shown and in the demands placed on the participants in our experiments and in those performed by Huang (2010). In our experiments, displays tended to contain far more than the maximum of about seven items per display given in Huang’s work. Second, the immediate memory task measured an aspect of visual awareness quite different from that measured by Huang. Our experiments always required a perceptual decision about a single display location; those of Huang typically required decisions about multiple locations. Thus, Huang’s tasks tested the ability to simultaneously process multiple sources of visual information, while our tasks merely tested what people were able to report about a single location. What our experiments demonstrate is that we often have a limited awareness of much of the visual information present before our eyes. The more items are present in a display, the less likely it is that we will be able to report on any individual item when tested, in terms of either its colour or even its existence at a particular location. This presumably reflects the limited capacity of VSTM (Vogel et al. 2005; Wolfe et al., 2006). The experiments of Huang and colleagues (Huang et al., 2007) demonstrate a different limitation, one of simultaneous access to the attributes of multiple objects. While location information can be simultaneously accessed in this way, colour information cannot. Further work could explore the relationship between the limitation in awareness revealed by the present tasks and the limitations imposed by selection. This could be done within the immediate memory paradigm by presenting displays in the same manner, but with two probed locations on each trial, both of which require report. The two cues could be presented simultaneously or in sequence after a short temporal gap. Comparison between these conditions would show the extent to which information about different attributes such as colour or location can be selected in parallel under the conditions imposed by our experiments.Footnote 4

Another aspect of our data requires comment. Accuracy was generally rather better in our experiments than in those of Wolfe et al. (2006). They found that observers were little better than chance in reporting which of two colours an object was in a display, except in cases in which attention had been recently drawn to that item. We found higher than chance accuracy levels even for displays of 48 items—that is, more than double the number of items present in the equivalent task in this earlier study (Wolfe et al., 2006, Experiment 1). How can we explain the discrepancy? In Experiments 1 and 2, this could, perhaps, be explained by the fact that the task weighted attention towards a subset of the display items. However, Experiment 3 required participants to focus on all display items, making this basic task at least as difficult as that in Wolfe et al.’s original experiments (in fact, arguably, it was more challenging because of the greater spatial uncertainty with which the probe could appear in the display; it could appear in empty as well as filled locations, whereas in Wolfe et al.’s task, the probe was constrained to appear only in filled locations). Despite this, performance remained well above chance for most participants even at the largest set-sizes.

We suspect that the contingencies of the tasks might explain the better performance we observed. In Wolfe et al.’s (2006) experiments, display item number was held constant; in the present experiments, it was varied. With the smallest set-sizes presented in our experiments, the task was almost trivially easy, and few errors were made. The inclusion of such easy trials may have motivated participants to engage with the task more strongly, leading to better performance than would have been observed had the experiment contained only the most difficult trials. However, this is speculation. Further research is needed to explore these issues more thoroughly to determine the factors which influence the observed accuracy level on these tasks. For now, it seems clear that immediate memory for colour may not be as profoundly limited as might be concluded from Wolfe et al.’s data.Footnote 5

As was described earlier, the immediate memory paradigm can be considered to have advantages over other methods (such as the flicker or mudsplash techniques) for measuring what we know about what we see. One potential issue concerns the role of the probe in the task. We have argued that when an observer makes an incorrect response in reporting about a tested region, this is because his or her conscious visual representation of the scene lacked the required information. However there is an alternative possibility: It could be that a representation was held but that it was substituted by the occluding probe object in a manner similar to that which occurs in the phenomenon of object substitution masking (see Di Lollo, Enns, & Rensink, 2000). Indeed, Wolfe and colleagues suggested such a possibility as a reason for the poor performance found with their occluding cue (Wolfe et al., 2006, p. 756). Against this explanation, it has been shown that object substitution masking effects diminish with the time with which items in a display are present; with items present or longer than about 500–800 ms, substitution effects are often weak or absent entirely (Gellatly, Pilling, Carter, & Guest, 2010). Given the long (>1,000 ms) display exposure times given in these immediate memory experiments, this suggests that object substitution effects are unlikely to be prevalent under the conditions of the experiments in this study. While object substitution per se may not be involved, other research has shown that representations of visual information are certainly vulnerable to overwriting by a subsequently presented probe, particularly under conditions in which attention is diffusely spread across multiple items (Makovski, Sussman, & Jiang, 2008).Footnote 6 Further research is needed to determine the extent to which item information is simply unavailable in these immediate memory experiments under conditions where the presence or feature of an item cannot be reported or the extent to which the absence of this information is due to overwriting of an existing representation by the probe.

Fundamentally, the research described in this article demonstrates the utility and potential flexibility of the immediate memory paradigm as a method of probing our moment-to-moment awareness of what our eyes tell us. The task, in its basic form, is simple and makes relatively few assumptions about the relationship between what a task measures and the nature of our visual experience, as compared with rival methods such as the flicker paradigm. Further research using this method would be profitable in exploring the nature of our visual representations and how they are affected by the contents of the displays we view, by our current task goals, and by the limits of our attention. Our findings also show that it is surprisingly difficult to demonstrate that observers know more about the locations of objects than about their features (cf. Huang, 2010).