Object-based selection in visual working memory

Lin, Yin-ting; Kong, Garry; Fougnie, Daryl

doi:10.3758/s13423-021-01971-4

Object-based selection in visual working memory

Brief Report
Open access
Published: 13 July 2021

Volume 28, pages 1961–1971, (2021)
Cite this article

Download PDF

You have full access to this open access article

Psychonomic Bulletin & Review Aims and scope Submit manuscript

Object-based selection in visual working memory

Download PDF

3326 Accesses
5 Citations
Explore all metrics

Abstract

Attentional mechanisms in perception can operate over locations, features, or objects. However, people direct attention not only towards information in the external world, but also to information maintained in working memory. To what extent do perception and memory draw on similar selection properties? Here we examined whether principles of object-based attention can also hold true in visual working memory. Experiment 1 examined whether object structure guides selection independently of spatial distance. In a memory updating task, participants encoded two rectangular bars with colored ends before updating two colors during maintenance. Memory updates were faster for two equidistant colors on the same object than on different objects. Experiment 2 examined whether selection of a single object feature spreads to other features within the same object. Participants memorized two sequentially presented Gabors, and a retro-cue indicated which object and feature dimension (color or orientation) would be most relevant to the memory test. We found stronger effects of object selection than feature selection: accuracy was higher for the uncued feature in the same object than the cued feature in the other object. Together these findings demonstrate effects of object-based attention on visual working memory, at least when object-based representations are encouraged, and suggest shared attentional mechanisms across perception and memory.

Memory-driven capture occurs for individual features of an object

Article Open access 11 November 2020

Feature-based and spatial attentional selection in visual working memory

Article 11 January 2016

Object-based visual working memory: an object benefit for equidistant memory items presented within simple contours

Article Open access 29 October 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Visual attention selects salient or behaviorally relevant objects, resulting in faster and more accurate responses to those objects at the expense of other information in the environment (Duncan, 1984; Egly et al., 1994; Maunsell & Treue, 2006; Posner, 1980). But what happens when information is no longer available to perception? We can temporarily hold task-relevant information in visual working memory (VWM), and recent research has suggested that selective attention mechanisms also operate in working memory (Chun et al., 2011; Kiyonaga & Egner, 2013). For example, cueing an item in VWM improves performance for that item, i.e., the retro-cue effect (Griffin & Nobre, 2003; Landman et al., 2003; Souza & Oberauer, 2016). Similarities in the effects of selection in perception and working memory are consistent with theoretical accounts suggesting a close relationship between attention and working memory (Cowan et al., 2005; Oberauer, 2019).

However, the exact nature in which attention and working memory interact is still unclear. Research has explored whether visual attention and VWM share limited resources (Cowan, 2001; Rensink, 2000; but see Fougnie & Marois, 2006), have common neural underpinnings (Awh et al., 1999; Awh & Jonides, 2001; Gazzaley & Nobre, 2012; Nobre et al., 2004), or rely on a common template (Kong et al., 2020; Olivers et al., 2011), mostly to ambiguous results. Work addressing whether selection in perception and VWM draw on similar representational properties (Kong & Fougnie, 2019) provides another avenue to investigate this question.

Extensive work has shown that multiple forms of attention exist in perception. In light of the debate on whether memory and perception share representation content (Harrison & Tong, 2009; Serences et al., 2009; Xu, 2017), it is important to examine whether this division between types of attention in perception also applies to VWM. Yet previous studies addressing this have focused on spatial selection (e.g., Bloem et al., 2018; Fang et al., 2019; Sahan et al., 2016; Souza et al., 2018). Given that early models of visual attention were predominantly focused on spatial properties (Downing & Pinker, 1985; Eriksen & Yeh, 1985; Posner et al., 1980), this is hardly surprising.

However, a considerable number of studies have since found that attention operates on non-spatial representations, such as features (Maunsell &Treue, 2006) or objects (for reviews, see Chen, 2012; Scholl, 2001). In perception, paradigms of object-based attention have shown enhanced performance for features on the same object, compared to those on overlapping or equidistant locations on different objects (Duncan, 1984; Egly et al., 1994), demonstrating a same-object advantage that is independent of space-based attention (but see Donovan et al., 2017; Vecera, 1994). Furthermore, this object-based mechanism was distinguished from feature-based selection, as attending to one feature of an object also enhances processing for other object features (Ernst et al., 2013; O’Craven et al., 1999; Schoenfeld et al., 2014).

Object-based attention is especially important here, as there are reasons to suspect that selective mechanisms in VWM can operate over objects versus features or locations. Some suggest that objects are fundamental units of memory representations (Irwin & Andrews, 1996; Luck & Vogel, 1997; Vogel et al., 2001; but see Bays et al., 2009; Fougnie & Alvarez, 2011). Accordingly, research has suggested possible object-based effects within VWM (Awh et al., 2001; Bao et al., 2007; Gao et al., 2017; Hajonides et al., 2020; Matsukura & Vecera, 2009; Peters et al., 2015; Sahan et al., 2020; Woodman & Vecera, 2011). For example, Woodman and Vecera (2011) found that participants were less accurate when switching between different objects during memory retrieval. However, these studies often overlooked the potential contribution of location to object-based effects. Recent work has shown the importance of location in feature binding in memory (Golomb et al., 2014; Kovacs & Harris, 2019; Pertzov & Husain, 2014; Schneegans & Bays, 2017) and even suggested that observed object-based benefits arise from effects of spatial selection (Wang et al., 2016). Given that others did not find object-based attentional effects in VWM (Ko & Seiffert, 2009), it is important to isolate object-based benefits from space-based effects.

Here we investigate whether object-based effects in visual attention – beyond that which can be explained by spatial or featural attention – also apply when information is selected and updated in VWM. Experiment 1 used a memory updating task, in which participants updated equidistant features in same or different objects, to examine whether memory selection is also faster for features on the same object. Experiment 2 examined whether selection of a feature automatically leads to selection of other features within the same object by manipulating the relevance of both the object and the feature dimension (color or orientation) in a retro-cue task. Importantly, we presented objects at overlapping locations to control for location-based effects. If object-based representations guide memory selection in a similar way, selecting a feature should also facilitate selection of another feature in the same object, regardless of whether they share the same location or feature dimension.

Experiment 1

Egly et al. (1994) demonstrated object-based attentional effects in a spatial cueing paradigm (for reviews, see Reppa et al., 2012; Shomstein, 2012). Cueing one end of one rectangular bar facilitates detection of invalid targets at the opposite end of the cue, compared to those on a different rectangle. Because invalid targets in both the same and different rectangles were equidistant from the cue, their findings demonstrate a same-object benefit that cannot be attributed to effects of spatial proximity.

Here we used a memory updating task to examine whether a similar selection benefit occurs in VWM. Participants memorized two rectangle bars with colored ends. Subsequently, participants updated colors of two rectangle ends that could be on the same bar, at equidistant locations on different bars, or diagonally located on different bars. As in previous memory updating studies (Kong & Fougnie, 2019), we measured reaction times in a self-paced updating procedure to assess which items are selected more efficiently. Finally, participants were tested on their updated memory in a change-detection display, in which we scrambled the location and diminished the size of bars to encourage encoding of objects (rather than spatial positions).

Method

Participants

Eighteen students (15 female, 3 male; mean age 19.72 years, age range: 18–23 years) at New York University Abu Dhabi participated in Experiment 1 in exchange for course credit or allowance of 50 AED per hour. One participant had an accuracy rate of 50% and was replaced.

To determine the sample size required for paired t-tests, we conducted power analysis using G*Power (Faul et al., 2007), using the smallest effect size (d_z = 0.7) in a previous memory updating study (Kong & Fougnie, 2019; Cohen ' s d_z ranging from 0.7 to 1.3). As we were aiming to detect an effect of one of the main modes of attention, we decided that smaller effect sizes would not fulfill that criterion. We estimated that a sample size of 18 participants would yield a power of .80 at alpha level of .05.

All participants reported normal or corrected-to-normal vision. Each participant gave written informed consent before the experiment. The study was approved by the New York University Abu Dhabi Institutional Review Board, and follows the principles laid out in the Belmont Report.

Apparatus and stimuli

The experiment was programmed in MATLAB using the Psychtoolbox extension (Brainard, 1997; Kleiner et al., 2007; Pelli, 1997). All stimuli were displayed against a light grey background on a 24-in. BenQ XL2411 monitor (1,920 × 1,080 pixels) placed 57 cm from the participant. The memory display consisted of two parallel rectangle bars (11.047° × 1.93°), one black and one grey, oriented horizontally or vertically (second frame in Fig. 1). Each end of the bars contained a square (1.61° × 1.61°), the color of which was selected from nine possible options (white, red, green, blue, yellow, purple, pink, orange, brown). The probe display contained two parallel bars (5.52° × 0.97°) with colored squares (0.81° × 0.81°) at each end and in the same orientation as in the memory display.

To exclude the possibility that participants were merely encoding color-location pairings instead of bound objects, we decreased the size of the probe display by 50% and manipulated the position of objects: there was a 50% probability that an object would be spun around, a 50% probability that the other object would be spun around, and a 50% probability that the objects would swap positions with each other such that each colored square had an equal probability of appearing at the four possible locations. Participants were instructed with regard to these potential changes and were told not to base the same or different judgment on these irrelevant changes. On half of the probe displays, the objects would be correctly updated, and on the other half, there was an equal probability that a target square retained its old color, that a target square was changed to a color that was not used during that trial, that diagonal squares swapped positions, or that the bars swapped colors.

Design and procedure

A summary of the trial sequence is shown in Fig. 1. Each trial began with the presentation of a fixation cross (length 0.81°, width 0.16°) for 500 ms. To minimize verbal encoding and rehearsal, participants were instructed to repeat the word “COLA” upon the onset of the fixation cross and to continue repeating throughout the trial and to stop only once the memory response was completed. Articulatory suppression was monitored remotely by the experimenter. Afterwards, the memory display was presented and remained on-screen until participants indicated with a mouse click that they had fully memorized the objects. This would ensure that participants had accurate memory of the display. The display was then replaced by a blank screen (500 ms), followed by the updating task. Participants were shown on-screen instructions to update two of the colored squares into two new colors, such as “Change the BLUE item to PURPLE” and “Change the BROWN item to GREEN.” Trials were equally divided between three update conditions: the to-be-selected squares were on the opposite ends of the same rectangle, on adjacent locations of different rectangles, or on diagonal locations of different rectangles. Update times were self-paced.

After participants indicated with a mouse click that they had updated the items mentally, the probe display was presented for 1 s. The presentation time was kept brief to discourage participants from updating during the memory test. Participants responded with a left mouse click if the objects were updated correctly or with a right mouse click if they were updated incorrectly. Participants could start the response upon presentation of the probe display. Upon response, feedback on accuracy was provided by showing “CORRECT” in green or “INCORRECT” in red. The probe display remained on the screen. In the case where the probe was updated incorrectly, we gave additional feedback by outlining the incorrectly updated squares in correct colors or outlining the rectangle bars in red when their colors were swapped.

Participants completed 180 trials divided into seven blocks. The experiment was preceded by 15 practice trials. In addition, to increase motivation, participants received 10 AED bonus if their accuracy rate was above 75% and 20 AED bonus if it was above 85%.

Analysis

In order to determine whether it was easier to update the same object, we conducted paired t-tests to compare updating performance (probe accuracy and reaction times (RTs) for the self-paced updating period) for equidistant pairs of targets on the same object or on different objects. Since we had no specific hypotheses for the condition involving the diagonal update, it was left out of the analysis. Differences in probe accuracy are assumed to reflect differences in updating (not differences in encoding) since encoding conditions did not differ between conditions. Similarly, longer update RTs are thought to reflect a more difficult updating transition between the first and the second update instruction. Updating times deviating more than 3 standard deviations from the mean were excluded as outliers, leading to a loss of 1.61% of all trials. Below we include analyses with incorrect trials included. However, excluding incorrect trials did not impact the findings.

Our experiment included a manipulation of bar orientation and/or position between the memory and probe displays to discourage spatial encoding. If participants did not rely on a spatial strategy to perform the task, scrambling the locations of objects at test should not disrupt memory performance. To verify this, we analyzed probe accuracy and decision time with 3 (bar rotation: no change, one bar rotated, both bars rotated) × 2 (position swap of bars: no swap, swap) repeated-measures ANOVAs.

Results and discussion

Mean encoding duration for the initial display was 8.00 s (SD = 4.09). We did not further analyze encoding duration as memory displays did not differ across conditions.

Updating performance

Mean update times are shown in Fig. 2a. Participants were faster in updating target squares on the same (3.80 s) versus different objects (4.58 s), 95% confidence interval for mean difference, CI [0.13 1.42], t(17) = 2.54, p = .021, d_z = 0.60. In addition, probe accuracy (Fig. 2b) was higher when targets were on the same (77.50%) versus different objects (74.07%), 95% CI [0.03 6.82], t(17) = 2.13, p = .048, d_z = 0.50. This rules out the possibility that the observed difference in update time was due to speed-accuracy tradeoff and provides further evidence for a same-object benefit in updating performance. Further, we analyzed the time to make a decision to the probe display to check whether faster updates for the same object were because participants chose to update during the probe display instead of during the instruction display. However, decision times did not differ between same (1.52 s) and different (1.58 s) objects, 95% CI [-0.23 0.10], t(17) = 0.80, p = .437, d_z = 0.19, suggesting that there were no strategic differences in updating during the probe display.

Probe manipulation

We tested whether spatial manipulation of the probe display disrupted performance. Analyses on both probe accuracy and decision time showed no main effects or interactions (all Fs < 3.06, ps > .098). Thus, presenting objects at a different location or in the opposite direction had little impact on performance, consistent with work showing that memory is relatively unimpaired by irrelevant location changes during the probe display (Logie et al., 2011; Treisman & Zhang, 2006; Udale et al., 2018; Woodman et al., 2012; but see Hollingworth, 2007; Jiang et al., 2000).

To assess whether our probe manipulation was critical to observing the object-based benefits, we analyzed updating performance on trials where the probe display preserved the spatial layout of the memory display (no bar rotation or swaps). There were relatively few trials in this baseline condition (meaning the statistical tests had reduced power). There was no effect in update time, t(17) = 1.75, p = .098, CI = [-1.61, 0.15], d_z = 0.41, but the trend was toward faster update times for the same (3.81 s) versus different objects (4.53 s). Accuracy did not differ between same (77.0%) and different objects (72.0%), t(17) = 0.93, p = .365, CI = [-6.37 16.41], d_z = 0.22.

During visual perception, it is faster to shift attention between locations within the same object than across two objects (Egly et al., 1994), thus demonstrating object-based attention. In line with this, we found that updating two features on the same object is faster and more accurate than updating features across two objects despite an equal spatial separation between conditions. This suggests that selection over perceptual and memory representations operate via similar object-based mechanisms.

One potential limitation of Experiment 1 is that the colored squares within each object structure might not have been represented as an object (but see Xu & Chun, 2007, for evidence of object-based processing), but as a “chunk,” such that object-based benefits could instead reflect more efficient updating within a single chunk (Oberauer & Bialkova, 2009). Given how both chunks and objects involve the integration of multiple elements into a unified representation (e.g., Miller, 1956; Thalmann et al., 2019; Wheeler & Treisman, 2002), we are not confident that the two are separate constructs with independent mechanisms. Rather, the difference may reflect the fact that integration of features into objects is less effortful than standard chunking accounts (Luck & Vogel, 1997). Regardless, to provide a stronger test of object-based effects, Experiment 2 used multi-feature objects to better align with most definitions of an object as a binding of different features.

Experiment 2

Experiment 1 suggested that participants could shift access between equidistant colors on the same versus different objects. Experiment 2 aimed to extend this finding by exploring how attention spreads in a retro-cueing task. On each trial, a pair of Gabors (with color and orientation information) were sequentially presented. A retro-cue then indicated the most relevant object and feature (e.g., color of the first Gabor). Participants reported the color or orientation of a probed item on a continuous response wheel. Of interest is whether attention to a single object feature spreads more towards the uncued feature or the uncued object. The object-based account predicts a benefit for another feature bound to the same object (e.g., Sahan et al., 2020), whereas the feature-based account predicts facilitation for the same feature dimension in another object (Niklaus et al., 2017). Finding a benefit for distinct features within the same object would provide converging evidence for object-based effects in VWM and highlight that such effects are stronger than any putative feature-based mechanisms.