Exogenous object-centered attention
KeywordsJump Condition Abrupt Onset Color Singleton Exogenous Attention Gabor Patch
Exogenous object-centered attention
It is well-known that the sudden appearance of an object in a scene may capture attention independently of the observer's goals and beliefs (Theeuwes, 1994, 2010). Even when observers have a top-down set to look for a color singleton, an abrupt onset will summon attention, slowing down search for the target color singleton (Schreij, Owens, & Theeuwes, 2008; Theeuwes, 1994). In their seminal study, Posner and Cohen (1984) were the first to demonstrate this type of exogenous attention in a paradigm in which one of two peripheral placeholders was cued by brightening, followed by the presentation of a target inside one of these placeholders. Participants were faster in detecting the target when it appeared at the cued, relative to the uncued, position. Crucially, Posner and Cohen (Posner & Cohen, 1984; Posner, 1980) showed that this exogenous attentional facilitation is coded in retinotopic coordinates.
It has been suggested that abrupt onsets are effective in capturing attention because they strongly activate the transient channel, also referred to as the magnocellular pathway (e.g., Breitmeyer & Ganz, 1976; Mathôt & Theeuwes, 2012; Theeuwes, 1995; Yantis & Jonides, 1984). This pathway is basically color blind and sensitive to luminance transients and motion (Theeuwes, 1995; Todd & Van Gelder, 1979). Even though this pathway also provides input to the ventral stream, it is the dominant feedforward interrupt signal to the dorsal “where” pathway (Ungerleider & Haxby, 1994; Ungerleider & Mishkin, 1982). Since the dorsal pathway is basically retinotopically organized (Golomb & Kanwisher, 2012), it may not be surprising that it is commonly believed that the frame of reference of exogenous attention is retinotopic that is, attention is allocated to the location on the retina where the abrupt onset is projected.
However, there is growing evidence that spatial cuing effects may not exclusively operate on the basis of retinotopic coordinates. One key finding concerns the related phenomenon known as inhibition of return, or IOR (Posner & Cohen, 1984), where a delay in target presentation results in slower detection times for cued than for uncued targets. Tipper, Driver, and Weaver (1991; see also Tipper, Weaver, Jerreat, & Burak, 1994) used the original paradigm of Posner and Cohen (1984) in which one of two peripheral squares was briefly brightened. Yet unlike in the original task, following the cue presentation, both squares rotated around a central square. Crucially, subsequent target detection was slowed when the target appeared at the location of the cued square, even though it had moved to a new location. This study was the first to provide evidence that IOR can be object based. This effect has been replicated in various subsequent studies (see Reppa, Schmidt, & Leek, 2012, for a recent review).
Furthermore, a recent study investigating the dynamics of attention in the interval surrounding an eye movement demonstrated that immediately after an eye movement, spatial attention has both a retinotopic (eye-centered) and a spatiotopic (world-centered) component (Mathôt & Theeuwes, 2010a; see also Golomb, Chun, & Mazer, 2008). Mathôt and Theeuwes (2010a) combined an exogenous cuing task with an eye movement task. Just before making an eye movement, a brief nonpredictive onset cue was flashed midway between the initial fixation point and the saccade goal, a few degrees above or below the required saccade trajectory. After executing the eye movement, a tilted bar was presented at the retinotopic or spatiotopic location of the cue. The results showed attentional cuing benefits for both retinotopic and spatiotopic locations. Using a similar paradigm, Mathôt and Theeuwes (2010b) investigated the locus of IOR and, similarly, found both a spatiotopic (predominantly at long postsaccadic intervals) and a retinotopic (predominantly at short postsaccadic intervals) component.
Paradigms that investigate object-based attention also typically use exogenous cues. For example, in Egly, Driver, and Rafal (1994), observers viewed displays consisting of two adjacent vertically or horizontally oriented rectangles. Then an abrupt onset cue was presented at one end of one of the rectangles. Because this abrupt onset cue automatically summoned attention to the end of the rectangle, observers were fast in detecting a target that appeared at that location. More interesting, however, they also found that the cue facilitated detection of targets that appeared anywhere within the cued object, as compared with targets that were equally far away from the cue but not within the same object (i.e., there was a within-object benefit). The prevailing view to explain this effect is that once a part of an object is attended, attention automatically “spreads” within the boundaries of the object (e.g., Vecera, 1994).
Overall, these findings point to a possible role for object-centered attention in exogenous spatial cueing (see also Boi, Vergeer, Ogmen, & Herzog, 2011). The present study was designed to determine whether the classic Posner exogenous cuing effect possibly operates in nonretinotopic coordinates. According to the classic notion, exogenous cuing effects should be found only at the location where the abrupt onset (the exogenous cue) is projected on the retina, since the transient channels are basically retinotopically organized.
Nineteen observers participated. For α = .05 and an effect size of d = 0.8, this experiment has a power of .91. Figure 1 provides an overview of the trial structure. Each trial started with a bright (90 cd/m2) fixation cross on a gray (45 cd/m2) background for 500 ms, followed by a dark (12 cd/m2) centrally positioned horizontal bar (21.3° × 4.3°). After a variable duration (μ = 1,500 ms, σ = 250 ms), an onset cue was briefly presented (x = 38 ms, s = 18 ms1) at one end of the bar. The cue was a uniform bright (90 cd/m2) patch with a Gaussian envelope (σ = 0.36°). Immediately following the offset of the cue, the bar rotated, jumped, or remained static, depending on the experimental condition. In the move condition, the bar rotated smoothly by 90° (clockwise or counterclockwise) to a vertical orientation. This was the crucial condition, designed to investigate object-centered cuing. We also included another control condition (the so called jump condition) in which the horizontal bar jumped suddenly by 90° to a vertical position. In this condition, there was no smooth movement, which implies that the movement direction was ambiguous. Clearly, in this condition, the cued position on the vertical bar was not associated with one of the positions on the vertical bar, and therefore, a cuing effect could not occur. We included this condition just to ensure that it is the actual movement of the object in the move condition that drives the object-centered cuing effect. The labels valid and invalid in the jump condition were, in fact, the same as those used in the move condition (i.e., same coding) even though they had no real meaning. In the wiggle condition, the bar rotated smoothly by 45° in one direction (clockwise or counterclockwise), after which it smoothly rotated back to its original orientation. This condition allowed us to investigate the effect of movement, without any net displacement of the bar In the static condition, the bar did not move at all. This condition served to replicate the conventional cuing effect. The movement/jump/static interval was 109 ms (s = 18 ms). In the jump condition, the jump occurred halfway through this interval. Finally, a target and a distractor stimulus were briefly presented (x = 38 ms, s = 18 ms). These were Gabor patches with a Gaussian envelope (σ = 0.36°) and a sinusoid luminance modulation (90 cd/m2 to <1 cd/m2; v = 2.2 cycles/°). The target was tilted 45° (clockwise or counterclockwise) from a vertical orientation. The distractor was oriented vertically. On validly cued trials, the target was presented at the same location within the object as the cue. On invalidly cued trials, the target was presented opposite from the cued location within the object.
Participants reported the orientation of the target by pressing the “z” key on a computer keyboard if the target was counterclockwise and the “/” key if the target was tilted clockwise. After each response, participants received feedback through a briefly presented colored fixation dot (500 ms; green on correct, red on incorrect). The experiment consisted of 64 practice trials, followed by 256 experimental trials across four blocks. The location of the cue (left/ right) and the target (left/right or up/down) and the condition (move/jump/wiggle/static) were mixed within blocks. Stimuli were presented using OpenSesame (Mathôt, Schreij, & Theeuwes, 2012) on a 19-in. CRT monitor (1,024 × 768 pixels; 120 Hz). A movie of the experimental paradigm of Experiment 1 is available (see the on-line supplementary material).
Three participants were excluded from analysis due to low accuracy (more than 4 standard deviations [SDs] below the mean of the other participants). All trials where the response time (RT) was more than 2.5 SDs below or above the mean RT (per participant) were discarded (2.4 %). Mean correct RT was 581 ms. Mean accuracy was 93 %.
An analysis of variance (ANOVA) with condition (move, jump, wiggle, static) and cue validity (valid, invalid) as within-subjects factors and mean correct RT as a dependent variable revealed main effects of condition, F(3, 15) = 11.5, p < .001, and validity, F(1, 15) = 11.0, p < .01, and a condition × validity interaction, F(3, 15) = 6.5, p < .001. Two-tailed paired-samples t-tests revealed an effect of cue validity in all conditions (all ps < .05), except the jump condition. An additional analysis showed that the jump condition was, overall, significantly faster than the move, t(15) = 7.19, p < .01, and the wiggle, t(15) = 3.88, p < .01, conditions, an effect that is likely due to the fact that this is the only condition is which the bar was presented as an abrupt onset. It is well-known that this may capture attention, thus speeding up responses (e.g., Theeuwes, 1991). To directly compare the static (i.e., retinotopic) cuing effect with the object-centered cuing effect, we performed an additional AVONA with condition (move, static) and cue validity as a factor. There were main effects of validity, F(1, 15) = 12.7, p < .01, and condition, F(1, 15) = 10.64, p < .01. However, the interaction was not reliable, F(1, 15) = 1.81, n.s., suggesting that the cuing effect in the classic (static) retinotopic condition was equally strong as in the object-centered condition. The results are shown in Fig. 2.
The overall ANOVA was also performed on accuracy as dependent variable. There was only an effect of condition, F(3, 15) = 10.3, p < .001, such that accuracy was higher in the move and jump conditions than in the static and wiggle conditions (i.e., accuracy increased when the bar turned).
In three of the four conditions, we found a cuing effect so that RTs were shorter for validly cued than for invalidly cued targets. Only the Jump condition did not show a cuing effect, which was fully expected since, in this condition, the movement direction is ambiguous. The static condition represents the classic condition in which the exogenous cuing effect can be explained in terms of retinotopy. The wiggle condition allowed us to isolate the effect of the movement of the object, since the bar rotated smoothly but, ultimately, moved back to the original retinotopic location. The critical move condition, in which the horizontal bar rotates to a vertical position, also shows a clear cuing effect, suggesting that exogenous attention does not necessarily operate in retinotopic coordinates but, instead, can move along with a rotating object. A direct statistical comparison between the static and move conditions indicated that the object-centered cuing effect was not significantly weaker than the classic retinotopic cuing effect.
The method was similar to that of Experiment 1, except for the following differences. Eighteen observers participated in the experiment. For α = .05 and d = 0.8, this experiment has a power of .89. Instead of a bar, a cross was presented (dimensions of the arms: 6.3° × 4.3°). Opposing arms of the cross had an outline of the same color (pinkish and greenish, chosen to be equiluminant and opposite in color space), giving the appearance of two crossed bars (see Fig. 3). The onset cue was presented for 50 ms at the end of one of the arms. Following the offset of the cue, the cross rotated by 90°, either clockwise or counterclockwise, in 124 ms. Following the rotation, the target and three distractors were presented for 58 ms at the end of the arms. The location of the target relative to the cue was the main independent variable (see Fig. 3b). The experiment consisted of 20 practice trials, followed by 160 experimental trials across eight blocks. A movie of the experimental paradigm of Experiment 2 is available (see the on-line supplementary material).
The same filtering criteria as those used in Experiment 1 led to the exclusion of 1 participant and 3.3 % of correct trials. Mean correct RT was 735 ms. Mean accuracy was 93 %.
Experiment 2 shows equally strong cuing effects for the object-centered and retinotopic reference frames. The object-centered cuing condition is basically a replication of the move condition of Experiment 1. It is quite remarkable that both object-centered and retinotopic cuing effects are simultaneously present and are equally strong. It suggests that both object-centered and retinotopic representations coexist at least immediately following the object movement, reminiscent of the dual retinotopy and spatiotopy that is observed immediately after an eye movement (Golomb et al., 2008; Mathôt & Theeuwes, 2010a; Mathôt & Theeuwes, 2011).
The present article shows that classic exogenous spatial cuing not only operates in retinotopic coordinates, but also can move along with a rotating object. Our Experiment 2 shows that both the retinotopic and object-centered reference frames are simultaneously present and accessible. These findings suggest that the notion that exogenous attention is rigid and closely tied to retinotopy should be revised.
A recent study by Boi et al. (2011) had also cast some doubts on the strict retinotopy of exogenous attention. Boi et al. used a cuing paradigm in which the exogenous cue (an abrupt onset) was followed by a variant of the Ternus–Pikler display in which three squares appeared to move laterally in tandem as a group. The cue was presented in the central square of the first frame. Then, in the second frame, participants searched for a target that could appear at the retinotopically cued, nonretinotopically (i.e., object-centered) cued, or invalid location. The results indicated attentional facilitation at both retinotopic and nonretinotopic locations. Boi et al. concluded that exogenous cuing can occur in a coordinate system that moves according to perceptual grouping relations present in the display. Even though this conclusion about exogenous attention seems reasonable given that abrupt onsets were used as cues, the conclusions may, in fact, be less probable given the fact that in four out of five experiments the cue was predictive of where the target was going to be presented (either 100 % in Experiments 1, 2, and 5 or 80 % in Experiment 3). When a cue is predictive, one cannot speak about exogenous attention (Yantis & Egeth, 1999), since observers may use the cue to endogenously direct their attention to the likely target position. There is only one experiment (Experiment 4) in which the exogenous cue did not predict the location of the target, but all participants in this experiment had also participated in Experiment 1 (with a 100 % predictive cue) and may have learned that the abrupt onset predicts the location of the impending target. Given these methodological concerns, the conclusions regarding nonretinotopic exogenous cuing in perceptual grouping may not be as convincing as the Boi et al. study suggests. The present study, however, does not suffer from these shortcomings, since the cues were nonpredictive in both experiments. Our results are also different from those in Boi et al. in that our retinotopic and object-centered cuing effects were about equal in size (our Experiment 2), while in Boi et al., object-centered cuing was significantly larger than retinotopic cuing (their Experiment 3). Again, the fact that the cues in this experiment were predictive (80 %) suggests that a stronger bias toward object-centered orienting may be related to the endogenous nature of the cues used.
The observation of a coexisting retinotopic and object-centered representation is consistent with studies that have shown the coexistence of space-based and object-based IOR (e.g., Tipper, Jordan, & Weaver, 1999). Since IOR follows the exogenous capture of attention, it may not be surprising that there is IOR at the retinotopic (originally stimulated) location. However, since IOR often is considered to be a foraging facilitator, in order to be effective, it has to be tightly connected to the object representation.
The coexistence of retinotopic and object-centered representations is consistent with the idea that there are two separate attentional systems: one for visual object processing and one for spatial processing. The neuroanatomical basis for dissociable systems is well established (Haxby et al., 1991; Ungerleider & Mishkin, 1982). The posterior parietal cortex is mainly concerned with spatial processing (dorsal stream), while the inferior temporal cortex is concerned with object processing (ventral stream). Also, studies involving patients with chronic visual neglect (typically as a result of right-hemispheric damage) have shown that some patients who cannot process information on the left half side of a scene may process objects when displayed on the left extinguished side but then may omit the left half of objects presented across the scene (Driver & Halligan, 1991), suggesting two distinct attentional systems. Brain-imaging studies have suggested separate brain areas, one involved in attentional control of spatial attention (the superior parietal lobule), while another area is involved in the control of object-centered attention (intraparietal sulcus and frontal areas) (for a review, see Yantis & Serences, 2003). The present findings are consistent with the notion of distinct spatial and object-centered attentional systems that, at any moment, can coexist.
(AVI 725 kb)
(AVI 1604 kb)
- Cousineau, D. (2005). Confidence intervals in within-subject designs: A simpler solution to Loftus and Masson’s method. Tutorial in Quantitative Methods for Psychology, 1, 4–45.Google Scholar
- Golomb, J. D., & Kanwisher, N. (2012). Higher-level visual cortex represents retinotopic, not spatiotopic, object location. Cerebral Cortex, 10, 1093.Google Scholar
- Mathôt, S., & Theeuwes, J. (2012). It's all about the transient: Intra-saccadic onset stimuli do not capture attention. Journal of Eye Movement Research, 5(2), e4.Google Scholar
- Posner, M. I., & Cohen, Y. (1984). Components of visual orienting. In H. Bouma & D. G. Bouwhuis (Eds.), Attention and performance X: Control of language processes (pp. 531–556). Hillsdale: Erlbaum.Google Scholar
- Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge: MIT Press.Google Scholar
- Yantis, S., & Egeth, H. E. (1999). On the distinction between visual salience and stimulus-driven attentional capture. Journal of Experimental Psychology: Human Perception and Performance, 25, 661–676.Google Scholar