Introduction

When viewing an object in a visual scene, an observer may attend to different properties of this object, such as its identity or location. A number of cognitive theories of visual attention assume that properties, such as identity and location, are processed in different feature maps that are located in different brain areas (see for a review, Shipp, 2004). In addition, the information coming from the different feature maps is subsequently combined into a signal that can be used to trigger an eye-movement or saccade towards a certain location. This is particularly important if the observer’s task is to locate a target stimulus that is presented in the visual periphery, so that by making a saccade the fovea will be aligned with the stimulus and detailed location information can be extracted. As the initiation of a saccade takes place at least 100 ms after the onset of a target stimulus (e.g., Carpenter, 2004; Van Loon & Adam, 2006), it is noteworthy that localization performance increases during the period when no saccade is being initiated (Adam, Ketelaars, Kingma, & Hoek, 1993). This finding suggests that pre-saccadic processing is sufficient for coarse localization.

In this paper, we have observers extract multiple types of information from a visual scene. Specifically, we focus on the impact of identifying a number of digits at fixation on the localization performance of a peripheral target. We will first outline the two-process model of object localization proposed and investigated by Adam, Huys, van Loon, Kingma, & Paas (2000), Adam et al. (1993); Adam, Paas, Ekering, & van Loon (1995) (see also Uddin, Ninose, & Nakamizo, 2004). We then present an experiment, using a dual-task method, that supports the assertion that visual attention is critical in localizing objects within the first 100 ms.

Adam et al. (1993) investigated the time course of visual object localization using a task in which participants had to locate a single target stimulus presented in one square of an imaginary 25 × 19 grid that contained 474 possible stimulus locations. They varied the presentation duration of the (masked) stimulus between 33 and 300 ms. Participants used the cursor to indicate the perceived target location. Results showed an initial steep rise in localization accuracy during the first 50 ms of stimulus duration, followed by a further but more gradual improvement from 100 ms onwards, finishing with near-perfect performance at about 300 ms.

Adam et al. (1993) interpreted these findings within a two-process model of visual object localization. In this model, a fast attentional process provides coarse localization information and precedes a slower saccadic system that provides more detailed information by aligning the fovea with the target. In support of the role of the saccadic system, Adam et al. (1993) showed that the further improvement in localization after 100 ms is absent when participants are instructed to abstain from making saccades. In addition, when saccades are allowed, eye movement analyses indicated that participants nearly always made a saccade (i.e., in 98.4% of all trials), but the saccadic onset latency was never less than 100 ms, indicating that the initial steep rise in localization performance during the first 50 ms of stimulus duration can not be attributed to the saccadic system. Together these results suggest strongly that the execution of saccades underlies the gradual improvement in localization performance after 100 ms.

In support of the view that the attentional system underlies the improvement in performance for the first 50 ms, Adam et al. (1993) cited the results of spatial cuing studies, showing that the largest gains in precuing typically occur within the first 50 ms; this provides an estimate of the time necessary to shift attention (Eriksen, 1990). Similarly, visual search experiments have demonstrated scanning rates, i.e., shifts of attention, in the order of 50 ms/item (e.g., Bergen & Julesz, 1983; Treisman & Gelade, 1980; but see e.g. Ward & Duncan, 1996, for much longer estimates). In addition, Adam et al. (2000) showed that advance knowledge about the possible location(s) of the target improves localization performance. In particular, they showed that localization performance improved with short duration (i.e., 71 ms) spatial precues, which accords with the notion that the spatial precue quickly directs spatial attention to the target area and thus mediates localization performance. Furthermore, localization performance for stimulus durations of less than 100 ms is greatly improved when the target stimulus is not backward masked (Adam et al., 1995). Assuming that the masking stimulus disrupts localization performance by involuntarily capturing attention (e.g., Yantis & Jonides, 1984), this finding too suggests that attention is involved in localizing stimuli.

So far, the role of visual attention in object localization is supported by experimental manipulations of events before (Adam et al., 2000) and after (Adam et al., 1995) the target stimulus. In this study, we sought to provide additional, converging evidence for the role of the attentional system in localization performance by examining the effect of a central to-be-identified distractor stimulus on localization performance of a simultaneously presented peripheral target stimulus. Thus, participants were facing a dual-task situation. Generally, in dual-task situations, interference occurs when both tasks need the same mechanism (e.g., Pashler & Johnston, 1998). Furthermore, it is well established that visual identification requires the operation of selective visual attention (e.g., Heinke & Humphreys, 2003; Kawahara, Di Lollo, & Enns, 2001). Hence, if localization needs attention too, then it should be vulnerable to the requirement to first identify the central distractor. If, on the other hand, localization is attention-independent, then it should not be sensitive to the requirement to first identify the central distractor.

We hypothesized that if localization depends on the allocation of visual attention, allocating attention to the distractor for identification should delay its availability for localization of the target, and thus localization performance should suffer. However, once the distractor has been identified, attention is free to move to the target for localization, and the interference effect should diminish or disappear. We also varied the number of to-be-identified distractors. We hypothesized that increasing the number of distractors should lead to longer time periods during which attention is engaged by the distractors for their identification and thus not available for localization of the target. Therefore, we expected greater and longer lasting localization performance decrements with increasing distractor load.

Method

Participants and design

Thirty participants were randomly assigned to one of three conditions in a 2 × 8 × 3 mixed factorial design, crossing the within-subject factors task (single vs. dual) and target stimulus duration (29, 57, 86, 114, 143, 200, 300 or 400 ms) and the between-subject factor distractor load (1, 2, or 3 digits). Thus, there were three different groups of each 10 subjects (1-distractor group; 2-distractor group, and 3-distractor group), with each group performing the same single-task condition (localization of the peripheral target stimulus with eight target duration conditions) but a different dual-task condition (either 1, 2, or 3 to-be-identified distractor items).

Procedure

All participants performed a visual localization task in two conditions (single- and dual-task). In the single-task condition, participants localized a peripherally presented target stimulus. In the dual-task condition, participants performed two perceptual tasks: identification and localization. Thus, in the dual-task condition there were two simultaneously presented visual stimuli: a distractor stimulus (containing 1, 2, or 3 digits) presented at fixation, which had to be identified; and a target stimulus (a single “*” sign) presented peripherally to the left or right of fixation, which had to be localized. After a variable delay that varied between 29 and 400 ms, the target stimulus (but not the distractor stimulus) was followed by a backward masking stimulus that eliminated its visibility.

At the beginning of each trial in the single task condition (localization only), a fixation sign (‘+’) appeared in the center of the screen (Fig. 1). After 1,000 ms the target stimulus (‘*’) was presented at one of 50 possible stimulus locations on an imaginary horizontal row on either side of the fixation sign (25 stimulus locations to the left and 25 to the right). After a variable delay (target-mask onset delay) a masking stimulus was presented to control the visibility of the target stimulus. The masking stimulus consisted of two horizontal strings of each 29 ‘*’ signs covering all possible target locations on the left and right of fixation (plus four extra, non-target positions). The masking stimulus remained present throughout the remainder of the trial. Eight target-mask onset asynchronies or target durations were employed: 29, 57, 86, 114, 143, 200, 300, and 400 ms. The participants’ task was to indicate the location of the target stimulus as accurately as possible by moving the cursor (a rectangle) from the fixation sign to the observed position of the target stimulus. Movement of the cursor was realized by manipulating the cursor keys on the keyboard with the index and ring fingers of the right hand. When subjects reached the perceived target location they confirmed their response by pressing the space bar with their left hand. No feedback was provided. An intertrial interval of 1.5 s separated the final response in one trial from the start of the next trial. Participants were instructed to fixate their eyes on the fixation sign at the beginning of each trial, but were told that they were free to make eye movements toward the target.

Fig. 1
figure 1

Schematic representation of the visual stimuli (upper part) and trial sequence (lower part) in the single- and dual-task conditions. Note that the key difference between single- and dual-task conditions is the appearance of the central distractor stimulus in the dual-task condition, containing 1, 2, or 3 to-be-identified digits (for the 1-, 2-, and 3-distractor groups, respectively). In the single-task condition there is no distractor stimulus; here the neutral fixation sign (+) continues to be visible for the same duration as the distractor stimulus (i.e., 29 ms)

In the dual-task condition there was, in addition to the target stimulus, a briefly (i.e., 29 ms) presented distractor stimulus which had to be identified. Depending on the distractor load, the distractor stimulus contained 1, 2, or 3 digits, which were simultaneously presented and always different from each other. The experimenter wrote down the participant’s verbal response, which was later entered in the computer and analyzed for correctness. The digits were randomly drawn from the set 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 on condition that each number appeared 5 times in a set of 50 test trials.

Participants completed two sessions on separate days. In each session, participants received a series of 50 experimental trials (1 for each of the 50 possible target locations) in each of the eight stimulus durations under either single-task or dual-task condition. Order of task condition between days was counterbalanced. Order of target duration within a day was random. Each series of 50 test trials was preceded by 10 practice trials. In order to asses the effect of retinal eccentricity on localization performance, we established 5 global distances between fixation point and target location by subdividing the 25 possible target locations on each side of fixation into 5 groups of each 5 adjacent target locations. These five distance groups (1, 2, 3, 4, 5) had increasing distances from fixation, namely 1.72, 3.15, 4.58, 6.02, and 7.45° of visual angle, respectively (these values represent the center position within each distance group). The target stimulus, distractor stimulus, and the fixation sign each subtended a visual angle of about 0.23 × 0.29°. The distance between two adjacent target positions was 0.29° of visual angle. Viewing distance was about 60 cm.

Results and discussion

Performance in the identification task was near-perfect. All (but one) participants succeeded in correctly identifying all the elements of the distractor stimulus in at least 95% of the trials. That is, only one participant (in the 1-distractor group) made too many errors of identification (i.e., on 29% of the trials); the data of this participant were excluded from all further analyses. Mean percentage correct identifications of the remaining 29 participants was 98.1% (1-distractor group = 98.6%; 2-distractor group = 99.0%; 3-distractor group = 96.6%). Estimates of localization performance were based solely on those trials in which the distractor stimulus had been identified correctly. Hence, we can assume that in the dual-task condition visual attention was first applied to the central distractor stimulus for identification before it was allocated to the target stimulus for localization.

We first analyzed the effect of target duration and target distance on localization performance in the single-task condition. This initial analysis forms the baseline to which the effect of distractor and number of distractors can be compared.

Single-task performance

Localization performance was quantified by calculating for each participant mean localization error, defined as the (horizontal) absolute distance between the target location and the response location, as a function of target duration and target distance. Figure 2 shows mean localization error for all 29 participants in the single-task condition (localization only) as a function of target duration (i.e., target-mask onset delay) and target distance. A two-way (target duration × target distance) repeated-measures analysis of variance (ANOVA) indicated that localization performance increased (i.e., localization error decreased) significantly as target duration increased and target distance decreased (F(7, 196) = 96.59, p < 0.001 and F(4, 112) = 50.94, p < 0.001, respectively). In Fig. 2c, it can be seen that with short durations, nearby targets were better localized than distant targets, but that this effect of distance diminishes, and eventually disappears, with longer target durations. This was supported by a significant interaction between target distance and target duration (F(28, 784) = 6.10, p < 0.001).

Fig. 2
figure 2

Mean localization error in the single-task condition for all participants (that is, averaged over all three distractor groups; = 29) as a function of a target-mask onset delay; b target distance; and c as a function of both target-mask onset delay and target distance. Error bars are standard error

In previous work, we found that the localization performance function exhibited a steep rise in accuracy (or decrease in error) for durations less than 60 ms (Adam et al., 1993), whereas in this study localization performance in this range is constant (i.e., performance at 29 ms is similar to that at 57 ms). The reason for this is unclear but it could be related to the fact that in the present study the targets appeared on a horizontal axis, whereas in the previous work the targets appeared in a two-dimensional grid that moreover contained many more possible target locations, that is, 474 vs. 50 possible positions. Thus, positional uncertainty was much greater in our previous than in the present work, and, moreover, distributed across two dimensions. These procedural differences allow the possibility that the spatial and temporal dynamics of attention allocation in the two paradigms may have differed, with faster allocation in the present, less complex paradigm. Consistent with this idea, Tse, Sheinberg and Logothetis (2003) reported poorer change detection performance along the vertical than the horizontal axis, a finding they attributed to the relatively poor spatial resolution of attention along the vertical dimension (e.g., He, Cavanagh, & Intriligator, 1996). Furthermore, Pellizzer and Hedges (2003, 2004) reported that reaction time of reaching responses towards visual targets increased with positional uncertainty. This finding conformed to the predictions of a capacity-sharing model, which assumes that the processing resources for motor localization are limited, and that they are distributed as a function of the spatial distribution of the possible target locations. Thus, when the target appears, the processing resources must be reallocated to this new location. This reallocation is performed through an adjustment of the dispersion and location of the processing resources, and this affects reaction time. By these arguments, the allocation of attention to the target stimulus was easier to implement in the present uni-dimensional task, which contained relatively few possible target positions, than in the previous two-dimensional task, which contained many more possible target positions.

For target durations longer than 100 ms, localization error systematically decreased, with near-maximal performance achieved at target durations between 300–400 ms. This outcome is similar to the findings obtained in our previous studies, and most likely can be attributed to the eye movement system that with longer target durations is increasingly able to execute a saccade (including secondary correction saccades) toward the target location while it is still visible. In this view, the strong improvement in localization performance between 100 and 300 ms is the result of progressively more saccades being initiated and executed before the target disappears. Note, that this gradual improvement in localization performance between 100 and 300 ms corresponds nicely with response latencies of saccades which typically are in the order of 200 ms with a distribution that ranges between 100 and 300 ms (e.g., Carpenter, 2004).

The main effect of target distance reflected larger localization errors with increasing distance from initial fixation, except for the largest distance, which showed a slight improvement relative to the next largest distance (Fig. 2b). The former finding is caused by the fact that the visual acuity of the retina (i.e., its spatial resolution) deteriorates with increasing retinal eccentricity. The latter finding is probably related to the fact that the end of the masking stimulus acted as some kind of reference point. That is, sometimes localization performance appears to depend more on the distance between the target stimulus and a reference point than upon eccentricity (e.g., White, Levi, & Aitsebaomo, 1992).

Dual-task performance

In the dual-task condition, where the localization task had to be performed together with the identification task, the performance function showed an initial decrease in error, reaching a stable level of performance after which it decreased further (see Fig. 3). Figure 3a shows mean localization error in single- and dual-task conditions for the 1-distractor group (averaged over target distance). As can be seen in Fig. 3a, localization error was substantially greater in the dual-task (1-distractor) condition than in the single-task (no-distractor) condition (F(1, 8) = 14.59, p < 0.001). This finding shows that the requirement to first identify the central distractor stimulus hampered localization performance and suggests that localization performance depends on the availability of selective visual attention. Importantly, this effect was qualified by a significant interaction with the factor target-mask onset delay (F(7, 56) = 7.46, p < 0.01). This interaction indicated that at the shortest target duration of 29 ms there was a robust (p < 0.001) interference effect that disappeared with the longer stimulus durations of 57 and 86 ms. This important outcome suggests that at the shortest target duration of 29 ms attention was not sufficiently available for localizing the target, as it was allocated to the task of identifying the distractor stimulus. Presumably, this raised localization error at the unattended target location. With the longer target durations of 57 and 86 ms, however, the interference effect disappeared, suggesting that attentional identification of the distractor stimulus had been completed and that visual attention had become increasingly available for, and shifted to, the peripheral target stimulus. Note that this early interference effect and its fast disappearance can not be attributed to the execution of eye movements because these effects occurred within 60 ms of presentation time, a time range far too short to execute saccades.

Fig. 3
figure 3

The left panels show mean localization error as a function of target-mask onset delay in single- and dual-task conditions for the 1-, 2-, and 3-distractor groups (a, c, and e, respectively). The right panels show the results of the fitting procedure for the 1-, 2-, and 3-distractor groups (b, d, and f, respectively)

Interestingly, with still longer target durations, an interference effect appeared again until it disappeared at the longest target duration of 400 ms. This outcome seems to suggest that the identification task interfered not only with the attention system (at short target durations) but also with the eye movement system (at longer target durations). Probably, if attention is delayed in moving to the target stimulus, eye movements are delayed too because shifts of attention are thought to precede (and to be functionally related to) saccadic eye movements (e.g., Godijn & Theeuwes, 2003; Irwin & Gordon, 1998; Rizzolatti, Riggio, Dascola, & Umiltà, 1987). In sum, it appears that the requirement to first identify a single distractor item caused a temporal shift of the full target duration—localization performance function relative to the control, no-distractor function.Footnote 1 Importantly, the interference effect present at the shortest target durations reflects the absence of attention for target localization because it can not be allocated to the peripheral target stimulus while it is processing the central distractor stimulus.

To estimate the extent of this temporal shift and test the assertion that the performance function was shifted horizontally, we performed a fitting procedure that calculated the time shift (τ) necessary to produce an optimal fit between the no-distractor and 1-distractor performance functions. First, both functions were non-linearly interpolated (via a piecewise cubic Hermite interpolation procedure). Then, the optimal time shift τ was calculated by the following minimization procedure:

$$ \min _{\tau } {\sum\limits_{t = t_{1} }^{t_{2} - \tau } [ }g(t + \tau ) - f(t)]^{2} $$

(g = 1-distractor function; f = no-distractor function). This procedure yielded a time shift τ of 61 ms, which resulted, after fitting, in a correlation coefficient of 0.99 between the two functions (Fig. 3b).

Localization error as a function of target-mask onset delay for the 2-distractor and 3-distractor groups is presented in Fig. 3c and e, respectively. Separate two-way ANOVAs for the 2- and 3-distractor groups indicated here too large distractor interference effects (F(1, 9) = 17.24, p < 0.01 and F(1, 9) = 43.22, p < 0.001, respectively) that varied as a function of target-mask onset delay (F(7, 63) = 2.25, p < 0.05 and F(7, 63) = 2.73, p < 0.05, respectively). In particular, increasing the number of distractor items shifted the target duration―localization performance function progressively further away from the control, no-distractor function, revealing longer lasting interference effects. In terms of the fitting procedure described above, time shifts τ of 145 and 193 ms were obtained for the 2- and 3-distractor groups, respectively (with correlation coefficients of 0.95 and 0.87, respectively; Fig. 3d,f). Of course, the restricted range and levels of target-mask onset delays used in the present study limits the reliability and accuracy of the fitting procedure, especially in the 2- and 3-distractor conditions.

Figure 4a depicts the mean overall interference effect (i.e., subtracting localization error in the single-task condition from that in the dual-task condition) for the 1-, 2-, and 3-distractor groups. Figure 4b depicts the time shift τ for the 1-, 2-, and 3-element distractor groups. Clearly, increasing the number of to-be-identified distractor items in the distractor stimulus caused increasingly larger and longer lasting localization interference effects. In fact, a linear regression analysis on the individual time-shift data revealed that the attentional system takes about 66 ms to identify each distractor item (F(1, 22) = 26.03, p < 0.001) without a general shift associated with dual-tasking (non-significant intercept = 2.5 ms).

Fig. 4
figure 4

a Mean difference in localization error between the single- and dual-task conditions as a function of the number of distractor items in the distractor stimulus (i.e., the 1-, 2-, and 3-distractor groups) averaged over target-mask onset delay; b the optimal time shift (τ) necessary to produce the best fit between the localization performance functions of the single-task and dual-task conditions as a function of the number of distractor items in the distractor stimulus

Systematic localization errors

To examine the presence of a systematic bias in localization error we calculated the constant error (CE), which retains the sign or direction of the errors (undershoots or overshoots), as a function of task (single, dual), target-mask-onset delay (eight levels), target distance (five levels), and distractor load (three levels). On average, participants tended to undershoot the target by -2.8 mm, which is consistent with the general finding that localization judgments typically undershoot briefly presented targets (e.g., Müsseler, van der Heijden, Mahmud, Deubel, & Ertsey, 1999). An ANOVA indicated larger undershoots in the dual-task than in the single-task condition (−3.3 and −2.3 mm, respectively; F(1, 26) = 16.59, p < 0.001), larger undershoots with shorter target-mask-onset delays (−0.5, −1.3, −2.4, −3.1, −2.9, −3.4, −4.0, −4.8 mm, for shorter delays, respectively; F(7, 182) = 23.87, p < 0.001), and an U-shaped function relating undershoot to target distance (−1.1, −2.8, −3.7, −3.6, and −2.8 mm, for increasing distances, respectively; F(4, 104) = 13.63, < 0.001). This latter finding is probably related to the fact that the most peripheral targets fell close to the end of the masking stimulus, which may have acted as a reference point (e.g., White et al., 1992). The above main effects were qualified by a significant 3-way interaction involving all three factors, F(28, 728) = 6.84, p < 0.001. This interaction is depicted in Fig. 5 and indicates that undershoots were disproportionally greater in the dual-task condition than in the single-task condition when targets were presented shorter than 100 ms and at greater distances from fixation. This finding supports the idea that the distractor(s) strongly interfered with the operation of the attentional system, even so much that it eliminated the advantage of the most-distance targets (falling near the end of the masking stimulus). Interestingly, the size of the undershoot effect in the dual-task condition for the shortest target durations (distances 2, 3, 4, and 5) was about 10%, which is very similar to previous estimates of mislocalization (e.g., Van der Heijden, Van der Geest, De Leeuw, Krikke, & Müsseler, 1999).

Fig. 5
figure 5

Systematic mislocalization in terms of constant error (mm) as a function of stimulus distance and stimulus duration (target-mask-onset delay) in a single-task and b dual-task conditions. The eight levels of stimulus duration are shown in three lines that group together the shortest (29, 57, and 86 ms), intermediate (114, 143, and 200 ms), and longest (300 and 400 ms) stimulus durations. Negative values represent undershoots

Control experiments

Although we interpret the current findings within the two-process model as providing supporting evidence for the role of the attentional system in spatial localization, two alternative explanations for the data were addressed in control experiments. The first alternative relates to the confounding of the presence of the distractor(s) and the task requirement (single vs. dual task). We tested the hypothesis that the mere presence of the distractor lead to the change in the performance function. Twelve participants performed the localization task in a condition with no distractor and in a condition with one distractor that did not require identification (but was merely present). Results showed the usual improvement in localization performance with longer target durations, F(7, 77) = 50.12, p < 0.001, and, moreover, that localization performance did not differ between these two conditions (mean localization error: 3.9 and 4.1 mm, respectively; F(1, 11) < 1, p > 0.4). More critically, this was also true for target durations shorter than 100 ms (mean localization error: 6.2 and 6.4 mm, respectively; F(1, 11) < 1, p > 0.6). These findings indicate that the distractor effect is not due passive capture of attention by a foveal onset, but to active allocation of attention for identification of the distractor stimulus.

The second alternative relates to the delay in localization due to uttering the digits, during which the location information could have decayed from visual working memory. We tested this “delayed response hypothesis” with another group of twelve participants performing the localization task before or after identifying three digits. Again, there was the improvement in target localization with longer target durations, F(7, 77) = 14.59, p < 0.001. An analysis of the localization onset times in these two conditions revealed, as expected, a substantially delayed localization onset response in the locate-after condition than in the locate-before condition (mean onset times: 1,770 vs. 605 ms, respectively; F(1, 11) = 88.54, p < 0.001). However, localization performance did not differ between these two response conditions (mean localization error: 5.7 and 5.0 mm, respectively; F(1, 11) = 2.97, p > 0.1).

Together, the results of these control experiments argue against the view that the interference effect found in our main experiment is due to either the mere presence of the distractor or delayed responding in the distractor condition.

Finally, it is interesting to note that the way in which attention is shifted toward the target stimulus may be different for the single- and dual-task conditions. In the single-task condition, the sudden, peripheral onset of the target stimulus may have prompted an exogenous (i.e., involuntarily, automatic) shift of attention, whereas in the dual-task condition the requirement to first identify the central distractor stimulus, which appears simultaneously with the peripheral target stimulus, most likely requires an endogenous (i.e., voluntary) shift of attention (e.g., Posner, 1980). However, this possible difference in attentional control for single- and dual-task conditions can not explain the effect of distractor load, showing greater and longer lasting interference effects at short target durations with increases in the number of distractor items.Footnote 2 Hence, our conclusion still is that the diminished availability of attention for target localization at short target durations is responsible for the observed interference effect.

Retinal localization

The two-process model of localization performance emphasizes responses to the stimulus (attention shifts and eye movements) rather than the initial intake or coding of stimulus information by visual cells. This emphasis on attention and eye movements does not negate the fact that there is the initial period prior to the attention movement. Indeed, shifting attention to the target presupposes at least some knowledge of the target location prior to the attention shift.

In line with this observation many theorists (e.g., Logan, 1992; Treisman, 1985) posit a preattentive level of visual input analysis in which the visual scene is coded in parallel along a number of separable dimensions or features such as color, line orientation, motion, but also position. In doing so, initial registration of sensory information by the retina already achieves some form of location coding because retinal visual cells are uniquely sensitive to information from different directions. However, although some degree of spatial localization might be afforded by preattentive analysis, the two-process model and the present findings suggest that subsequent processes (i.e., shifts of attention and eye movements) are needed to precisely localize items.

Functional connection between attentional and saccadic systems

Although the current experiment was designed to further establish the involvement of the attentional system in object localization, it is interesting to speculate briefly on the implications of the present results for the functional connection between the attentional and saccadic systems. That is, the systematic horizontal shift of 66 ms of the localization performance function for each to-be-identified central distractor item is consistent with the notion of an obligatory or functional link between the attentional and saccadic systems, in that attentional selection of the target stimulus seems obligatory before a saccade toward it can be executed. In this view, the triggering of the saccade in the dual-task condition is delayed by the time it takes the attentional system to identify the distractor stimulus. Our data suggest a time delay of about 66 ms for each distractor item, an estimate that is in close agreement with the results of Kowler, Anderson, Dosher and Blaser (1995, Experiment 2) who reported an increase in saccadic latency by 50–75 ms when subjects, in addition to preparing a saccade, were also required to identify a single letter. Thus, our findings are consistent with a growing body of behavioral (e.g., Deubel & Schneider, 1996; Hoffman & Subramanian, 1995; Kowler et al., 1995), neurophysiological (e.g., Desimone, Wessinger, Thomas, & Schneider, 1989; Kustov & Robinson, 1996; Wardak, Ibos, Duhamel, & Olivier, 2006), and computational (e.g., Clark, 1999; Koch & Ullman, 1985) evidence that supports a strict and functional coupling between spatial attention and saccadic eye movements. This view is most prominent in the premotor theory of attention (e.g., Rizzolatti et al., 1987; Umiltà, Riggio, Dascola, & Rizzolatti, 1991), which has been proposed 20 years ago, postulating a strong, direct coupling between spatial shifts of attention and the preparation of saccadic eye movements. According to the premotor theory, the allocation of attention to a location is intrinsically linked with the preparation to make a saccadic eye movement to that location. In addition, recent studies have demonstrated a close relationship between attentional shifts and the direction of micro-saccades (e.g., Engbert & Kliegl, 2003). However, direct evidence for the interdependency between attention and eye movements in the current localization paradigm awaits future studies that use eye movement recordings.

Conclusion

The present study demonstrated that when attention is preoccupied with identifying a distractor stimulus, localization of a target stimulus suffers a cost that depends on the number of items in the distractor stimulus. This finding has two important implications. First, it demonstrates that visual attention is critical in spatial localization during the first 100 ms and speeds up localization thereafter. Second, it indicates a strong interdependency between identification and localization, suggesting that attention is allocated at very early stages of visual processing.