Introduction

When we look in our refrigerator to retrieve a carton of milk, it is necessary to inhibit responses to non-targets (i.e., distractors) such as a carton of eggs or a jar of pickles. In this scenario, the oculomotor system must select a command appropriate for the target while ignoring—or inhibiting—similar commands to the distractor(s). A simple experimental corollary to the refrigerator example involves a visual target concurrently presented with a task-irrelevant visual distractor, and work has shown that the location of the distractor influences oculomotor planning (e.g., Lévy-Schoen 1969; Walker et al. 1997; DeSimone et al. 2015). For example, Walker et al. (1997) found that a target concurrently presented with a remote distractor (i.e., > 20° in angular coordinates from the target axis) increased saccade reaction time (RT) compared to when a target was presented in a distractor-free environment—a phenomenon referred to as the remote distractor effect (RDE) (cf. Corneil and Munoz 1996). In contrast, a target presented with a proximal distractor (i.e., within ± 20° in angular coordinates from the target axis) was found to not influence RT; however, amplitudes were biased toward the distractor’s location (i.e., the global effect or center-of-gravity effect) (Coren and Hoenig 1972; Deubel et al. 1984; Findlay 1982; Lévy-Schoen 1969; Walker et al. 1997; for review see Van der Stigchel and Nijboer 2011).

The competitive integration model (CIM) asserts that the RDE arises due to coeval encoding of target and distractor information onto a common saccade map with a retinotopic representation in the intermediate layers of the superior colliculus (SC) (Godijn and Theeuwes 2002; Meeteret al. 2010; see also Trappenberg et al. 2001). In particular, the CIM asserts that activity at one location in the map inhibits distant activity via a long-range intercollicular inhibitory pathway and delays the buildup properties of target-related saccade generating signals (Trappenberg et al. 2001) (cf. Walker et al. 1997; Findlay and Walker 1999). Further, the CIM contends that a proximal distractor does not influence RTs because target and distractor proximity within the retinotopic map leads to activity that merges into a single movement vector representing a spatially averaged response. Although the spatially averaged response does not engender a cost to saccade RT, the CIM states that saccade amplitudes elicit a global effect such that the response lands between the target and distractor.

The present investigation sought to determine whether the sensory modality associated with a distractor influences expression of the RDE and global effect. The basis for this question stems from work showing that the SC encodes spatial information about visual (Bender and Davidson 1986; Cang et al. 2018; Gale and Murphy 2014; Stein et al. 2001) and acoustic (Bednárová et al. 2018; Lau et al.2018; Rajala et al. 2018) stimuli. In particular, non-human electrophysiology studies have shown that the superficial layers of the SC respond exclusively to visual stimuli, whereas the intermediate layers of the SC respond to auditory, somatosensory and visual stimuli (Casagrande et al. 1972; May 2006). In a classic demonstration, Jay and Sparks’ (1984) reported that intermediate and deep layer SC neurons that responded to visual stimuli also respond to acoustic stimuli. Thus, although visual and acoustic stimuli are encoded in retinal and head-centered coordinates, respectively, evidence suggests that both converge onto a common pathway for saccade generation in an oculocentric frame of reference (Jay and Sparks 1987; Yao and Peck 1997). Because the intermediate layers of the SC encode the direction and amplitude of a to-be-completed saccade (i.e., to achieve a desired movement error) (Sparks et al. 1976; Wurtz and Goldberg1972), it is possible that remote and proximal acoustic distractors paired with a visual target elicit the RDE and global effect, respectively. Accordingly, Experiment 1 used a traditional visual target paired with a proximal and a remote visual distractor, whereas Experiment 2 entailed the same visual target paired with a proximal and a remote acoustic distractor. In addition, target-only trials were included in both experiments. In terms of research predictions, if the RDE and global effect arise due to the encoding of target and distractor location within a retinocentric map (i.e., via the visual encoding of target and distractor) then the aforementioned phenomenon should selectively manifest in Experiment 1. In turn, if the RDE and global effect reflect the encoding of target and distractor location in an oculocentric frame of reference (i.e., via the motor encoding of target and distractor) then the effects should be observed in Experiments 1 and 2.

Methods

Participants

Experiment 1 involved 12 individuals (6 female: age range: 20–25 years) and Experiment 2 involved a separate set of 10 individuals (5 female: age range: 21–25 years). Participants were recruited from the University of Western Ontario community and had normal or corrected-to-normal vision and were self-declared right-hand dominant. Participants indicated they were free from neuropsychiatric/neurological disorder, or eye injury. Pure-tone audiometry (500–4000 Hz) was used to quantify that hearing for all participants fell within the normal range for speech perception. All participants read a letter of information and signed a consent form approved by the Non-Medical Research Ethics Board, University of Western Ontario, and this work was conducted according to the Declaration of Helsinki.

Experiment 1: visual targets and visual distractors

Participants sat in a height adjustable chair in front of a table with their head placed in a forehead and chinrest. Visual stimuli were presented on a 690 × 470 mm stimulus board centered on participants’ midline and located 520 mm anterior to the front surface of the table. The stimulus board contained horizontally aligned LEDs (see details below) presented behind black stereo cloth. The gaze location of participants’ left eye was measured via a video-based eye tracker (EyeLink 1000 Plus, SR Research, Ottawa, ON) sampling at 1000 Hz. Prior to data collection, a nine-point calibration of the viewing space was performed. Two monitors visible only to the experimenter provided real-time gaze position, trial-to-trial saccade kinematics and information related to the eye-tracking system (i.e., to perform a recalibration when necessary). Computer events and the presentation of stimuli were controlled via MATLAB (R2018b; The Math Works, Natick, MA) and the Psychophysics Toolbox extensions (v 3.0) (Brainard 1997; Kleiner et al. 2007) including the Eyelink Toolbox (Cornelissen et al. 2002). The lights in the laboratory were extinguished during data collection.

A centrally located white LED served as the fixation location for each trial, and yellow LEDs 15.5° to the left and right of fixation—and in the same horizontal meridian—served as targets (LED case size = 5 mm). A trial began with the illumination of the fixation LED which instructed participants to direct their gaze to this location. Once a stable gaze was achieved (± 1.5° for 500 ms), a uniformly distributed random foreperiod between 1000 and 2000 ms was initiated after which a target (i.e., left or right of fixation) was presented for 50 ms. For 20% of the trials, the target was presented without a distractor (i.e., target-only condition: TO). For 80% of the trials, the target was presented with a visual distractor (i.e., a red LED) in the same (i.e., proximal distractor: P) or the opposite (i.e., remote distractor: R) visual field as the target at eccentricities less than (10.5° from fixation; i.e., P− or R−) or greater than (20.5°; i.e., P+ or R+) the target (for schematic, see Fig. 1). The fixation LED remained visible during the foreperiod and was extinguished with the target (i.e., overlap paradigm). The onset of the target cued participants to saccade ‘quickly and accurately.’ Twenty randomly ordered trials were completed to each distractor and TO condition trial type.

Fig. 1
figure 1

Timeline of visual events for Experiments 1 and 2. A centrally located LED (“ + ”) was presented for a 1000–2000 ms random foreperiod. Following the foreperiod, a visual target (i.e., a yellow LED depicted as an open white circle) was presented for 50 ms at 15.5° left or right of the fixation. For 20% of trials, the target was presented without a distractor (i.e., target-only condition: TO). For 80% of trials, a visual target was presented with a visual (Experiment 1) or acoustic (Experiment 2) distractor in: (1) the same visual field as the target (i.e., proximal distractor: P) or (2) in the opposite visual field as the target (i.e., remote distractor: R) and at eccentricities less than (i.e., P− or R−) or greater than (i.e., P+ or R+) the target location (i.e., 10.5° and 20.5°). For this figure, we depict a visual target in the right visual field with a R+ visual and acoustic distractor (the other distractor and target locations are depicted in light gray). Visual distractors were red LEDs

Experiment 2: visual targets and acoustic distractors

Experiment 2 was identical to Experiment 1 with the exception that distractors were acoustic. In particular, the stimulus board was modified to contain 10 mm circular speakers (Neodymium Headphone Element, frequency response: 2000–12,000 Hz) covered with black stereo cloth (see Fig. 1). Uniform white noise was generated from the computer at 22,050 Hz, sent to a digital/analog converter and amplified by custom-built amplifiers (i.e., one amplifier per speaker). Following amplification, the sound was sent to a speaker to generate a 50 ms burst of 69 dBA noise that was presented concurrently with a visual target. Background noise in the laboratory was less than 36 dBA during data collection. Further, although saccades directed to acoustic targets are generally less accurate (i.e., when greater than 10° eccentricity) and/or more variable than visual targets, ample evidence has shown that humans reliably—and accurately—discriminate between the visual target and acoustic distractor locations used here (Heath et al. 2015, 2016; Yao and Peck 1997; Zambarbieri et al. 1987).

Experiment 1 and 2: data processing, dependent variables and statistical analyses

Gaze position data were filtered offline using a dual-pass Butterworth filter with a low-pass cutoff frequency of 15 Hz. These data were used to compute instantaneous velocities via a five-point central finite difference algorithm. Acceleration data were similarly obtained from the velocity. Saccade onset was marked when velocity and acceleration exceeded 30°/s and 8000°/s2, respectively. Saccade offset was marked when saccade velocity was below 30°/s for 40 ms. Trials involving signal loss (e.g., eye blink) were excluded as were trials with: (1) an amplitude less than 2° or greater than 2.5 times the participant-specific mean (Weiler and Heath 2014) and (2) trials with a RT less than 50 ms or greater than 2.5 times the participant-specific mean (Wenban-Smith and Findlay 1991). Less than 5% of trials were removed for any participant.

Dependent variables included reaction time (i.e., time from response cuing to saccade onset: RT) and saccade gain (i.e., saccade amplitude/veridical target location). Dependent variables were examined via one-way repeated measures ANOVA involving trial type (i.e., TO, P−, P+ , R−, R+) with an alpha level of 0.05.

Results

Experiment 1

RT produced a main effect for trial type, F(4,44) = 35.67, p < 0.001, η2 = 0.44. Figure 2 shows that P− and P+ distractors did not differ from the TO condition (all t(11) = − 0.74 and 1.99, ps > 0.17, dz = 0.21 and 0.57), whereas RTs for R− and R+ conditions were longer than the TO condition (all t(11) = 2.85 and 2.45, ps < 0.03, dz = 0.82 and 0.71). In addition, participant-specific RT difference scores (i.e., distractor condition minus TO condition) were computed to determine whether the magnitude of the distractor effect differed between R− and R+ trial types. A paired-samples t test indicated that trial types did not reliably differ (t(11) = 1.02, p = 0.31, dz = 0.29).

Fig. 2
figure 2

The main panels present Experiment 1 and 2 group mean reaction times (ms) for each distractor trial type and target-only (TO) trials with error bars representing 95% within-participant confidence intervals. The offset panels represent RT difference scores (i.e., distractor condition minus TO condition) with error bars representing 95% between-participant confidence intervals, and the absence of overlap with error bars and zero (i.e., horizontal dashed line) represents a reliable difference interpreted inclusive to a test of the null hypothesis. The difference scores shown in gray in the offset panel represent the average of proximal (i.e., P+ and P−) and remote (i.e., R+ and R−) trial types with error bars representing 95% between-participant confidence intervals

Saccade gain revealed a main effect for trial type, F(4,44) = 39.73, p < 0.001, η2 = 0.47: P− and P+ values were smaller and larger, respectively, than the TO condition (all t(11) = 2.37 and 2.22, ps = 0.037 and 0.048, dz = 0.68 and 0.64). In turn, R− and R + values were larger than the TO condition (all t(11) = 2.24 and 2.25, ps = 0.045, dz = 0.64 and 0.65) (Fig. 3). We examined for differences in the magnitude of distractor effects via absolute (i.e., unsigned) participant-specific gain difference scores (i.e., distractor condition minus TO condition). The results of a one-way ANOVA did not elicit a reliable effect, F(3,33) = 1.76, p = 0.15, η2 = 0.05.

Fig. 3
figure 3

The main panels present Experiment 1 and 2 group mean saccade gains (i.e., saccade amplitude divided by veridical target amplitude) for each distractor trial type and target-only (TO) trials with error bars representing 95% within-particiapnt confience intervals. The offset panels represent absolute gain difference scores (i.e., |distractor condition minus TO condition|) with errors bars representing 95% between-participant confidence intervals

Experiment 2

RT yielded a main effect for trial type F(4,36) = 21.40, p < 0.001, η2 = 0.66. Figure 2 demonstrates that RTs for each distractor trial type were shorter than the TO condition (all t(9) > 7.70, ps < 0.001, all dz > 2.43). We also submitted participant-specific RT difference scores (i.e., distractor condition minus TO condition) to a one-way ANOVA and observed a significant effect, F(3,27) = 8.85, p < 0.001, η2 = 0.50. In decomposing this effect, the offset panel of Fig. 2 shows that proximal distractors (i.e., P+ and P−) produced shorter RTs than remote distractors (i.e., R+ and R−) (all t(9) = 3.09 and 3.05, ps < 0.014, dz = 0.97 and 0.96).

Figure 3 presents saccade gain and demonstrates that values were refractory to trial type, F(4,36) = 0.08, p = 0.99, η2 = 0.01. In other words, proximal and remote acoustic distractors did not influence saccade amplitude to a visual target.

Discussion

Visual targets and visual distractors produce an RDE and global effect

Experiment 1 required saccades to visual targets in a target-only (TO) condition, and when presented concurrently with visual proximal and remote distractors. Results showed that remote distractors (i.e., R− and R+) produced longer RTs than their TO condition counterpart, whereas proximal distractors (i.e., P− and P+) did not influence RT. These findings support previous work documenting an RDE (DeSimone et al. 2015; Lévy-Schoen 1969; Walker et al. 1997) and are interpreted within the CIM’s assertion that saccade-related activity at distant locations in a common retinotopic map inhibits one another via a long-range intercollicular inhibitory pathway (Godijn and Theeuwes 2002). In terms of saccade gain, proximal distractors elicited a global effect such that responses were biased toward the distractor’s location (for review, see Van der Stigchel and Nijboer 2011). This finding is in line with the CIM’s contention that activity for a proximal distractor ‘spreads’ into the retinotopic representation associated with a target and leads to a spatially averaged movement vector (i.e., an amplitude intermediary between target and distractor). Interestingly, remote distractors (i.e., R− and R+) produced larger gains than the TO condition. Although this result is not accounted for in the CIM, it does correspond to earlier work by our group (DeSimone et al. 2015). In accounting for this result, we note that saccade trajectories curve away from a distractor in pursuit of a response goal (Doyle and Walker 2001; Tipper et al. 2000, 2001), and as a result top-down inhibition related to the spatial location of a remote distractor may bias trajectories contralateral to the distractor. In other words, a saccade may move further away from the location of a remote distractor to avoid task-irrelevant capture of visual information.

Visual targets and acoustic distractors: no evidence for an RDE or global effect

Although visual and acoustic stimuli are initially encoded in retino- and head-centered coordinates, respectively, the different sensory modalities are converted into a common oculocentric frame of reference for motor output (Zambarbieri et al. 1987). Moreover, Frens and Van Opstal’s (1998) single-unit recording work in non-human primates reported that saccadic burst neuron activity in the intermediate and deep layers of the SC encodes divergent sensory signals into an oculocentric frame of reference (see also Frenset al. 1995). Accordingly, it is possible that a visually guided saccade completed in the presence of a proximal or remote acoustic distractor might give rise to an RDE and global effect. In the present work, RTs for all distractor trial types were shorter than the TO condition, and RTs for proximal distractors were on average 29 ms shorter than remote distractors. In turn, saccade gains for all distractor trial types did not differ from the TO condition. As such, Experiment 2 provides no evidence of an RDE or global effect for a visually guided saccade paired with an acoustic distractor.

The current study sought to determine whether the concurrent activation of target and distractor location within a saccade map occurs at the level of sensory encoding (i.e., within a retinotopic map) or the level of motor programming (i.e., within an oculocentric frame of reference). As indicated above, the present findings provide no evidence that an acoustic distractor elicited an RDE or global effect. We believe that such findings are directly in line with the CIM’s assertion that: ‘saccade programming occurs on a common saccade map with a retinotopic representation, in which information from different sources (e.g., endogenous and exogenous) is integrated’ (Godijn and Theeuwes 2002, pp 1039). Put more directly, the present results support the view that the competing activity of a remote distractor (i.e., the RDE) and the spatially averaged response of a proximal distractor (i.e., the global effect) occur in a retinotopic frame of reference prior to planning saccade motor error (i.e., the direction and amplitude of a saccade required to bring a target onto the fovea) in an oculocentric frame of reference (Sparks 1989).

At least four issues from Experiment 2 require addressing. The first three issues relate to: (1) why all distractor trial types produced shorter RTs than the TO condition, (2) why RTs for TO trials in Experiment 2 were longer than Experiment 1 (see Fig. 2), and (3) why proximal distractors produced shorter RTs than remote distractors. In the first case, it has been reported that the coincident presentation of acoustic and visual stimuli yields SC activity in cats within 19 and 83 ms, respectively (Meredith et al. 1987).Footnote 1 As such, although the auditory distractor and visual target were triggered synchronously, it is likely that the former increased baseline activity in the SC leading to an earlier peak in SC activity compared to when the visual target was presented alone. We thus propose that results are consistent with the model of intersensory facilitation effect’s (see Todd 1912) assertion that the integration of bimodal stimuli shortens RT (Colonius and Arndt 2001). Previous work has linked this finding to a bimodal integration process within an oculocentric frame of reference (Frens et al. 1995; Harrington and Peck 1998; Hughes et al. 1994, 1998; Nozawa et al. 1994). In other words, multisensory integration in the SC improves the efficiency of movement planning processes. In the second case, the longer RTs for TO trials in Experiment 2 could be accounted for by the between-experiment design; however, a more parsimonious account can be drawn from the intersensory facilitation effect outlined above. Because TO trials were presented on 20% of trials, it is possible that participants adopted a response-set (for review see Berkman 2018) wherein individual trials were planned to advantage the facilitatory effect of integrating visual and acoustic signals (i.e., because 80% of trials entailed bimodal stimuli). As such, a visual target presented without an acoustic distractor may have engendered a planning cost associated with switching from a planned to an unplanned response set. In terms of the third issue, that proximal distractors produced a larger magnitude RT reduction than their remote distractor counterpart is in line with Frens and Van Opstal’s (1998) behavioral and electrophysiological evidence that bimodal enhancement is a function of decreasing spatial separation between a visual target and acoustic distractor (Frens et al. 1995). The fourth issue relates to the fact that acoustic distractors did not influence saccade gain (cf. Lueck et al. 1990). This finding may be accounted for by the fact that the spatial resolution for visual stimuli is greater than acoustic stimuli (for review, see Petro et al. 2017). Indeed, because sensory information supporting motor output is combined in a statistically optimal fashion (Ernst and Banks 2002), it may be that visual information is preferentially selected to encode the motor error for an ensuant saccade independent of a task-irrelevant—and statistically non-optimal—acoustic distractor.

Conclusion

The RDE and global effect are elicited when a visually guided saccade is paired with a visual—but not acoustic—distractor. Accordingly, and in line with the CIM, the present findings indicate that the RDE and global effect reflect a sensory-dependent phenomenon relating to conflicting (i.e., the RDE) and spatially averaged (i.e., the global effect) ‘visual’ signals within a common retinotopic map in the SC.