Introduction

The typical methods for examining reflexive and volitional orienting are two versions of Posner’s cuing task (Klein, 2005; Posner, 1980). In the exogenous (reflexive) task, a nonpredictive, peripheral cue is presented before the appearance of a target. The target could appear at the cued location (valid condition) or at the opposite location (invalid condition). The typical pattern of results is an early facilitatory effect (faster reaction time for the valid compared with the invalid condition), followed by inhibition of return (IOR; Posner & Cohen, 1984). In a typical version of Posner's endogenous task, participants are presented with a central, informative cue (e.g., a central arrow; colour patch), which is followed by a peripheral target. Importantly, the cue is informative regarding the location of the upcoming peripheral target (e.g., in 80% of the trials the cue predicts target's location). As the interval between the cue and the target (stimulus onset asynchrony; SOA) increases, the common pattern of results is a developing facilitatory effect at the predicted (valid) location that is not followed by IOR.

Several differences between reflexive and volitional orienting have been demonstrated (for a review, see Klein, 2009, pp. 245-248). Endogenous facilitation is slower to develop—approximately 200 ms SOA (Remington & Pierce, 1984) than reflexive facilitation, which can be observed as early as 50 ms SOA (Shepherd & Müller, 1989). There also are differences in the level of automaticity; reflexive orienting is considered to be more automatic than volitional orienting (Carrasco, Loula, & Ho, 2006; Hein, Rolke, & Ulrich, 2006; Jonides, 1981; Yeshurun & Carrasco, 1998). In addition, reflexive attention is almost certainly phylogenetically older than volitional attention (Carrasco, 2011; Gabay, Leibovich, Ben-Simon, Henik, & Segev, 2013). Finally, and most pertinent to the current study, numerous studies have suggested that volitional orienting is highly connected to neocortical regions (e.g., frontal-pariatel regions; Corbetta & Shulman, 2002; Robinson, Bowman, & Kertzman, 1995; Zackon, Casson, Zafar, Stelmach, & Racette, 1999; Andersen, Snyder, Bradley, & Xing, 1997; Kincade, Abrams, Astafiev, Shulman, & Corbetta, 2005; Peelen, Heslenfeld, & Theeuwes, 2004; Voytko et al., 1994; Yantis et al., 2002), whereas reflexive orienting also involves subcortical regions, such as the superior colliculus (Dorris, Klein, Everling, & Munoz, 2002; Gabay et al., 2013; Sapir, Soroker, Berger, & Henik, 1999; Self & Roelfsema, 2010).

Alternative viewpoint

Brain mechanisms are highly sensitive to regularities and can learn different types of sequences and statistical regularities even implicitly and incidentally (Chun & Jiang, 1998; Courville, Daw, & Touretzky, 2006; Gallistel & Gibbon, 2000; Goujon & Fagot, 2013). As discussed, to measure "volitional" orienting, many studies have used a probability manipulation, which is a basic feature of Posner's endogenous cuing task. In this task, in addition to the experimental instructions (which instruct participants to shift attention to the predicted location), symbolic cues inform the participants where the target is most likely to appear (Posner, 1980). This probabilistic association between a cue characteristic (e.g., shape of the arrow or a color of the cue) and the target's location might be acquired through a low-level learning mechanism. The question remaining is whether common endogenous orienting tasks truly and exclusively elicit a "volitional" mechanism? Another possibility, which needs to be explored, is that a simple associative learning mechanism also might be involved in the facilitated responding to targets presented at validly cued location in an endogenous cueing paradigm.

Associative learning depends on acquiring the contingency between a property of the cue and the target’s location (e.g., target appears at the cued location at 80% of the trials; for a review, see Gallistel & Gibbon, 2000). Accordingly, across trials, the regularity between events, such as the target's location and the preceding cue's color, could be learned, and such an association might shorten responses to targets at the validly cued location. Something akin to this was demonstrated in studies that examined spatial attention (in which statistical learning mechanism influences attentional allocation; Chun & Jiang, 1998; Geng & Behrmann, 2002), even without any voluntary intention to learn.

In addition, contrary to most of the literature, some data imply that neocortical regions are not exclusively responsible for volitional orienting and that subcortical regions also might be involved in the typical endogenous orienting tasks (Katyal & Ress, 2014; McAlonan, Cavanaugh, & Wurtz, 2008; Saban, Sekely, Klein, & Gabay, 2017a; Saban, Sekely, Klein, & Gabay, 2017b). For example, and most pertinent to the current study, consider these two very recent findings. First, in a study examining human's participants, it was demonstrated that the onset of endogenous orienting was apparent earlier when both cue and target were presented to the same monocular channel (vs. different channels), indicating that subcortical regions in humans play a functional role in endogenous orienting (Saban et al., 2017a). Second, when subjected to Posner’s endogenous orienting task, the archer fish (an evolutionarily older species) demonstrated a human-like endogenous facilitation and IOR, a pattern of results that commonly emerges in exogenous orienting tasks (Saban et al., 2017b). Hence, there is some basis to surmise the involvement of subcortical mechanisms when orienting is explored using the typical endogenous orienting task.

Therefore, the current study was designed to explore the possibility that a simple learning mechanism, which might functionally involve monocular (subcortical) regions, is influencing the pattern of results observed in the typical "volitional" endogenous orienting task. To examine this possibility, we compared the typical endogenous orienting task to a similar one without the involvement of statistical contingencies between cues' visual properties and targets' locations. In both versions an arbitrarily selected, nonspatial feature of a central cue (its color) was used to signal which target location should be attended endogenously. In addition, for each task, we have measured whether there is a functional contribution of monocular visual channels.

How to probe behaviorally the contribution of subcortical visual channels?

Visual input from the two eyes is separated in the early stages of the visual processing stream. That is, visual information once received by the retina, is monocularly segregated until it reaches binocular extrastriate regions (Horton, Dagi, McCrane, & de Monasterio, 1990; Menon, Ogawa, Strupp, & Ugurbil, 1997). Hence, two inputs from different eyes can be integrated mostly after the convergence of binocularly driven neurons in the neocortex. Using a stereoscope, one can control the visual information presented to each eye separately, and therefore can examine the involvement of monocular channels (mostly subcortical) in a specific cognitive process. This device has been used to explore the involvement of subcortical structures in many cognitive processes (Batson, Beer, Seitz, & Watanabe, 2011; Gabay & Behrmann, 2014; Gabay & Behrmann, 2014; Karni & Sagi, 1991; Saban, Gabay, & Kalanthroff, 2017; Saban et al., 2017a; Self & Roelfsema, 2010). In the context of Posner’s endogenous cuing paradigm, the stereoscope allows us to manipulate the cue's and target's eye-of-origin and therefore provides a useful tool for isolating the involvement of monocular versus binocular (mostly cortical) visual channels in endogenous orienting.

In the current study, we applied this method and logic in a typical implementation of the endogenous orienting task. Importantly, we also applied it in a version of this task that does not involve statistical contingencies between cues and targets. In both versions, a central cue was presented before the appearance of the peripheral target. A simple target detection task was used, and participants were explicitly instructed to attend one of the two possible target locations according to the cue's color (e.g., if the cue is red the participants were instructed to attend the upper location). In the typical task, the central cue predicted at which location the peripheral target would appear. In the "pure" endogenous task, the cue was not predictive. Using the stereoscope, for each of the tasks employed, the eye to which the endogenous cue and target were presented was manipulated. In the same eye condition, cue and target were presented to the same eye, and in the different eyes condition, they were presented to different eyes. In the “typical” condition, we would expect to replicate our previous finding that facilitation begins earlier with same than different eye delivery of the cue and target. In addition, if in the “pure” condition, we have eliminated the contribution of low-level learning mechanisms, then in that condition we would not expect a difference between the same eye and the different eye conditions.

Methods

Participants

A total of 48 (25 performed the pure task and 23 the typical task) participants volunteered to participate in exchange for payment or course credit. The mean age was 22.9, and the standard deviation was 3.9 (39 females). All participants had normal or corrected-to-normal vision. The sample size in this study is sufficient, because studies have long employed the endogenous version of Posner's cuing task and consistently found the endogenous facilitation effect with much smaller sample size (e.g., 13 subjects in Shepherd & Müller, 1989; 16 subjects in Berger, Henik, & Rafal, 2005). The study was approved by University of Haifa ethics committee.

Stimulus and apparatus

Stimulus presentation was performed using a HP Z200 computer, operating with Windows 7 system. Stimuli were displayed on a Samsung LCD monitor (model S24C650PL) with a recommended resolution of 1680X1050. Responses were made using DELL Hebrew-English Extended Keyboard (model RT7D50 SK-8115). The computer monitor was positioned 57 cm in front a stereoscope (model ScreenScope LCD SA200LCD), blocking the participant’s direct view of the monitor. The monitor presentation was divided into two halves (each half was presented to a different eye) and consisted of two rectangles (6°X17.8°) placed 10.3° from the center of the screen and 20.6° from each other. Each rectangle contained three squares (2.8° each side) in a vertical alignment. The upper and lower squares were placed at 5.8° from the center of the screen, and the central square was placed at its center. A central fixation cross comprised two lines (0.7° each), centered within the central squares. Cues consisted of red or green colors filling in the central square. An asterisk target (0.7°) was then presented, centered within one of the peripheral squares. Except for the cues, all stimuli were white figures against a black background.

Procedure

Typical experimental trials are depicted in Figure 1. Each trial began with a fixation cross appearing for 500 ms. Two hundred ms after fixation disappeared, the central cue was presented for 100 ms. After a variable SOA of 100, 300, or 500 ms, the target appeared for 3,000 ms or until a response was detected. In the pure group, participants were instructed to focus their eyes at the center of the screen throughout the experiment but to pay attention volitionally up or down depending on color of the cue. Each color was associated with a specific location, but the target appeared in the instructed location in 50% of the trials. In contrast, in the typical group, each color was associated with a specific location, and the target appeared at the predicted location in 80% of the trials. Participants were informed about the cues' predictability. In both groups, the target appeared at the instructed location (valid trial) or at the opposite location (invalid trial). The cue and target were presented to the left or right eye with equal probability. Four possible target locations varied equally and randomly: left eye-up, left eye-down, right eye-up, and right eye-down. Participants were instructed to respond to target appearance by pressing the space bar of a keyboard with their dominant hand as fast as possible. After manual response, an intertrial interval of 1,000 ms was introduced. Each participant has 16 practice trials before the experiment began. In the typical task, each participant completed 480 experimental trials divided into four blocks. For each of the two eye-of-origin conditions, subjects performed 64 valid and 16 invalid trials for each one of the three SOAs. In the pure task, each participant completed a total of 192 experimental trials divided into 4 blocks. For each of the two eye-of-origin conditions, subjects performed 16 valid and 16 invalid trials for each of the three SOAs. In 7.7% of the trials, no target appeared (i.e., catch trials), and the participant was instructed not to respond. Catch trials were dispersed randomly across the trials. All instructions were automated and were presented on the screen. The different experimental conditions were presented randomly.

Figure 1
figure 1

Typical task in which a green cue predicts a target at the upper square while a red cue, predicts a target at the lower square. Note that in the pure task the visual presentation is exactly the same. (A) A typical Valid, Different−eye condition trial in which the cue (green square) is presented to the right eye (right column) and the target is presented to the right eye (right column), at the upper square. (B) A typical Valid, Same−eye condition trial in which the cue (green square) is presented to the right eye (right column) and the target is presented to the left eye (left column), at the upper square. The middle columns represent the participant's fused perception.

Results

Trials in which RT was longer than 2,500 ms or shorter than 100 ms were excluded from the analyses (<1.5%). Participants responded in catch trials on less than 1% of the trials and did not respond to target appearance on less than 1% of trials. For each task (typical, pure), we performed a three-way analysis of variance (ANOVA), with eye-of-origin (same-eye, different-eye), SOA (100 ms, 300 ms, or 500 ms), and validity (valid, invalid) as a within-subject factors and RT as the dependent variable. See Table 1 for a detailed presentation of RTs for the different conditions. Figure 2 presents RT as a function of eye-of-origin, SOA, and validity for each task separately.

Table 1 Reaction time for the different experimental conditions
Figure 2
figure 2

(A) The upper panels show the pattern of results in the Typical task, and the lower panels (B) show pattern of results in the Pure task. For both tasks, RT as a function of SOA and validity depicted for each eye-of-origin condition. Ninety-five percent confidence intervals are shown in the error bars. The two functions have been slightly offset horizontally to allow visualization of the error bars. *p < 0.05.

Typical Task

In the typical task, replicating previous findings, the main effects of SOA and validity were significant [F(2, 44) = 49.67, MSE = 1,549, p < 0.001, η2 = 0.69; F(1, 22) = 7.60, MSE = 1,769, p = 0.011, η2 = 0.26, respectively]. In contrast, the main effect of eye-of-origin was not significantFootnote 1 [F(1, 22) = 0.38, MSE = 894, p > 0.250]. The SOA x validity, SOA x eye-of-origin, and eye-of-origin x validity interactions were not significant [F(2, 44) = 2.16, MSE = 735, p = 0.126; F(2, 44) = 2.26, MSE = 942, p = 0.116; F(1, 22) = 0.499, MSE = 780, p > 0.250, respectively]. Most importantly, the three-way interaction between SOA x validity x eye-of-origin was significant [F(2, 44) = 4.32, MSE = 553, p = 0.019, η2 = 0.16].

To further investigate the three-way interaction, we examined the simple two-way interaction between SOA and validity for each eye-of-origin condition separately. When the cue and target were presented to different eyes, the SOA x validity interaction was significant [F(2, 44) = 5.21, MSE = 754, p = 0.009, η2 = 0.19], indicating a significant validity effect only at the last SOA [F(1, 22) = 1.01, MSE = 924, p > 0.250; F(1, 22) = 2.362, MSE = 1416, p = 0.138; F(1, 22) = 9.04, MSE = 907, p = 0.006, for the 100 ms, 300 ms, and 500 ms SOAs, respectively]. In contrast, when the cue and target were presented to the same eye, the SOA x validity interaction was not significant [F(2, 44) = 0.10, MSE = 534, p > 0.250], and the validity effects were significant at all SOAs [F(1, 22) = 5.25, MSE = 769, p = 0.031; F(1, 22) = 5.12, MSE = 478, p = 0.033; F(1, 22) = 4.50, MSE = 628, p = 0.045, for the 100 ms, 300 ms, and 500 ms SOAs, respectively].

Pure Task

Replicating previous findings, the main effects of SOA and validity were significant [F(2, 48) = 32.29, MSE = 2414, p < 0.001, η2 = 0.57; F(1, 24) = 9.90, MSE = 2,632, p = 0.004, η2 = 0.29, respectively]. In contrast, the main effect of eye-of-origin was not significant [F(1, 24) = 0.26, MSE = 1,223, p > 0.250]. The SOA x validity, SOA x eye-of-origin, and eye-of-origin x validity interactions, were not significant [F(2, 48) = 1.64, MSE = 1094, p = 0.203; F(2, 48) = 0.18, MSE = 1,012, p > 0.250; F(1, 24) = 0.12, MSE = 1,016, p > 0.250, respectively]. In contrast to the typical task, the three-way interaction between SOA x validity x eye-of-origin was not significant [F(2, 48) = 0.02, MSE = 1,111, p > 0.250].

Although the SOA x validity x eye-of-origin interaction was not significant, to compare the pattern of results to that observed in the typical task, we examined the simple two-way interaction between SOA and validity for each eye-of-origin condition separately. When the cue and target were presented to different eyes, the SOA x validity interaction was not significant [F(2, 48) = 1.20, MSE = 828, p > 0.250], and further analyses revealed a significant validity effects only at the last two SOAs [F(1, 24) = 1.02, MSE = 630, p > 0.250; F(1, 24) = 4.49, MSE = 1,236, p = 0.04; F(1, 24) = 4.46, MSE = 1,518, p = 0.04, for the 100 ms, 300 ms, and 500 ms SOAs, respectively].

When the cue and target were presented to the same eye, the SOA x validity interaction also was not significant [F(2, 48) = 0.60, p > 0.250], and further analyses revealed a significant validity effects at the last two SOAs [F(1, 24) = 1.03, MSE = 1,509, p > 0.250; F(1, 24) = 4.46, MSE = 1,045, p = 0.028; F(1, 24) = 4.41, MSE = 2,120, p = 0.046, for the 100 ms, 300 ms, and 500 ms SOAs, respectively].

Discussion

The current results provide novel insights for those interested in endogenous orienting. Replicating our previous findings, in the typical task—used to measure "volitional" orienting—facilitation was found as early as 100 ms after cue onset but only in the same-eye condition. In the "pure" task, which in contrast to the current typical task does not contain statistical regularities between the cue's color and the target's location, no difference was found between the two eye-of-origin conditions and facilitation was not observed until 300 ms after cue onset. In line with the previous literature, in both tasks, once facilitation emerged, it was maintained throughout the longer SOAs in both eye-of-origin conditions.

As previously demonstrated, volitional and reflexive processes (i.e., endogenous and exogenous attentional effects) can simultaneously co-exist and under some conditions may influence performance in an additive manner (Berger, Henik, & Rafal, 2005; Berlucchi, Chelazzi, & Tassinari, 2000; Chica, Lupianez, & Bartolomeo, 2006). Several reflexive processes could initiate a rapid attentional orienting as a result of what have been typically termed an "endogenous" cue. Central cues, such as arrows and gaze cues, can produce orienting responses even when they are not predictive of target's location (Friesen & Kingstone, 1998; Kingstone, Friesen, & Gazzaniga, 2000; Pratt & Hommel, 2003; Ristic, Friesen, & Kingstone, 2002; Ristic & Kingstone, 2006). Such findings suggest that these types of cue possess a reflexive property, which has been hypothesized to elicit automatically attentional orienting because of a lifetime of associating their spatial properties with the location of important information in our environment.Footnote 2 In contrast, in the typical cuing task of the current study, the association between the cue's nonspatial and arbitrarily selected property (color) and the target's location is specific to the experimental context and is learned through exposure to the task. The instruction to orient attention endogenously in response to the cue's color might have elicited some amount of true voluntary orienting, just as it did in the “pure” condition. We believe that this volitional orienting takes place in parallel with associative learning due to the statistical regularities (i.e., the correlation between a cue property and the target's location) present in the typical condition.

When a property of the cue (e.g., shape or color) is correlated, by numerous trials, with the location of the target, an associative learning process could be initiated by this contingency and result in more rapid responses to targets presented at the cued location. Such a learning process should develop as the experiment proceeds.Footnote 3 As mentioned, volitional orienting in response to informative central cues is generally thought to take approximately 200 ms to reach its full magnitude (Remington & Pierce, 1984). There are, however, studies that used larger number of trials than is typically used in this literature, between 3,500 (Shepherd & Müller, 1989) and ~6,500 trials (Cheal & Lyon, 1991). These studies have found substantial cueing effects with SOAs of 50 ms or less. Our interpretation is that when a property of the cue (e.g., shape or color) is correlated with the location of the target, simple learning mechanisms may kick in such that an orienting response becomes conditioned to the cues' property. This may masquerade as and/or co-exist with voluntary orienting. Such associated orienting responses are, like reflexive orienting, initiated rapidly in response to the appearance of the cue.

The involvement of an additional associative learning process only in the typical endogenous task should result in a greater facilitation effect in the 100 ms SOA in the same-eye condition of the typical task compared with the facilitation effect in the different-eye condition of the pure task. Yet, when examining this difference specifically, it did not reach significance in the current study. This might be explained by individual differences between the two experimental groups (typical vs. pure task). Regardless, as indicated earlier, the finding of a validity-effect modulation as a function of SOA and eye-of-origin condition only in the typical task strengthens the conclusion that different processes are involved in the two tasks.

In the current experimental design, a central color cue was associated with peripheral target's location. Due to neural plasticity, we propose that that monocular neurons started to associate the two spatiotemporal events by their spatiotemporal receptive fields. This explanation is in line with previous findings demonstrating perceptual learning in monocular channels (Karni & Sagi, 1991). Hence, monocular channel’s receptive fields are likely to associate two spatiotemporal events in the same eye, and this would lead to the observed same-eye cueing advantage. That is, when the same monocular channel (mostly subcortical) is presented with both the cue and the target, the associative learning mechanism can kick in and initiates a rapid orienting response to the appearance of the endogenous cue.

Implication for the study of volitional orienting processes

The current study has important implications for our conceptual understanding of volitional orienting processes. In contrast to the common perspective, "endogenous" orienting might not be equivalent to "volitional" orienting. The common tasks, which are suggested to manipulate volitional orienting in humans, usually manipulate attention by influencing the cue predictability. As suggested by the present study and our work with the archer fish, the contingency between the cue and the target location can produce an associative learning process that can masquerade as volitional orienting in these tasks. The pattern of results supports the possibility that volitional and reflexive processes may contribute jointly to behavior and, therefore, that in the typical Posner task cuing effects might be a combination of the associative learning effect and the intention to follow the instructions. This account is in accordance with a recent study, which suggests that implicit learning of cue-target contingencies can influence attentional effects (Risko & Stolz, 2010). We suggest that simple associative learning is contaminating tasks that are commonly used to measure "volitional" orienting even when participants do not have a lifetime of experience that links a spatial property of the cue with the location of task-relevant information. Hence, a reconceptualization of the way volitional processes are defined and measured is needed.