Introduction

In everyday life individuals are exposed to visual scenes that contain a tremendous amount of information. Because humans are not capable of recognizing all the visual objects around them simultaneously, they have to select a few objects that are important, and ignore the others (Desimone & Duncan, 1995). The human attentional system selects and inhibits information, and is biased toward task-relevant or highly salient objects to promote selection of necessary information (Corbetta & Shulman, 2002; Serences, Shomstein, Leber, Golay, Egeth, & Yantis, 2005; Serences & Yantis, 2007).

Previous research has indicated that reward shapes the deployment of visual attention to particular objects (e.g., Anderson, 2013; Chelazzi, Perlato, Santandrea, & Della Libera, 2013; Della Libera & Chelazzi, 2006; Engelmann & Pessoa, 2007; Hickey, Chelazzi, & Theeuwes, 2010a, 2010b; Kiss, Driver, & Eimer, 2009; Navalpakkam, Koch, Rangel, & Perona, 2010). For example, stimulus features associated with reward increase trial-to-trial priming (Hickey et al., 2010b; Kristjansson, Sigurjonsdottir, & Driver, 2010) facilitate target detection in rapid serial visual presentation (RSVP) streams (Raymond & O’Brien 2009; Yokoyama, Padmala, & Pessoa, 2015) and visual search (Störmer, Eppinger, & Li, 2014), modulate contextual cueing effects (Tseng & Lleras, 2013), facilitate task-relevant behavioral responses, and impair inhibition of task-irrelevant information (Krebs, Boehler, & Woldorff, 2010).

In other studies, stimuli previously associated with reward in reward learning involuntarily capture attention in visual search even when the stimulus no longer predicted a reward outcome (e.g., Anderson, Laurent, & Yantis, 2011a, b). This effect of stimulus-reward association on visual attention is called value-driven attentional capture (VDAC, e.g., Anderson et al., 2011b). The magnitude of VDAC depends on reward learning. Stimuli associated with high reward induce larger VDAC than stimuli associated with low reward (e.g., Anderson et al., 2011a). Moreover, stimuli associated with reward in one task can be applied to other tasks (e.g., Anderson, Laurent, & Yantis, 2012). For example, the effect of a stimulus-reward association learned during a bottom-up search task (pop-out-search) transfers to a top-down search task (serial-search; Lee & Shomstein, 2014). Furthermore, the effect of stimulus-reward associations learned in a training phase persist for a long period of time, even when a reward is not given in the task (e.g., Anderson et al., 2011a, b; Della Libera & Chelazzi, 2009; Raymond & O’Brien 2009). For example, an association learned in the training phase biased attention in a test phase from a few days to 9 months later (Anderson & Yantis, 2013; Della Libera & Chelazzi, 2009). Rewarded stimuli are preferentially processed in the visual environment, and capture visual selective attention.

Although the effect of the reward-learning has been observed in a range of situations in previous studies, the mechanism underlying the association between stimulus features and reward during the learning phase remains unclear. Most previous studies associated a target-defining feature with reward in the learning phase. For instance, participants received reward in the reward-learning phase depending on color in a color-search task (e.g., Anderson et al., 2011a, b; Sali, Anderson, & Yantis, 2014) and a Stroop task (Krebs et al. 2010), shape in a negative priming task or a shape search task (Della Libera & Chelazzi, 2009; Wang, Yu, & Zhou, 2013), orientation in an orientation search task (Laurent, Hall, Anderson, & Yantis, 2014; Lee & Shomstein, 2014), faces in a choice game (Raymond & O’Brien 2009; Rutherford, O’Brien, & Raymond, 2010), and location in a visual discrimination task (Chelazzi et al., 2014) and a visual search (Anderson, 2014).

In other studies, the stimulus features (color) associated with reward did not define the target stimulus in the visual search task (reward-learning phase). However, a target was always presented in the colored circle associated with reward, and this stimulus feature predicted target location (Failing & Theeuwes, 2014). During training, two letters were presented to the right and left of fixation, and participants judged which target (S or P) was presented. The letters were surrounded by colored circles, and one color was always paired with the target and associated with reward. This reward learning produced an attentional capture effect in a spatial cueing task during the test phase. Although the reward-associated feature (color) was no longer a target-defining feature, color was still task-relevant, in that participants could use color as a cue to localize the target.

Moreover, recent studies examined whether the stimulus-signaling reward could induce attentional and oculomotor capture even when participants have to inhibit this stimulus to obtain reward (Bucker, Belopolsky, & Theeuwes, 2015; Le Pelley, Pearson, Griffiths, & Beesley, 2015). Le Pelly et al. (2015) used an additional singleton search task to associate the color of distractor with reward (Theeuwes, 1991, 1992). This reward-signaling color was only bound to one of the distractors. The target was defined by shape singleton, and participants had to report the orientation of the line segment within the target shape. Results showed that reaction times (RTs) were slower when the distractor associated with high reward was presented, suggesting that the distractor associated with high reward captured attention.

Although these recent findings claimed that reward-associated features are not necessary for task-relevance for the effect of reward prediction on visual attention, recent studies on VDAC using non-target-defining features is the lack of test phase. Most previous studies on task-relevant stimulus features (e.g., target-defining features) associated with reward reported that VDAC was observed when this stimulus feature no longer predicted reward outcome (e.g., Anderson et al., 2011a, 2011b). For example, Anderson et al. (2011b) showed that the target-defining feature (i.e., color) associated with reward in color search (reward learning phase) implicitly captured attention even when this stimulus feature was presented as a distractor and predicted no reward outcome in shape search (test phase). However, Le Pelley et al. (2015) examined the VDAC with features not defining target only in the learning phase, thus it is unknown whether this VDAC also occurs even when reward-associated features no longer predict reward outcome. The VDAC observed with features not defining the target may be substantially weaker than, or qualitatively different from, the VDAC observed with target-defining features, such that the attentional capture disappears once the feature-reward association is eliminated.

Many studies have found that visual perceptual learning (VPL) occurs selectively for task-relevant features. For instance, when two stimuli are presented in a display, VPL is only observed for the stimulus to which participants voluntary directed attention (Ahissar, 2001; Shiu & Pashler, 1992). However, VPL also occurs for task-irrelevant features. Stimulus features presented in the visual field evoked reinforcement signals, and these signals induce VPL even when the stimulus features were task irrelevant and unperceived (Seitz, Kim, & Watanabe, 2009). Such task-irrelevant VPL raises the possibility that VDAC is also mediated by associations between rewards and task-irrelevant features. As described in detail below, the series of experiments in the current study provides evidence for this possibility.

The purpose of this paper is to explore whether task-irrelevant stimulus-reward association elicits VDAC, even when this stimulus features no longer predict reward outcome. To test this hypothesis, we used the flanker task (Eriksen & Eriksen, 1974) in the reward-learning phase. In the flanker task, a target is flanked by distractor stimuli (e.g., BBABB: A is a target and Bs are distractors). When participants identify the target stimulus (i.e., respond “A or B” in this example), response inhibition occurs as a result of processing the distractors. An important characteristic of the flanker task is the lack of spatial uncertainty in the target location. The target is defined by “location” not color, so color manipulation in the flanker task provides a redundant cue for target selection. Participants can use color to select the target, but they can perform the task without using color. Thus, using the flanker task with reward manipulation allows us to test whether VDAC requires a feature that is task relevant. To evaluate VDAC in the test phase, we used a visual search task with a color singleton, which is known to be sensitive to VDAC and where there is spatial uncertainty about the target’s location.

In Experiment 1, we associated reward with a target letter in reward learning, following most previous studies (e.g., Anderson et al., 2011a, b). Moreover, in Experiment 2, color was bound to the distractor letters (Le Pelley et al., 2015). If VDAC requires that features associated with reward are the target in the flanker task, then VDAC will not be observed. In contrast, if a feature serving as a redundant cue for target localization is sufficient to produce VDAC, then VDAC will be observed after training in the flanker task. These two experiments constitute replication and extension of recent studies reporting VDAC with features not defining target. They replicate the VDAC with the flanker task, and extend the notion that the VDAC occurs in the test phase without feature-reward associations.

Experiments 3 and 4 are critical tests regarding whether task-irrelevant reward learning induces VDAC in the subsequent test phase. Given that Experiments 1 and 2 added color to a particular stimulus (i.e., target or distractor) in the flanker task, this color would be useful for target discrimination similar to previous studies. In Experiment 3, both a target and distractors were bound to a reward-predicting color to test the possibility that the reward-associated color improves discrimination of target and distractor. Furthermore, in Experiment 4, colored rectangular frames surrounding flanker letters were associated with reward. In these experiments, colors are irrelevant to performing the flanker task. If VDAC in no reward situation only occurs when features associated with reward are task-relevant, then task-irrelevant stimulus-reward association will not induce this VDAC. In contrast, if task relevance is not necessary to produce VDAC, then VDAC will be observed during subsequent visual search, in which this color predicted no reward outcome.

Experiment 1

Method

Participants

Twenty-six undergraduate and graduate students from Kyoto University, Kyoto, Japan, participated in Experiment 1 (eight females, mean age 20.2 years). All reported normal or corrected-to-normal visual acuity and color vision. Informed consent was obtained from all participants. Before the experiment, participants were told that rewards earned in the task were imaginary and were given course credit for their participation. Data from three participants were removed from the analysis due to accuracy of more than two standard deviations below the group mean.

Apparatus

A PC (running Windows) equipped with Matlab software and the PsychToolbox extensions was used to present stimuli on a CRT monitor (DELL D1626HT). Participants were tested individually in a dark room, and the viewing distance was approximately 57 cm. Responses were collected using a keyboard (z and m keys).

Stimuli and procedure

The experiment was divided into training and test phases that lasted approximately 1 h. Practice trials preceded the experimental trials (training and test phases). Practice trials were identical to the experimental trials.

Training phase:

A conventional reward-learning paradigm was used (e.g., Anderson, 2013). There were five experimental blocks of 48 trials in the training phase (total of 240 trials; 80 trials for each condition: high-reward, low-reward, and control). On each trial (Fig. 1a), a white fixation cross (0.5° × 0.5° visual angle) was displayed in the center of a uniform black background for a variable interval selected at random from a uniform distribution of 400, 500, or 600 ms. This was followed by a flanker display that contained a colored target letter (CIE; red = 0.6 0.3 21.2, green = 0.3 0.6 71.5, or blue = 0.2 0.1 7.2) in the center of the display (visual angle; 1.1° × 1.4°), flanked to the left and right by identical white letters of equal size (1.4° center-to-center). In the training phase we used twenty letters (excluding “IJMQWZ”) divided into five groups of four letters based on visual similarity as targets and flankers. In each experimental block, a four-letter group was randomly selected, and two letters were assigned to each response key“z” or “m.”. For instance, in a block including A, B, C, and D, if the central target was “A” or “B” participants had to respond using the “z” key, and if the target was “C” or “D” they had to respond using the “m” key. In congruent trials, the response mapping for the target and distractors was compatible (e.g., AABAA), but the target and distractors were always different letters (e.g., AAAAA was never shown). In incongruent trials, the response mapping for the target and distractors was incompatible (e.g., AACAA). Participants made a two-alternative response on target identity according to the response mapping that was explained at the beginning of each block. The flanker display was shown until participants made a response or the trial timed out (800 ms), followed by a black screen for 1000 ms. Thereafter, feedback was presented for 1500 ms. When participants made a correct response, the feedback display indicated the earning for the trial and the cumulative total amount earned. When the response was incorrect or no response was made the feedback display said “Incorrect.” Participants were informed that monetary reward was given randomly and not related to their performance on the task. In the high-reward condition, high-reward feedback (+100 yen) was given on 75 % of the trials and low-reward feedback (+10 yen) on the remaining 25 % of the trials; for the low-reward condition, the probability of reward association (high and low) was reversed. In the control condition, four asterisks were presented as feedback for all of correct trials not to include reward information (i.e., “****”).

Fig. 1
figure 1

Sequence of trial events in Experiment 1. (a) Training phase. A flanker display was followed by a display showing correct feedback and reward earnings. (b) Test phase. A singleton search display was followed by a display showing correct feedback Note. 100 yen is almost equal to $1 as at 2014

The target color was counterbalanced across participants. In the high-reward condition, the target letter was red for half the participants, and green for the other half. In the low-reward condition, the target was green for half the participants, and red for the other half. In the control condition, the target was blue in all trials. Participants were instructed to respond “as quickly as possible while minimizing errors” and that color was irrelevant to the task.

Test phase:

There were four experimental blocks of 48 trials in the test phase (total of 192 trials; 64 trials for each condition: high-reward, low-reward, and control). Each trial (Fig. 1b) started with the same fixation as the training phase. Then, a search display consisting of a fixation cross surrounded by six numbers (1.3° × 1.6° visual angle) placed at equal intervals along an imaginary circle with 6° radius was presented. The six numbers in the display included one target and five distractors. All distractors were the same number, and the target was a different number from the distractors. The target and four of the five distractors were white, whereas the remaining one distractor was red, green, or blue (randomized). The numbers used for the target and distractors ranged from 2 to 9. One number was assigned to each target and distractor (e.g., 4 for target and 5 for distractors), such that in some trials all numbers (i.e., both target and distractors) were odd or even, and in other trials one number was odd and the other was even. Target number was counterbalanced, and distractor numbers was selected randomly in all trials. Participants were asked whether the singleton target was an odd (“z” key) or even (“m” key) number. Unlike the learning phase, no reward feedback was given during the test phase, but corrective feedback (“Correct” or “Incorrect”) was provided.

The sequence of events in the test trials was identical to the training phase, except for the search display. The search display remained until a response was made or a maximum of 1500 ms was attained. Participants were instructed to respond “as quickly as possible while minimizing errors” and that color was irrelevant to the task. Numbers and target and distractor locations were randomized across trials. Only correct responses were analyzed, and all RTs more than three standard deviations above or below the mean of their respective conditions for each participant were excluded from analysis.

Results

The averaged accuracy across all participants was high in both training and test phases (Table 1). Consistent with previous studies, we did not observe an effect of reward in the training phase (Anderson, Laurent, & Yantis, 2011a, 2012, 2013; Chelazzi et al., 2014; Yokoyama et al., 2015). A two-way repeated-measures ANOVA on RT with reward (high, low, control) and congruency (congruent, incongruent) as within-subjects factors revealed only significant main effect of congruency (F(1, 22) = 88.13, p < .001, η p 2 = .800), and no main effect of reward (F(2, 44) = 0.11, p = .896, η p 2 = .005), and interaction (F(2, 44) = 0.75, p = .478, η p 2 = .033; Table 2).

Table 1 Mean accuracy (%) standard deviation from the training and test phases
Table 2 Mean reaction times (in ms) and standard error of the mean (SEM) from the training phase

Trials in the test phase were classified according to training phase condition. Figure 2 shows the test phase RTs. To test the effects of reward on attention, we conducted an ANOVA with reward (high, low, control) as a within-subjects factor. There was a significant main effect of reward (F(2, 44) = 6.26, p = .004, η p 2 = .222). Post-hoc comparisons using Ryan’s method revealed a significant difference between the high-reward and control conditions (t(22) = 3.43, p = .001), and the high-reward and low-reward conditions (t(22) = 2.47, p = .018), but the difference between the low-reward and control condition was not significant (t(22) = 0.97, p = .339). There was no significant difference in error rates between reward conditions (F(2, 44) = 0.44,p = .647, η p 2 = .020). These results suggest that the high-reward distractor preferentially captured attention because RTs were slower in the high-reward than in the low-reward and control conditions.

Fig. 2
figure 2

Mean reaction times from the test phase in Experiment 1. Error bars show standard error of the mean (SEM). * p < .05, ** p < .01

In Experiment 1, a task-irrelevant feature associated with reward captured attention in a subsequent visual search task. This indicates that a cue that was redundant for target localization elicited VDAC. In Experiment 1, color was bound to the target letter, which may be a necessary condition for VDAC. That is, value-based capture may only occur when a feature associated with reward is bound to the target. Alternatively, it may be sufficient for the feature to be redundant for target localization. This was tested in Experiment 2 where colors were bound to distractors, not targets, but were still redundant for target localization.

Experiment 2

Method

Participants

Twenty-four undergraduate and graduate students at Kyoto University participated in Experiment 2 (14 female, mean age = 19.4 years). All have self-reported normal or corrected-to-normal visual acuity and color vision. Informed consent was obtained from all participants. Participants were informed that compensation for their participation was not dependent on task performance and were given course credit for their participation. Data from four participants were excluded from the analysis due to accuracy of more than two standard deviations below the group mean.

Apparatus, stimuli, and procedure

The apparatus, design, and procedure were identical to Experiment 1, with one exception. In the training phase, the target was white and the flankers were red, green, or blue (opposite of Experiment 1). In the high-reward condition, flanker letters were red for half the participants, and green for the other half. The test phase was the same as Experiment 1.

Results

Similar to Experiment 1, we conducted a two-way ANOVA on RTs with reward (high, low, control) and congruency (congruent, incongruent) as within-subjects factors. There were no main effects of reward (F(2, 38) = 0.37, p = .697, η p 2 = .019) and interaction (F(2, 38) = 0.75, p = .480, η p 2 = .038) , but only significant main effect of congruency in the training phase (F(1, 19) = 60.74, p < .001, η p 2 = .762). Figure 3 shows the test phase RTs. A one-way repeated-measures ANOVA with reward (high, low, control) as a within-subjects factor revealed a significant main effect of reward condition (F(2, 38) = 5.01, p = .012, η p 2 = .209). Post-hoc comparisons using Ryan’s method confirmed significant differences between the high-reward and control conditions (t(19) =3.00, p = .005), and the high-reward and low-reward conditions (t(19) = 2.37, p = .023). In the low-reward and control conditions, a significant difference was not obtained (t(19) =0.64, p = .528). There were no significant differences in error rates between the reward conditions (F(2, 38) = 1.13, p = .333, η p 2 = .056). The results of Experiment 2 are qualitatively similar to Experiment 1. Therefore, even though reward was associated with distractors in the learning phase, this reward association captured visual attention.

Fig. 3
figure 3

Mean reaction times from the test phase in Experiment 2. Error bars show standard error of the mean (SEM). * p < .05, ** p < .01

In Experiment 2, we observed VDAC even when colors were not bound to the target, indicating that a feature does not have to be bound to a target to elicit value-based capture. Experiments 1 and 2 suggest that cues that discriminate between targets and distractors without spatial uncertainty are sufficient for VDAC. In Experiment 3, we examined whether the feature needs to discriminate between target and distractors. To this end, the target and distractors were the same color in each flanker task trial, thus that colors do not facilitate discriminating between targets and distractors. If the stimulus features associated with reward need to be useful for target-distractor discrimination, attentional capture will not be observed. In contrast, if it is not necessary for the rewarded feature to discriminate between targets and distractors to produce VDAC, value-based capture will be observed.

Experiment 3

Method

Participants

Twenty-four undergraduate and graduate students at Kyoto University participated in Experiment 3 (six females, mean age 21.1 years). All self-reported normal or corrected-to-normal visual acuity and color vision. Informed consent was obtained from all participants. At the end of the experiment, all participants received a book coupon for 1000 yen for their participation. Data from two participants were excluded from the analysis due to accuracy of more than two standard deviations below the group mean.

Apparatus, stimuli, and procedure

The apparatus, design, and procedure were identical to Experiment 2, with one exception. In the training phase, the target and flankers were both red, green, or blue. In the high-reward condition, both the target and flanker letters were red for half the participants, and green for the other half. The test phase was the same as Experiment 2.

Results

We conducted a two-way ANOVA on RTs in the training phase with reward (high, low, control) and congruency (congruent, incongruent) as within-subjects factors. There were significant main effects of reward (F(2, 42) = 5.56, p =.007, η p 2 = .210) and congruency (F(1, 21) = 42.58, p < .001, η p 2 = .670), but no interaction (F(2, 42) = 1.67, p =.200, η p 2 = .074). Post-hoc comparisons about reward using Ryan’s method revealed a significant difference between the high-reward and control conditions (t(21) = 2.95, p = .005), and the low-reward and control conditions (t(21) = 2.82, p = .007), but not significant between the high-reward and low-reward conditions (t(21) = 0.13, p = .897). These results indicated that RTs for target identification were faster in both reward conditions (i.e., high and low) than in the control condition, suggesting that the color associated with reward captured attention.

Figure 4 shows the test phase RTs. A one-way repeated-measures ANOVA on RTs revealed a significant main effect of reward (F(2, 42) = 6.06, p = .005, η p 2 = .224). To further assess the effect of reward, paired-sample t-tests using Ryan’s method were conducted. There were significant differences between the high-reward and control conditions (t(21) = 3.18, p = .003), and the high-reward and low-reward conditions (t(21) = 2.82, p = .007), but the low-reward and control conditions failed to show a significant difference (t(21) = 0.36, p = .721). There were no significant differences in error rates between reward conditions (F(2, 42) = 0.67,p = .518, η p 2 = .031). In Experiment 3, RTs were slower in the high-reward condition compared to the low-reward and control conditions, indicating that distractors associated with high reward captured attention. This suggests that reward-learning occurs even when the stimulus feature associated with reward do not facilitate for target-distractor discrimination.

Fig. 4
figure 4

Mean reaction times from the test phase in Experiment 3. Error bars show standard error of the mean (SEM). * p < .05, ** p < .01

Experiments 1, 2, and 3 in the present study indicate that reward-associated features do not need to discriminate targets and distractors to produce reward learning. Even though they are unnecessary for target-distractor discrimination, reward-associated features may need to be bound to task-relevant objects (i.e., letters). This hypothesis was tested in Experiment 4. To this end, the target and distractors were surrounded by colored rectangular frames, and the color of the frame was associated with reward in the flanker task (reward learning phase). The stimulus feature associated with reward (i.e., frame color) was completely task irrelevant. Attentional capture due to reward learning should not be observed if reward learning requires that the reward is associated with a feature of a task-relevant stimulus. In contrast, attentional capture may be observed if reward does not have to be associated with a feature of a task-relevant stimulus.

Experiment 4

Method

Participants

Eighteen undergraduate and graduate students at Kyoto University participated in Experiment 4 (seven females, mean age 20.7 years). All reported normal or corrected-to-normal visual acuity and color vision. Informed consent was obtained from all participants. After the experiment participants were given a book coupon for 1000 yen for their participation. Data from one participant were excluded from the analysis due to accuracy of more than two standard deviations below the group mean.

Apparatus, stimuli, and procedure

The apparatus, design, and procedure were identical to Experiment 3, with the following exceptions. In the training phase, a colored rectangular frame (visual angle; 6.0 ° × 12.2 °) was presented around both the target and flanker letters, which were white. The color of the rectangular frame was magenta (CIE; 0.3 0.2 28.5), yellow (CIE; 0.4 0.5 42.9), or cyan (CIE; 0.2 0.3 40.5), and all colors were equiluminant (16.5 cd/m2). Unlike Experiment 3, reward feedback was not based on the color of the target or distractor, but on the color of the rectangular frame. Participants were divided into the three groups. For one-third of participants, the rectangular frame was magenta in the high-reward condition, yellow in the low-reward condition, and cyan in the control condition. For another third of participants, the rectangular frame was yellow in the high-reward condition, cyan in the low-reward condition, and magenta in the control condition. For the remaining third of participants, the rectangular frame was cyan in the high-reward condition, magenta in the low-reward condition, and yellow in the control condition. The test phase was the same as Experiment 3.

Results

A two-way repeated-measures ANOVA on RTs with reward (high, low, control) and congruency (congruent, incongruent) as within-subjects factors revealed significant main effect of congruency (F(1, 16) = 42.16, p < .000, η p 2 = .725), but no main effect of reward (F(2, 32) = 1.51, p = .236, η p 2 = .086) and interaction (F(2, 32) = 1.67, p = .205, η p 2 = .094). In the test phase, trials were classified based on training phase condition. Figure 5 shows the test phase RTs. We conducted an ANOVA on these RTs with reward as a within-subjects factor. Similar to Experiment 3, there were significant differences between RTs based on reward condition (F(2, 32) = 4.52, p = .019, η p 2 = .220). Post-hoc comparisons using Ryan’s method revealed significant differences between the high-reward and control conditions (t(16) = 2.74, p = .010), and the high-reward and low-reward conditions (t(16) = 2.44, p = .021). However, we did not find a significant difference between the low-reward and control conditions (t(16) = 0.30, p = .763). There were no significant differences in error rates between reward conditions (F(2, 32) = 0.22, p = .805, η p 2 = .014). In summary, in Experiment 4 we also found that target selection was impaired when a colored additional singleton distractor was previously associated with high reward, even though the color of the task-irrelevant rectangular frame predicted reward during the training phase. These results suggest that the effect of reward learning occurs even when features of task-irrelevant stimuli are associated with reward.

Fig. 5
figure 5

Mean reaction times from the test phases in Experiment 4. Error bars show standard error of the mean (SEM). * p < .05, ** p < .01

General discussion

In the present study, we examined whether a task-irrelevant stimulus feature associated with reward induces VDAC. In Experiment 1, the color of the target in the flanker task was associated with reward during the learning phase. In a subsequent visual search task (test phase), RTs were slower when the stimulus features were previously associated with high reward. Thus, VDAC occurred even though the reward-associated feature did not define a target in the learning phase. The results of Experiment 2 were qualitatively similar even though reward was associated with distractor stimuli in the flanker task (reward learning phase). In Experiment 3, both the target and distractors were the same color in the flanker task, so color information associated with reward could not be effective for target selection during reward learning. Nevertheless, VDAC was observed. Finally, VDAC was induced in Experiment 4, even though the reward-associated feature (color of the rectangular frame surrounding the letters) was task-irrelevant in the training phase. Taken together, these findings indicate that attentional capture induced by reward learning occurs even when the stimulus features associated with reward are task irrelevant.

The present study provides the first evidence that the association between stimulus features and reward occurs even when a task-irrelevant feature predicts reward, and this feature no longer predicts reward information. In most previous studies on reward-attention associations, stimulus features related to reward learning were defined as a target (e.g., Anderson et al., 2011a, b). In other studies, features not defining target were also associated with reward (i.e., distractor) during the reward learning phase (Le Pelley et al., 2015). Although these findings indicated that stimulus-reward association affected visual selective attention even when features not defining a target predict reward outcome, this effect of reward learning was not examined in the subsequent test phase. In the current study, we used the flanker task during reward learning, and VDAC was observed even when the task-irrelevant features were associated with only target or distractor (Experiments 1 and 2). Moreover, not only were the reward-associated features not relevant to target discrimination (i.e., color of letters), they were also completely irrelevant for performing a task in the flanker task (i.e., color of the rectangular frame surrounding the letters), producing VDAC (Experiments 3 and 4). In visual search tasks, it is well known that physically salient distractors capture visual attention (Theeuwes, 1992; Wei & Zhou, 2006; Yantis & Jonides, 1984). Because we used a similar visual search task to previous studies (test phase), pop-out effects should have occurred in our study. The RT delays we observed could have been caused by attentional capture and delayed attentional disengagement due to reward manipulation. Our findings extend previous research and indicate that reward learning also occur when stimulus features associated with reward are task irrelevant.

Our findings of attentional capture as a consequence of reward learning with task-irrelevant features are consistent with previous studies (Anderson, 2013; Chelazzi et al., 2013) in many respects. First, similar to previous studies, colored targets and distractors can be associated with reward during the learning phase (Della Libera & Chelazzi. 2009; Della Libera, Perlato, & Chelazzi, 2011; Le Pelley et al., 2015). Second, the magnitude of reward in the learning phase modulated value-based attentional capture in the test phase. Previous studies suggested that the magnitude of reward modulates task performance when the reward-associated features predict information about reward magnitude (e.g., Anderson et al., 2011a, 2012). For example, Anderson et al. (2011b) showed that the effect of reward learning on attentional capture (in a visual search task) was dependent on the magnitude of the reward, and larger attentional capture effects were observed in trials containing stimuli previously associated with high compared to low reward. This finding was replicated in the current study. Furthermore, to access whether the magnitude of VDAC was modulated by the learning condition (e.g., the differences between the target-reward association and distractor-reward association), we conducted mixed ANOVA with experiment (1, 2, 3, and 4) as a between-subjects factor and reward (high, low, control) as a within-subjects factor. Results showed no significant interaction (F(6, 156) = 0.21, p = .973, η p 2 = .008), indicating that the magnitude of VDAC did not change substantially according to learning conditions.

Third, feature-reward association persisted when the task switched from a flanker task (reward learning phase) to a visual search task (test phase). This is also consistent with previous studies showing that effects of stimulus-reward associations learned in one task transfer to another task (Anderson et al., 2012). Finally, the current study observed VDAC in the test phase that stimulus-features previously associated with reward predicted no reward any longer. Some recent studies reporting VDAC with features not defining targets (Le Pelley et al., 2015) showed effects only in the learning phase with reward association, leaving possibilities that the effect may be qualitatively different from, or substantially weaker than, the VDAC reported in the test phase with task-relevant features (e.g., Anderson et al., 2011a, b). In the present study, VDAC was still observed in the test phase with no reward even when task-irrelevant stimulus features were associated with reward, suggesting that the VDAC with task-irrelevant stimulus-reward association is qualitatively similar to that with task-relevant reward learning. Our findings support many previous studies and expand our understanding of the mechanism involved in creating stimulus-reward associations.

The training phase of Experiment 2 in the present study was analogous to the paradigm of the Le Pelley et al. (2015). Le Pelley et al. (2015) showed that the effect of reward was observed when the distractor color predicts reward outcome in the training phase. However, the current study did not obtain the reward effect in the training phase in Experiment 2. One reason for the inconsistent results could be the difference in the task design. In the present study, the color bound to distractors predicted the reward outcome in the flanker task (Eriksen & Eriksen, 1974) during reward learning. Given that Le Pelley et al. (2015) showed that reward-associated features (i.e., the color of distractor) impaired the target selection in visual search, the reward-signaling flankers could induce RT delay due to attentional capture in the current study. However, as mentioned before, the color of distractors might be helpful in target discrimination (e.g., RT facilitation) because we only colored distractors. Hence, the design in the present study would be insensitive to the reward effect, because these two factors (i.e., RT delay by attentional capture and RT facilitation due to target discrimination) cancelled each other in reward learning. Another possibility is the number of trials. Le Pelley et al. (2015) conducted ten blocks of 40 trials (total 400 trials) in their Experiment 1 and 36 blocks of 40 trials (total 1728 trials) in Experiment 2. In the present study, participants completed five blocks of 48 trials (total 240 trials), much less than the previous study. In such a short training phase (e.g., 240 trials), the learning effect may be gradually observed over the trials (and we only obtained the reward effect in the training phase in Experiment 3), but the effect is difficult to reach statistical significance (also see Anderson et al., 2011a). Moreover, the instruction that we gave participants in the training phase could affect the effect of reward learning. The current study emphasized accuracy in reward learning because only the correct response was followed by reward outcome, which may lead participants to set a higher value on accuracy than on reaction time.

Previous studies have shown that stimuli that are important for survival or wellbeing are given high attentional priority (Hodsoll, Vinding, & Lavie, 2011; Most, Chun, Widders, & Zald, 2005; Most, Smith, Cooter, Levy, & Zald, 2007; Williams, Moss, Bradshaw, & Mattingley, 2005). For example, Hodsoll et al. (2011) found that facial expressions (e.g., angry faces, happy faces) were more likely to capture visual spatial attention than neutral faces in a visual search even when facial expression was not a target-defining feature. Although the target was defined by face sex (male or female) and facial expression was not relevant for target selection, facial expressions biased attentional priority. It is reasonable to direct attention to highly salient objects in a situation because humans are not able to recognize all information in an environment at once (Corbetta & Shulman, 2002; Yokoyama, Ishibashi, Hongoh, & Kita, 2011). Recent research has suggested that reward association learning induces an attentional bias to originally neutral stimuli (e.g., Anderson et al., 2011a, b). For example, target-defining features (i.e., color) that were associated with reward in one trial capture attention in subsequent trials when this stimulus feature is presented as a distractor (e.g., Hickey et al., 2010b). In addition, VDAC occurs even when the reward-associated features in a learning phase do not predict reward outcome in a subsequent test phase (e.g., Anderson et al., 2011a, b). VDAC occurs flexibly in a variety of a tasks (e.g., Anderson, 2013; Chelazzi et al., 2013; Hickey et al., 2010a; Lee & Shomstein, 2014).

Reward association learning raises some issues regarding the relationship between attentional priority and VDAC. Reward learning in VDAC can be viewed as an experimental setting that simulates the acquisition of attentional priority in the everyday environment. The finding that monetary rewards increase attentional efforts and result in higher accuracy than symbolic reward (Hübner & Schlösser, 2010) is consistent with our intuitive sense of attentional priority acquisition, in that important information explicitly draws attention. Other studies showing that the effect of reward learning observed when reward is not associated with a target-defining feature (Failing & Theeuwes, 2014) and that the association between reward and stimulus features are learned implicitly (Anderson, 2014, 2015; Anderson et al. 2013) suggest that attentional priority may be acquired implicitly without a direct link to a common everyday task. The current study goes one step further, showing that VDAC occurs even when reward is associated with task-irrelevant stimulus features. This raises the possibility that attentional priority may be acquired via implicit extraction of contingencies in contextual information.

VPL occurs for both task-relevant and task-irrelevant features (e.g., Ahissar, 2001; Sasaki, Nanez, & Watanabe, 2010; Seitz et al., 2009; Shiu & Pashler, 1992). In task-relevant VPL, when two stimuli are presented in the display, VPL is only observed for the stimulus that participants voluntarily attended (Ahissar, 2001; Shiu & Pashler, 1992). In task-irrelevant VPL, the stimulus features presented in the visual field evoke reinforcement signals that induce VPL even when these features are task-irrelevant and unperceived (Seitz et al., 2009). These findings indicate that both task-demanding focused attention and stimulus-driven reinforcement signals are involved in VPL (Sasaki et al., 2010). However it is unclear whether reward learning and VPL are related to similar mechanisms because in previous reward-learning studies rewards were always associated with task-relevant stimulus features (e.g., Anderson et al., 2011a, b). The current study showed effects of reward learning using task-irrelevant stimulus features, indicating that task relevance is not necessary for inducing VDAC. Similar to VPL, reward learning that induces VDAC may occur through reinforcement signals regardless of task relevance. A question remaining is whether VDAC is exclusively mediated by the mechanism underlying task-irrelevant VPL, or whether both task-relevant and task-irrelevant VPL mechanisms contribute to VDAC. Comparable effect sizes across the four experiments in the current study are consistent with the former hypothesis, but a more systematic and quantitative evaluation of effect sizes is necessary in future studies.

It has been suggested that VDAC involves a different mechanism to goal-directed and stimulus-driven attention (e.g., Anderson, 2013). Previous research on attentional control indicated that dorsal frontal-parietal brain regions are related to goal-directed attention, and ventral temporal-parietal brain regions are involved in stimulus-directed attention (Corbetta & Shulman, 2002; Serences et al., 2005). Recent studies have examined the neural mechanisms underlying value-based attention using functional magnetic resonance imaging (fMRI) and event-related potentials (ERPs) (Anderson, Laurent, & Yantis, 2015; Qi, Zeng, Ding, & Li, 2013). Anderson et al. (2015) found that the tail of the caudate nucleus and extratstriate cortex were activated when the stimulus previously associated with reward was presented. Qi et al. (2013) found that the N2pc component was observed earlier in trials that contained the reward-associated distractor. However, in these studies the task-relevant stimulus feature was associated with reward and it is unclear whether the same process is involved when task-irrelevant features are associated with reward. This should be investigated in future studies.

It is important to note that in the current study reward was symbolic, and participants received a fixed amount of monetary compensation or course credit regardless of task performance. In contrast, most studies on reward association learning provide participants with monetary compensation that is proportional to reward points earned in the experiment (e.g., Anderson et al., 2011a, b; Chelazzi et al., 2014; Failing & Theeuwes, 2014; Hickey et al., 2010a, b; Krebs et al., 2010). Variable reward was used to maintain task motivation. Hübner and Schlösser (2010) examined whether monetary reward increases attentional effort in a flanker task (Eriksen & Eriksen, 1974) and found that accuracy was higher in the performance-contingent monetary reward condition than the fixed reward condition (i.e., symbolic reward), suggesting that monetary reward increases task motivation. However, reaction times and flanker effects (i.e., RT delay induced by incongruent flankers) did not differ between conditions, indicating that reward type (performance-contingent vs. fixed) does not modulate effects of selective attention. This is consistent with the results of the current study where VDAC was successfully replicated in all experiments. Therefore, a fixed amount of monetary compensation or course credit, as used in the current study does not appear to modulate VDAC.

This study examined whether VDAC occurs in this situation where reward-associated stimulus features were task irrelevant, using a flanker task during reward learning. Through the four experiments, we found this VDAC with task-irrelevant stimulus reward association is comparable to the previously reported VDAC with task-relevant stimuli, in that the effect persisted in subsequent no reward situation. Thus, features that are unrelated to the learning task induce VDAC as long as they are associated with reward.