Introduction

It is common knowledge that attentional processing can be affected by recent experiences. For instance, the “priming of pop-out” effect (for a review, see Kristjansson & Campana, 2010) shows that visual search for a salient target is more efficient when the target’s salient feature (e.g., color) is repeated over trials (e.g., Maljkovic & Nakayama, 1994; for other ways in which selection history affects attentional processing, see Awh, Belopolsky, & Theeuwes, 2012). While consensus exists that attention can be modulated through experience, the extent to which future plans affect attention is less clear. A famous example that suggests that future plans affect attention is the contingent capture effect (e.g., Folk, Remington & Johnston, 1992), which shows that a salient distractor interferes with target processing more when this salient feature is shared with the expected target (i.e., both distractor and target are salient because of their color) than when it is not (i.e., the target is salient because of its color; the distractor is salient because it is characterized by a sudden onset). In other words, when participants keep a plan to respond to a future target in working memory, attention is biased toward related distractors.

More recently, other studies have been performed with the aim of examining interactions between working memory and attention. These studies commonly consist of different tasks. First, a participant is asked to remember a certain stimulus. This is followed by an attention task in which this stimulus appears as an irrelevant feature. Finally, participants are asked to recall or recognize the to-be-remembered stimulus (for an overview, see Soto, Hodsoll, Rotshtein, & Humphreys, 2008). Commonly, the memory task has an effect on attentional processing. In visual search tasks, reactions are faster when targets are presented within a to-be-remembered stimulus (e.g., Dowd & Mitroff, 2013; Soto, Humphreys, & Heinke, 2006) and slower when the to-be-remembered stimulus is presented as a distractor (e.g., Soto & Humphreys, 2007; Soto, Humphreys, & Heinke, 2006) or as a singleton distractor (e.g., Olivers, Meijer, & Theeuwes, 2006, but for some critical notes, see Downing & Dodds, 2004 and Woodman & Luck, 2007). Similarly, when a to-be-remembered item is used as a cue in a visual probe tasks (e.g., Downing, 2000; Schwark, Dolgov, Sandry, & Volkman, 2013) reaction times are faster when a target is presented at the same location of a to-be-remembered cue than when it is presented at the opposite location.

Importantly, because in most of these experiments, the to-be-remembered items were irrelevant and uninformative in the attention tasks, these effects do not seem to be due to top-down attentional control. In addition, the stimuli that were held in working memory did not differ in visual salience from the other cues or distractors (with the exception of Olivers et al., 2006) but nevertheless managed to capture attention.

These findings suggest that bottom-up attentional effects can be modulated by one’s future plans (i.e., the plan to respond to a certain target). However, previous experiences are generally confounded by our plans for the future, both in real life as well as in the lab. This is also a very important limitation of the studies discussed above. For instance, in Folk et al.’s study, participants experienced targets and distractors during large numbers of trials, leading some to suggest that these effects are due to participants’ selection history instead of participants’ future plans (Belopolsky, Schreij, & Theeuwes, 2010; see also Awh et al., 2012). Similarly, participants in the visual search tasks (e.g., Olivers et al., 2006; Soto et al., 2006; Soto & Humphreys, 2007) had experience in both attending to the to-be-remembered color and in performing the memory task. In Downing’s study (2000), the to-be-remembered stimuli were novel in each trial, but still participants had ample time to study these stimuli before the beginning of the trial.

Because of this confound, the question whether future plans alone (i.e., without prior execution of the plans) can induce an early, automatic attentional bias remains unanswered. This is somewhat surprising because prominent theories of attention emphasize the role of attention in the planning of future actions (e.g., Allport, 1987; Treisman, 1996). Attention is often assumed to serve as a mechanism that selects relevant information from the enormous quantity of stimuli we encounter in our environment because our cognitive system is too limited to process all this information (e.g., Broadbent, 1958). However, Allport, (1987) suggested that the selectivity might be necessary not because of our processing limitations, but because of the limited nature of our capacities to act upon our environment. When participants plan to perform a specific action, it is beneficial if they are able to efficiently process information relevant for performing this action, and to ignore irrelevant information (for an overview see Hannus, Neggers, Cornelisen, & Bekkering, 2004). In addition, “common coding” accounts suggest that action and perception share the same representational domain (e.g., Hommel, Müsseler, Aschersleben, & Prinz, 2001; Stoet & Hommel, 2002). This implies that an action can prime a stimulus, and vice versa. More importantly, these theories also suggest that action planning will influence attentional processing of action-related stimuli (e.g., Fagioli, Ferlazzo, & Hommel, 2007; Fagioli, Hommel, & Schubotz, 2007). Although these theories assume a strong relationship between attention and action, the influence of planning a new action (i.e., without prior execution) on attentional processing remains unexplored.

While the impact of future plans on attention remains to be determined, research on response-compatibility effects demonstrated that a plan about a future action or task that is merely based on the implementation of instructions can lead to automatic response activations (Cohen-Kdoshay & Meiran, 2007; De Houwer, Beckers, Vandorpe & Custers, 2005; Everaert, Theeuwes, Liefooghe, & De Houwer, 2014; Liefooghe, De Houwer & Wenke, 2013; Liefooghe, Wenke, & De Houwer, 2012; Meiran & Cohen-Kdoshay, 2012; Meiran, Pereg, Kessler, Cole, & Braver, 2015; Theeuwes, Liefooghe, & De Houwer, 2014; Waszak, Pfister, & Kiesel, 2013; Wenke, Gaschler, & Nattkemper, 2007; Wenke, De Houwer, De Winne, & Liefooghe, 2015). A common assumption in this line of research is that the cognitive system can prepare itself for a future task on the basis of instructions without any practice or experience (Meiran, Cole, & Brever, 2012). Such instruction-based preparation leads to the formation of a functional plan in working memory, which guides performance when needed, but also biases performance when the plan is irrelevant.

In a procedure that is highly relevant to our research, Liefooghe et al. (2012, 2013) asked participants to perform two tasks. First, participants received instructions for an inducer task that consisted of pressing a specific button when a green letter appeared (e.g., if Q, press left; if P press right). Before performing this task, participants were asked to perform the diagnostic task, in which they needed to decide whether letters were presented upright or in italic (e.g., press left for italics, press right for upright). After an unpredictable number of diagnostic test trials, the green inducer task probe appeared. Diagnostic trials were congruent when there was an overlap between responses in the inducer task (e.g., press left for a Q in italic) and incongruent when the opposite response was required in the inducer task (e.g., press right for an upright Q). The results showed that reaction times on congruent trials were faster than on incongruent trials, an effect they referred to as the instruction-based congruency effect (IBCE).

It is important to note three striking features of this effect: first, none of the stimuli had specific salient features that made them stand out (i.e., bottom-up processes cannot account for these results). Second, it was not beneficial to the task to attend to specific letters in the diagnostic task, suggesting that top-down processes cannot account for these findings either. Third, new stimuli were used for each run, which means that the effect could not be driven by experience. In sum, the IBCE is due merely to planning a future action on the basis of instructions. Studies also showed that the formation of a functional plan seemed to be crucial for the IBCE to occur: when participants merely need to remember the instructions without performing the task, the IBCE disappeared. Finally, and of major interest for the present purposes, the IBCE is not restricted to situations in which the inducer and diagnostic task share the same responses. For instance, Wenke et al., (2007) did observe an IBCE between an inducer task, which instructed a left or right response to a letter-stimulus, and a diagnostic task in which participants needed to judge the size of a stimulus. Importantly, in the diagnostic task, letter-stimuli from the inducer task were presented at either the left or the right side of fixation. Even though in this task both the identity of the stimulus as well as its location was irrelevant, they still affected behavior. This raises the question to what extent task-irrelevant instructions bias earlier processing stages, such as attention.

We used a variant of the procedure of Liefooghe et al., (2012) and the procedure of Downing, (2000), in order to examine two research questions. First, we tested whether attention is biased toward new stimuli (S) that are paired with an instructed future response (R). On the basis of the assumption that attention functions as a selection-for action mechanism, we suggest that preparing to perform the S-R task should automatically affect attentional processing, even if (1) participants have no history with the stimuli; (2) stimuli are not salient (i.e., will not “pop-out” because of low-level visual features); and (3) participants do not have the goal to attend these stimuli during the current task. Second, we examined whether this bias would still be present if instructions did not specify an S-R association but an association between two stimuli (an S-S association). If attention is mainly a selection-for-action mechanism, one would expect that attention is not biased when stimuli are paired with other stimuli instead of with a specific action because learning an S-S association should not trigger preparatory processes for performing a response.

In line with the procedure developed by Liefooghe et al., (2012, 2013), we asked participants to perform an inducer task and a diagnostic task. Participants first viewed the instructions for the inducer task. In “Experiment 1”, they were presented with four object names, the first two of which were target objects that were associated with a response (e.g., “balcony → press A” and “bomb → press P”; see Fig. 1 for an example). The other two object names were not paired with a response (control objects). Subsequently, participants performed the diagnostic task, which was a version of the visual probe task that required the identification of a probe (the letter E or the letter F), a task that is commonly assumed to measure a bias in the automatic, early allocation of attention (e.g., Bradley, Mogg, Falla, & Hamilton, 1998; Posner, Snyder, & Davidson, 1980). The probe was preceded by the presentation of (irrelevant) pictures of two objects (one inducer-target object and one inducer-control object) that were referred to in the instructions for the inducer task. This yielded two types of visual probe trials: congruent trials, in which the probe appeared on the same location as a target object picture and incongruent trials, in which the probe appeared on the opposite location of a target object picture. After a number of visual probe trials, one of the two target object pictures of the inducer task was presented in the center of the screen, and participants were required to give the corresponding response. It is crucial to note that the instruction screen contained only a symbolic representation of the stimuli that were used in the visual probe task. This means that, in contrast to the study of Downing, (2000), participants did not have any previous experience with the visual configuration of the stimuli, excluding the possibility that selection history could affect attentional processing.

Fig. 1
figure 1

Outline of a run of the inducer task and the diagnostic task

In “Experiment 2”, the task was the same, apart from two aspects: first, in the instruction screen, the first two words were now paired with a color–word (e.g., “balcony → blue”; “bomb → green”). Second, at the end of each run, one of the two target object pictures appeared in either blue or green and participants needed to indicate whether the color was correct (i.e., in line with the information in the instruction screen) or not. Thus, in this experiment, stimuli were not paired with a specific response, which excludes the possibility that response preparation would bias attention toward the instructed stimuli. We expected that our findings would be in line with the idea of attention as a selection-for-action mechanism (e.g., Allport, 1987) in the sense that we expected an attentional bias for target objects when these objects were associated with a response (“Experiment 1”) but not when they were associated with another stimulus (i.e., a color; “Experiment 2”).

Experiment 1

Method

Participants

We tested 46 participants who were paid 10 euros or received one course credit for their participation. All participants were native Dutch-speakers. The study was conducted in accordance with the principles expressed in the Declaration of Helsinki.

Stimuli and materials

We selected 196 pictures of the Snodgrass & Vanderwart, (1980) picture set to use as targets in the inducer task and as irrelevant cues in the visual probe task. Severens, Van Lommel, Ratinckx, & Hartsuiker, (2005) performed a study in which participants were asked to name these images as quickly as possible. We selected only pictures for which maximally three different names were given, in order to avoid ambiguity. Targets in the visual probe task were the letter E and the letter F.

Participants were tested in a spacious room in which four computers were set up, separated by partitions. One, two, three or four participants were tested during each session. They were seated in front of a laptop PC with a 17-in. color monitor at a distance of approximately 45 cm. After giving informed consent, they performed the experiment. For stimulus presentation and response registration, we used the E-Prime software package (Schneider, Eschman, & Zuccolotto, 2002a, b). Responses were recorded with a standard AZERTY keyboard.

Procedure

The experiment consisted of 32 runs (see Fig. 1). A run started with the presentation of the instruction screen, on which four words were presented in Courier New 18 point font. The first word was paired with the instruction “press left” (the A-key) and the second word was paired with the instruction “press right” (the P-key). The other two words were not paired with an instruction. Participants continued by pressing the space bar. This was followed by the presentation of a 16 point fixation cross for 1000 ms. Two boxes were presented 1.4 cm above and below fixation. Each box was 4.8 cm high and 6 cm wide. After this, two object pictures appeared (one of a target object, and one of a control object), each inside one of the boxes, for 500 ms. Then, the boxes were blank for 30 ms, and after this the target (E or F in 18 point Courier New font) appeared either on the location of the target object (congruent trial) or on the opposite location (incongruent trial). The target remained on screen until a response was given. After an incorrect response the word “FOUT” (“Wrong”) appeared on the screen for 500 ms. A trial was followed by an inter-trial interval that lasted between 250 and 500 ms. During this interval, the fixation cross and the blank boxes remained present on the screen. Half of the runs consisted of 8 visual probe trials, the other half consisted of 16 visual probe trials. After the last visual probe trial of the run, one of the two target objects was shown in the center of the screen until the correct response was given. After an incorrect response, the word “FOUT” was presented for 500 ms. The run ended with a blank screen that was presented between 250 and 500 ms. After completing 32 runs, participants were thanked for their participation. All stimuli were presented in white against a black background. The drawings were black on a white background.

Results

We excluded data of two participants. One participant made too many errors in the inducer task, M .85. This deviated more than 2.5 standard deviations of the mean, M .97, SD .04. Another participant made too many errors in the visual probe task, M .88. This deviated more than 2.5 standard deviations of the mean, M .95, SD .03.

We excluded trials for which participants gave an incorrect response on the inducer task (3 % of the data) and excluded trials for which participants’ reaction times were more than 2.5 SD slower than their mean reaction times per trial type (congruent vs. incongruent; 2 % of the data). Our analyses of the proportion of correct responses on the visual probe task did not reveal any effects, F(1, 43) = 1.42, MSE = .0002477, p = .24, η 2 = .03. Accuracy was very high for both the congruent, M .95, SD .03, and the incongruent trials, M .96, SD .03. Our analyses of the visual probe reaction times, however, yielded a significant congruency effect, F(1, 43) = 13.92, MSE = 127.2, p < .001, η 2 = .24. Participants were faster to respond on congruent trials, M = 533, SD = 92, than to incongruent trials, M = 542, SD = 93.

Experiment 2

Method

Participants

We tested 39 participants who were paid 10 euros for their participation. All participants were native Dutch-speakers. The study was conducted in accordance with the principles expressed in the Declaration of Helsinki.

Stimuli, materials, and procedure

We used the same stimuli and materials as in “Experiment 1”, with the exception of one picture (of a bear) that could not be used because of a computer error. Furthermore, in the instruction screen, words were no longer paired with a response, but with a color. The first word was paired with the color “blue” and the second word was paired with the color “green”. The visual probe trials remained exactly the same, but at the end of a run a picture reflecting one of the first two words appeared on the screen in either blue or green and participants were asked to indicate whether this was correct (i.e., in line with the instruction screen) by pressing the A (correct) or the P (incorrect) key.

Results

We excluded trials for which participants gave a response that was more than 2.5 standard deviations slower than their mean reaction time per trial type (2% of the data) and trials for which participants gave an incorrect response on the inducer task (10% of the data). There was no effect of Congruency on accuracy, F < 1. Participants were very accurate on both congruent, M .96, SD .02, and incongruent, M .95, SD .02, trials. Our reaction time data did also not reveal a congruency effect, F < 1. Reaction times were equally fast on congruent, M 535, SD 80, and on incongruent trials, M 536, SD 77.

Comparison between experiments

The results of Experiment 1 and 2 were directly compared by performing a 2 (Congruency) by 2 (Experiment) mixed ANOVA with repeated measures on the first factor. For the reaction times, the main effect of congruency was significant, F(1, 81) = 9.47, MSE = 109, p < .01, η 2 = .10. In contrast, the main effect of Experiment was not significant, F < 1. Overall response speed was thus comparable in both experiments. Importantly, the two-way interaction was significant, F(1, 81) = 6.79, MSE = 109, p < .05, η 2 = .08. This significant interaction offers statistical support for the difference in congruency effects observed in Experiments 1 and 2. For the accuracy data none of the effects reached significance, the largest F value was observed for the two-way interaction: F(1, 81) = 2.24, MSE = .0002427, p = .14, η 2 = .03.

Discussion

Our results support the hypothesis that a future action plan can automatically influence early attentional processes. When participants learned to associate specific stimuli with specific responses on the basis of instructions, their attention was biased toward these stimuli (“Experiment 1”). Because participants had no prior experience in attending to these stimuli or in executing the S-R task, this finding indicates that future action plans can bias one’s automatic attentional processing even in the absence of a prior execution of the plans. In addition, we did not find an attentional bias toward stimuli that were associated not with a response but with a specific stimulus property (i.e., color; “Experiment 2”). This strengthens the idea that attention functions as a selection-for-action mechanism (Allport, 1987).

It is important to stress that the present study goes beyond previous experiments that have examined the role of working memory in attention (e.g., Dowd & Mitroff, 2013; Downing, 2000; Downing & Dodds, 2004; Schwark, Dolgov, Sandry, & Volkman, 2013; Soto et al., 2006; Soto & Humphreys, 2007), because we show that items held in working memory can bias attention even when the effects of prior exposure are ruled out. It is, however, interesting to note that our results seem to clash with this literature. The fact that the memory task we used in Experiment 2 did not affect attentional bias seems in contrast with the finding that merely holding an item in working memory for a subsequent memory test (e.g., Olivers et al., 2006; Soto & Humphreys, 2007) can modulate attention. However, it must be noted that our design differs from the previous studies in two important ways. First, the memory task is more complicated than a mere recognition or recall test: it involves the pairing of a color with a specific stimulus. Furthermore, more than one item/feature needed to be remembered. In the light of previous research that has shown that an increased working memory load attenuates the attentional bias toward to-be-remembered items (e.g., Downing & Dodds, 2004; Woodman & Luck, 2007), our findings are thus not so surprising.

This makes the attentional bias toward the S-R items in Experiment 1 (in which the memory load was substantial) even more striking and suggest that the underlying mechanisms of this effect are different from those involved in the studies concerning working memory and attention we discussed above (e.g., Soto et al., 2008). Our findings suggest that a conceptual representation of an object that is related to a future action is consolidated in such a way that it can bias attention even in situations when working memory is burdened with different instructions, tasks, and stimuli. This is related to an issue brought forward by, for instance, Woodman & Luck, (2007) who suggest that working memory and task set are two different constructs that differentially affect attentional processing, and supports the notion that our manipulations go beyond those used in the context of working memory and attention.

Furthermore, our research provides three important additions to research on the automatic effects of instructions. First, a plan that is implemented on the mere basis of instructions, not only can lead to automatic response activations (e.g., Everaert et al., 2014), but can also bias attention toward the stimuli represented in that plan. While circumstantial evidence already suggested this possibility (Wenke et al., 2007), the present study offers clear-cut evidence that instructed S-R mappings can indeed affect behavior at early stages of attention allocation. Second, the present study offers additional insights in the nature of the S-R associations that a plan includes. The same concepts were instantiated in different ways in the instructions of the inducer task (object–words) and the cues in the visual probe task (object–pictures). Because attention was modulated across these different instances, instructed S-R associations most likely employ higher-order conceptual stimulus representations, rather than concrete representations. This conclusion converges with the proposals of Liefooghe et al., (2012), who suggested that instructed S-R associations only include conceptual response representations. Taken together, the present study and the study of Liefooghe et al., (2012) indicate that instructed S-R associations are most probably stripped of concrete stimulus and response features during the implementation of instructions. Finally, while research on instructions for the most part focused on instructed S-R associations, the present results offer a first distinction between instructed S-R associations and instructed S-S associations, with only the former type being able to modulate attention. Such distinction clearly calls for a systematic comparison of different types of instructions and the automatic effects they elicit.

To sum up, our study shows that attention and action are indeed strongly related, but more importantly, our study shows that experience is an unnecessary condition for an action-induced attentional bias to occur. At the same time, we offer additional evidence that the automatic impact of instructions is probably much broader than previously thought.