Complex cognition is determined by our ability to briefly store information and keep it in an active and accessible state. However, the storage capacity of working memory (WM; Baddeley & Hitch, 1974; Cowan, 2001) is highly limited. For instance, estimates of visual WM suggest that only about three objects can be maintained at a time (e.g., Luck & Vogel, 1997). This capacity limitation requires that only information relevant to our current goals be stored, suggesting a strong relationship between selective attention and WM (see Gazzaley & Nobre, 2012, and Kiyonaga & Egner, 2013, for recent reviews). Accordingly, there is growing interest in how attentional processes support control over the contents of WM (e.g., Liesefeld, Liesefeld, & Zimmer, 2014).

One method to examine the role of attentional filtering for WM is to present relevant, to-be-stored visual stimuli along with distractors that should be prevented from being encoded into WM. Gazzaley and colleagues provided evidence that in such situations, early visual processing is modulated by both the enhancement of task-relevant and the suppression of task-irrelevant stimuli (Gazzaley, Cooney, McEvoy, Knight, & D’Esposito, 2005). Moreover, Vogel, McCollough, and Machizawa (2005) showed that when these filtering processes fail, irrelevant material gains access to WM. Crucially, filtering efficiency proved to be a critical source of individual differences in WM performance: Low-WM individuals stored more irrelevant information than did high-capacity individuals. Thus, filtering ability seems to be an important factor of the efficient utilization of the limited workspace. This idea has been further elaborated by findings showing impaired filtering in relation with WM deficits in older age (Gazzaley et al., 2008; Gazzaley, Cooney, Rissman, & D’Esposito, 2005; Jost, Bryck, Vogel, & Mayr, 2011; Sander, Werkle-Bergner, & Lindenberger, 2011).

These and other findings provide converging evidence that attention functions as a “gatekeeper” for WM, and therefore as a critical determinant of WM efficiency (e.g., Awh, Vogel, & Oh, 2006). It is typically assumed that this gatekeeping function relies on templates of the currently relevant features that specify attentional filter settings (Bundesen, 1990; Bundesen, Habekost, & Kyllingsbæk, 2005; Desimone & Duncan, 1995). Thus, it is important to understand the functionality/constraints associated with filter settings. For example, real-world cognitive functioning requires not just static selection of goal-relevant information, but also the ability to flexibly adjust the filter settings to changing requirements.

Two different theoretical pathways might lead from demands on flexibly changing filter settings to reduced WM performance. First, the efficiency of static filter settings and the dynamic of switching such settings might rely on common resources. Furthermore, the robust relationship between individual differences in WM capacity and filter efficiency implies that the negative effects of flexible switching demands would be particularly strong for individuals with low WM capacity.

A second possibility is suggested by research on the performance costs that arise in task-switching situations. A major source of such costs is stimulus-induced interference due to long-term memory representations of the task sets that are used within the same experimental context (e.g., Mayr, Kuhns, & Hubbard, 2014; Rubin & Meiran, 2005; Waszak, Hommel, & Allport, 2003). Similarly, in WM situations that require different filter settings, proactive interference from the currently irrelevant setting may make efficient filtering difficult, even when the currently relevant setting is fully implemented.

In the present work, we examined to what degree the demands of flexibly changing attentional templates actually negatively affect filtering in WM, and if so, what the potential reasons for such a flexibility deficit might be. To explore these questions, we used a “filter-switching” paradigm. As in the filtering paradigm introduced by Vogel et al. (2005), participants had to encode objects of a particular color (e.g., red) and to ignore objects in other colors. However, which color was relevant switched randomly from trial to trial, requiring dynamic adjustments of the filter settings.

We were particularly interested in the degree to which the delay activity measured by means of event-related potentials (ERPs) would indicate increased representation of the irrelevant information in the filter-switching condition. Such an effect would suggest constraints on the ability to flexibly reset the gatekeeper to WM. Furthermore, because it has already been shown that filtering efficiency in static situations is related to WM capacity, it is important to know the degree to which the ability to flexibly change the filter is also related to WM capacity.

WM and filtering performance was measured by means of the change-detection task (Luck & Vogel, 1997; Phillips, 1974), in which a varying number of visual stimuli have to be stored (without manipulating them). After a retention interval of about 1 s, memory for the stored items is tested. Given that the capacity of visual WM is on average about three to four items, increasing the number of items in the memory display would lead to an overload, and hence to a decrease of WM performance. In addition, by measuring the delay activity by means of ERPs—specifically, the so-called contralateral delay activity (CDA)—the number of stored items can be tracked.

The CDA is a sustained negative wave measured over the posterior cortex that is largest contralateral to the memorized hemifield. Its amplitude increases with the number of representations being held in visual WM, and reaches an asymptotic limit at each individual’s specific memory capacity (Vogel & Machizawa, 2004). Thus, it provides a measure of the contents of visual WM. More importantly, via a filtering paradigm in which both relevant and irrelevant information are presented together in a memory display (first described by Vogel et al., 2005), the CDA can also be used to track the extent to which irrelevant information is stored in WM. The important question here was whether the amplitude of the CDA would also increase with the number of irrelevant items. Therefore, we compared the amplitude of a distractor condition containing two relevant (i.e., targets) plus two irrelevant (i.e., distractors) objects with the amplitudes of conditions in which only relevant objects were presented, either two or four. The rationale is as follows: If in the distractor condition irrelevant objects are perfectly excluded from being stored and only the two relevant objects are maintained, then the CDA amplitude should be similar to the amplitude in the condition with only two relevant items. If, however, filtering is inefficient and two relevant plus two irrelevant items are stored, then the amplitude should be similar to that in the condition with four relevant objects. Consequently, the relative position of the distractor condition’s CDA amplitude in comparison to the two no-distractor conditions should serve as an indicator of filtering efficiency.

We explored to what degree filtering efficiency is reduced when the filter settings need to be adjusted. In Experiment 1, we compared filtering performance as measured with the CDA amplitude in pure and mixed blocks. In mixed blocks, two different selection criteria switched in random order, whereas in the pure blocks the selection criterion remained constant. To examine the relationship between WM capacity and filter-switching ability, we also assessed WM capacity with an independent measure. This allowed us to split participants into groups of high and low capacity. Our results from Experiment 1 will show that demands on flexible filtering actually do compromise the efficiency of filtering, and that this filter-switching effect also affects individuals with good filtering abilities.

The findings of Experiment 1 left the question of why exactly frequent filter changes produce the filtering problems. As we indicated above, one possibility is that filtering efficiency per se and dynamic filtering might rely on shared resources. The second possibility is that the problems arise from the fact that filter settings relevant in the recent past remain potent sources of interference, such that distractors that match these settings are more difficult to filter out. To distinguish between these two possibilities, we added in Experiment 2 a second distractor condition in which distractors were presented in a color that was never relevant throughout the experiment.

Method

Participants

A total of 22 students of the RWTH Aachen University participated in Experiment 1, and another 28 took part in Experiment 2. All participants were healthy, had normal or corrected-to-normal vision, and gave informed consent. Thirteen of the participants were excluded from the analyses because of extensive eye movements or other artifacts in the electrophysiological measurements. The final samples of Experiments 1 and 2 comprised data from 16 (mean age of 25 years; 14 female, two male) and 21 (mean age of 23 years; 12 female, nine male) participants.

Stimuli, task, and procedure

Measurement of filtering efficiency in Experiment 1

On each trial, participants were presented with an array of red and/or blue rectangles (each 0.41° × 1.31° of visual angle) of varying orientations (45°, 90°, 135°, and 180°), and the task was to remember the orientations of only the objects in the relevant color. Set size (i.e., the number of relevant items, either two or four targets) and distractor presence (no distractors or two distractors) were manipulated orthogonally. This resulted in four different conditions, such that on half of the trials only relevant items were presented (i.e., in either red or blue), and on the other half distracting items were presented along with the task-relevant ones (i.e., the memory display contained blue and red objects).

The critical conditions for investigating filtering efficiency by means of the CDA were the conditions set size 2, set size 4, and set size 2 + 2 distractors (in the following referred to as the distractor condition). The number of targets in the distractor condition was equal to the number of targets in the set size 2 condition, whereas the total number of items in the distractor condition was equal to the number of targets in the set size 4 condition (see Fig. 1b). Filtering efficiency was indexed by the relative position of the distractor condition’s CDA amplitude in comparison to the two no-distractor conditions: If filtering distractors is highly efficient, such that only targets are stored, then the CDA amplitude in the distractor condition should be similar to the amplitude in the set size 2 condition. If, however, filtering is inefficient, and not only targets but also distractors are stored, then the amplitude should be more similar to that in the set size 4 condition. Note that a similar rationale cannot be applied to the set size 4 + 2 distractors condition, because the design did not include a no-distractor condition with the same total number of items (i.e., set size 6). Apart from that, the CDA amplitude increase usually reaches an asymptotic limit with WM capacity, which is around three or four items (see Vogel & Machizawa, 2004). As a consequence, in the above-capacity range, as in the set size 4 + 2 distractors condition, the CDA is not sensitive to differentiating between storing targets and storing distractors. The reason to nevertheless include this condition was to obtain equal numbers for the distractor and no-distractor trials and to encourage filtering (see also Jost et al., 2011). For the sake of completeness, ERPs and behavioral data from this condition are included in Fig. 1 and in Table 1 below.

Fig. 1
figure 1

Stimulus sequence, design, and results of Experiment 1. a Example of a distractor trial, in which only blue items are to be stored, as is indicated by the color of the cue presented in advance. Note that for investigating contralateral delay activity (CDA), a bilateral display is essential (see the text for details). Memory for stored items is tested with a single probe. The task is to decide whether the orientation of the probe has changed or not. b The three conditions for which amplitudes of the CDA are compared, in order to investigate filtering efficiency (exemplarily for “blue is relevant”). Critical is the amplitude of the distractor condition in comparison to the two no-distractor conditions; efficient filtering is indicated by a distractor condition amplitude similar to that in the two-target condition (set size 2). In contrast, an amplitude in the distractor condition that is almost as large as that in the set size 4 condition indicates that not only the targets but also the distractors are stored, and hence inefficient filtering. c Grand average event-related potential difference waves (contralateral minus ipsilateral) in pure and mixed blocks. Negative voltage is plotted upward. Filtering efficiency is worse in the mixed blocks that is, when filter settings switch, and hence need to be adjusted

Table 1 Experiment 1: Performance in the change-detection task

The relevant color in each trial was indicated by a color cue presented in advance (see Fig. 1a). This color cue remained the same in the pure blocks, but switched randomly in mixed blocks. Note that we tried to keep factors that could affect filtering performance (such as the color of distractors and the order of pure and mixed blocks) constant across participants.

To measure the CDA, a bilateral display is essential. This means that on both sides of the fixation cross a complete memory array was presented (i.e., two 3.61° × 6.2° rectangular regions centered 2.8° to the left and right of the central fixation cross), but only the items in one hemifield were to be remembered. This was indicated by an arrow presented in advance (see Fig. 1a). The CDA, calculated as the amplitude difference between contralateral and ipsilateral activity, allows for isolating the lateralized effects of visual WM from nonspecific bilateral activity. Consequently, the CDA reflects maintenance in visual WM (Vogel & Machizawa, 2004).

Each trial began with a 200-ms colored arrow cue presented above a fixation cross, which indicated both the relevant hemifield and the relevant color. After a variable interval of 200–400 ms, the memory array was presented for 200 ms, followed by a 900-ms retention interval. Memory for the stored items was tested with a single-item probe test array in which the probe was either identical to the object presented at the same location or had changed in orientation. Participants responded by pressing one of two buttons on a handheld gamepad (right for “change” and left for “no change”), and accuracy was stressed. After the response, an intertrial interval of 2 s followed.

The testing consisted of eight blocks alternating between pure and mixed conditions after every second block (i.e., pure, pure, mixed, mixed, etc.). Each pair of consecutive pure blocks contained one block in which red was the relevant color and one block in which blue was the relevant color. All participants started with two pure blocks.

Each block contained 128 trials with equal numbers (i.e., eight trials) for the combinations of experimental condition (i.e., number of relevant and irrelevant items), relevant side of the memory array, and match of the memory and test arrays. The trial sequence was random. Moreover, in the mixed-block trials, red and blue occurred as relevant colors equally often in each of the mentioned combinations (i.e., for four trials each). Altogether, 128 trials were run for each experimental condition and block type. Prior to the testing session, participants were familiarized with the task in a practice block.

Measurement of filtering efficiency in Experiment 2

Here, only mixed blocks were realized; that is, the filter criterion randomly changed across trials between “red is relevant” and “blue is relevant”. Again, distractors were presented in the currently irrelevant color: blue distractors when red was relevant, and vice versa. Moreover, we here included another distractor condition, with green distractors—that is, in a color that was never relevant.Footnote 1 We expected that these distractors would be easier to ignore than the red and blue ones. Again, the distractor conditions contained two targets and two distractors (in the same color), and the distractor conditions were compared with no-distractor conditions that included either two or four targets (i.e., set size 2 and set size 4). The condition with four targets + two distractors (cf. Exp. 1) was not realized here. The size and orientation of the rectangles, as well as the stimulus presentation and timing, were similar to those aspects of Experiment 1.

The experiment consisted of 14 blocks with 64 trials each and equal numbers for all combinations of experimental condition, color, relevant side, match of memory, and test arrays (i.e., two trials each). Moreover, the possible transitions from one trial to the next were also completely balanced within a block for all Condition × Color combinations (64 possible transitions in the case of eight different Condition × Color combinations). Altogether, 224 trials were run for each experimental condition.

Estimation of WM capacity

The EEG part is ideally suited to capture filtering efficiency. However, due to only small set sizes and ceiling effects, it is suboptimal for estimating individual WM capacities. Because of this, and to obtain an independent measure of WM capacity, a standard, behavioral version of the change-detection paradigm (see Luck & Vogel, 1997) was run prior to the EEG part. Here, only relevant items were presented in varying numbers: two, four, six, or eight items. The task was to maintain the color of each object (squares 0.75° × 0.75° in size). The colors were randomly selected from a set of highly discriminable colors (red, green, blue, yellow, purple, black, and white). All stimuli were presented for 200 ms within a centered 6.2° × 6.2° region on a gray background and were followed by a retention interval of 900 ms. Trials were presented in three blocks, each containing 20 trials for each set size.

The WM capacity K was estimated with a standard formula (see Cowan, 2001; Pashler, 1988; Vogel & Machizawa, 2004)—that is, K = S × (HF), where S is the set size, H is the hit rate, and F is the false alarm rate. Set sizes 4, 6, and 8 were included in this measure.

Electrophysiological recording and analysis

The EEG was recorded from 61 Ag/AgCl electrodes inserted into an elastic cap (Easycap, Brain Products, Munich, Germany) with predefined electrode positions, according to the 10–20 System. The electrodes were referenced to the nose tip. The horizontal electrooculogram (EOG) was recorded from two electrodes placed approximately 1 cm to the left and right of the external canthi of the eyes. The vertical EOG was recorded from an electrode mounted beneath the left eye and from electrode FP1. The left or right mastoid served as the ground (counterbalanced across participants), and impedances were kept below 7 kΩ. Signals were recorded with two 32-channel DC amplifiers (Brain Amps, Brain Products, Munich, Germany), sampled at 500 Hz and low-pass filtered at 250 Hz.

Data preprocessing and ERP averaging were run with the Brain Vision Analyzer software. The signals were filtered offline with a band-pass of 0.1–30 Hz (24 dB/oct) and a 50-Hz notch-filter. Epochs (starting 100 ms before the onset of the arrow cue and lasting until the end of the retention interval) containing eye movements, blinks, and other artifacts were excluded from further analysis.

Horizontal eye movements were rejected by means of a two-step procedure suggested by Luck (2014; see also Woodman & Luck, 2003). In the first step, trials with horizontal EOG amplitudes exceeding a threshold were detected and removed by means of a semiautomatic procedure. The amplitude criterion was initially set to 25 μV (which should capture eye movements of >1.5°), but individually adjusted by visual inspection of the single-trial waveforms such that clearly visible artifacts were rejected. Note that this criterion also leads to a rather high number of false alarms when the signal is noisy. In some participants, more than 50 % of the data were rejected with this criterion (i.e., because of too many eye movements and/or too much noise). These participants were excluded.

In the second step, we computed averaged horizontal EOG waveforms for the attend-left and attend-right trials, to assess whether the ERPs were contaminated with very small systematic eye movements. The residual activity was <2 μV on average, and <3 μV for each individual participant (which corresponds to an average eye movement of less than 0.2°; Lins, Picton, Berg, & Scherg, 1993).

After rejection of eye movements, the EEG was segmented into 1,100-ms epochs starting 100 ms before the onset of the memory array and covering the whole retention interval. Remaining artifacts were detected and excluded by means of a semiautomatic procedure with the following criteria: The maximum allowed voltage step between two adjacent sampling points was 20 μV, the maximum allowed absolute difference in a segment was 100 μV, and the minimum allowed difference within 100 ms was 0.5 μV. On average, 4 % and 3 % of the trials (maximal 12 %) were excluded in Experiments 1 and 2, respectively.

In Experiment 1, ERPs were averaged for each of the four conditions and for pure and mixed blocks separately; the waveforms were based on 91 trials, on average (minimum 73 trials). In Experiment 2, ERPs were averaged for each of the four conditions (aggregated across colors) and were based on 147 trials, on average (minimum 118). All averages were corrected with a 100-ms prestimulus baseline.

As in other studies (e.g., Jost et al., 2011; Vogel et al., 2005), the CDA was computed by subtracting ipsilateral from contralateral activity, averaged across hemispheres and across occipital to parietal electrode positions (i.e., O1/O2, PO7/PO8, PO3/PO4, P7/P8, P5/P6, P3/P4, and P1/P2). As can be seen in the figures below, the CDA followed the N2pc, started around 350–400 ms after the onset of the memory array, and lasted until the end of the retention interval. To keep the number of analyzed intervals when investigating filtering efficiency and distractor effects at a minimum, we predefined a time window of interest, in which the CDA proved to be sensitive to increasing numbers of stored items. To test for these set-size effects, analyses of variance (ANOVAs) were run separately for consecutive time windows of 100-ms length starting 400 ms after the onset of the memory array (i.e., 400–500 ms, 500–600 ms, etc.). Amplitudes were then aggregated across the significant time windows for further analyses. Note that only the no-distractor conditions, set size 2 and set size 4, were included in the superordinate ANOVAs. Thus, the time intervals of interest for testing filtering efficiency and distractor effects were defined independently of the distractor conditions. For Experiment 1, the analyses revealed a time window of interest between 500 and 900 ms (see the Results section for detail).

The main hypothesis of Experiment 1 was that more irrelevant material would be stored in mixed than in pure blocks. The amplitude of the distractor condition relative to the two no-distractor conditions served as an indicator for filtering efficiency. More precisely, we expected that in pure blocks the distractor condition would be more similar to the set size 2 condition, and that in mixed blocks the distractor condition would be more similar to the set size 4 condition. To test for these differences, an ANOVA was run for the mean amplitudes of the predefined time window (i.e., 500–900 ms) including the factors Condition (set size 2, distractor, set size 4) and Block Type (pure vs. mixed blocks), followed by t tests meant to test the distractor effects for significance and to directly compare the sizes of the distractor effects between block types.

In Experiment 2, the most important contrast was between the two distractor conditions. We expected that distractors in a color that was never relevant could be filtered out more easily. Thus, we expected to find a smaller CDA amplitude for this condition than for distractors in a color that was only currently irrelevant. The time interval of interest for this analysis was between 600 and 900 ms. Again, this interval was predefined independently of the distractor conditions—that is, by means of significant set-size effects (set size 2 vs. set size 4) measured in 100-ms intervals. Note that we here included only five of the posterior electrode pairs (i.e., O1/O2, PO7/PO8, PO3/PO4, P7/P8, and P5/P6), because two pairs (P3/P4 and P1/P2) did not show substantial set-size effects (no significant differences in any of the 100-ms length time windows between 400 and 1,100 ms).

Results

Experiment 1

CDA and filtering efficiency

Figure 1c illustrates the CDAs in the retention interval. A general pattern here is that the amplitude of the CDA (i.e., the sustained negativity starting around 350–400 ms) increased with set size. ANOVAs run for time windows of 100-ms length revealed significant set-size effects (set size 2 vs. set size 4) between 500 and 900 ms, with F values varying between minF(1, 15) = 8.20, p = .012, and maxF(1, 15) = 27.95, p < .001. This result suggests that the CDA indexed the number of active representations in visual WM (see Vogel & Machizawa, 2004), and the significant time windows between 500 and 900 ms therefore were taken as the time of interest for further analyses. The comparison of the distractor with the no-distractor conditions (i.e., the relative position of the distractor condition’s amplitude) in this time window, therefore, can be utilized to measure whether and to what degree irrelevant material was unnecessarily stored.

The direct comparison of pure and mixed blocks revealed a filtering deficit when the filter criteria switched randomly from trial to trial. Under these conditions (see the CDAs in mixed blocks on the right side of Fig. 1c), the amplitude of the distractor condition was as large as in the set size 4 condition. In contrast, in pure blocks (left side of Fig. 1c), the distractor condition’s amplitude was more similar to the set size 2 amplitude and much smaller than in the set size 4 condition. Analyses run for the time window between 500 and 900 ms confirmed this pattern. An ANOVA with the factors Block Type (pure vs. mixed blocks) and Condition (set size 2, distractor, and set size 4) revealed a significant interaction, F(2, 30) = 3.49, p = .044, GG-ε = .993, which was found to be due to the different positions of the distractor conditions relative to the no-distractor conditions: The difference between distractor and set size 2 was significantly smaller in pure than in mixed blocks, t(15) = 1.85, p = .042, and the difference between distractor and set size 4 was larger in pure than in mixed blocks t(15) = 2.54, p = .012 (both contrasts one-tailed).

Moreover, t tests comparing the distractor and no-distractor conditions separately for the two block types confirmed this pattern. For pure blocks, the distractor condition only marginally differed from the set size 2 condition, t(15) = 1.81, p = .091, but it was significantly different from the set size 4 condition, t(15) = 2.20, p = .044. In contrast, for mixed blocks we observed a significant amplitude increase for the distractor relative to the set size 2 condition, t(15) = 5.24, p = .001, but no significant difference between the distractor condition and the set size 4 condition, t(15) = 1.50, p = .154. This pattern of results suggests that filtering efficiency was weaker in mixed than in pure blocks, and thus that more of the irrelevant material was stored when filter settings switched.

WM performance in the change-detection task

WM performance was measured as the percentage of correct responses in the change-detection task (see Table 1). As with the ERPs, the focus was on the conditions set size 2, set size 4, and set size 2 + 2 distractors. An ANOVA revealed a main effect of block type,Footnote 2 F(1, 15) = 5.15, p = .0385; a main effect of condition, F(2, 30) = 178.84, p < .0001, ε = .7235; and, importantly, a Block Type × Condition interaction, F(2, 30) = 3.27, p = .060, ε = .8717. Performance decreased with increasing number of targets: That is, performance was, in general, poorer for the set size 4 than for the set size 2 condition, F(1, 15) = 256.15, p < .0001. This effect reflects limited capacity and did not differ for pure and mixed blocks (F < 1).

Most importantly, a switching-induced filtering deficit was present in performance: In pure blocks, performance was as good in the distractor condition as in the set size 2 condition, but in mixed blocks it was significantly poorer than in the set size 2 condition, t(15) = 3.90, p = .0014. This pattern indicates that in mixed blocks, more of the irrelevant items occupied the limited storage space. Moreover, in accordance with the ERP results, performance differences between the distractor and set size 2 conditions were significantly smaller in pure than in mixed blocks, t(15) = 2.99, p = .005, and the difference between the distractor condition and the set size 4 condition was larger in pure than in mixed blocks, t(15) = 1.73, p = .052 (both contrasts one-tailed).

Taken together, the behavioral results also provided evidence that filtering is affected when filter settings switch, and that more of the irrelevant material is encoded in WM. Note that this pattern also held when the condition with four targets + two distractors was taken into account.Footnote 3

Individual differences in filtering

In previous studies, it has been shown that filtering efficiency is a critical source of individual differences in WM performance (e.g., Jost et al., 2011; Vogel et al., 2005): Individuals who score low in estimates of WM capacity are also less efficient in filtering out irrelevant material. It is therefore an interesting question whether the reduction in filtering efficiency caused by switching between filter settings is also larger for low-capacity than for high-capacity individuals. We, therefore, investigated individual differences in more detail.

The pattern of individual differences in pure blocks proved to be similar to those from previous studies (see Jost et al., 2011; Vogel & Machizawa, 2004; Vogel et al., 2005). The set-size effect (i.e., the amplitude increase from set size 2 to set size 4) significantly correlated with an individual’s WM capacity, r = .628, p = .005: Individuals with lower WM scores were less efficient at storing an increased number of items, as reflected in a smaller amplitude increase from set size 2 to set size 4. Moreover, filtering was also less efficient in low-capacity individuals. Filtering scores (the difference between the distractor condition and the set size 4 condition) were smaller for low-capacity than for high-capacity individuals, r = .449, p = .040 (see also the group differences on the left side of Fig. 2).

Fig. 2
figure 2

Individual differences. CDAs are shown separately for individuals with high and low capacity. As in previous studies, in pure blocks the amplitude increase from set size 2 to set size 4 is larger for high- than for low-capacity individuals. Also filtering efficiency is better in high-capacity individuals; that is, the amplitude of the distractor condition is between those for set sizes 2 and 4, whereas for low-capacity individuals, the amplitude of the distractor condition is almost as large as in the four-target condition, which indicates inefficient filtering. Most importantly, however, both groups do suffer from switching between filter settings and show inefficient filtering in the mixed blocks. Thus, filter switching does not affect low-capacity individuals more than high-capacity individuals

Most importantly, the filtering performance of both low- and high-capacity individuals was affected in the mixed blocks. This is illustrated in Fig. 2, showing the CDA amplitudes separately for individuals with low capacity (mean capacity of 1.92 items) and individuals with high capacity (mean capacity of 3.29 items). For both groups, the amplitude of the distractor condition was as large as the amplitude of the set size 4 condition, t(7) = 1.78, p = .118, and t(7) = 0.51, p = .629, for the low- and high-capacity groups, respectively. Moreover, also for both groups, the distractor condition significantly differed from the set size 2 condition, t(7) = 4.73, p = .001, and t(7) = 2.83, p = .013, for the low- and high-capacity groups, respectively.

If anything, the effect caused by filter switching was larger for high-capacity individuals. This was due to the fact that for low-capacity individuals, filtering efficiency was weak even in pure blocks, and the distractor condition here already differed significantly from the set size 2 condition, which indicates inefficient filtering, t(7) = 1.95, p = .047. In contrast, high-capacity individuals had more “room to move”. The difference from the set size 2 condition became larger, t(7) = 2.31, p = .028, from pure to mixed blocks, whereas the difference from the set size 4 condition became smaller, t(7) = 2.11, p = .037 (all contrasts one-tailed). These results clearly do not suggest that filter switching affects low-capacity individuals more than high-capacity individuals.

In order to quantify the extent to which reduced filtering in mixed blocks affected the capacity for relevant items, we predicted WM capacity in the mixed blocks by means of the CDA-filtering efficiency scores in the retention interval (see the significant correlation between WM capacity and filtering efficiency above). The resulting regression analysis with WM capacity K (assessed in the independent measure) as criterion and filtering efficiency (amplitude difference between the set size 4 and distractor conditions in pure blocks) as predictor was K = 2.363 + .866x. Inserting the mean filtering value of the mixed blocks led to a prediction of WM capacity of 2.17 slots—and, hence, a reduction of 0.43 slots (effect size d = 0.44).

Overall, the fact that WM capacity and static filtering efficiency are related, but WM capacity and flexible filter switching are not, is incongruent with the idea that filtering and filter switching rely on common resources.

Experiment 2

CDA and filtering efficiency

In Experiment 2, we aimed at exploring the filter deficit in mixed blocks in more detail; that is, we investigated whether filtering is in general reduced when the filter settings need to be constantly adjusted, or whether the filter deficit is restricted to stimuli in a color that is potentially relevant. Figure 3b illustrates the CDAs of the four conditions. This shows the typical pattern of a set-size effect. The CDA amplitude increased with increasing number of targets. ANOVAs for the 100-ms time windows revealed that the set-size effect was significant between 600 and 900 ms, minF(1, 20) = 4.91, p = .039, and maxF(1, 20) = 12.12, p = .002. As in Experiment 1, this time window was taken as the time of interest for the subsequent analyses.

Fig. 3
figure 3

Design and results of Experiment 2. a The four conditions. Here, another distractor condition was included—that is, distractor non relevant, with green distractors. Again, participants switched between “blue is relevant” and “red is relevant.” The green distractors, therefore, were distractors in a color that was never relevant, and therefore that should be ignored more easily than distractors in a color that had been relevant in previous trials and could become relevant in following trials. b CDA waves for the four conditions. The amplitude increased with the number of to-be-stored items (see the amplitude difference between set size 2 and set size 4), but also when distractors were present. Most important, however, is the amplitude difference between the distractor conditions, with a larger amplitude for distractors in a color that was only currently irrelevant (i.e., distractor irrelevant) than for distractors in a color that was never relevant (i.e., distractor non relevant)

The most important difference, however, was that between the two distractor conditions: Analyses for the time window between 600 and 900 ms revealed that the amplitude of the CDA was larger for the condition with only currently irrelevant distractors (i.e., distractor irrelevant) than for distractors that were never relevant (distractor non relevant), t(20) = 1.74, p = .049, one-tailed. The difference between the distractor non relevant and set size 2 conditions was not significant, t(20) = 0.22, p = .827.

WM performance in the change-detection task

Table 2 illustrates performance differences across the conditions. An ANOVA including the four conditions revealed a significant main effect, F(3, 60) = 105.79, p < .0001, ε = .886. As in Experiment 1, performance significantly decreased with set size, t(20) = 15.28, p < .0001, but also when distractors were present: Both distractor conditions differed significantly from the set size 2 condition: t(20) = 2.24, p = .037, and t(20) = 3.29, p = .004, for the distractor non relevant and distractor-irrelevant conditions, respectively. Importantly, we also found a small difference between the two distractor types. This effect, however, was not significant here,Footnote 4 t(20) = 0.85, p = .4017.

Table 2 Experiment 2: Performance in the change-detection task

Taken together, the findings of Experiment 2 replicated the findings of a switching-induced filtering deficit: Distractors in a color that was currently irrelevant were stored to some extent. In addition, the results of Experiment 2 also indicated that filtering was much more efficient for distractors in a color that was never relevant throughout the experiment than for distractors in a color that was potentially relevant. This finding suggests that the filtering inefficiency in mixed blocks resulted for the most part from previous filter settings that were still potent sources of interference, allowing material that matched these criteria to become stored. We also analyzed whether the difference between the distractor conditions was related to WM capacity. However, we did not find any reliable correlation in either the CDA or the behavioral data.

Discussion

The present results shed light on the question of how attentional processes support control over the contents of WM. Specifically, we were interested in situations in which filter settings had to be flexibly adjusted. Our results indeed suggest that flexible filtering demands reduce filtering efficiency, which in turn leads to less efficient utilization of the capacity-limited workspace.

In Experiment 1, we contrasted the standard condition, in which the filtering criterion remained invariant across trials, with a filter-switching condition. The increased demands on filtering in the mixed blocks led to decreased filtering efficiency, such that distractors were stored and occupied storage space. This resembles the pattern that can be observed when high- and low-WM-capacity individuals are compared (see Jost et al., 2011, and Vogel et al., 2005, as well as the individual difference approach in the present study), suggesting that not only is the individual ability to filter out irrelevant information important for the efficient utilization of the limited workspace, but also how well filter settings can be adjusted to changing requirements. In fact, in terms of overall performance, the reduction in filtering efficiency in the mixed condition was equivalent to a drop in 0.43 WM slots, suggesting a substantial loss of available storage space. Importantly, switching between filter settings, in particular, affected filtering in those individuals with high capacity and good filtering performance (note that the filtering efficiency of low-capacity individuals was already weak without switching demands). A post-hoc analysis of the behavioral data in the most difficult condition (i.e., four targets plus two distractors) in mixed blocks revealed that with only 70 % correct responses (which is, in terms of capacity K, equivalent to 1.65 stored targets), high-capacity individuals performed as poorly as low-capacity individuals in situations without filtering demands (i.e., four targets only). Thus, switching between filter settings reduced the WM performance of high-capacity individuals to the level of low-capacity individuals.

There were two possible reasons for the filtering inefficiency in the mixed condition. First, filter switching may demand general attentional resources, which renders the actual filtering less efficient. Second, recently used filter settings may continue to bias processing. To explore the mechanisms behind the filter deficit, we compared two different distractor types in Experiment 2: distractors that were associated with the competing filter setting (as in Exp. 1), and distractors in a color that was never relevant throughout the experimental session. The CDA amplitude difference (and the behavioral data, as a trend) showed that objects in the currently irrelevant color were more likely to end up in WM than were distractors in a never-relevant color. In other words, sustained interference from previously relevant, but currently irrelevant, filter settings, and not the switching demands per se, is responsible for the filtering deficit. Moreover, the data suggest that this occurs independent of the individual WM capacity.

This pattern of results parallels the costs that usually occur in task-switching situations. Switching between two or more simple tasks leads to RT and error costs. With time to prepare for the upcoming task, these switch costs are usually reduced, which has been taken as evidence that some kind of active reconfiguration of the cognitive system takes place in order to perform the new task. However, even with ample time to prepare, switch costs are not eliminated completely, suggesting episodic-memory contributions to the costs (for recent reviews, see Kiesel et al., 2010; Vandierendonck, Liefooghe, & Verbruggen, 2010). In their seminal article, Allport, Styles, and Hsieh (1994) proposed that these costs are due to persisting activation of the previously activated task sets, and they coined the term task-set inertia to refer to this kind of proactive interference. Importantly, there is evidence from a number of studies that persisting activation from previous task sets interferes with responding to the actual stimulus, particularly when the stimulus has been processed before in the context of the other task (e.g., Waszak et al., 2003; Wylie & Allport, 2000). Interestingly, such costs are not restricted to switch trials, but often occur even on no-switch trials in the context of task-switch blocks (e.g., Mayr, Kuhns, & Hubbard, 2014; Rubin & Meiran, 2005). The present findings constitute an important extension of this pattern of results, because they indicate that in switching situations, interference from previous control settings affects not just the speed of responding to a given stimulus, but also the fidelity with which this stimulus is coded in WM.

The switching-induced filtering deficit is also similar to findings from visual search experiments showing that attentional selection is affected by the search history (e.g., Dombrowe, Donk, & Olivers, 2011; Maljkovic & Nakayama, 1994; Olivers & Humphreys, 2003; Wolfe, Butcher, Lee, & Hyle, 2003). For instance, in the study by Wolfe et al. (2003), search times were prolonged when the target-defining feature varied from trial to trial. Similar findings were reported by Dombrowe et al., using a task in which participants were asked to saccade toward a target in a prespecified color. Saccades were slower and less accurate when the target color switched. Moreover, an initial preference for distractors in a color that matched the previous target color indicated that it takes a while to fully switch the attentional set. ERP data (N2pc) suggest that the repetition of either the target dimension or the feature allows a faster and more efficient allocation of attention (Eimer, Kiss, & Cheung, 2010; Töllner, Gramann, Müller, Kiss, & Eimer, 2008). In the context of the guided search model (Wolfe, Cave, & Franzel, 1989) or of other theories of attention such as dimensional weighting (Found & Müller, 1996), top-down filter settings are counteracted by lingering effects of previous selections that reduce filtering for those items that match the previous filter settings.

Recent evidence suggests that when control settings are established after longer periods of learning, long-term memory takes over, such that attentional selection is then guided by templates stored in long-term memory (Carlisle, Arita, Pardo, & Woodman, 2011; Woodman, Carlisle, & Reinhart, 2013). Thus, it is conceivable that for our pure blocks, long-term memory was responsible for holding the templates. However, in mixed blocks when switching between filter settings was required, a no-longer-relevant template needed to be replaced by a different one, which presumably takes place in WM.

The fact that in general, filtering inefficiency in mixed filter conditions reflects episodic priming from previous control settings is relevant when considering another important—and at least at first sight curious—result from Experiment 1: The reduction in filtering efficiency in the mixed condition was observed for both high- and low-capacity individuals. This is important, because both in past work (e.g., Vogel et al., 2005) and in our Experiment 1, filtering efficiency (as reflected in the CDA) has been found to be highly related to individual differences in an independently assessed measure of WM capacity. In fact, individual differences in filtering efficiency have been suggested to be a critical factor behind memory capacity (e.g., Vogel et al., 2005). It is then noteworthy that an experimental manipulation that strongly affects filtering efficiency (i.e., pure vs. mixed filtering) shows no tendency to interact with WM capacity. However, this pattern is consistent with the idea that two distinct factors determine overall filtering efficiency. The first is the top-down, goal-related implementation of an attentional filter—an ability that is related to WM capacity. The second is the fact that, due to implicit, episodic priming effects (cf. Awh, Belopolsky, & Theeuwes, 2012; Wolfe et al., 2003), past filter settings remain active and penetrate even efficient filter settings.

Conclusion

In two experiments, we provided evidence that the requirement to frequently adjust the filter settings that are responsible for preventing irrelevant material from being processed reduces the efficient utilization of capacity-limited WM. Because WM plays an important role in many cognitive tasks, any further limitation of this system, such as the limitation found here, can have important, real-life consequences. We also specified the observed switching-induced filtering deficit by showing that it is due to lingering effects of previous filter settings that allow irrelevant material that matches these settings access to WM. In this sense, implicit guidance of attention counteracts goal-directed selection, and thus constitutes an important limitation of our attention system that holds even for high-WM individuals. In showing that different selection influences determine how efficiently the limited workspace can be used, the present study contributes to our understanding of the interrelation between selective attention and WM.