Introduction

Cognitive control enables information processing and action implementation in a context-sensitive manner. To do so, a context representation should be maintained and utilized to guide behavior. Often, this context is a goal, which serves to bias behavior according to the task requirements. Since the information about the relevant goal is typically unavailable in the environment, working memory (WM) is required to maintain it. The role of WM in maintaining goal representations is well established (e.g., Kane & Engle, 2003; Miller & Cohen, 2001; Oberauer et al., 2013). Previous research on cognitive control investigated the mechanisms that support goal implementation, using experimental tasks such as Stroop (e.g., Cohen et al., 1990), flanker (Eriksen & Eriksen, 1974), and task-switching (Grange & Houghton, 2014). In these tasks, control is required to bias response selection toward a goal-appropriate feature, often by discrediting irrelevant aspects of the information. However, apart from moderating S-R associations, control is also needed to select, maintain, and update the appropriate context. These processes take place even before the stimulus is encountered (D'Ardenne et al., 2012). While this role of control is explicitly acknowledged in current theoretical models of control (see Egner, 2017, for various theoretical views), we know relatively little about the processes that support it. This is because context (or goal) selection, maintenance, and updating cannot be studied in isolation using experimental tasks that do not manipulate context updating demands, namely whether WM updating is needed or not. The present study examined the processes involved in updating WM with task-goals, using a novel, multiple-cue variant of the task-switching paradigm, in which participants needed to adhere to some of the cues but not to others.

Theoretical models of WM emphasize the conflict between its maintenance and updating functions (Frank et al., 2001; Miller & Cohen, 2001; O'Reilly, 2006). WM enables the maintenance of information in a highly accessible state, shielded from interference caused by distracting or even conflicting information. Selective updating of WM contents is the ability to modify the maintained information when needed, counteracting the robust maintenance. Cognitive control is required to balance these demands, namely, to update WM with relevant information, such as the currently relevant goal, and to shield it from irrelevant or outdated information, such as previous or completed goals. These models account for such control by assuming that WM representations are protected behind a “gate” that separates WM from perceptual information. Keeping the input-gate closed by default enables robust maintenance within WM, shielded from the ongoing flow of information. Conversely, opening the gate allows updating WM with available input.

Corroborating these models, empirical studies that manipulated the need for updating typically find that trials that involve updating are slower and more error-prone than those that do not (Ecker et al., 2010; Frischkorn et al., 2022; Kessler, 2017, 2018; Kessler & Meiran, 2006, 2008; Morris & Jones, 1990; Nir-Cohen et al., 2020; Rac-Lubashevsky & Kessler, 2016a, 2016b, 2018, 2019), and that the latency of updating increases with the number of to-be-updated items (Ecker et al., 2014; Kessler & Oberauer, 2014, 2015). These findings are in line with the notion that WM updating constitutes an “executive function” (Miyake et al., 2000), a process that is controlled, costly, non-obligatory, and involves mental effort (Westbrook et al., 2013). Moreover, switching between update and no-update trials, and vice versa, is associated with a substantial cost that was attributed to the process of opening/closing the gate to WM (Kessler & Oberauer, 2014, 2015; Nir-Cohen et al., 2020; Rac-Lubashevsky & Kessler, 2016a, 2016b; Verschooren et al., 2021).

The above theoretical and empirical picture was challenged recently by new findings from our lab (Kessler et al., 2022), showing that updating WM with a single letter is faster and more accurate than not-updating. Participants performed a choice response-time (RT) task in which the letter X or O appeared on the screen in each trial, and they had to respond to it using a right/left key press. The letter appeared within a red or blue frame to manipulate updating demands. Specifically, the participants were required to update their WM with the letter that appeared within the most recent red frame and to indicate its identity at a recall trial that appeared unexpectedly after several trials. Accordingly, trials involving a red frame required updating WM with the identity of the letter presented in them (update trials), whereas trials involving a blue frame did not require updating (no-update trials). Surprisingly, in three experiments, RTs in update trials were faster and more accurate compared to no-update trials. This is at odds with the idea that updating is slow and effortful. To resolve this discrepancy, we suggested that when only a single item needs to be processed and maintained in WM, its updating is seamlessly carried out as part of attending to new information. When updating is not desired, it should be overridden by the act of removal (Lewis-Peacock et al., 2018), leading to slower RTs in no-update trials. This idea of updating as a by-product of attention is in line with the theoretical view that attending to items facilitates their encoding into WM (Oberauer, 2019).

The present study aimed to extend the above phenomena with procedural information, namely goals. A multiple-cue task-switching paradigm was used, in which a sequence of task cues is presented, followed by a single probe (Fig. 1). The cues were red or blue, indicating the update and no-update conditions. The participants were asked to judge the probe according to the most recent red cue. This required updating WM with red but not with blue cues. A choice-RT task performed on the cue identity enabled measuring RT and accuracy for cue processing. A facilitated performance in update compared to no-update trials would indicate that goals can be quickly updated into WM in an obligatory manner, a finding that stands in sharp contrast to previous results and theorizing.

Fig. 1
figure 1

a Stimuli and response mapping for Experiment 1. b A demonstration of a run of trials. c A run of trials consisted of a varied number of task cues, followed by a single probe. The cue colors indicated update/no-update. The probe needed to be judged according to the most recent red cue

Experiment 1

Method

Participants

Twenty-eight psychology students from Ben-Gurion University of the Negev participated in the experiment for partial course credit. Six participants were excluded from the analysis because of having diagnosed attention deficits (N = 2) or a low accuracy rate (N = 4; < 80% in the probe trials). The final sample included 22 participants (19 women; Mage = 23.64 years, SDage = 1.26 years). All the participants reported having no neurological deficits or learning disabilities, and an intact color vision.

Procedure

The experiment was programmed in OpenSesame (Mathôt et al., 2012). The study was run online using JATOS (Lange et al., 2015). Participants performed the experiment online using their personal computers through their internet browsers.

The experiment was composed of 120 runs that included a sequence of trials that presented task-cue shapes, followed by a digit probe. In each trial, a square or a rhombus appeared on the screen, in either red or blue, and the participants needed to respond to it with the keys L or A, respectively, using their left/right ring fingers (Fig. 1a). Square cues indicated the parity task (whether the probe digit is odd or even), and rhombus cues indicated the magnitude task (whether the probe digit is smaller or larger than 5). Upon the presentation of the probe digit, participants were required to apply the task indicated by the most recent red shape. Accordingly, they had to update their WM with the identity of the shapes that appeared in red (update condition) but not with the shapes that appeared in blue (no-update condition). The first trial in each sequence was always an update trial. The updating condition in each of the subsequent trials was chosen at random with equal probabilities for update and no-update. Each trial was terminated with the participant’s response or after 1,500 ms. The inter-trial interval was 500 ms. A sequence was composed of a minimum of two trials. Starting from the third trial, each cue trial had a 20% probability of being the last in the run, so that the participants could not predict when the digit probe will be presented. After a sequence of task cues, one of the digits 1–9 (excluding 5) was presented as a probe, and the participants were required to apply the task cued by the most recent red shape. The response keys were J and K, respectively, for even/odd, and D and S, respectively, for larger/smaller than 5, using the left/right middle and index fingers.

The experiment started with a practice phase, composed of (a) 20 trials in which only a shape was presented, and the participants were required to respond using the L/A keys; (b) 16 trials in which a digit was presented, and the participants needed to apply the parity task; (c) same as b, but with the magnitude task; and (d) 21 runs, similar to those that appeared in the main task.

Design and analysis

The main analysis focused on RTs and error proportions (PE) for the task cues, as a function of Updating (update vs. no-update) and Update-Switch (whether the Updating condition was repeated or switched compared to the immediately previous trial). An additional analysis examined performance at the probe as a function of the number of preceding no-update trials. Error and post-error trials were removed from the RT analysis. RTs shorter than 100 ms were removed from the cue analysis. For the probe analysis, RTs shorter than 100 ms or longer than 10,000 ms were removed. Then, trials that deviated more than 2 SD from the mean of their condition within each participant were dismissed as outliers. Runs in which the response to the probe was erroneous were excluded from the cue-trials analyses, as was the first trial in each run. All analyses were carried out in R (R Core Team, 2021) using the RStudio IDE (RStudio team, 2022; version 4.1.2) using “afex” (Singman et al., 2021; version 1.0-1), “emmeans” (Length, 2022; version 1.7.2), “tidyverse” (Wickham et al., 2019; version 1.3.1), “dplyr” (Wickham et al., 2021a, b; version 1.0.7), “readr” (Wickham, Hester, & Bryan, 2021b; version 2.1.1), “tidylog” (Elbers, 2020; version 1.0.2), and “ggbeeswarm” (Clarke & Sherrill-Mix, 2017; version 0.6.0) packages.

Results

Cue-trials response time (RT)

The descriptive data are presented in Table 1. An ANOVA was conducted on mean RTs with Updating and Update-Switch as within-subject factors. The main effect of Updating was significant, reflecting shorter RTs in update trials than in no-update trials, F(1,21) = 15.58, p < .001, ηp2 = .43 (see Fig. 2). Also, the main effect of Update-Switch was significant, reflecting shorter RTs in repeat than switch trials, F(1,21) = 55.61, p < .001, ηp2 = .73. The two-way interaction was non-significant, F(1,21) = .05, p = .83, ηp2 = .002.

Table 1 Descriptive data for cue-trials in Experiments 1 and 2
Fig. 2
figure 2

Response time (RT; ms) for update and no-update trials. Participants that were faster in update than in no-update cues appear in blue, and participants with the reversed pattern appear in red. Group means are denoted by black horizontal lines

Cue-trials error proportions (PE)

A parallel ANOVA was conducted on mean PE with Updating and Update-Switch as within-subject factors. The main effect of Updating was significant, F(1,21) = 19.56, p < .001, ηp2 = .48, as well as the main effect of Update-Switch, F(1,21) = 9.26, p = .006, ηp2 = .31. The two-way interaction was also significant, F(1,21) = 10.55, p = .004, ηp2 = .33. Whereas the no-update condition was more error-prone than update in both Update-Switch conditions, this difference was larger in repeat trials, F(1,21) = 23.74, p < .001, than in switch trials, F(1,21) = 3.61, p = .07.

Probe RT

An ANOVA was conducted on mean RTs for the probes, as a function of the number of preceding no-update trials in a run (0, 1, 2, or 3+). Probes that were preceded by 0 no-update trials are ones in which the relevant task was cued in the immediately preceding cue. The more no-update trials preceded the probe, the further away the relevant task cue was. The effect of number of preceding no-update trials was significant, F(3,63) = 8.88, p < .001, ηp2 = .30. RTs were 1,599 (SD = 433), 1,690 (SD = 408), 1,815 (SD = 553), and 1,947 (SD = 758) ms for the 0–3+ conditions, respectively. Helmert contrasts revealed a significant difference between 0 and 1–3+, t(21) = 4.18, p < .001, between 1 and 2–3+, t(21) = 3.56, p = .002, but not between 2 and 3+, t(21) = 1.43, p = .17.

Probe PE

A parallel ANOVA was conducted on mean PE for the probes. The effect of number of immediately preceding no-update trials was significant, F(3,63) = 6.97, p < .001, ηp2 = .25, PE = 3.5% (SD = 4.8%), 6.4% (SD = 5.7%), 9.8% (SD = 11.1%), and 13.4% (SD = 13.4%) for conditions 0–3+, respectively). Helmert contrasts revealed a significant difference between 0 and 1–3+, t(21) = 3.55, p = .002, between 1 and 2–3+, t(21) = 2.54, p=.02, but not between 2 and 3+, t(21) = 1.56, p = .13.

Discussion

Experiment 1 demonstrated that task-cue updating is faster and less erroneous than no-updating, replicating our previous finding with letter stimuli (Kessler et al., 2022). The faster updating latencies were previously explained by the need for the additional process of removal in the no-update condition to overcome the default updating that takes place as part of, or because of, response selection. However, the probe-trial analysis revealed that this process is far from perfect. Specifically, the further away the presentation of the relevant (red) task-cue from the probe, the slower and more erroneous the probe performance is.

To what extent did the participants update the relevant goal in WM throughout the presentation of the task-cues? One possibility is that whenever a red task-cue appeared, they reconfigured the task-set accordingly to be prepared for the upcoming probe. Accordingly, the goal representation in WM was updated continuously throughout the run. Another possibility is that the shape of the most recent red task-cue was maintained and updated in a declarative manner. Only when the probe was presented, was the maintained shape translated to its cued task. According to this possibility, only the shape but not the goal representation was updated throughout the run, so that participants did not mentally switch between tasks during the presentation of the task cues.

To address this issue, in Experiment 2 two different task cues indicated each task. This enables us to distinguish between updating the task-cue and updating the goal throughout the run. If only the cue is maintained and updated, no difference in performance should be observed between updating a task-cue to another when they both indicate the same task and updating to a cue that indicates a different task. In other words, observing a task switch cost beyond that of alternating among task cues will indicate that at least part of the implementation of the relevant goal takes place throughout run of task-cue presentation.

Experiment 2

Method

Participants

Thirty-three psychology students from Ben-Gurion University of the Negev participated in the experiment for partial course credit. Ten participants were excluded from the analysis because of misunderstanding the instructions, as revealed during the debriefing (N = 5) or a low accuracy rate (< 80%) in either the cue or probe trials (N = 5). The final sample included 23 participants (21 women; Mage = 23.17 years, SDage = 0.89 years). All the participants reported not having diagnosed attention deficits and having an intact color vision.

Procedure

Experiment 2 was similar to Experiment 1, except for using four task-cues: square, circle, pentagon, and triangle, corresponding to the keys L, A, S, and K, respectively. The shapes were presented in either red or blue. For half of the participants, red indicated updating and blue indicated not-updating. This mapping was reversed for the other half, to ensure that the effects of updating reflect the condition rather than the color itself. Since the color mapping did not result in a main effect nor interacted with any of the other variables (all ps > .29), we collapsed the data across this variable. The square and circle shapes cued the parity task, for which the responses were D and J, respectively. The pentagon and triangle shapes cued the magnitude task, for which the responses were H and F, respectively. The participants kept four fingers of each hand on the keys A–F and H-L on a standard QWERTY keyboard. They responded to the cue identity (namely, the shape) using their left/right ring and little fingers, and to the probe using their index and middle fingers. The structure of the practice phase was similar to that of Experiment 1, except for including 40 full runs instead of 21.

Results

Cue-trials RT

An ANOVA was conducted on mean RTs with Updating and Update-Switch as within-subject factors. The main effect of Updating was significant, reflecting shorter RTs in update trials than no-update trials, F(1,22)=6.13, p=.021, ηp2=.22 (see Fig. 2). Also, the main effect of Update-Switch was significant, reflecting shorter RTs in repeat than switch trials, F(1,22) = 35.98, p < .001, ηp2 = .62. The two-way interaction was again non-significant, F(1,22) = 2.53, p = .13, ηp2 = .10.

Cue-trials PE

Similar analyses were conducted on mean PE. An ANOVA with Updating and Update-Switch revealed a significant main effect for Update-Switch, F(1,22) = 9.01, p = .007, ηp2 = .29, but not for Updating, F(1,22) = 2.92, p = .101, ηp2 = .12. As in Experiment 1, the two-way interaction was significant, F(1,22) = 12.20, p = .002, ηp2 = .35. The no-update condition was more erroneous than update in update-repetition trials, F(1,22) = 8.70, p = .007, but the two did not differ significantly in update-switch trials, F(1,22) = 1.18, p = .29.

Task switching

We next examined the effect of switching among tasks during the sequence of cues. To this end, we examined the effect of Condition, comprising three levels: no-update trials, update trials with both a cue-switch and a task-switch, and update trials with a cue-switch and a task-repetition. Trials in which the cue (and hence also the response) was repeated from the previous trial were removed from this analysis since these repetitions, which are generally fast, could not take place in the task-switch condition and hence may confound the results. With the remaining trials we conducted an ANOVA on mean RTs with Condition and Update-Switch (repeat, switch) as within-subject factors. Only the main effect of Task-Switch was significant, F(2,44) = 8.60, p < .001, ηp2 = .28. Task switch trials (M = 889 ms, SD = 190) were slower than task repetition (M = 817 ms, SD = 177), respectively, t(22) = 2.34, p = .029. Also, no-update trials (M = 924 ms, SD = 192) were slower than both, t(22) = 3.96, p < .001. These findings indicate that task switching was more costly than a mere cue/response-switch within the same task. The main effect of Update-Switch was non-significant, F(1,22) = .14, p = .712, and so was the two-way interaction, F(2,44) = 1.84, p = .171. The parallel analysis on PE did not reveal any significant effects (F(2,44) = 1.46, p = .244, F(1,22) = 1.14, p = .298, and F(2,44) = 2.44, p = .099, for the main effects of Task-Switch, Update-Switch and the two-way interactions, respectively). Notably, task-switch trials were numerically more erroneous (3.7%) than task repetitions (2.6%) Fig. 3.

Fig. 3
figure 3

Mean response time (RT) by task switching in Experiment 2. Trials in which the exact cue and response was repeated were removed from the analysis. Horizontal lines represent group means

Probe RT

An ANOVA was conducted on mean RTs for the probes as a function of the number of preceding no-update trials in a run (0, 1, 2, or 3+). Replicating the findings of Experiment 1, the effect of number of preceding no-update trials was significant, F(3,66) = 8.19, p < .001, ηp2 = .27. RTs were 2,411 (SD = 676), 2,500 (SD = 780), 2,612 (SD = 734), and 2,757 (SD = 828) ms for the 0–3+ conditions, respectively. Helmert contrasts revealed a significant difference between 0 and 1–3+, t(22) = 3.64, p = .001, between 1 and 2–3+, t(22) = 3.92, p<.001, but not between 2 and 3+, t(22) = 1.59, p = .13.

Probe PE

A parallel ANOVA was conducted on mean PE for the probes. The effect of number of immediately preceding no-update trials was significant, F(3,66) = 7.52, p<.001, ηp2 = .26. Mean PE was 3.1% (SD = 3.4%), 5.9% (SD = 6.7%), 9.9% (SD = 13.2%), and 14.9% (SD = 17.6%) for conditions 0–3+, respectively. Helmert contrasts revealed a significant difference between 0 and 1–3+, t(22) = 3.42, p = .003, between 1 and 2–3+, t(22) = 3.22, p = .004, but not between 2 and 3+, t(22) = 1.66, p = .11. This pattern also replicates the findings of Experiment 1.

General discussion

The present study replicated and extended the finding of Kessler et al. (2022). Updating is quicker and more accurate than not-updating in both declarative stimuli (letters) and task-cues. Merely acting on a task-cue, as required by response selection, leads to its updating into WM. In the cued task-switching paradigm (Meiran, 1996) the cue always indicates the relevant goal. The basic finding is a highly robust task-switching cost (for reviews, see Kiesel et al., 2010; Monsell, 2003), observed by comparing performance task-switch and task-repetition trials in a similar fashion to our analysis of Experiment 2. The multiple-cue paradigm developed here added two aspects to the paradigm. First, the inclusion of no-update trials enabled examination of selective WM updating, by which not all available information needed to be maintained in WM. Rather, the cue identity in no-update trials had to be ignored. Second, responding to the cue identity enabled measuring performance during cue processing that was not contaminated with processing the probe. Three possible outcomes could be predicted in this situation. The notion of updating as an inserted costly process that takes place only when needed implies that updating should be slower than not updating. Alternatively, participants could simply perform the choice RT task on the cue identity without updating WM throughout the run, and attempt to recall the relevant task cue only when the probe was presented. This reactive strategy (Braver, 2012) implies no difference in performance between update and no-update trials. In contrast to these alternative results, out findings show that adhering to the task cue and its maintenance in WM is faster than ignoring it, regardless of its instructed goal. It follows that updating, rather than not-updating, is the default mode of operation.

The empirical picture is somewhat more nuanced. Previous work with declarative information often showed that updating is more costly than not-updating. For example, in the reference-back task (Rac-Lubashevsky & Kessler, 2016a, 2016b), a red or blue letter appears on the screen in each trial. Participants are required to indicate whether each letter is the same as or different from the most recent red letter. Accordingly, they need to update their WM with each red letter, and not with blue letters. This paradigm robustly gives rise to an update cost, in contrast to the findings of Kessler et al. (2022) and of the present study. Kessler et al. reconciled this apparent conflict by distinguishing between updating items and item-context associations. Specifically, to respond correctly in the reference-back task, participants need to maintain both the reference (previous red letter) and the currently presented item in WM, each with its associated “role” or “context.” Updating trials, therefore, required not only updating a single maintained letter but the item-to-context binding. In their Experiment 1, Kessler et al. (2022) examined a condition in which two frames appeared on the screen. In each trial a single letter appeared in only one of them, which was colored in red (update) or blue (no-update). After several trials, the most recent letter corresponding to each frame had to be recalled. In this condition, which required forming item-context associations, updating was slower than not-updating, as “usual.” It follows that when only one item is maintained in WM, whether a letter or a task cue, updating is easy and does not depend on gating. However, the updating of item-context bindings is costly and selective.

At the empirical level, the multiple-cue task-switching paradigm developed here enables examining the process of selective updating and of ignoring instructed goals, including the behavioral moderators of this process, its brain correlates, and the associated pattern of individual differences. Returning to cognitive control, our findings suggest that updating WM with the relevant goal is not necessarily controlled or demanding. Rather, at least in some situations, it is the default mode of operation. This implies that ignoring a goal, such as when hearing an instruction given to someone else, is harder than adhering to it.