Often in our daily lives, we make errors. These errors can be relatively minor, such as typing the wrong letter in a word, or they can be potentially more disastrous, such as failing to stop for a red light at a busy intersection. In order to catch these errors and make the correct adjustments, an effective error monitoring system is required. In the current study, we examined normal variation in error monitoring and how it relates to variation in susceptibility to lapses of attention and working memory capacity. In particular, we investigated whether individual differences in lapses of attention and working memory capacity predict individual differences in error monitoring.

Working memory capacity and cognitive control

Working memory refers to our ability to actively maintain, manipulate, and retrieve task relevant information. A great deal of research has demonstrated that individual differences in working memory capacity (WMC) strongly predict performance in a number of domains from low-level attention and memory tasks to higher-level reasoning and comprehension (see Engle & Kane, 2004; Unsworth, 2016; Unsworth & Engle, 2007 for reviews). Research suggests that individual differences in WMC are partially attributable to individual differences in cognitive control (Engle & Kane, 2004). Cognitive control refers to the ability to guide processing and behavior in the service of task goals, and this ability is a fundamental aspect of the cognitive system and is thought to be important for a number of higher-level functions. Important components of cognitive control include actively maintaining task goals, selectively and dynamically updating task goals, detecting and monitoring conflict, and making adequate control adjustments in the presence of conflict (Cohen et al., 2004; Gratton et al., 2018). A wealth of studies have found that WMC measures tend to correlate quite well with measures of cognitive (or attention) control (Kane et al., 2016; McVay & Kane, 2012a; Unsworth & McMillan, 2014; Unsworth & Spillers, 2010; Unsworth et al., 2014; Unsworth et al., 2021a).

Recent research suggests that a key aspect of the WMC-cognitive control relationship is whether one can consistently apply control across trials (Unsworth, 2015). That is, trial-to-trial variability in the allocation of control seems to be critically important. High WMC individuals are better able to consistently maintain attention on task than low WMC individuals. This results in low WMC individuals experiencing more fluctuations and lapses of attention than high WMC individuals. Supporting evidence comes from a number of recent studies that have shown that low WMC individuals have more slow reaction times (RTs) and more variability in RTs during attention control tasks than high WMC individuals (McVay & Kane, 2012b; Schmiedek et al., 2007; Unsworth, 2015; Unsworth et al., 2010, 2012; Unsworth et al., 2021b), and low WMC individuals are more likely to report that they are mind-wandering and experiencing off-task thoughts than high WMC individuals (Kane et al., 2007; Kane et al., 2016; Kane et al., 2017; McVay & Kane, 2009, 2012a, b; Robison et al., 2017; Robison & Unsworth, 2015, 2018; Unsworth & McMillan, 2013, 2014, 2017; Unsworth & Robison, 2017a, b). Collectively, prior research suggests that low WMC individuals experience more trial-to-trial fluctuations in attention than high WMC individuals, suggesting that inconsistency in control is a likely reason for poorer performance seen by low WMC individuals on various tasks.

While prior research has suggested that individual differences in WMC are partially due to variation in cognitive control, most of this research has focused on individual differences in goal maintenance processes (and lapses in goal maintenance as discussed above) or conflict resolution processes (Kane & Engle, 2003; Meier & Kane, 2013, 2015; Unsworth et al., 2012). Considerably less research has examined links between WMC and error monitoring (although see Coleman et al., 2018; Miller et al., 2012; Unsworth et al., 2012). Below we provide a broad overview of error monitoring and discuss relevant studies suggesting that individual differences in WMC are related to individual differences in error monitoring and the possibility that this relation is due to differences in lapses of attention.

Error monitoring

Error monitoring (or performance monitoring more broadly) refers to the set of processes that are engaged before and after errors that allow us to detect errors and make corrective adjustments following errors. Early work by Rabbitt (Laming, 1979; Rabbitt & Rodgers, 1977; 1966) found that responses following errors in choice reaction time tasks are much slower (and more accurate) compared with the average of other correct responses. That is, after an error, participants respond slower (and tend to be more accurate) on the very next trial (as well as a few subsequent trials). This effect is known as post-error slowing; it is typically explained as due to local changes in the speed-accuracy tradeoff function in which after an error participants adopt a more conservative strategy, sacrificing speed to ensure accurate performance. For example, based on an extensive review of the literature, Wessel (2018) has proposed an adaptive orienting theory of error monitoring. In this theory, errors trigger an orienting response as well as an adaptive shift in attention following errors allowing for corrective actions to be taken. Thus, this theory combines prior theories suggesting that error monitoring includes both an orienting response and controlled adjustments following errors.

Additional evidence for error monitoring comes from examinations of various physiological correlates (Ullsperger et al., 2014). For example, the error-related negativity (ERN) is an event-related potential that is associated with the commission of an error (Gehring et al., 1993; see Gehring et al., 2012 for a review). Following the ERN is the error-related positivity (Pe) (Falkenstein et al., 2000), which is thought to be particularly sensitive to error awareness (see Ullsperger et al., 2010 for a review). Thus, a key aspect of error monitoring is the extent to which people are aware of their errors (Nieuwenhuis et al., 2001; Wessel et al., 2011).

Another physiological correlate of error monitoring and error awareness is pupil dilation. A great deal of research suggests that the pupil dilates in response to the demands of a task and the amount of attentional effort (or the intensity of attention) that is allocated to a task (Beatty, 1982; Beatty & Lucero-Wagoner, 2000; Kahneman, 1973), which is thought to be related to functioning of the locus coeruleus-norepinephrine system (Aston-Jones & Cohen, 2005; Gilzenrat et al., 2010; Joshi et al., 2016; Samuels & Szabadi, 2008). The pupil also responds to salient stimuli as part of the broader orienting response (Kahneman, 1973; Nieuwenhuis et al., 2011). As such, a number of studies have found greater pupillary dilation for error responses than for correct responses (Braem et al., 2015; Critchley et al., 2005; Murphy et al., 2016; Wessel et al., 2011), and error phasic pupillary responses are larger for young adults compared with older adults (Wessel et al., 2018). Importantly, the error pupillary response seems to be modulated by error awareness with larger dilations occurring for aware errors compared to unaware errors (Harsay et al., 2018; Wessel et al., 2011). Thus, error responses seem to give rise to an orienting response that is associated with physiological responses such as pupil dilation, and this response is associated with overall error awareness. Error awareness, in turn, seems to be particularly associated with the anterior insula cortex, which is a major part of the salience network (Hester et al., 2005). Additionally, pupillary dilations associated with errors are related to activity in the salience network in humans (Critchley et al., 2005) and monkeys (Ebitz & Platt, 2015). Furthermore, increased pupil dilation and increased activity in the salience network is particularly strong for aware errors, but not unaware errors. Collectively, recent theorizing suggests that errors are particularly salient events that the salience network responds to, resulting in an orienting response to errors and subsequent adjustments in control (Ullsperger et al., 2010).

While much of the research on error monitoring has focused on what happens during and after an error, additional research has focused on what conditions precede errors. Several studies suggest that errors are preceded by reductions in activity in frontal control areas. For example, using fMRI Eichele et al. (2008) found that activity in areas important for task engagement (medial frontal cortex) began to decrease roughly 30 s before an error; at the same time, areas linked with task disengagement (default mode network) showed increased activity. Examining EEG, O’Connell et al. (2009) found that detection errors were preceded by increased alpha band activity 20 s before an error, and this was followed by decreased frontal P3 and contingent negative variation. Similarly, Padilla et al. (2006) found that errors were preceded by reduced contingent negative variation and reduced ERPs during visual processing. Padilla et al. (2006) suggested that errors were due to lapses in attention. Consistent with fMRI results, these EEG results suggest that some errors are preceded by reduced activity linked with task engagement. Thus, some errors are likely the result of lapses of attention.

Furthermore, lapses of attention and task engagement seem to be linked to error monitoring abilities and error awareness. For example, Shalgi et al. (2007) suggested that error awareness is strongly tied to the sustained attention system such that unaware errors reflect lapses of attention. That is, participants must be in a highly attentive state to perceive correctly that an error has occurred. Indeed, Shalgi et al. (2007) noted that participants indicated that they were unaware of their errors, because they failed to pay attention. Similarly, examining error awareness in TBI patients, O’Keeffe et al. (2007) suggested that TBI patients had impairments in error awareness and these deficits were likely due to drifts in attention where the TBI patients missed that the error had occurred. Thus, they suggested that sustained attention abilities are required for the accurate monitoring of errors and that drifts of attention can lead to not only decrements in performance (errors), but also to decrements in error monitoring. Examining adults with ADHD, O’Connell et al. (2009) found a significant negative correlation between error awareness and reaction time variability (a marker of the frequency of lapses of attention). In interpreting the results, O’Connell et al. suggested that “an error will only be detected consciously if the participant is in a sufficiently attentive state such that contextually appropriate stimulus–response or goal mappings are highly activated” (p. 1156).

More recently, Harsay et al. (2018) suggested that aware errors were related to increased task engagement and increased alertness, whereas unaware errors (error blindness) were associated with increased task disengagement. In support of this, Harsay et al. (2018) found that unaware errors were associated with increased activity in the default mode network, associated with decreased task related activations, had smaller phasic pupil dilations, and were preceded by larger baseline pupil diameters. These results are consistent with some prior research, which has suggested that lapses of attention are associated with larger pretrial baseline pupil diameters (Konishi et al., 2017; Unsworth & Robison, 2016; van den Brink et al., 2016). Whereas unaware errors seemed to be related to lapses of attention and task engagement, aware errors seemed to occur when the participants were overall engaged in the task (and these errors are likely due to failures in response inhibition; O’Connell et al., 2008). Thus, there seems to be evidence suggesting that error awareness is associated with sustained attention abilities and task engagement wherein unaware errors might reflect fluctuations in attention. Further evidence for the role of task engagement in error monitoring comes from several individual differences investigations of error processing. For example, Tops and Boksem (2010) suggested that the ERN reflects both trait and state differences in engagement. In support, Tops and Boksem (2010) found that personality measures linked with motivation and engagement were positively related to the ERN amplitude. Although these results should be interpreted with caution given that the sample size for the ERN correlations was very small with N = 24. In another individual differences study, Larson and Clayson (2011) found that the ERN was correlated with an executive/attention composite and suggested that the ERN is a likely indicator of task engagement. Collectively, prior research suggests that error monitoring and error awareness are likely associated with task engagement both within and between participants, such that higher levels of task engagement result in better overall error monitoring and greater error awareness.

The present study

A first main goal of the current study was to examine relations between WMC and error monitoring. As noted above, while a great deal of research has been done examining associations between WMC and aspects of cognitive control, less research has been done examining associations between WMC and error monitoring. To our knowledge, only two prior studies have examined potential relations between WMC and physiological indicators of error monitoring (Miller et al., 2012; Coleman et al., 2018). Miller et al. (2012) examined WMC differences in error monitoring by having 12 high WMC individuals and 12 low WMC individuals perform a version of the Simon task while recording EEG. Miller et al. found that high WMC individuals demonstrated higher ERN and Pe on error trials than low WMC individuals, suggesting better error monitoring and error awareness for high than for low WMC individuals. Similarly, Coleman et al. (2018) had 25 high and 25 low WMC individuals perform a variant of the flanker task in which either accuracy or speed was stressed. Coleman et al. found that high WMC individuals demonstrated larger ERNs than low WMC individuals across conditions. Additionally, high WMC individuals demonstrated larger Pe components than low WMC individuals, and this difference was greater in the accuracy stressed condition than in the speed stressed condition. Coleman et al. (2018) suggested that these results were indicative of WMC differences in error detection with high WMC individuals having a more robust error detection system and better error awareness than low WMC individuals. These results provide important initial evidence suggesting that WMC is related to error monitoring abilities. Although these initial results are encouraging, additional research is needed to replicate and extend these findings. In particular, one issue with these prior studies is that they relied on relatively small sample sizes for individual differences research. Additionally, both studies relied on an extreme groups approach in which only participants from the top and bottom of the WMC distribution were selected for the EEG experiment. Although extreme-groups studies have their place, particularly in early stages of empirical investigation (indeed, we have published a number of extreme-groups studies), it is always desirable to subsequently test the effects found in extreme-groups studies with a full range of participants. To demonstrate the robustness of these relations it is important to have a much larger sample size (necessary to find small relations) and to examine the full range of participants. Thus, a first main goal of the present study was to examine relationships between WMC and error monitoring. Participants performed multiple WMC measures along with a Stroop task. Pupil diameter was continuously recorded during the Stroop task to examine pupillary responses to errors.

A second main goal of the current study was to examine the possible role of lapses of attention in individual differences in error monitoring. As noted above, lapses of attention and individual differences in lapses seem to be important for error monitoring abilities. Likewise, individual differences in WMC are strongly linked to task engagement and lapses of attention with low WMC individuals having deficits in consistently maintaining task engagement compared to high WMC individuals (Unsworth et al., 2021b). Thus, we expect to see that individual differences in lapses of attention are related to both WMC and error monitoring. Here lapses of attention are operationalized as particularly slow reaction times in the Stroop task along with self-reports of off-task thinking (e.g., mind-wandering). A great deal of research has suggested that the slowest reaction times in various attention control tasks (and in particular in the Stroop) are partially reflective of individual differences in lapses of attention and task disengagement (Cheyne et al., 2009; Coyle, 2003; Jackson et al., 2012; Kane & Engle, 2003; Larson & Alderton, 1990; Leth-Steensen et al., 2000; McVay & Kane, 2012b; Tse et al., 2010; Unsworth, 2015; Unsworth et al., 2010, 2012; Unsworth et al., 2021b; Weissman et al., 2006; West, 2001; West & Alain, 2000a, b). As such, a specific prediction is that the relation between WMC and error monitoring should largely be accounted for by shared variance with lapses of attention. That is, low WMC individuals are more likely to periodically disengage from the task than high WMC individuals, leading to lapses of attention. On trials where errors occur this should result in a blunted error pupillary response indicative of reduced error monitoring. However, if relations between WMC and error monitoring reflect differences in error detection per se, then individual differences in lapses of attention should not account for the relation between WMC and error monitoring. Thus, the current study provides a means of explicitly testing whether associations between WMC and error monitoring are due to shared variance with task engagement/disengagement.

A third main goal of the present study was to explicitly examine error awareness. The prior work by Miller et al. (2012) and Coleman et al. (2018) suggested that WMC is related to error awareness given relations between WMC and Pe. However, participants did not indicate their awareness of errors in either study. In order to examine possible relations between WMC and error awareness in Experiment 2 of the current study, we had participants perform the Stroop task. After each response, they had to indicate whether the prior response was a correct response or an error. This should provide an explicit measure of error awareness that should be related to the error pupillary response, WMC, and lapses of attention. Furthermore, this should allow for an examination of differences between aware and unaware errors consistent with prior research (Harsay et al., 2018; Wessel et al., 2011).

To examine these issues, we conducted two individual differences experiments in which participants performed multiple WMC measures and versions of the Stroop task. Participants’ pupils were continuously monitored during the Stroop to examine variation in error phasic pupillary responses as an indicator of error monitoring abilities.

Experiment 1

In Experiment 1, we examined relationships among WMC, error monitoring (error pupillary responses), and lapses of attention. Participants performed a version of the Stroop task while pupil diameter was continuously monitored. Periodically during the Stroop task participants were presented with thought probes asking them to classify their immediately preceding thoughts. These off-task thought reports, along with the slowest 20% of reaction times in the Stroop, were taken as our measures of lapses of attention. Based on prior research, it was predicted that WMC should be related to error monitoring but that individual differences in lapses of attention should largely account for this relationship.

Method

Participants

A total of 175 participants were recruited from the subject-pool at the University of Oregon. Participants were 63.4% female with an average age of 19.48 (SD = 2.17). Data from 157 participants was complete for the Stroop task and only those participants are examined. Participants received course credit for their participation. Each participant was tested individually in a laboratory session that lasted approximately 2 h. We tested participants over two full academic quarters, using the end of the second quarter as our stopping rule for data collection. We determined that a minimum sample size of 120 participants would be sufficient to find correlations of 0.25, with power of 0.80, and alpha set at 0.05 (two-tailed). Note that some of the data has been reported in Unsworth and Robison (2017b). The purpose of that study was to examine relationships among WMC, attention control, and pupillary responses. None of the critical error pupillary response data were examined in that study, and none of the specific hypotheses regarding error monitoring were tested.

Materials and procedure

After signing informed consent, all participants completed operation span, symmetry span, reading span, psychomotor vigilance task, antisaccade, Stroop, Ravens Advanced Progressive Matrices, letter sets, syllogisms, and a visual working memory filtering task. All tasks were administered in the order listed above.

Working memory capacity (WMC) tasks

Operation span

Participants solved a series of math operations while trying to remember a set of unrelated letters (Unsworth et al., 2005). Participants were required to solve a math operation; after solving the operation, they were presented with a letter for 1 s. Immediately after the letter was presented, the next operation was presented. At recall, participants were asked to recall letters from the current set in the correct order by clicking on the appropriate letters. For all of the span measures, items were scored correct if the item was recalled correctly from the current list. Participants were given practice on the operations and letter recall tasks only, as well as two practice lists of the complex, combined task. List length varied randomly from three to seven items, and there were two lists of each list length for a maximum possible score of 50. The score was total number of correctly recalled items.

Symmetry span

Participants recalled sequences of red squares within a matrix while performing a symmetry-judgment task (Unsworth et al., 2009). In the symmetry-judgment task, participants were shown an 8 X 8 matrix with some squares filled in black. Participants decided whether the design was symmetrical about its vertical axis. The pattern was symmetrical half of the time. Immediately after determining whether the pattern was symmetrical, participants were presented with a 4 X 4 matrix with one of the cells filled in red for 650 ms. At recall, participants recalled the sequence of red-square locations by clicking on the cells of an empty matrix. Participants were given practice on the symmetry-judgment and square recall task as well as two practice lists of the combined task. List length varied randomly from two to five items, and there were two lists of each list length for a maximum possible score of 28. We used the same scoring procedure as we used in the operation span task.

Reading span

While trying to remember an unrelated set of letters, participants were required to read a sentence and indicated whether or not it made sense (Unsworth et al., 2009). Half of the sentences made sense, while the other half did not. Nonsense sentences were created by changing one word in an otherwise normal sentence. After participants gave their response, they were presented with a letter for 1 s. At recall, participants were asked to recall letters from the current set in the correct order by clicking on the appropriate letters. Participants were given practice on the sentence judgment task and the letter recall task, as well as two practice lists of the combined task. List length varied randomly from three to seven items, and there were two lists of each list length for a maximum possible score of 50. We used the same scoring procedure as we used in the operation span and symmetry span tasks.

Stroop

Prior to each trial, there was a 2-s baseline period with black “ +  +  +  +  + ” in the center of a white background screen to determine baseline pupil diameter (luminance = 208 lx). Following this, there was a blank screen for 250 ms, 500 ms, or 1,000 ms randomly distributed across trials. Participants were then presented with a color word (red, green, or blue) in one of three different font colors (red, green, or blue: average luminance = 214 lx), which remained onscreen until response. The participants’ task was to indicate the font color via key press (red = 1, green = 2, blue = 3). Participants were told to press the corresponding key as quickly and accurately as possible. After responding, participants were presented with a blank screen for 1,500 ms to allow for the pupillary response to unfold. Participants received 15 trials of response mapping practice and 6 trials of practice with the real task. Participants then received 100 real trials. Of these trials, 67% were congruent such that the word and the font color matched (i.e., red printed in red) and the other 33% were incongruent (i.e., red printed in green). Twelve thought probes were randomly presented after incongruent trials. Our behavioral indicator of lapses of attention was the average reaction time for the slowest 20% of correct trials across all trials. We utilized all trials given that prior research has suggested that lapses likely occur on both congruent and incongruent trials and these overall measures of lapses tend to correlate with cognitive abilities (Kane et al., 2016; Tse et al., 2010; Unsworth, 2015). Specifically, each individual’s correct RTs (both congruent and incongruent) were ranked ordered from fastest to slowest. Next, these rank ordered responses were placed into five bins (quintiles) such that 20% of each individual’s responses were placed into each bin. The slowest RT bin (quintile 5) was taken as a behavioral measure of lapses of attention.Footnote 1

Thought probes

During the Stroop task, participants were presented with 12 thought probes randomly after incongruent trials, asking them to classify their immediately preceding thoughts. The thought probes asked participants to press one of five keys to indicate what they were thinking just prior to the appearance of the probe. Specifically, participants saw:

Please characterize your current conscious experience.

  1. 1.

    I am totally focused on the current task.

  2. 2.

    I am thinking about my performance on the task.

  3. 3.

    I am distracted by sights/sounds/temperature or by physical sensations (hungry/thirsty).

  4. 4.

    I am daydreaming/my mind is wandering about things unrelated to the task.

  5. 5.

    I am not very alert/my mind is blank.

During the introduction to the task, participants were given specific instructions regarding the different categories. Response 1 was considered on-task. Response 2 measures task-related interference and was not included in the analyses. Responses 3–5 were considered as off-task thinking. Prior research has demonstrated that the different off-task probes are correlated at the individual differences level and that variance common to the various off-task probes is what is important for the relation between WMC and attention control (Unsworth & McMillan, 2014). Thus, responses 3–5 were combined into a single off-task measure.

Eye tracking

For the Stroop task (and the other attention control tasks), participants were tested individually in a dimly lit room. Pupil diameter was continuously recorded binocularly at 120 Hz using a Tobii T120 eyetracker, integrated in a 17-inch TFT monitor. Data from each participant’s left eye was used. Participants were seated approximately 60 cm from the screen. Missing data points due to blinks, off-screen fixations, and/or eyetracker malfunction were removed (roughly 9.1% of the overall data in Experiment 1 and 16.3% in Experiment 2). Pretrial baseline responses were computed as the average pupil diameter during the fixation screen (2,000 ms) for each task. At the suggestion of a reviewer, we examined both stimulus-locked and response-locked phasic pupillary responses. Stimulus-locked phasic pupillary responses were corrected by subtracting out baseline pupil and were time locked to when the stimulus was presented on a trial-by-trial basis for each participant. Specifically, Stroop phasic responses were time locked to the appearance of the colored word. Phasic pupillary responses were also time locked to the response on a trial-by-trial basis. Results for both phasic pupillary responses are presented. To examine the time course of the phasic pupillary responses, the pupil data were averaged into a series of 20 ms time windows following stimulus onset for each trial. The dependent measure was the peak task-evoked response. Specifically, we computed the maximum pupillary response following stimulus onset (or response) and then averaged the maximum values within each trial for each participant. These peak-task evoked responses were then examined for correct congruent, correct incongruent, and error trials.

Results and discussion

Consistent with much prior research, there were significant Stroop effects for both RT (M Incongruent = 824.44, SD = 199.42; M Congruent = 659.52, SD = 175.19; M Stroop = 164.92, SD = 107.86), t(156) = 19.16, p < 0.001, d = 1.52, and accuracy (M Incongruent = 0.93, SD = 0.06; M Congruent = 0.97, SD = 0.04; M Stroop = 0.04, SD = 0.06), t(156) = 10.05, p < 0.001, d = 0.80, in which incongruent trials were slower and less accurate than congruent trials.Footnote 2 Overall accuracy was high (M = 0.95, SD = 0.04) with participants committing 4.69 (SD = 3.87, range 0–26) errors on average and 84.7% of participants committing at least two errors.Footnote 3

Turning to the pupillary responses, we next examined whether stimulus-locked error pupillary responses would be larger than phasic responses for correct congruent and incongruent trials as has been seen previously. Peak responses for correct congruent, correct incongruent, and errors were submitted to a repeated measures ANOVA. Note that these analyses consist of only 119 participants given that not all participants made an error and given that some participants did not have clean phasic pupillary responses for errors (due to missing data and blinks). There was a main effect of response, F(2, 236) = 16.44, MSE = 0.01, p < 0.001, partial η2 = 0.12. As shown in Fig. 1a, errors demonstrated the largest phasic pupillary responses (M = 0.16, SD = 0.18), followed by incongruent trials (M = 0.11, SD = 0.08), and then congruent trials (M = 0.09, SD = 0.06). Specifically, phasic pupillary responses for incongruent trials were larger than congruent trials, t(148) = 4.36, p < 0.001, d = 0.36, and error phasic pupillary responses were larger than both incongruent, t(118) = 3.00, p = 0.003, d = 0.28, and congruent, t(118) = 4.74, p < 0.001, d = 0.43, phasic pupillary responses. Note, the waveforms are presented for visualization purposes. Similar results were obtained when examining response-locked pupillary responses (Fig. 1b). Specifically, phasic pupillary responses for incongruent trials were larger than congruent trials, t(147) = 3.67, p < 0.001, d = 0.30, and error phasic pupillary responses were larger than both incongruent, t(118) = 4.62, p < 0.001, d = 0.42, and congruent, t(118) = 5.95, p < 0.001, d = 0.55, phasic pupillary responses.

Fig. 1
figure 1

(a) Change in pupil diameter for stimulus-locked correct congruent, correct incongruent, and error trials in Experiment 1. (b) Change in pupil diameter for response-locked correct congruent, correct incongruent, and error trials in Experiment 1. Shaded areas reflect one standard error of the mean

Next, we examined correlations among the different measures of interest. Descriptive statistics for all of the measures are shown in Table 1. The measures had generally acceptable values of internal consistency and most of the measures were approximately normally distributed with values of skewness and kurtosis under the generally accepted values (i.e., skewness < 2 and kurtosis < 4). The correlations are shown in Table 2.Footnote 4 Note that these correlations are Spearman rhos rather than the typical Pearson correlations, because there were two potential outliers present for the error pupillary responses, and Spearman’s rho is better suited for dealing with outliers than Pearson correlations (de Winter et al., 2016). See supplemental materials for scatter and density plots for relations with stimulus-locked error pupillary responses. The individual WMC measures were correlated, the two lapses measures were correlated, and similar overall relations were demonstrated for stimulus-locked and response-locked pupillary responses.

Table 1 Descriptive statistics and reliability estimates for all measures in Experiment 1
Table 2 Correlations among all measures in Experiment 1

To further test these relationships and to examine whether lapses of attention account the relationship between WMC and error monitoring, we next created WMC and lapse composites. Specifically, to ensure that any relationships with WMC and lapses were not due to idiosyncratic task effects and to ensure that we were measuring the broad constructs, we computed a composite score. That is, single tasks represent a combination of construct variance along with task-specific method variance (Wittmann, 1988). Thus, to ensure that true abilities are being measured, one should use several tasks designed to tap the ability of interest. Therefore, a composite WMC score was computed for each participant by z-transforming each complex span task. Then, these z-scores were averaged together for each participant. Similar to the WMC composite, we created a lapse composite by z-transforming both the slowest 20% of trials in the Stroop and off-task measure from the Stroop. These z-scores were averaged together for each participant. The WMC and lapse composites were correlated (rs =  − 0.31, p < 0.001). WMC was correlated with both stimulus-locked (rs = 0.21, p = 0.020) and response-locked (rs = 0.22, p = 0.018) error pupillary responses. Similarly, the lapse composite was correlated with both stimulus-locked (rs =  − 0.27, p = 0.004) and response-locked (rs =  − 0.25, p = 0.007) error pupillary responses. The stimulus-locked and response-locked pupillary responses were strongly correlated (rs = 0.80, p < 0.001).

Having demonstrated relationships between WMC, the lapse composite, and the error pupillary responses, we next examined how WMC and lapses would account for variation in the error pupillary responses and whether lapses would largely account for the relation between WMC and error monitoring. Therefore, we ran a simultaneous regression in which the WMC and lapse composites predicted the error pupillary responses. As shown in Table 3, the measures accounted for 9% of the variance in the error pupillary responses. Importantly, only the lapse composite accounted for unique variance in error pupillary responses. Thus, these results suggest that the relationship between WMC and error monitoring was largely accounted for by shared variance with lapses of attention. Similar results were seen (Table 4) when examining the response-locked pupillary responses.

Table 3 Simultaneous regression predicting stimulus-locked error pupillary responses in Experiment 1
Table 4 Simultaneous regression predicting response-locked error pupillary responses in Experiment 1

Overall, Experiment 1 suggested that error phasic pupillary responses were larger than phasic pupillary responses for correct trials consistent with prior research. These error phasic pupillary responses were related to WMC and to indicators of lapses of attention. Importantly, the relationship between WMC and error monitoring was accounted for by shared variance with lapses of attention, suggesting that low WMC individuals have poorer error monitoring abilities than high WMC individuals, in part, because low WMC individuals experience more fluctuations in attention across trials, resulting in temporary reductions in error monitoring.

Experiment 2

The purpose of Experiment 2 was to replicate and extend the results from Experiment 1. Specifically, given the somewhat weak relation between WMC and error monitoring in Experiment 1, we wanted to replicate this finding. Additionally, in Experiment 2, we wanted to assess the potential associations between error awareness and WMC and lapses of attention. As noted previously, error awareness has been postulated to be related to overall levels of task engagement, with aware errors resulting in larger phasic pupillary responses than unaware errors. As such, we wanted to examine whether WMC would be related to error awareness and whether lapses of attention would account for this relation. To examine this, participants performed the same three complex WMC tasks from Experiment 1 and a version of the Stroop task. The version of the Stroop task used in the current experiment was the same as what was used in Experiment 1 with the following exceptions. First, we increased the number of trials to 150 to increase the reliability of the results and increase the possibility of error trials. Second, we changed the proportion congruency to 80–20 in order to make it more difficult to maintain the task goal and potentially increase the number of errors made (Kane & Engle, 2003). Finally, we included an assessment of error awareness where after every trial participants had to indicate whether the prior response was correct or incorrect. Because of the inclusion of the error awareness assessment after each trial, we took out the thought probes assessing off-task thinking and only relied on the slowest reaction times in the Stroop as the indicator of lapses of attention.

Method

Participants

A total of 126 participants were recruited from the subject-pool at the University of Oregon. Participants were 58.2% female with an average age of 19.51 (SD = 1.75). Data from 122 participants was complete for the Stroop and the three complex span tasks. Participants received course credit for their participation. Each participant was tested individually in a laboratory session lasting approximately 1.5 h. We tested participants over two full academic quarters until we had a minimum of 120 participants consistent with Experiment 1.

Materials and procedure

After signing informed consent, all participants completed operation span, symmetry span, reading span, delayed free recall, Stroop, and antisaccade. All tasks were administered in the order listed above.

Working memory capacity (WMC) tasks

Same as Experiment 1.

Stroop

Same as Experiment 1 with the following exceptions:

Participants received 150 trials. Of these trials, 80% were congruent such that the word and the font color matched (i.e., red printed in red) and the other 20% were incongruent (i.e., red printed in green). Following each trial, participants saw a screen asking if the last response was correct. They pressed a key labeled Y for correct and key labeled N for incorrect responses. The screen remained until response, at which point the baseline screen for the next trial appeared.

Eye tracking

Same as Experiment 1.

Results and discussion

Similar to Experiment 1, and consistent with prior research, there were significant Stroop effects for both RT (M Incongruent = 953.45, SD = 237.56; M Congruent = 739.14, SD = 178.33; M Stroop = 214.31, SD = 119.17), t(121) = 19.86, p < 0.001, d = 1.80, and accuracy (M Incongruent = 0.94, SD = 0.05; M Congruent = 0.98, SD = 0.02; M Stroop = 0.04, SD = 0.05), t(121) = 8.45, p < 0.001, d = 0.77, in which incongruent trials were slower and less accurate than congruent trials. Overall accuracy was high (M = 0.97, SD = 0.02) with participants committing 3.89 (SD = 3.48, range 0–16) errors on average, and 72.1% of participants committing at least two errors.

Turning to the pupillary responses, we next examined whether stimulus-locked error pupillary responses would be larger than correct congruent and incongruent trials. Peak responses for correct congruent, correct incongruent, and errors were submitted to a repeated measures ANOVA. Note that these analyses consist of only 109 participants given that not all participants made an error and given that some participants did not have clean phasic pupillary responses for errors (due to missing data and blinks). There was a main effect of response, F(2, 216) = 74.77, MSE = 0.01, p < 0.001, partial η2 = 0.41. As shown in Fig. 2a, errors demonstrated the largest phasic pupillary responses (M = 0.23, SD = 0.19), followed by incongruent trials (M = 0.08, SD = 0.08), and then congruent trials (M = 0.07, SD = 0.07). Consistent with Experiment 1, phasic pupillary responses for incongruent trials were larger than congruent trials, t(119) = 2.62, p = 0.01, d = 0.24, and error phasic pupillary responses were larger than both incongruent, t(108) = 8.76, p < 0.001, d = 0.84, and congruent, t(108) = 8.89, p < 0.001, d = 0.85, phasic pupillary responses. Note, the waveforms are presented for visualization purposes. Similar results were obtained when examining response-locked pupillary responses (Fig. 2b). Specifically, phasic pupillary responses for incongruent trials were larger than congruent trials, t(118) 4.78, p < 0.001, d = 0.44, and error phasic pupillary responses were larger than both incongruent, t(108) = 6.18, p < 0.001, d = 0.59, and congruent, t(108) = 8.07, p < 0.001, d = 0.77, phasic pupillary responses.

Fig. 2
figure 2

(a) Change in pupil diameter for stimulus-locked correct congruent, correct incongruent, and error trials in Experiment 2. (b) Change in pupil diameter for response-locked correct congruent, correct incongruent, and error trials in Experiment 2. Shaded areas reflect one standard error of the mean

We also examined possible differences between aware and unaware errors. As noted previously, prior research has suggested that aware errors generate a larger phasic response than unaware errors (Harsay et al., 2018; Wessel et al., 2011). To see whether we replicated this effect, we compared error phasic responses for trials where participants correctly classified the trial as an error (aware error) to trials where participants classified the error trial as being correct (unaware error). Overall error awareness was high with participants correctly classifying 91% (SD = 18) of their errors. Thus, only 26 participants were available for this analysis who had both aware and unaware errors and clean phasic responses. Replicating prior research, aware errors resulted in a larger phasic pupillary response (M = 0.20, SD = 0.12) than unaware errors (M = 0.04, SD = 0.16), t(25) = 3.97, p = 0.001, d = 0.83. Harsay et al. (2018) also found that unaware errors were preceded by larger baseline pupil diameters than aware errors and suggested that this finding was consistent with the notion that unaware errors were linked with task disengagement similar to lapses of attention (Konishi et al., 2017; Unsworth & Robison, 2016; van den Brink et al., 2016). To see whether similar results were obtained in the current experiment, we examined baseline pupil diameter during the baseline period for aware and unaware errors. The pretrial baselines were z-scored normalized within each participant to correct for individual differences in pupil diameter, and we compared the z-scored baselines for aware and unaware errors within participants. Only 30 participants were available for this analysis. Consistent with Harsay et al. (2018), unaware errors were associated with larger pretrial baselines (M = 0.34, SD = 0.75) than aware errors (M =  − 0.11, SD = 0.38), t(29) = 3.11, p = 0.004, d = 0.61. Thus, consistent with prior research, unaware errors were associated with larger pre-trial baseline pupil diameters and smaller phasic pupillary responses than aware errors, suggesting that unaware errors are associated with task disengagement and lapses of attention.

Next, we examined correlations among the different measures of interest. Descriptive statistics for all of the measures are shown in Table 5. Shown in Table 6 are the correlations. Similar to Experiment 1, these correlations are Spearman rhos, because there were two potential outliers present for the error awareness data. Both participants had error awareness scores of zero and only committed two errors. Thus, it seems likely that these reflect real instances of error blindness. Furthermore, the error awareness measure was negatively skewed. Overall similar results were obtained when using Pearson correlations and when excluding these two participants. See supplemental materials for scatter and density plots for relations with stimulus-locked error pupillary responses. As shown in Table 6, the individual WMC measures were correlated, error awareness tended to correlate with the slowest 20% of trials in the Stroop and with the error pupillary responses. Additionally, overall similar correlations were demonstrated for stimulus-locked and response-locked pupillary responses.

Table 5 Descriptive statistics and reliability estimates for all measures in Experiment 2
Table 6 Correlations among all measures in Experiment 2

Similar to Experiment 1, we next examined these relations by forming composites for WMC and the error measures. Specifically, we created a similar WMC composite as Experiment 1. Because the error pupillary response and the error awareness measures were correlated, we similarly created a z-score composite for error monitoring. Because there was only one lapse measure in this experiment, we were unable to create a lapse composite. The WMC composite and the lapse measure (slowest 20% of trials in the Stroop) were correlated (rs =  − 0.28, p = 0.002). Both WMC (rs = 0.19, p = 0.040) and the lapse measure (rs =  − 0.42, p < 0.001) correlated with error awareness. WMC was correlated with stimulus-locked error pupillary responses (rs = 0.24, p = 0.012) but not quite with the response-locked (rs = 0.16, p = 0.091) error pupillary responses. The lapse measure was correlated with both stimulus-locked (rs =  − 0.44, p < 0.001) and response-locked (rs =  − 0.40, p < 0.001) error pupillary responses. Error awareness was correlated with both stimulus-locked (rs = 0.28, p = 0.003) and response-locked (rs = 0.34, p < 0.001) error pupillary responses. Both WMC (rs = 0.24, p = 0.012) and the lapse measure (rs =  − 0.44, p < 0.001) correlated with the error monitoring composite.

Having demonstrated relations between WMC, the lapse measure, and error pupillary responses (in particular stimulus-locked pupillary responses), we next examined how WMC and lapses would account for variation in the error monitoring composite and whether lapses would largely account for the relation between WMC and error monitoring similar to Experiment 1. Therefore, we ran a simultaneous regression in which the WMC and lapse composites predicted error monitoring. As shown in Table 7, the measures accounted for 21% of the variance in the error monitoring composite. Importantly, only the indicator of lapses accounted for unique variance in error monitoring. Thus, these results suggest that the relation between WMC and error monitoring was accounted for by shared variance with lapses of attention. Similar results were seen (Table 8) when examining the response-locked pupillary responses.

Table 7 Simultaneous regression predicting the error monitoring composite with stimulus-locked error pupillary responses in Experiment 2
Table 8 Simultaneous regression predicting the error monitoring composite with response-locked error pupillary responses in Experiment 2

Overall, Experiment 2 suggested that error phasic pupillary responses were larger than phasic pupillary responses for correct trials consistent with prior research and Experiment 1. These error phasic pupillary responses were related to WMC, error awareness, and lapses of attention. WMC was also related to error awareness such that high WMC individuals were better able to identify errors than low WMC individuals. Thus, WMC was related to overall error monitoring abilities. Importantly, consistent with Experiment 1, the relationship between WMC and error monitoring (error phasic pupillary responses and error awareness) was accounted for by shared variance with lapses of attention, suggesting that variation in the ability to consistently maintain task engagement was critical for the WMC to error monitoring relation.

Combined analysis

Given the similar results in the two experiments, we further examined the data via a combined cross-experimental analysis. This was performed to better examine potentially small relations among the measures with a larger combined sample with more power. Specifically, in the combined sample (N = 279 for WMC and Stroop slowest 20% of RTs; N = 228 for the pupillary measures) we had sufficient power to detect correlations of rs = 0.18 or larger. We again relied on Spearman rhos to account for potential outliers. The WMC composite and the lapse measure (slowest 20% of trials in the Stroop) were correlated (rs =  − 0.34, p < 0.001). WMC was correlated with both the stimulus-locked error pupillary responses (rs = 0.21, p = 0.001; Fig. 3a), and the response-locked error pupillary responses (rs = 0.17, p = 0.012) error pupillary responses. The lapse measure was likewise correlated with both stimulus-locked (rs =  − 0.23, p < 0.001; Fig. 3b) and response-locked (rs =  − 0.22, p < 0.001) error pupillary responses.

Fig. 3
figure 3

(a) Scatter and density plots for relation between working memory capacity (WMC) and stimulus-locked error pupillary (StErrPupS) responses in the combined data. (b) Scatter and density plots for relation between the slowest 20% of trials in Stroop (StSlow) and stimulus-locked error pupillary (StErrPupS) responses in the combined data

We next examined how WMC and lapses would account for variation in the error pupillary responses and whether lapses would account for the relation between WMC and error monitoring. Therefore, we ran a simultaneous regression in which WMC and an indicator of lapses (slowest 20% of RTs) predicted the error pupillary responses. As shown in Table 9, the measures accounted for 7% of the variance in the error pupillary responses. In this analysis, both WMC and the lapse measure accounted for unique variance in error pupillary responses. Generally, similar results were seen (Table 10) when examining the response-locked pupillary responses, except that here only the lapse measure accounted for unique variance. Overall, these results suggest that the relation between WMC and error monitoring was largely accounted for by shared variance with lapses of attention.

Table 9 Simultaneous regression predicting stimulus-locked error pupillary responses in the combined data
Table 10 Simultaneous regression predicting response-locked error pupillary responses in the combined data

General discussion

In two experiments, we examined relationships between individual differences in WMC and error monitoring abilities. In both experiments, WMC was related to the size of the error phasic pupillary response suggesting that high WMC individuals are better at error monitoring than low WMC individuals. These results corroborate prior EEG results (Coleman et al., 2018; Miller et al., 2012), suggesting that high WMC individuals have superior error monitoring abilities than low WMC individuals. As such, these results extend prior research examining relations between WMC and cognitive control by demonstrating that individual differences in WMC are related to error monitoring abilities in addition to goal maintenance and conflict resolution abilities (Kane & Engle, 2003; Meier & Kane, 2013, 2015; Unsworth et al., 2012). The current results also extend prior research demonstrating a relation between WMC and error monitoring by demonstrating an explicit association between WMC and error awareness. Specifically, in Experiment 2 when assessing error awareness, we found that high WMC individuals are more aware of their errors than low WMC individuals. This notion of differences in error awareness was suggested by prior results from Miller et al. (2012) and Coleman et al. (2018) who found that WMC was related to Pe (a putative index of error awareness), but these prior studies did not explicitly test for differences in error awareness. The current results suggest that individual differences in WMC are indeed related to variation in error awareness. These results are reminiscent of prior research which has shown that high WMC individuals are better at classifying errors in free recall tasks than low WMC individuals (Unsworth & Brewer, 2010). Finding similar differences in error awareness in the Stroop task suggests that WMC is related to broad error awareness abilities that likely cut across multiple tasks. Overall, the current results are consistent with prior research suggesting that individual differences in WMC are related to error monitoring abilities.

Furthermore, consistent with prior research, the current results suggest that error monitoring abilities seem particularly tied to variation in task engagement. As noted previously, a number of studies have suggested that error monitoring is associated with task engagement both within and between individuals (Harsay et al., 2018; O’Connell et al., 2009; Shalgi et al., 2007). Specifically, when individuals experience a lapse of attention, error monitoring abilities are reduced. When participants seem fully engaged in the task error monitoring abilities are working appropriately. Furthermore, this seems particularly tied to error awareness with aware errors being associated with behavioral, neural, and physiological markers of task engagement and unaware errors being associated with markers of lapses of attention. As such, Harsay et al. (2018) suggested that “error awareness corresponds to engagement whereas error blindness corresponds to disengagement” (p. 7). Harsay et al. further suggested that “error blindness can be argued to reflect disengagement from the task, perhaps due in part to attentional lapses” (p. 8). Consistent with these notions we found that unaware errors were associated with small phasic pupillary responses and large baseline pupil diameters compared to aware errors. These results are consistent with prior research which has found that lapses of attention are similarly associated with smaller phasic pupillary responses and large baseline pupil diameters (Konishi et al., 2017; Unsworth & Robison, 2016; van den Brink et al., 2016). Thus, these results are consistent with the notion that unaware errors are associated with lapses of attention. Furthermore, at the inter-individual level we found that error monitoring and error awareness were associated with behavioral and self-report measures of lapses in both experiments suggesting than individuals who demonstrated deficits in error monitoring tended to have more fluctuations in attention than individuals with superior error monitoring abilities. Thus, at both the intra- and inter-individual levels, the current evidence suggested that error monitoring was associated with fluctuations in attention.

Given the relationships between WMC and error monitoring as well as the relationships between lapses of attention and error monitoring, we also tested the hypothesis that the relationship between WMC and error monitoring would be due to shared variance with lapses. As noted previously, much prior research has suggested that the relation between WMC and aspects of cognitive control is due to fluctuations in control leading to a greater frequency of periodic drifts in task engagement for low WMC individuals compared with high WMC individuals. Thus, we hypothesized that the relationship between WMC and error monitoring might be due to individual differences in fluctuations in attention. Consistent with this hypothesis, we found in both experiments that markers of lapses of attention largely accounted for the relation between WMC and error monitoring. These results suggest that you need to be in an attentive state to fully engage the error monitoring system and catch errors. Low WMC individuals are more likely to experience lapses which not only hurt performance, but also result in reduced error monitoring than high WMC individuals. That is, periodic drifts in task engagement likely not only result in performance issues (longer and more variable RTs) but also result in downstream effects where error monitoring is disrupted. As such, the current results suggest that WMC is related to error monitoring abilities, but this association is largely due to shared variation in the ability to stay consistently engaged with the current task. Future research is needed to examine the extent to which the WMC-error monitoring relation is due specifically to variation in lapses of attention, or is potentially due to variation in broader cognitive control abilities of which individual differences in preventing lapses of attention are just a subset.

While the current results suggest that individual differences in WMC and lapses of attention are related to error monitoring abilities, it is important to note that one main limitation of the current study is that the overall number of errors participants made in each experiment was very low. Specifically, in both experiments participants committed only four errors on average. Thus, few error trials contributed to the analyses. In Experiment 2, we attempted to increase the number of errors by increasing the overall number of trials and increasing proportion congruency. However, these changes actually resulted in slightly fewer errors in Experiment 1. It is not entirely clear why fewer errors were observed, but it is possible that by including the error awareness measure after each trial, we slowed the overall pacing of the task resulting in better goal maintenance and, hence, fewer errors (De Jong et al., 1999; Jackson & Balota, 2013). In both experiments, even with very few trials, the error phasic pupillary response was robust indicating a much larger phasic response than correct incongruent or congruent trials replicating prior research. Furthermore, the error phasic pupillary response had moderate reliability in each experiment. These results are consistent with prior error monitoring research, which has suggested that the ERN and Pe are robust and reliable with as few as six trials contributing to the analysis (Olvet & Hajcak, 2009). Nevertheless, more error trials are preferred to assess the stability of the results. As such, future research is needed to examine these relations in variants of the Stroop with larger numbers of trials in order to increase the overall number of error responses and better assess the reliability of the error phasic pupillary measure and the replicability of the current results. Furthermore, future research is needed utilizing other tasks that have been used to examine error monitoring (i.e., flankers, stop-signal) to examine the replicability and generalizability of the results.

Conclusions

Collectively, the current results suggest that individual differences in WMC are related to error monitoring abilities. This relationship seems to be largely driven by shared variation in lapses of attention such that low WMC individuals are unable to consistently maintain attention on the current task resulting in more lapses of attention compared to high WMC individuals. These lapses of attention result in performance deficits and reduced error monitoring. Overall, these results are consistent with the notion that individual differences in WMC are related to individual differences in broad cognitive control abilities and are specifically related to error monitoring abilities. Furthermore, the results suggest that error monitoring abilities are related to individual differences in lapses of attention. By combining experimental, differential, and physiological techniques, we will be better able to clarify error monitoring processes and for whom these processes are deficient.