Skip to main content

The role of attention control in complex real-world tasks


Working memory capacity is an important psychological construct, and many real-world phenomena are strongly associated with individual differences in working memory functioning. Although working memory and attention are intertwined, several studies have recently shown that individual differences in the general ability to control attention is more strongly predictive of human behavior than working memory capacity. In this review, we argue that researchers would therefore generally be better suited to studying the role of attention control rather than memory-based abilities in explaining real-world behavior and performance in humans. The review begins with a discussion of relevant literature on the nature and measurement of both working memory capacity and attention control, including recent developments in the study of individual differences of attention control. We then selectively review existing literature on the role of both working memory and attention in various applied settings and explain, in each case, why a switch in emphasis to attention control is warranted. Topics covered include psychological testing, cognitive training, education, sports, police decision-making, human factors, and disorders within clinical psychology. The review concludes with general recommendations and best practices for researchers interested in conducting studies of individual differences in attention control.

A robust finding from the cognitive psychology literature is that the amount of information humans can temporarily hold in consciousness at any given time is severely limited (e.g., Atkinson & Shiffrin, 1968; Broadbent, 1958; Cowan, 2001; Jahanshahi et al., 2008; Lachman et al., 1979; Miller, 1956; Shannon & Weaver, 1949). Miller et al. (1960) coined the term working memory to distinguish between the passive holding of information (short-term memory) versus memory involved in planning and carrying out behavior in the service of ongoing mental activity. The concept of working memory was therefore developed to refer to the more controlled and adaptive aspects of information processing. More formally, working memory can be defined as a limited capacity system that allows the temporary storage, manipulation, and maintenance of information in performing complex cognitive tasks (cf. Baddeley, 2000) and can be conceived as a description of how short-term memory is used while under cognitive load.

Baddeley and Hitch (1974; also see Baddeley, 1992) later developed the tripartite model of working memory, which heavily influenced subsequent research on working memory and human cognition more broadly. In this model, working memory consists of two storage systems—the phonological loop for verbal information and visuospatial sketchpad for visual/spatial information—and the central executive. The central executive is a flexible system that is responsible for the processing and manipulation of information, such as integrating information from the phonological loop and visuospatial sketchpad and connecting short-term memory to long-term memory (Baddeley, 2000). The central executive is also particularly important for managing the flow of information in situations of high cognitive load or demand in which capacity limits are exceeded and selective attention is required (e.g., Fukuda et al., 2016; Unsworth et al., 2004).

Working memory capacity

Definition and assessment

The functional capability of one’s working memory system is known as working memory capacity and is often indexed in terms of the number of units of information an individual can hold in primary memory at a given time (often referred to as span) while under cognitive load. Working memory capacity tasks are similar to simple span measures of short-term memory, in which a series of to-be-remembered stimuli are presented followed by immediate recall, but working memory measures must necessarily involve additional cognitive demand (Table 1; see Conway et al., 2005; Oberauer, 2005). The purpose of the additional demand is to prevent the respondent from rehearsing to-be-remembered information, thereby requiring them to maintain and manage the memoranda in a manner requiring more controlled processing, thus engaging the central executive. In complex span tasks (Fig. 1), this additional demand is in the form of a secondary processing (distractor) task based on the storage plus processing working memory hypothesis laid out by Baddeley and Hitch (1974). But the additional demand can come in other forms, for example in working memory updating tasks cognitive load is imposed via the requirement to continuously add new information into primary memory at the expense of previously relevant, but now irrelevant, information (see Fig. 2 for two examples of updating tasks).

Table 1 Commonly used working memory capacity tasks
Fig. 1
figure 1

Complex span tasks. In each task, the storage and processing components are independent. After 2–9 trials of storage + processing are presented, the examinee is asked to recall the memoranda in the order in which they were presented. Performance is often scored using the partial span score, which is simply the total number of items recalled in the correct position (see Conway et al., 2005). Pictures are to scale

Fig. 2
figure 2

Working memory updating tasks. a In running span, to-be-remembered stimuli are presented sequentially, and the respondent is asked to recall the last n of them in order. Here, n = 3 and the correct response is “6, 1, 3.” b In this example of mental counters, the respondent begins each trial by imagining the number “555.” Then, a series of cues quickly flash above or below lines that correspond to each counter, indicating that the respondent should update the original number by adding (flash above line) or subtracting (flash below line) 1 from the appropriate counter. In this example trial of Set Size 3, the correct response is “554.” Presentation times in updating tasks are generally quick to prevent rehearsal. Not to scale

Psychological importance

Working memory capacity has become an important construct in cognitive psychology as it has been repeatedly shown that it is a broad and domain general ability (e.g., Kane et al., 2004) that correlates with a wide range of cognitive abilities and real-world behaviors, often above other predictors. Further, many psychopathologies and situational phenomena are marked by deficits in working memory capacity, and it is often suggested that higher levels of working memory capacity provide protective effects. The list of abilities and phenomena associated with working memory capacity is long but include language and reading comprehension (Daneman & Carpenter, 1980; Daneman & Merikle, 1996), quantitative ability (Turner & Engle, 1989), acquisition of native language (Gathercole & Baddeley, 1989) and second language (Wen, 2015), following directions (Engle et al., 1991), reasoning ability and fluid intelligence (Engle et al., 1999; Kyllonen & Christal, 1990), attention control (Draheim et al., 2021; Shipstead et al., 2015) and everyday attention failures (Unsworth et al., 2012) long-term memory (McCabe, 2008), rejection of false memories (Leding, 2012), accuracy of eye-witness testimony (Jaschinski & Wentura, 2002), prospective memory (Brewer et al., 2010), multitasking (Redick et al., 2019), task switching (Draheim et al., 2016), emotion regulation (Schmeichel et al., 2008), performance after interruptions (Foroughi, Werner, et al., 2016c; Westbrook et al., 2018), performance during extreme sleep deprivation (Lopez et al., 2012), anxiety (Moran, 2016), depression (Nikolin et al., 2021), stress (K. Klein & Boals, 2001), schizophrenia (Forbes et al., 2009), posttraumatic stress disorder (Shaw et al., 2009), Alzheimer’s (Rosen et al., 2002), stereotype threat (Schmader & Johns, 2003), and alcoholism (Finn et al., 2002).

To provide some context for the popularity of assessing working memory capacity, our lab website has links to several versions of the complex span tasks for download. They have been downloaded thousands of times and independently translated into more than 15 languages despite the tasks requiring access to proprietary software (E-Prime) and an expensive license. According to Google Scholar, as of August 22, 2021, the methodological review and guide to measuring working memory capacity (Conway et al., 2005) has been cited more than 3,000 times, and the article that introduced the computerized version of the operation span task (Unsworth et al., 2005) has more than 2,000 citations. Further, the first two papers showing that complex span performance correlates with reading comprehension and scores on the Scholastic Aptitude Test combine for more than 12,000 citations (Daneman & Carpenter, 1980; Turner & Engle, 1989). Clearly, there is great interest in discovering and explaining associations between working memory and behavior.

Connecting working memory capacity, fluid intelligence, and attention control importance of attention control

One of the most notable features of working memory capacity is its substantial correlation with fluid intelligence, which is the ability to reason in novel situations (Engle et al., 1999). The precise magnitude of this relationship has been the subject to debate, but the two constructs typically share at least half of their variance at the latent level (see Kane et al., 2005; Oberauer et al., 2005). The relationship is sometimes considered to be causal in that individuals with higher levels of working memory capacity can better store and maintain representations that allow for generating and testing of hypotheses in fluid intelligence tasks (e.g., Chuderski et al., 2012; Kali, 2007; Kyllonen & Christal, 1990; Shah & Miyake, 1996; Unsworth et al., 2014; Verguts & De Boeck, 2002). But Engle (2002) proposed that attentional mechanisms largely account for individual differences in both working memory capacity and fluid intelligence, and therefore attentional mechanisms are the primary and causal reason for the relationship between the two constructs (see Barrett et al., 2004; Engle, 2002; Engle et al., 1999; Heitz et al., 2005, 2006; Shipstead et al., 2016; Unsworth & Engle, 2005; also see Burgoyne et al., 2019; Wiley et al., 2011). This executive attention view of individual differences in working memory capacity is notably compatible with K. Kovacs and Conway’s (2016) recent and novel account of intelligence called process overlap theory; for example, Conway et al. (2021) stated, “A central claim [of process overlap theory] is that domain-general executive attention processes play a critical role in intelligence, acting as a central bottleneck on task performance and a constraint on development of domain-specific cognitive abilities” (p. 2).

To map the executive attention theory of individual differences in working memory capacity to the Baddeley and Hitch (1974) model, the most important aspect of working memory is the central executive component and not the storage systems. Initial evidence of this claim came largely from extreme-groups studies showing that individuals who perform poorly on working memory capacity tasks also perform worse on attention-demanding tasks that have minimal memory demands. For example, individuals with a larger working memory capacity are better able to avoid looking at a flashing distractor in their peripheral vision to catch a target on the opposite side of the screen (Unsworth et al., 2004), quicker to narrow the size of their visual lens to only include stimuli in a target area of space (Heitz & Engle, 2007), better able to ignore their name in a dichotic listening task when it appears in the to-be-ignored channel (Conway et al., 2001), better at filtering distracting color words in Stroop tasks (Kane et al., 2001), and are less likely to experience attentional lapses (McVay & Kane, 2009).

In a recent elaboration on the executive attention view, Shipstead et al. (2016) proposed that maintenance and disengagement represent core top-down attentional mechanisms through which working memory capacity relates to fluid intelligence (Fig. 3a). According to this hypothesis, maintenance and disengagement are both necessary in performing working memory capacity and fluid intelligence tasks, but measures of working memory and fluid intelligence place differential demands on one or the other (Fig. 3b). Working memory tasks place more demand on maintenance and fluid intelligence tasks place more emphasis on disengagement (see also Mashburn et al., 2020). While the mechanisms of maintenance and disengagement are in opposition of one another, they work in tandem to facilitate goal-directed behavior. As such, the strong relationship between working memory capacity and fluid intelligence can be explained by their common reliance on a top-down executive attention system, which regulates both maintenance and disengagement. This top-down executive attention system is how attention control is implemented, which we define as the general ability to engage in goal-directed behavior via (1) maintaining goal-relevant behavior and information, particularly in the face of distraction and interference, and (2) filtering or otherwise blocking irrelevant and inappropriate information and behavior. Therefore, what distinguishes working memory capacity from attention control is that the former places more emphasis on specifically maintaining information in primary memory, whereas attention control refers to how limited-capacity domain-general attention is applied to the management of goal-directed behavior, which may (or may not) involve the maintenance of multiple pieces of information (also see Martin, Mashburn, & Engle, 2020).

Fig. 3
figure 3

Maintenance/disengagement hypothesis. In the maintenance/disengagement hypothesis, top-down signals in the form of attention control organize maintenance and disengagement around a particular goal (a). The relative emphasis on maintenance and disengagement in carrying out these top-down goals relies largely on the nature and demands of the to-be-performed task (b). The pie charts show the hypothetical relative proportion of total performance variance attributable to specific processes. For example, working memory tasks will require more maintenance than disengagement, whereas fluid intelligence tasks will require more disengagement than maintenance. The mental counters task (illustrated in Fig. 2b) may correlate about as strongly with both fluid intelligence and working memory capacity measures because maintenance and disengagement are required to roughly the same degree. The percentages are for illustrative purposes only and should not be considered veridical

In the maintenance/disengagement hypothesis, attention control is the commonality between working memory capacity and fluid intelligence and therefore the primary reason that these constructs (and perhaps all higher-order cognitive abilities) are related is due to their reliance on top-down executive attention (see Burgoyne & Engle, 2020; also see Conway et al., 2021; Rueda, 2018, for similar views). It should therefore be the case that, in most situations, attention control is a better indicator of one’s overall cognitive capability than working memory capacity and/or fluid intelligence alone. In other words, knowledge of an individual’s ability to control their attention should explain more variation in higher-order cognitive behaviors and performance than either working memory capacity or fluid intelligence.

Indeed, several independent lines of research support the theoretical position that attention control underlies higher-order cognition and therefore working memory capacity’s broad predictive powers can be largely attributed to attentional factors rather than the maintenance of information specifically (e.g., Draheim et al., 2021; Fukuda et al., 2016; S. Gray et al., 2017; McCabe et al., 2010; McVay & Kane, 2012a; Rueda, 2018; Tsukahara et al., 2020). Some of these studies involve measuring various abilities and testing whether attention control can mediate (or account) for the relationships between other cognitive abilities at the latent level. Examples include Draheim et al. (2021), a large-scale correlational study of 396 students and community members in which the authors found that the strong relationship between working memory capacity and fluid intelligence was no longer statistically significant when accounting for the shared variance between the two constructs and attention control (Fig. 4). Similarly, Tsukahara et al. (2020) found in each of two independent datasets that the relationships between sensory discrimination ability and working memory capacity—and sensory discrimination ability and fluid intelligence—were both completely accounted for by attention control. McVay and Kane (2012a) found that the relationship between working memory capacity and reading comprehension was no longer statistically significant after accounting for the shared variance with mind wandering and other attention control measures. And, finally, Frith et al. (2021) reported results suggestive that individual differences in attention control are the reason for the relationship between fluid intelligence and creativity. Partially based on these findings, we argue that attention control should demonstrate greater predictive power for cognitive behavior and real-world outcomes as opposed to working memory capacity or even fluid intelligence.

Fig. 4
figure 4

Attention control mediating the working memory capacity-fluid intelligence relationship. Structural equation model from Draheim et al. (2021) showing that the relationship between working memory capacity and fluid intelligence is not statistically significant when attention control is added as a mediator. The numbers on each path between the constructs can be likened to correlations between the latent (unobserved) abilities, with the path between working memory capacity and fluid intelligence representing the relationship between the two abilities that was left over after the contribution of attention control was partialled out. Each construct was measured with three tasks, which are shown here along with their respective factor loadings (likened to the correlation between each task and its corresponding construct). Note that this study included 10 total attention control tasks, and full mediation of the working memory capacity-fluid intelligence relationship was present with multiple combinations of accuracy-based attention control measures, but not in models involving reaction time-based attention control tasks. N = 396

Attention control or long-term memory?

Our argument is that individual differences in working memory capacity are primarily attentional in nature and that attention control mostly accounts for working memory capacity’s strong relationship with an array of cognitive behaviors and phenomena. But there are competing theories, and it is feasible that working memory capacity has a unique relationship to some aspects of cognitive behavior which cannot be fully accounted for by attention control (e.g., more lower-level processing which depends more on short-term memory capacity than goal maintenance). Examples of phenomena which may be driven more so by individual differences in short-term and working memory capacity than attention control may include rate of encoding into long-term memory (Fukuda & Vogel, 2019), long-term associative learning (G. Jones & Macken, 2018), and creation of false memories (e.g., Peters et al., 2007). Some models and theories of working memory therefore place less emphasis on the role of attention (see Adams et al., 2018, for a review) and more on other processes, most notably long-term memory, which is often closely linked with working memory in many theoretical and descriptive models of working memory (e.g., Baddeley, 2000; Cantor & Engle, 1993; Cowan, 1988, 2008, 2017; Lewis-Peacock & Postle, 2008; Oberauer, 2002; Ruchkin et al., 2003). A well-known example is Cowan’s embedded process model, in which working memory is effectively information in long-term memory that is both activated and within the focus of attention (Cowan, 1988, 1999, 2017; also see Oberauer, 2002; Ruchkin et al., 2003). Baddeley (2000) also revised the Baddeley and Hitch (1974) multicomponent model of working memory by adding a fourth component—the episodic buffer—to incorporate the role of long-term memory in working memory (see Fig. 5). Baddeley stated that the episodic buffer “comprises a limited capacity system that provides temporary storage of information held in a multimodal code, which is capable of binding information from the subsidiary systems, and from long-term memory, into a unitary episodic representation” (p. 417). Baddeley’s motivation for adding the episodic buffer was to account for more complex aspects of cognition in working memory and emphasize integration of the subsystems rather than their isolation. For example, several studies showed that individuals with short-term memory deficits also displayed long-term memory deficits, suggesting a stronger interaction between long-term memory and short-term memory than assumed under the original model (also see Burgess & Hitch, 2005).

Fig. 5
figure 5

Updated multicomponent model of working memory from Baddeley (2000). Baddeley updated the multicomponent model to include the episodic buffer, in large part to account for the contributions of long-term memory to working memory. It is noteworthy that, according to Baddeley, the episodic buffer is controlled by the central executive as are the original two storage systems (visuospatial sketchpad and phonological loop), and that the central executive is synonymous with attention control

Individual differences research supports that one particularly important aspect of working memory capacity is the ability to perform a controlled search of activated information contained in long-term memory (e.g., Mogle et al., 2008; Unsworth & Engle, 2005; Unsworth & Spillers, 2010). For example, Unsworth and Engle (2005) proposed a dual-component model which postulated that limitations in working memory (i.e., individual differences) arise due to individual differences in (1) the ability to active maintain information in primary memory (involving both short-term memory capacity and attention control) and (2) the ability to search for and retrieve information from secondary memory. To elaborate on the second component, Unsworth and Engle assumed that capacity limitations result in lost and displaced items from the contents of primary memory, thus necessitating controlled search and retrieval from long-term (secondary) memory to recover them. Individual differences in working memory capacity therefore arise partially due to individual variation and limitations in the ability to perform this recovery, for example differing ability to initially encode items in long-term memory or combat proactive interference. These sources of individual variation were argued to be at least partially independent from attention control, and in a test of this idea, Unsworth et al. (2014) found that a full mediation of the working memory capacity-fluid intelligence relationship was possible, but only when primary memory capacity, attention control, and secondary memory were all included in the model as mediators, and that these three factors each had an independent contribution to the working memory capacity-fluid intelligence relationship (see also Unsworth & Spillers, 2010). They therefore concluded that individual differences in attention control, memory capacity, and long-term memory retrieval were jointly responsible for producing individual differences in working memory capacity and thus the likely mechanisms behind the strong criterion validity of working memory capacity measures.

While many researchers have linked working memory and long-term memory and shown that long-term memory processes are an important aspect of individual differences in working memory capacity, it is not clear whether the ability to search and recover information from secondary memory into working memory is an independent or separable from the ability to control attention more generally. For example, in discussing the episodic buffer being added to the multicomponent model, Baddeley (2000) stated that the buffer was assumed to be controlled by the central executive, which he noted was “an attentional control system” (p. 418), and that attentional mechanisms were largely responsible for the binding and integration of information in the episodic buffer specifically and working memory more generally. This is largely consistent with our lab’s view and supported by our recent finding that attention control fully accounted for the working memory capacity-fluid intelligence relationship (Draheim et al., 2021). Further, we would argue that the most parsimonious reason other studies (e.g., Unsworth et al., 2014; Unsworth & Spillers, 2010) have failed to find a full mediation with attention control alone is due to methodological considerations with how attention control is typically measured, which is the focus of the next section.

Assessing individual differences in attention control

Even though attention control has been identified as a central and important ability for human cognition (e.g., Broadbent, 1957; Engle, 2002; Norman & Shallice, 1986; Posner & Snyder, 2004), the role of working memory capacity has largely been studied and emphasized more so for explaining real-world behaviors, performance, phenomena, and outcomes. This is likely because the claim that attention control is the most predictive marker or driver of cognitive performance remains contentious, largely owing to how difficult it is for researchers to establish a strong and coherent factor of attention control and related mechanisms (Draheim et al., 2021; Friedman & Miyake, 2004, 2017; Hedge et al., 2018; Rey-Mermet et al., 2018; Rey-Mermet et al., 2019; Rouder et al., 2019). We have argued that this difficulty has not been because of theoretical or substantive reasons, but instead because most measures of attention control are psychometrically poor (see Draheim et al., 2019; Draheim et al., 2021). In this section, we discuss the challenges investigators face when trying to assess individual differences in attention control and describe some recent work aimed at addressing these challenges.

Challenges with measuring attention control

Perhaps the most widely used working memory capacity measures, complex span, were theoretically motivated tasks specifically developed for individual differences research (Daneman & Carpenter, 1980; Kane et al., 2004; Turner & Engle, 1989). They work well for this purpose because between-subjects variance is sufficiently large and reliability is high, producing strong individual differences and therefore adequate statistical power to detect correlations of interest (see Cronbach, 1957, for the differences between the experimental and individual differences approach). Subsequent research in individual differences in working memory capacity were successful thanks to the availability of psychometrically strong and validated tasks, and it was soon found that working memory capacity was strongly associated with intelligence and myriad other abilities. On the other hand, investigators studying individual differences in attention control were not afforded the same luxury. Studies of attention control often failed to show shared variance among tasks designed to measure inhibitory processing (e.g., Earles et al., 1997; Friedman & Miyake, 2004; Kramer et al., 1994). Because of the continued failed attempts for researchers to establish a coherent factor using attention tasks, some researchers have questioned whether the same inhibitory and attentional mechanisms are involved in performing different tasks (e.g., Hedge et al., in press; Rouder et al., 2019; Rouder & Haaf, 2019). Most notably, Rey-Mermet et al. (2018; Rey-Mermet et al., 2019) found little evidence of a common cross-task attentional ability and argued that it was time for researchers to simply stop thinking about “inhibition” as a unitary concept. In other words, they concluded that a general ability to control attention does not exist.

At present, the issues of measurement of attention and whether attention control is a psychometric construct are topics of debate among researchers. This is further complicated in that studies of attention control often operationalize it in terms of “inhibition,” and it is not entirely clear if what is labeled as inhibition is equivalent to what we and others call attention control (i.e., jingle-jangle fallacies; Kelley, 1927; also see Conway et al., 2021, for terminology confusion in this area). On the surface, it would appear so because (a) many of the same tasks are traditionally used to measure these abilities (such as Stroop, flanker, antisaccade; see Fig. 6), (b) inhibition appears to be a common way to conceptualize broad attention control (e.g., Rey-Mermet et al., 2018; von Bastian et al., 2020), and (c) some authors seem to use attention and inhibition interchangeably (e.g., Friedman & Miyake, 2004). On the other hand, it may be that what is commonly called inhibition is a specific and much narrower ability than attention control (cf. Draheim et al., 2021; Friedman & Miyake, 2017). To facilitate discussion, throughout this article we assume that what is called inhibition is similar enough to our view of attention control—namely, because the same tasks are being used to measure an underlying ability, and inhibition is often considered one of the most important functions or facets of the more general attention control. But it should be noted that the term inhibition does appear to be overly used in the literature, and likely reflects a wide variety of measurement tasks and mechanisms (see Bjork, 1989). We also argue strongly that attention control is much broader than inhibition. For example, one aspect of attention control that may not be encompassed by a narrower conceptualization of inhibition is the ability to maintain current task goals and avoid attentional lapses. Such lapses can be internally driven via intrusive and task-unrelated thoughts which disrupt task performance (known as mind wandering). Research suggests that individual differences in mind wandering are distinct, but correlated, with susceptibility to external distraction (e.g., Unsworth & McMillan, 2014) and correlated with both working memory capacity and fluid intelligence (Kane & McVay, 2012; McVay & Kane, 2012a; Unsworth & McMillan, 2014). Mind wandering researchers often emphasize that attentional lapses in the form of intrusive and task-unrelated thoughts are a sustained attention failure which result in extremely slow reaction times and therefore poor performance (known as the worst performance rule, see Löffler et al., 2021; McVay & Kane, 2012b; Welhaf et al., 2020). Another aspect of attentional lapses may be more on the micro level—for example, when performing an antisaccade task (see Fig. 6a) the respondent has a very brief window to execute the appropriate saccade away from the distractor and toward the target on the other side of the screen. Any attentional lapse, even for a fraction of the second, can have a deleterious effect on performance if said lapse occurs at a critical part of the trial (i.e., just before or as the distractor appears). We hypothesize that part of the reason for our finding from Draheim et al. (2021) that attention control fully mediated the working memory capacity-fluid intelligence relationship was because our attention control battery involved measures which heavily tapped the ability to avoid not only macro-level attentional lapses but also micro-level ones, thereby affording higher ability participants to apply intensive attentional resources to the trials as needed.

Fig. 6
figure 6

Attention control tasks. a In this version of the antisaccade task (Hutchison, 2007), the respondent is asked to start each trial by looking at the center of the screen. A distractor appears on one side of the screen and then a target letter (Q or O) appears for only 100 ms on the other side. The respondent is asked to identify the letter. Accuracy rate is the dependent variable for this version, although some versions are scored using reaction time, difference scores (in reaction time or accuracy, using prosaccade trials as baseline), or eye-tracking with no behavioral response. For added effect, the distractor may blink several times while on screen. b In the color Stroop task, a color word is presented, and the respondent is asked to indicate the color of the ink in which the word is printed. On congruent trials, the word and ink color match. On incongruent trials, they do not. The dependent variable is usually the difference in reaction time between incongruent and congruent (baseline) trials, although accuracy differences are sometimes used. c In the arrow flanker task, five arrows are presented, and the respondent is asked to indicate which direction the central arrow is pointing. On congruent trials, the central arrow and flanking arrows point in the same direction, whereas on incongruent trials the flanking arrows point in the opposite direction. As with Stroop tasks, the dependent variable is usually the difference in reaction time between incongruent and congruent trials, although accuracy differences may be used. Not to scale

The reliability paradox

Despite the disagreements and ambiguities over the nature of attention control or inhibition, there is a growing consensus among researchers that commonly used measures to assess these constructs (i.e., Stroop and flanker) typically reflect very little shared attention-related variance. Specifically, reliability of attention control tasks are often inadequate, intercorrelations are weak, a unitary factor is difficult to establish in latent analyses, and performance does not correlate as strongly as expected with other cognitive indicators (see Hedge et al., 2018; Hedge et al., in press; Paap & Sawi, 2016; Rey-Mermet et al., 2018; Rey-Mermet et al., 2019; Rouder et al., 2019; Rouder & Haaf, 2019; von Bastian et al., 2020).

At the core of this problem is that individual differences in attention have historically been assessed using paradigms born out of the experimental approach—the epitome of which is the Stroop task (Stroop, 1935; Fig. 6b). In the basic version of this task, the test-taker is asked to resist the automaticity of reading a color word to instead name the color of ink in which the word is printed. For example, they see “RED” in blue ink and are asked to respond by pressing a key corresponding to the response “BLUE.” One of the most robust and almost universal findings from this paradigm is known as the Stroop effect, which is that responses are slower and more error-prone when the color word is incongruent with the color of ink it is printed in (e.g., “RED” in blue ink), as opposed to when the two are congruent (e.g., “RED” in red ink). The Stroop task, and similar measures such as flanker (Fig. 6c; Eriksen & Eriksen, 1974) and Simon (Simon & Wolf, 1963; collectively known as conflict or interference tasks) have a rich history within the experimental literature and are successful in experimental researchFootnote 1 for the same reason that makes them poorly suited to individual differences research—the minimization of between-subjects variance (see Cronbach, 1957; Draheim et al., 2019; Hedge et al., 2018). The typical finding that tasks which exhibit strong and robust experimental effects fail to produce strong and reliable individual differences was dubbed “the reliability paradox” by Hedge et al. (2018), and there are several reasons for this phenomenon. We have argued that the primary reasons for why popular experimental tasks are typically poorly suited to correlational research are the use of difference scores and reaction times to assess task performance in many widely used measures of attention, as well as the failure for researchers to account for speed–accuracy interactions (see Draheim et al., 2019, for an extensive review and analysis of this problem).

Difference scores

The logic of using difference scores follows the Donders (1969) subtraction methodology, in which the difference is taken between two related but different variables to separate out cognitive processes (or, sometimes, to assess change over time), with one variable serving as the baseline or control. This method has been criticized for not isolating processes of interest as well as believed (e.g., Sternberg, 1969; Verhaeghen & De Meersman, 1998). Further, while difference scores can maximize statistical power when comparing performance at the group level, making them particularly useful in experimental designs (Overall & Woodward, 1975; but see Mashburn et al., 2020), they are clearly suboptimal for individual differences pursuits (see Cronbach & Furby, 1970; Draheim et al., 2016; Draheim et al., 2019; J. R. Edwards, 2001; Hedge et al., 2018; Lord, 1956; Paap & Sawi, 2016). This is because subtraction effectively removes the (generally strong) correlation between the two component scores, which, by nature, is reliable variance, thereby increasing the overall proportion of unreliable variance in the resulting difference score. The extent to which reliability is lost depends primarily on (1) the reliability of the component scores, and (2) how strongly the component scores are correlated. Critically, the stronger the correlation of the component scores, the lower the reliability of the resulting difference. With attention tasks in which the different trial types (conditions) are highly similar, correlations are generally quite strong and effect sizes are rather small (generally around 50 ms; see Rouder et al., 2019), and therefore using difference scores is a clear problem. For example, von Bastian et al. (2020) reviewed 76 individual differences studies and found an average reported reliability of just .63 for attention tasks when the dependent variable was in the form of a difference or contrast.

Reaction time and the relationship between speed and accuracy

Another factor that we argue contributes to problems with attention measures is the use of reaction time, specifically regarding individual differences in speed–accuracy emphasis (see Draheim et al., 2018; Heitz, 2014; Luce, 1986; Wickelgren, 1977). Individuals naturally differ on the extent to which they prioritize speed and accuracy (e.g., Forstmann et al., 2011; Starns & Ratcliff, 2010), and diffusion modeling studies from Hedge and colleagues have shown that individual differences in speed–accuracy emphasis are correlated across attention tasks even in young adults (Hedge et al., 2019; Hedge et al., in press). At issue is that most attention measures are scored not only using difference scores, but difference scores in reaction time, as is the case with conflict tasks in which the dependent variable is usually the difference in reaction time between incongruent and congruent trial performance. This is problematic in that the underlying processes reflected by reaction times have been shown to be multiply determined and not process pure (e.g., Hedge et al., 2019; Miller & Ulrich, 2013; Verhaeghen & De Meersman, 1998). Further, accuracy rates are often completely ignored and unaccounted for in these measures, meaning that respondents with the same general ability to control their attention will score differently on, say, a Stroop or flanker task if they have differences in baseline speed–accuracy emphasis and/or if they adjust their speed–accuracy emphasis during the task. Scores will also be affected by the extent to which any speed–accuracy relationships and interactions are systematically related to cognitive ability, for example if higher ability individuals slow down on a task after making an error to minimize the chances of committing subsequent errors (e.g., Draheim et al., 2016). To that end, Hedge et al. (in press) argued that performance in the most popular class of attention tasks (conflict tasks such as Stroop, flanker, and Simon) is not a reflection of attention-specific mechanisms, but instead are contaminated with variance attributable to processing speed and response cautiousness. As such, research using measures of attention control to answer questions about individual differences has stagnated relative to research on working memory capacity due to lack of reliable and valid measures of attention control.

To summarize, we believe that applied research has emphasized the role and importance of working memory capacity more so than attention control for two primary reasons. The first is simply inertia; early research on individual differences in working memory capacity was successful in establishing it as a broad and domain-general construct and other researchers were quick to expand this work when it was shown that working memory capacity correlated substantially with intelligence. Second, the availability of several psychometrically sound and accuracy-based measures of working memory capacity facilitated this surge of research. In contrast, assessing individual differences in attention control has not been so straightforward. Researchers interested in individual differences in attention control adopted established paradigms from the experimental literature, which proved to be poorly suited for correlational research for a variety of reasons. In our estimation, the lack of psychometrically strong attention control measures has undoubtedly stunted theoretical advancements in this area and likely set research back decades, just as Friedman and Miyake (2004) predicted could happen if improved measures were not developed. However, the problems are being addressed, and the possibility exists that assessing individual differences in attention will soon be as streamlined as assessing differences in other notable abilities such as working memory capacity and fluid intelligence.

Overcoming the challenges

Despite the controversy over measurement of attention control, we argue that there are reasons to be optimistic. One reason is that a pair of psychometrically sound attention control tasks has existed for some time. The first, the antisaccade task, is an instructionally simple yet highly difficult task that requires the respondent to look away from a flashing distractor on one side of the screen to instead catch a target on the opposite side before it is masked (see Hutchison, 2007; Fig. 6a). If the respondent looks in the direction of the distracting stimulus for even a moment, they will be unable to identify the target. This design is effective because animals are evolutionarily wired to look at something flashing in our environment, as this suggests movement which could indicate the presence of either danger or food (e.g., Howard & Holcombe, 2010). Therefore, the respondent must override or otherwise inhibit this strong evolutionarily engrained behavior to look toward the distractor, a quintessential example of a situation in which the control of attention is required. Another paradigm, visual arrays, is a change detection task in which stimuli are very briefly flashed on the screen and then reappear after a short delay (Fig. 7). In a typical visual arrays task, one of the stimuli change in some manner from the first display to the second on half the trials, and the respondent’s job is to judge whether something has changed. Performance on visual arrays is usually transformed into a capacity (k) score which is an estimate of how many items the individual can hold in primary memory (see Cowan et al., 2005).

Fig. 7
figure 7

Visual arrays task. In this version of the task (see Shipstead et al., 2014), an array of rectangles briefly appears, disappears, and then reappears with one rectangle probed with a white dot. The respondent is asked to indicate whether this probed rectangle changed orientation from the initial display. Accuracy performance is typically converted into a capacity (k) score to estimate how many items the respondent can hold in primary memory. The trial shown is Set Size 3, and so 100% accuracy on a series of such trials would produce a k score of 3, whereas 50% (chance) performance would produce a k score of 0. a No distractors present (nonselective visual arrays). b Respondent is cued to attend to only a subset of the to-be-presented stimuli, and distractors are presented with the targets (selective visual arrays). Not to scale

Psychometrically speaking, the appeal of the antisaccade and visual arrays tasks are that they are entirely accuracy-based measures that do not involve difference scores for measuring performance. Because speeded responding is not required, the aforementioned issues with reaction time are avoided (see Draheim et al., 2019; Draheim et al., 2021). Chiefly, individual differences in speed–accuracy emphasis and processing speed should be minimally impactful on the overall accuracy score. These desirable characteristics are shared with many successful measures of working memory capacity and fluid intelligence, and thus reliability and criterion validity of antisaccade and visual arrays approach that of measures such as complex span and matrix reasoning tasks (see Draheim et al., 2021). Unfortunately, many versions of antisaccade are employed—some variants involve reaction time and/or difference scores, and characteristics (namely, visual angle and presentation timings) of the task need to be properly tuned to produce sufficient individual variation. As such, antisaccade tasks only sometimes display high reliability, strong inter-correlations and factor loadings, and strong correlations to other cognitive measures (see Appendix B of Rey-Mermet et al., 2018; also see Hutton & Ettinger, 2006, for a review of antisaccade). As for visual arrays, this paradigm is often thought to be a measure of visual working memory capacity, which is evident in that performance is usually transformed into capacity (k) scores (Cowan et al., 2005). This classification is sensible given the task involves holding target stimuli in primary memory, but a growing body of research supports our contention that individual differences in performance of some types of visual arrays tasks are due more to attentional factors than memory (e.g., Balaban et al., 2019; Cusack et al., 2009; Draheim et al., 2021; Fukuda et al., 2016; Fukuda & Vogel, 2011; Martin et al., 2021; Shipstead & Engle, 2013; Souza & Oberauer, 2015; Vogel et al., 2005; Wheeler & Treisman, 2002). Critically, visual arrays tasks can be broken down into two categories, selective and nonselective (see Fig. 7). Nonselective visual arrays do not involve distractors and therefore place no demand on filtering irrelevant stimuli. On the other hand, selective versions of visual arrays include distractors, and generally a prompt before each trial indicates which subset of the to-be-presented stimuli should be selected and which subset should be ignored. For example, the respondent might see “BLUE” just before the presentation of the first array consisting of both red and blue stimuli, which means they should only attend to the blue items and ignore the red ones. While the case can be made that individual differences in nonselective visual arrays performance are jointly attributable to working memory and attentional mechanisms, what is clearer is that individual differences in selective visual arrays are primarily attentional in nature (cf. Martin et al., 2021). Several studies have shown that the additional filtering demand produces large individual differences attributable to selective attention, as performance is greatly reduced for individuals who do not attend to the selection cue (also possibly due to mind wandering or lack of ability to sustain attention) and/or are unable to properly select the target stimuli and ignore/filter the irrelevant stimuli (see Draheim et al., 2021; Draheim & Engle, 2021; Fukuda et al., 2016; Martin et al., 2021; Vogel et al., 2005). Additionally, Fukuda and Vogel (2009, 2011) found that individual differences in selective visual arrays performance were in part due to individual differences in the ability to recover from attentional capture. Fukuda and Vogel (2011) noted the similarities of resisting attentional capture in visual arrays with the demand to override prepotent eye movements toward distractors in the antisaccade, and we would argue both tasks place a strong demand on avoiding micro-level attentional lapses to intensely focus attention at a critical point in each trial. Supporting this, we recently found in Draheim et al. (2021) that antisaccade and selective visual arrays performance correlated strongly (r = .45), loaded onto an attention factor in the .60–.70 range, and had statistically equivalent correlations to working memory capacity and fluid intelligence composite scores. These were very strong effect sizes for two very different tasks, and much larger than the typical correlations of around r = .10–.20 and factor loadings below .40 for Stroop and flanker tasks (e.g., Friedman & Miyake, 2004; Rey-Mermet et al., 2018; Rouder et al., 2019; Rouder & Haaf, 2019).

Another reason for optimism is recent and ongoing developments to create and validate new measures of attention control. In Draheim et al. (2021) we argued that developing new and modified tasks was a straightforward way to tackle the challenges in assessing attention control (cf. Friedman & Miyake, 2004). We reasoned that avoiding difference scores and either controlling for accuracy (using adaptive procedures) or pushing all performance variance into accuracy (by making reaction time irrelevant) would increase the chances of the task having psychometric properties on par with measures of working memory capacity and fluid intelligence. We administered a combination of ten existing, new, and modified attention tasks. Included among these were modified versions of the Stroop and flanker tasks that involved adaptive response deadlines or presentation times and were designed to assess how quickly one could respond or how brief the presentation of the target stimulus could be at the same level of accuracy for each participant. We also included an accuracy analog of the psychomotor vigilance task, which is a reaction time-based sustained attention task which asks the respondent to press a key as soon as a timer on the screen begins counting up from zero. While we found that the two strongest attention measures (according to the criteria we outlined in the paperFootnote 2) were the pure accuracy versions of antisaccade and selective visual arrays tasks, the accuracy analog of the psychomotor vigilance task was just behind them, and the three modified Stroop and flanker tasks were also clear improvements to their reaction time and difference score counterparts. To quantify the relative improvements, performance in the antisaccade and visual arrays tasks each had about five times as much reliable and predictive variance as the traditional Stroop and flanker tasks and the three modified Stroop and flanker tasks had around three times as much. By using these more reliable measures of attention control, we found that attention control fully mediated the relationship between working memory capacity and fluid intelligence at the latent level (Fig. 4). Importantly, we only found this full mediation when the attention control factor was composed of tasks with accuracy-based dependent variables that did not involve difference scores (such as antisaccade, selective visual arrays, adaptive Stroop and flanker tasks, and a sustained attention measure). If traditional reaction time measures (psychomotor vigilance, Stroop, and flanker) were included, full mediation did not occur, just as it did not occur for Unsworth and Spillers (2010) and Unsworth et al. (2014), of which four out of their seven attention control tasks involved difference scores. We therefore argue that the discrepancy in the results between our study and the studies of Unsworth and colleagues was due to how attention control was assessed. One caveat is that the attention control’s full mediation of the working memory capacity-fluid intelligence relationship found in Draheim et al. (2021) was novel, and so it is yet to be established that this finding can be replicated, ideally across labs and with different populations and diverse tasks for the relevant constructs.

The central claim of this article is that these recent developments in the understanding (e.g., Shipstead et al., 2016; Tsukahara et al., 2020) and measurement (e.g., Draheim et al., 2021; Martin et al., 2021) of attention control provide a solid foundation from which to argue that attention control is more important than working memory capacity for explaining higher-order cognitive performance, both in and outside the laboratory (see also Burgoyne & Engle, 2020; Mashburn et al., 2020). In the following sections, we outline several areas of research in which working memory has been identified as an important predictor of real-world outcomes. For each, we provide explicit reasoning for why we think attention control better explains these phenomena.

Review of working memory and attention in the real world

In the following review of specific areas of applied research, we encourage the reader to keep in mind that the quality and nature of measurement may vary substantially across studies. Researchers often employ different tasks to measure the same construct, use the same tasks but give a different label to the underlying ability, use only one task but frame their findings as if they had measured a construct, administer too few trials to too few participants (see Rouder & Haaf, 2019), do not have a representative sample, and so on. In the following review we will generally grant that constructs have been assessed with some validity so as not to distract from the overall argument, but on occasion it will be necessary for us to mention methodological considerations to properly evaluate a study and the authors’ conclusions.

A recurring theme throughout this literature review is that working memory and attention are intertwined (Engle, 2002) and therefore difficult to disentangle. It is commonplace for researchers to hypothesize that behaviors are driven by attentional mechanisms but use working memory tasks (e.g., operation span) to index “executive attention” or as a proxy for attentional mechanisms. This practice is understandable given that many theories of working memory involve attention as a central component (see Baddeley, 1992; Engle, 2002). But because attention tasks with minimal storage demands (e.g., measures of attention control) are often less reliable and predictive than, say, complex span tasks, researchers who used working memory tasks as a proxy for attention were more likely to find significant results than researchers who instead used the traditional, and flawed, measures of attention control. This is relevant throughout this review, and we encourage the reader to keep in mind that individual differences studies of attention control and inhibition have historically relied on psychometrically poor measures, and so correlations involving these tasks are expected to be highly attenuated and thus conclusions drawn by the investigators using these measures may minimize the role of attention control in favor of other abilities (often working memory capacity and/or intelligence).Footnote 3

Education, learning, and child development

Working memory is a powerful explanatory tool for children’s learning, classroom performance, and overall academic achievement, as illustrated by the opening lines of Chapter 5 in Dehn (2008):

“Working memory capacity is more highly related to . . . learning, both short-term and long-term, than is any other cognitive factor” –P. Kyllonen

Educational and psychological research on working memory (e.g., Gathercole et al., 2006; Swanson et al., 1990) over the past 20 years has repeatedly affirmed the hypothesis that working memory processes underlie individual differences in learning ability. Working memory is required whenever anything must be learned. (p. 92)

Indeed, there is a solid basis for these claims and some research has shown that working memory capacity is an even better predictor of early academic achievement than psychometric intelligence (e.g., Alloway & Alloway, 2010; Cockcroft, 2015). The formation of new concepts and accumulation of information involves the manipulation of information and eventual storage into long-term memory, which requires information passing through working memory. Working memory has thus historically been viewed as a portal to long-term memory, particularly in early stages of development when learning is most important due to low levels of knowledge and automatized skills (Cowan, 2014; Forsberg et al., 2021). An individual with low working memory capacity will struggle to combine capacity, speed, knowledge, and strategies necessary for problem solving, inference making, and learning of complex skills and concepts (Alloway, 2006; Cowan, 2014; Halford et al., 1998; Reid, 2009). It is therefore not surprising that working memory capacity predicts an array of behaviors important for learning and classroom performance, such as reading comprehension (Daneman & Carpenter, 1980), reasoning ability (Kyllonen & Christal, 1990), direction following (Engle et al., 1991), long-term memory retrieval (Brewer & Unsworth, 2012), and language acquisition (Gathercole & Baddeley, 1989). To quantify these relationships, according to the Woodcock-Johnson III Technical Manual (McGrew & Woodcock, 2001), working memory capacity has an aggregate correlation over r = .50 with eleven specified achievement clusters for children and adolescents in the domains of reading, writing, comprehension, reasoning, and mathematics.

Working memory is therefore often implicated as a cause of learning disabilities, and students with low working memory capacity generally perform poorly in classroom settings (e.g., Cowan, 2014; Gathercole et al., 2006; Sabol & Pianta, 2012). Reduced working memory capacity is viewed as a causal source of the co-occurrence of inattentive behavior and working memory problems (Diamond, 2005; Gathercole et al., 2008) and may mediate the (negative) relationship between trait anxiety and academic performance (Owens et al., 2008). Some researchers argue that working memory limitations are the primary cause of learning disabilities (see Dehn, 2008), whereas another argument is that both working memory capacity and fluid intelligence contribute to learning disabilities because they work together to support problem solving in facilitation of learning and educational achievement (Cockcroft, 2015; Cowan, 2014). If we interpret this argument with the lens of the maintenance/disengagement framework (Shipstead et al., 2016), then it would be expected that attention control is the primary driving force behind learning and, by extension, scholastic achievement.

It is often hinted or implied that scholastic achievement and learning difficulties have an attentional origin. Working memory is sometimes fractionated into different components, such as visuospatial working memory, verbal working memory, and executive working memory (Dehn, 2008).Footnote 4 Executive working memory refers to the central executive system of the Baddeley and Hitch (1974) model and is synonymous with executive attention/attention control. It has been argued that executive working memory is by far the best predictor of learning ability. For example, Dehn (2008) stated, “Research has consistently found students with specific learning difficulties to be most deficient in the executive processing components of working memory (Swanson et al., 1990)” (p. 96) and “Executive-loaded working memory tasks provide the best discrimination between children with and without learning disabilities (Henry, 2001)” (p. 96). Supporting these assertions, Gathercole and Pickering (2000) tested 6- and 7-year-old children and reported that scores on their central executive subtest scores predicted performance on arithmetic, vocabulary, and literacy a year later above and beyond scores on the phonological or visuospatial subtests.

There is also extensive and diverse research more explicitly outlining the role of attention in learning and academic achievement. For example, it has been shown that higher levels of anxiety result in worse academic performance and working memory capacity is thought to mediate this relationship (Owens et al., 2008). But another hypothesis is that anxiety produces specific attentional deficits—such as the propensity for a student high in anxiety to divide attention, devoting attentional resources to task-irrelevant thoughts and behaviors (see Beilock, 2007). A systematic review by Polderman et al. (2010) found that attentional factors were strong correlates of academic achievement even after controlling for intelligence, socioeconomic status, and comorbid disorders. Similarly, Steinmayr et al. (2010) found that scores on a sustained attention test moderated the relationship between intelligence and grades in high school students, and that error rates on this sustained attention test predicted overall school performance above and beyond intelligence. In a longitudinal study, Rhoades et al. (2011) reported that kindergarteners’ attention level was a strong mediator of their preschool emotional knowledge and their first-grade academic achievement, even after accounting for socioeconomic status and verbal skills. Bull and Espy (2006) reported that inhibitory control substantially predicted mathematical performance in preschoolers even when age, verbal intelligence, and maternal education were factored out, and inhibition also accounted for 12% of variability in mathematical skills in preschoolers above and beyond working memory capacity. Finally, St Clair-Thompson and Gathercole (2006) administered a battery of executive functioning tasks to middle school-aged children and found that working memory capacity and inhibition each uniquely predicted various measures of scholastic achievement and attainment in mathematics, science, and English.

Other scholars have argued that reading comprehension is heavily dependent on attention—specifically, the ability to discard previously relevant but now irrelevant information (e.g., Carretti et al., 2009; De Beni & Palladino, 2000; Savage et al., 2006). For example, difficulties in mathematics and problem solving may arise less so due to working memory factors and instead due to limitations in the ability to filter or otherwise block irrelevant information (e.g., Passolunghi et al., 1999; Passolunghi & Siegel, 2001). In a review on the role of working memory in learning disabilities, Swanson and Siegel (2011) stated that individuals with learning disabilities often have severe attentional deficits in the form of general inhibitory functioning, sustained attention, selective attention, divided attention, and switching attention. Their argument was, relative to age-matched controls, individuals with learning disabilities struggle to allocate attentional resources on high demand tasks, struggle to maintain information in the face of attention-capturing events or general distraction, are more likely to report irrelevant nontarget words in a recall task, and are worse at selectively attending to the relevant features of primary and secondary tasks when put into a divided-attention situation. Finally, Fenesi et al. (2015) argued that researchers interested in education and learning often too strongly emphasize the short-term storage aspects of the Baddeley and Hitch (1974) multicomponent model. They instead suggested an increased focus on the role of attention control (based on the executive attention view of individual differences in working memory; Engle, 2002) and long-term memory (based on the embedded process model; Cowan, 1988, 1999) in education research—the former is precisely what we also argue here. One example provided by Fenesi et al. is that individuals with attention problems (such as attention-deficit/hyperactivity disorder; ADHD) have difficulties with reading comprehension, which can be misinterpreted as a working memory issue and therefore misdiagnosed as dyslexia, resulting in parents and teachers attempting to correct a suspected reading problem when the underlying issue is instead attentional in nature. Another example they offered was in the domain of mathematical performance, in which individuals with worse attention control are more drawn to superficially relevant, garden-pathing, and distracting aspects of the problem, particularly in word problems, hence affecting their ability to stay on track and solve the problem at hand. Highlighting relevant parts of a mathematical problem was shown to help students focus their attention and improve mathematical performance in those with attentional problems (Kercood & Grskovic, 2009).

Still, work in the areas of academic learning and achievement generally examines the role of working memory and storage-based deficits for learning difficulties and poor academic performance, with generally less emphasis on the more fundamental deficits in attentional abilities (unless the deficits are severe enough to be considered pathological, such as with ADHD). Potential remediations are therefore designed to address and target working memory more so than attentional deficits. These include working memory training, teaching of mnemonic and other memory strategies, reducing the working memory demands of classroom activities and assignments, and providing regular positive feedback (see Alloway, 2006; Cockcroft, 2015; Cowan, 2014; St Clair-Thompson et al., 2010). While targeted approaches such as memory-related strategy training and reducing the working memory load of material may be helpful, attention-based remediations would be expected to have more generalizable benefits. Attention deficits, such as those in individuals with ADHD, are more generalized because they manifest as a global problem with maintaining goal-directed thoughts, information, and behaviors as well as blocking inappropriate and/or now-irrelevant ones, whereas working memory deficits may involve more specific issues with maintaining, processing, and/or storing information. Because we argue that attention control is the basis for both working memory capacity and fluid intelligence, researchers, educators, and parents may find more success if they focus on the underlying attentional deficits in students. Attentional-specific interventions are more likely to produce broad and farther-reaching benefits for the child.

Cognitive training

The repeated demonstration that working memory capacity is correlated with a host of other cognitive behaviors has resulted in widespread testing of the hypothesis that training or otherwise boosting one’s working memory capacity ought to result in long-term improvements to general cognitive and intellectual functioning. The idea of cognitive training is certainly not new and was argued to be ineffective over a century ago (e.g., James, 1890; Woodworth & Thorndike, 1901), but researchers now had a new realm in which to test it—operating under the assumption that working memory capacity is a causal source of individual differences in other domains such as reading comprehension, mathematical skills, language ability, and fluid intelligence (cf. Melby-Lervåg et al., 2016). The holy grail of working memory training is to establish far-transfer, which is training-induced improvement on untrained and novel tasks of a different ability, to measures of intelligence. Some studies purported to find just that (e.g., Jaeggi et al., 2008; Klingberg et al., 2002), resulting in a flurry of scientific interest and funding devoted to working memory training as well as commercial brain training products which purport to improve cognitive functioning by way of working memory training.

Unfortunately, systematic reviews and meta-analyses have consistently shown a lack of evidence for far-transfer to intelligence after training neurotypical individuals (Dougherty et al., 2016; Melby-Lervåg et al., 2016; Melby-Lervåg & Hulme, 2013; Schwaighofer et al., 2015; Shipstead et al., 2010; Shipstead et al., 2012a, b; Soveri et al., 2017). It is often observed that the relatively few studies that report far-transfer to intelligence have severe methodological limitations (see Dougherty et al., 2016; T. L. Harrison et al., 2015; Melby-Lervåg et al., 2016; Rodas & Greene, 2021; Shipstead et al., 2010; Shipstead et al., 2012a, b; Simons et al., 2016), and that any training-induced improvements are typically limited to the tasks that were directly trained (or highly similar tasks) and short lived (e.g., Melby-Lervåg & Hulme, 2013; Soveri et al., 2017). In other words, the most robust finding is that people get better on the tasks they practice but not much else. As a result, researchers are highly skeptical that working memory training or brain training products can improve general cognitive functioning (see Simons et al., 2016). For example, Stojanoski et al. (2021) surveyed more than 8,000 individuals regarding their use of brain training programs and then administered a battery of cognitive tasks to each. They found no relationship between any of those cognitive measures and participant engagement (use and duration) in brain training, even for the most committed brain trainers and those who fully expected it to work. Despite the overwhelming evidence against working memory training, it still receives a good deal of interest from researchers and companies offering “brain training” programs continue to be highly successful.

Why working memory training does not work

A central challenge for the working memory training hypothesis is that, for training to be effective, it must first be shown that working memory capacity can indeed be improved through training, sometimes referred to as moderate or intermediate transfer (Harrison et al., 2015; von Bastian et al., 2013). That is, effortful and intensive practice and/or training on a subset of working memory tasks should lead to robust and lasting improvements in performance on another subset of untrained working memory tasks, and improvements must not be due to the application of highly specific strategies common across the two subsets of tasks (cf. Shipstead et al., 2012a, b). There is little reason to believe that the broad ability of working memory capacity can be improved after training, and thus evidence for moderate transfer is sporadic and inconsistent among the relatively few studies that properly assess it; some report strong moderate transfer (e.g., Holmes et al., 2009), some report moderate transfer for a subset of tasks but not others (e.g., T. L. Harrison, Shipstead, et al., 2013b), and many report no moderate transfer (see Shipstead et al., 2012a, b; Simons et al., 2016).

Another challenge is that even if moderate transfer could readily be achieved it must be the case that working memory has a causal influence on other cognitive abilities such as fluid intelligence. As noted by Shipstead et al. (2016), it was not clear why working memory capacity and fluid intelligence were so strongly related yet the underlying assumption that working memory capacity determined their fluid intelligence was rarely questioned (cf. Burgoyne et al., 2019; T. L. Harrison et al., 2015). As such, some scholars have been critical of the hypothesis that a more efficient working memory system causes higher performance on fluid intelligence tasks due to the ability to maintain more information in the form of partial solutions, hypotheses, and subgoals (e.g., Burgoyne et al., 2019; Cusack et al., 2009; Unsworth & Engle, 2005; Wiley et al., 2011). Alternatives therefore need to be considered. As discussed in the introduction, the theoretical framework that we are operating with is specifically that limitations in both working memory capacity and fluid intelligence are primarily due to individual differences in attention control.

Training narrower abilities and skills

While reviews have consistently found that working memory training is ineffective, it should be possible to teach and train specific strategies, which could result in desirable outcomes to the extent that those strategies are shared across different tasks and cognitive domains (see Bailey et al., 2008, for a discussion of the strategy affordance hypothesis). Ideally, individuals would also be able to modify and improve their strategies and apply them to novel situations, further increasing the benefit of strategy training. Several studies support this notion (e.g., Dunning & Holmes, 2014; Paas, 1992; Turley-Ames & Whitfield, 2003; Uttal et al., 2013; also see Ceci & Papierno, 2005), including some from our lab. In a training study that expectedly found no evidence of far-transfer, Foster et al. (2017) noted that there was evidence that spatial abilities were improved after training for individuals of higher ability, as variance in spatial ability increased after twenty days of training. This finding is consistent with Uttal et al.’s (2013) meta-analysis of 217 studies showing that the average effect size of spatial training was just shy of half a standard deviation and that training-induced transfer to untrained spatial tasks is routinely reported. In a study aimed at assessing the role of proactive interference in working memory training and transfer, Redick et al. (2019) reported strategy-specific benefits on transfer in tasks specifically involving letter stimuli. In another training study, T. L. Harrison, Shipstead, et al. (2013b) found that individuals who trained on tasks involving retrieval from secondary (long-term) memory performed better than controls on both complex and simple span tasks. They offered two possible explanations: either working memory training improved a component of working memory (such as secondary memory), or participants developed strategies that were applicable to some tasks but not others, thus resulting in sporadic training effects. In a nontraining study of strategy discovery and implementation in immediate free recall, T. L. Harrison, Hertzog, and Engle (2013a) reported two interesting findings. First, most participants were able to implement a particular organizational recall strategy after being informed of it. Second, the relationship between working memory capacity and memory recall was stronger after participants were informed about the organizational strategy than when no specific strategy instructions or information were given. This study further supported the viability of strategy-specific training and showed that such training may be particularly useful for higher ability individuals who are able to successfully execute said strategy and generalize it to other tasks (see also Ceci & Papierno, 2005).

In most of the above examples of strategy training, the goal was to train individuals on strategies specifically related to either general memory capacity (short term and long term) or modal information (verbal or visuospatial). This sort of memory training should still be viewed with skepticism, as it has not been established that training effects will transfer to other areas or produce real-world significance (e.g., Stieff & Uttal, 2015). This is summed up quite well in a pessimistic review of memory strategy training in children by Bjorklund et al. (1997), who argued that training children was often successful in teaching a strategy, but children rarely showed any measurable benefit of said training. We instead propose that training of strategies of an attentional nature has a higher chance to succeed than training ones related to memory capacity or modalities, as the prospect of chunking or mnemonic training resulting in widespread cognitive improvements remains dubious. There are a handful of studies showing benefits of training and interventions that are more targeted towards attention and self-regulation than capacity, but less attention has been given to this area from researchers who study individual differences in executive functioning. We think this area warrants further research, particularly given theoretical and methodological advances in the realm of individual differences in attention.

The potential for attention-based training

In clinical psychology, attention-based interventions and training are often shown effective by way of identifying individuals with a specific vulnerability and targeting remediation for that particular domain. This will be discussed in more detail in later sections, but relevant here is that attention-based training has also been shown to be effective in neurotypical individuals, most notably by Verbruggen and colleagues (Lawrence et al., 2015; Porter et al., 2018; Porter et al., 2021; Stevens et al., 2015; Verbruggen et al., 2012).Footnote 5 In a novel and adaptive gambling procedure, Verbruggen et al. (2012) offered participants monetary incentives for successful gambling decisions. On each trial of the gambling task, participants could choose to bet six different amounts, and participants were informed that their bets would be less successful the more they gambled. Before the gambling task, participants were trained for 30 minutes on a separate identification task in which they were occasionally (25% of total trials) prompted to either withhold responding just before they would normally execute an identification response (stop group) or make an additional keypress and then their normal identification response (double-response group). For the stop group, the authors reasoned that, “occasionally stopping motor responses should induce a general state of cautiousness that may propagate across cognitive domains. When preparing to stop, people make proactive adjustments and become more cautious in executing motor response” (p. 807). They further elaborated that any overlap between mechanisms regulating motor cautiousness and controlling gambling behavior could result in increased cautiousness transferring to other domains, such as encouraging risk-adverse behavior. After training and a short break, the two training groups and a control group (which received no training) performed 84 trials of a gambling task. As predicted, they found that the stop group took 10%–15% less monetary risk on the gambling task than did the double-response and control groups (that is, the stop-group made more cautious bets). They also found that training-induced cautiousness effects were still present when the gambling task was administered two hours after training. These findings indicate that even brief motor cautiousness training can influence gambling decisions, resulting in less risky gambling behavior. They suggested that future studies should examine these sorts of effects when using other forms of cautiousness and inhibition, for instance in the realm of speed–accuracy trade-offs. Of note is that working memory training studies often involve tens of hours of training on a multitude of cognitively demanding tasks, whereas Verbruggen et al.’s (2012) training was modest in terms of duration and demand. As such, we would expect larger training effects with more intensive and sustained training efforts, perhaps using a variety of procedures designed to improve different types of attentional behavior. To that end, it is not surprising that a follow-up study by Verbruggen et al. (2013) found that these training effects virtually disappeared 24 hours after training. However, more optimistically, a series of experiments by Stevens et al. (2015) replicated the training benefits induced by requiring participants to occasionally withhold responses, and importantly found that it was a consistent and reliable (but relatively small) effect that generalized to an untrained task. In a different sort of design, Porter et al. (2018; Porter et al., 2021) reported evidence that children make healthier food decisions when given food-specific inhibition training, and Lawrence et al. (2015) similarly reported that inhibition training resulted in healthier food-related decisions that were associated with self-reported weight loss six months after training. Still, this is a relatively unexplored area, and so the efficacy of attention-specific training is yet to be firmly established or refuted.

In summary, despite the vast resources devoted to working memory training this century, properly conducted studies and reviews clearly fail to show that working memory capacity can be increased or that subsequent far-transfer (particularly to intelligence) is possible. Narrowing the scope to memory-specific training such as chunking and mnemonic strategies offer some promise, but evidence is scant. Conversely, we argue that successful training and interventions within the realm of attention, if attainable, have the potential to produce farer-reaching positive benefits than memory-specific training. Several studies have provided tentative evidence in support of this view, and increased focus on attentional interventions and improved methods could provide a breakthrough in the coming years. It does, however, need to be stressed that we are by no means suggesting that it is possible to train and improve attention control at a broad or construct level, as there is no convincing evidence to suggest that this is attainable given current methods and scientific knowledge. Rather, our argument is that identifying and training various attention-related aspects such as impulsive behaviors is a more principled approach and more promising than the ones taken in many working memory training studies, given both our claim that attention control is a more fundamental aspect of cognitive performance and that some research supports the efficacy of attentional-based interventions. In this vein, we now turn to the clinical literature and some successful efforts to improve cognitive functioning in populations with specific deficits associated with symptoms of psychological disorders.

Psychological distress

Psychological distress has been identified as a leading cause of disability, morbidity, mortality, and economic burden (U.S. Burden of Disease Collaborators et al., 2018). Clinical researchers have identified deficits in “cognitive control” as a potential risk factor for a wide variety of psychological disorders (Barch, 2005; Gotlib & Joormann, 2010; Harvey et al., 2004; Keilp et al., 2013; Mathews & MacLeod, 2005). Koster et al. (2017) defined cognitive control as “executive processes that allow information processing and behavior to vary adaptively over time depending on current goals” (p. 80), similar to our view of attention control, but they conceptualized it as involving all three of the executive functions from the Miyake et al. (2000) framework (inhibition, shifting, and updating). According to Conway et al. (2021), cognitive control is a “broad construct that refers to the regulation of information processing during goal-directed behavior” (p. 7), and they also classified attention control as referring to individual differences in cognitive control. In the following sections, we will refer to the broader concept of cognitive control and then narrow it down to working memory and attentional mechanisms more specifically.

We discuss the role of cognitive control in clinical disorders organized by the latent class structure identified by Caspi et al. (2014). They used confirmatory factor analysis to assess the structure of psychopathology in a large sample (N = 1,037). They identified a tripartite factor structure, consisting of internalizing disorders (including anxiety disorders and depression), externalizing disorders (including substance use disorders and conduct disorder), and thought disorders (including schizophrenia, bipolar disorder, and obsessive-compulsive disorder). Similar to the general factor of intelligence (g), the authors also identified a higher-order p factor, which is defined as a broad (transdiagnostic) risk factor for psychological distress that was associated with greater impairment in function, increased heritability, more adverse childhood experiences, and more compromised brain function during early development (particularly deficits in self-control and emotion regulation). Further, the authors reported significant correlations between p and low self-control factor in childhood (r = .26), working memory as measured by the WAIS-IV in adulthood (r = −.18). and mental control measured by the Wechsler Memory Scale (3rd ed.; WMS-III) in adulthood (r = −.20). These factors were all significantly associated with internalizing, externalizing, and thought disorders when p was not included in the model but were no longer significant when p was added into the model.

Importantly, evidence in support of the p factor highlights the value of identification of underlying risk factors that may explain individual differences in transdiagnostic risk for psychopathology. Deficits in cognitive control may represent one such transdiagnostic risk factor. For example, difficulties in regulation of attention and concentration are listed as diagnostic criteria for several disorders in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; American Psychiatric Association, 2013; see Table 2 for a list of disorders that include difficulties related to concentration and attention among the diagnostic criteria). Additionally, several researchers have routinely found an association between cognitive control processes and many different disorders (i.e., demonstration of multifinality), including psychotic disorders and suicide (Barch, 2005; Gotlib & Joormann, 2010; Harvey et al., 2004; Keilp et al., 2013; Mathews & MacLeod, 2005). These findings support the view that dysregulation of cognitive control processes represents an intermediate phenotype, or heritable trait, that confers risk or resilience for the development of psychological distress (Nolen-Hoeksema & Watkins, 2011). Notably, the relationship between cognitive control and psychological distress is likely to be bidirectional, in that symptoms of psychological disorders negatively impair cognitive functioning and may maintain symptoms over time in addition to conveying risk for the initial development of symptoms.

Table 2 Disorders with symptoms related to impaired attention

Improvement in cognitive control may therefore represent a mechanism through which therapeutic interventions for a variety of disorders contribute to improvement in symptoms of psychological distress. For example, cognitive training programs have successfully been applied to the treatment of several different disorders, including schizophrenia, ADHD, mood disorders, anxiety disorders, and substance use disorders (see Keshavan et al., 2014, for a review). As such, researchers are increasingly exploring how interventions designed to improve cognitive control may be used to enhance therapy. The current section reviews evidence in support of the role of cognitive control in the development, maintenance, and treatment of psychological distress across disorders, including internalizing disorders, externalizing disorders, thought disorders, and neurodevelopmental disorders. We will also consider the role of cognitive control in the context of minority stress and racial trauma. Particular emphasis will be dedicated to internalizing disorders, which are among the most prevalent disorders (Kessler et al., 2005) and have robust literatures devoted to cognitive processing. We will argue that better understanding of the role of attention control, specifically, may enhance theoretical models and treatments for psychological disorders.

Internalizing disorders

Internalizing disorders, including anxiety disorders and depression, are characterized by behavioral overcontrol (M. Kovacs & Devlin, 1998) and emotion dysregulation (Aldao et al., 2010; Hostinar & Cicchetti, 2019). Emotion regulation refers to the ability to interact with emotions in a way that is consistent with personal goals (Gross, 2015). Cognitive processes are thought to play a central role in emotion regulation (Gotlib & Joormann, 2010; Joormann & Vanderlind, 2014). For example, effective cognitive emotion regulation, comprised of selective attention control and cognitive reappraisal abilities, has been identified as a source of resilience that may prevent the onset of psychopathology (Troy & Mauss, 2011). Cognitive dysfunction, on the other hand, may predispose individuals to engage in maladaptive emotion regulation strategies (e.g., emotional suppression) and impair the ability to engage in adaptive emotion regulation strategies (e.g., cognitive restructuring; Campbell-Sills et al., 2014; Joormann & Gotlib, 2010; Ochsner & Gross, 2005; Schmeichel & Tang, 2015). As such, assessment of attention control may enhance our understanding of how emotion dysregulation contributes to the development and maintenance of symptoms of internalizing disorders (and vice versa).

Improved cognitive control may be the mechanism through which interventions shown to effectively treat symptoms of internalizing disorders also improve emotion regulation. Transdiagnostic treatments for internalizing disorders, such as the Unified Protocol (Barlow et al., 2010), are thought to promote adaptive emotion regulation and improve symptoms via interventions that promote cognitive reappraisal, behavioral exposure, and mindfulness. Cognitive reappraisal is considered to be an adaptive emotion regulation strategy during which individuals make more realistic evaluations of situations, thoughts, and core beliefs. For example, a catastrophic thought, “If I don’t finish my part of the manuscript on time my first-author spouse will divorce me,” may be reappraised to be more realistic and/or helpful, “If my contribution to the manuscript is late, it is highly likely that my first-author spouse will be understanding, and the likelihood is low that it will negatively impact our relationship in the long term.” Cognitive reappraisal is considered to be a primary mechanism through which cognitive behavioral therapy, one of the most effective forms of treatment for internalizing disorders, contributes to symptom improvement (Smits et al., 2012). Notably, cognitive reappraisal may depend on cognitive control capabilities, particularly attentional processes (Gotlib & Joormann, 2010). For example, research has shown that larger working memory capacity was associated with greater ability to engage in cognitive reappraisal (e.g., Schmeichel et al., 2008). Exposure therapy, which has been shown to be effective in treatment of anxiety disorders, posttraumatic stress disorder, obsessive-compulsive disorder, and eating disorders, improves symptoms through promotion of “inhibitory learning,” which we will argue reflects attention control (Craske et al., 2014). Finally, mindfulness-based interventions have also been shown to be effective for treatment of internalizing disorders (Khoury et al., 2013). Mindfulness involves focusing attention to the present moment with acceptance and nonjudgment (see Vago & David, 2012 for a review of neurobiological mechanisms of mindfulness). Improvements in working memory capacity may represent a mechanism through which mindfulness contributes to improved emotion regulation (Corcoran et al., 2010; Vago & David, 2012). For example, research shows that mindfulness training may offer protection from declines in working memory capacity (measured by operation span task) which, in turn, prevents exacerbation in negative affect following exposure to a highly stressful environment (predeployment interval among active military personnel; Jha et al., 2010). We would argue that inclusion of measures of attention control could further clarify the mechanistic role of cognitive functioning in the relationship between therapeutic interventions and improvement in symptoms. Along these lines, several studies have demonstrated an association between worse antisaccade performance and symptoms of anxiety and depression (including trait symptoms and induced symptoms; reviewed by Ainsworth & Garner, 2013). Some studies have even reported improvements in attention control as a result of treatment response (Crevits et al., 2005; Malsert et al., 2012). The following section will explore the role of cognitive control, working memory capacity, and attention control in the development and treatment of anxiety disorders and depression.

Anxiety disorders

“Matt” presented to treatment as part of his performance improvement plan after receiving a poor evaluation at work. He indicated his work-related difficulties began after receiving a new assignment to a highly stressful and competitive environment. Matt noted his supervisor was a “micromanager” who would criticize every mistake of his. He began experiencing symptoms of generalized anxiety disorder which included frequent and excessive worrying about a variety of domains (e.g., finances, work, his health, loss of a loved one), that coincided with difficulty falling asleep, muscle tension, being easily fatigued, irritability, feeling “on edge,” and difficulty concentrating. Though he had since transferred to a more supportive working environment, he continued to worry constantly. Matt stated he started experiencing moments of his mind going blank and being unable to complete simple tasks. Meta-worries (worrying about the fact that he was worrying too much) exacerbated his symptoms and made it more difficult for him to focus on his work and more prone to making mistakes, which reinforced his worry. This scenario, loosely based on clinical experiences of the third author, highlights the dynamic interaction between cognitive control and symptoms of anxiety disorders.

Several theories of anxiety disorders indicate that impairments in cognitive control may lead to the development of symptoms of anxiety disorders (e.g., Mathews & MacLeod, 2005). For example, evidence from longitudinal research indicates that reactive cognitive control strategies may increase risk for later development of anxiety symptoms, whereas proactive or more goal-directed cognitive control may prevent symptoms (Troller-Renfree et al., 2019). In addition, Bredemeier and Berenbaum (2013) found that n-back performance predicted self-reported worry measured at a later time-point even after controlling for worry measured at Time 1. Further, Stout and Rokke (2010) reported a significant association between selective visual arrays and self-report measures of anxiety, rumination, and depression for those with low working memory capacity but not for those with high working memory capacity.

Cognitive processes have also been shown to maintain symptoms over time; for example, people with social anxiety disorder commonly engage in postevent processing following a social encounter. Postevent processing involves mental rehearsal of a past social encounter that emphasizes attention to potential threat or embarrassment (see Wong, 2016, for a review). Experimental manipulation of post-event processing conducted by Vassilopoulos and Watkins (2009) demonstrated how the nature of post-event processing, concrete or abstract, contributed to symptoms of psychological distress such that more abstract thinking exacerbated symptoms and more concrete thinking (reflective of more cognitive control) minimized them.

Notably, symptoms of anxiety may also negatively impair cognitive control. For example, Visu-Petra et al. (2014) report evidence that self-reported anxiety predicted performance on a digit span task measured 9 months later. Further, several experimental studies using anxiety induction strategies (Trier Social Stress Test and affective video clips) have demonstrated increased impairment in performance on working memory measures among participants in anxiety-induction conditions relative to controls (J. R. Gray & Braver, 2002; Oei et al., 2006; Qin et al., 2009; Schoofs et al., 2008). As such, the relationship between anxiety and cognitive control is likely bidirectional.

Reduced working memory capacity is often identified as one of the primary cognitive deficits exhibited in individuals with anxiety. However, some researchers have argued that cognitive symptoms of anxiety are best explained by impairments in attention control more broadly (e.g., Berggren & Derakshan, 2013). A meta-analysis by Moran (2016) provided the strongest evidence of this. Moran found that self-reported symptoms of anxiety were related to impairments in working memory capacity across a wide variety of measures, but the overall effect size was relatively small (g = −.33). On the other hand, the effect size was more than twice that when analysis was restricted to studies examining the relationship between anxiety and attention control (i.e., filtering efficiency; g = −.70). Another meta-analysis by Shi et al. (2019) also demonstrated a significant negative relationship between anxiety and attention control measured by a variety of tasks, including those with limited reliability such as the Stroop task (g = −.58). Further supporting the relationship between anxiety and attention control, several studies have shown that increased cognitive load impairs performance among anxious individuals to a greater degree relative to nonanxious individuals specifically when the secondary task requires activation of attentional processes (Eysenck et al., 2005; Hayes et al., 2008; Rapee, 1993; Stefanopoulou et al., 2014).

Theories highlighting the role of attentional processes on symptoms of anxiety disorders have informed the development of cognitive treatments. For example, many researchers have argued that biased attention towards threatening stimuli may contribute to symptoms of anxiety disorders (Armstrong & Olatunji, 2012; Bar-Haim et al., 2007; Cisler et al., 2009; Cisler & Koster, 2010; Mathews & MacLeod, 2005; Pergamin-Hight et al., 2015; Van Bockstaele et al., 2014; Weierich et al., 2008; but see Kruijt et al., 2019). Based on the assumption that attentional vigilance towards threat contributes to symptoms of anxiety disorders, researchers have developed interventions designed to train attention away from threatening stimuli, termed attention bias modification (see C. MacLeod et al., 2002). Unfortunately, these types of attention bias modification programs have demonstrated limited efficacy on symptoms (Beard et al., 2012; Cristea et al., 2015; Fodor et al., 2020; Mogoaşe et al., 2014; Van Bockstaele et al., 2014). Some researchers have used an emotional variant of the antisaccade task to clarify the nature of attention bias in anxiety (Chen et al., 2014; Derakshan et al., 2009; Jazbec et al., 2005; Liang, 2021; Reinholdt-Dunne et al., 2012; Wieser et al., 2009). Findings from these studies highlighted that attention bias in anxiety is multifaceted and more dynamic than hypervigilance towards threat. As such, attention bias modification programs that simply train attention away from threat may fail to adequately target the specific attentional mechanisms that contribute to symptoms of anxiety. Consistent with this view, Mogg and Bradley’s (2016) review highlighted that attention bias modification interventions may be improved by better targeting top-down processing and goal-directed inhibitory control (i.e., attention control) rather than simply training attentional avoidance of threat. For example, positive search training interventions instruct participants to search for positive images and ignore threatening image distractors (Dandeneau et al., 2007; De Voogd et al., 2014; Waters et al., 2013; Waters et al., 2015; Waters et al., 2016). Studies utilizing positive search training paradigms have demonstrated better efficacy relative to typical attentional avoidance training paradigms (Mogg & Bradley, 2016). These findings highlight how nuanced understanding of the specific cognitive processes that contribute to symptoms of psychological distress can better inform therapeutic interventions.

Exposure therapy is one of the most effective treatments for anxiety disorders and may promote reduction in symptoms by enhancing attention control. Exposure therapy helps people to overcome their fears by repeatedly facing them, in a variety of contexts, and without engaging in any subtle avoidance behaviors (i.e., safety behaviors). For example, a person with a specific phobia of spiders may confront their fear by engaging in progressively fear-inducing exposure sessions (e.g., looking at images and videos of spiders, walking up to a spider web and taking a picture of it, holding a spider in their hand), repeatedly and in a variety of locations and emotional states. They will also need to refrain from engaging in any safety behaviors, such as imagining the spider is not real or dissociating (mentally checking out) that may limit their ability to learn that the feared outcome is not as likely to happen, nor is it as insurmountable, as they expect. This new safety learning competes with initial fear learning and is termed “inhibitory learning” (Craske et al., 2014). Arguably, attention control is required to promote inhibitory learning (by refraining from engaging in safety behaviors) such that those with greater attention control capabilities may be more likely to benefit from exposure therapy, or they may benefit from it more quickly. The inhibitory learning that develops through exposure therapy may also promote attention control when someone with an anxiety disorder is faced with a feared stimulus, allowing for inhibition of a fear response. In support of this view, research shows that exposure therapy leads to increased activation of prefrontal brain regions associated with cognitive control and decreased activation of the amygdala and other brain regions associated with fear learning and threat-response (Bishop, 2007; Sotres-Bayon et al., 2006). Further, medications such as D-cycloserine and methylene blue that have been shown to enhance new safety learning during exposure (Mataix-Cols et al., 2017; Zoellner et al., 2017) have also been shown to repair working memory deficits in quinolinic acid hippocampal-lesioned rats (Schuster & Schmidt, 1992).

A common theme across these findings is that attention control in the presence of emotionally threatening stimuli may represent a mechanism in the development, maintenance, and treatment of anxiety disorders. Reliable measurement of deficits in attention control may be useful for identifying those at risk for development of later psychopathology (Hutton & Ettinger, 2006) and could potentially be used to inform preventative interventions. Further, interventions that promote more adaptive attention control strategies have promise to enhance treatments.


Symptoms of depression are similarly theorized to develop as a result of deficits in cognitive control (e.g., Siegle et al., 2007). In support of this view, longitudinal research of 4,192 participants in the National Longitudinal Study of Adolescent to Adult Health found that better working memory capacity was associated with decreased risk for later development of depressive symptoms (Crandall et al., 2018). Similarly, Kertz et al. (2016) found that preschool-age children who demonstrated poorer cognitive control on self-report measures of shifting and inhibition abilities consistently reported greater levels of depression and anxiety in subsequent assessments over the course of seven and a half years. Rodman et al. (2020) also provide support for the role of cognitive control, including evaluation of neurobiological mechanisms, in conveying risk and resilience for later development of depression among youths with a history of childhood maltreatment. Prospective studies have similarly found that deficits in cognitive control predict future symptoms of depression (Demeyer et al., 2012; Pe et al., 2016; Zetsche & Joormann, 2011). In addition, deficits in cognitive control have been identified among those at higher risk of developing depression (Joormann et al., 2007).

If deficits in cognitive control represent an underlying risk factor for depression, we would also expect these deficits to persist following recovery from a major depressive episode, particularly among those who experience recurrent episodes. Some studies have found that cognitive control deficits persist following recovery from a depressive episode (Demeyer et al., 2012; Joormann, 2004; Joormann & Gotlib, 2007; Levens & Gotlib, 2015; Paelecke-Habermann et al., 2005; Vanderhasselt & De Raedt, 2009). Other studies find evidence of deficits in cognitive control among participants who are currently depressed but not for those whose symptoms have remitted (Gotlib & Cane, 1987; Hedlund & Rude, 1995; Merens et al., 2008; Quigley et al., 2020). Reliable measures of attention control may help clarify these conflicting findings. For example, Vanderhasselt and De Raedt (2009) did not detect group differences in reaction times nor error rates on a Stroop task among participants who had never experienced depression relative to those with a history of depression. Measurement of conflict-related modulation abilities using event related potentials, however, reflected greater impairments among formerly depressed participants, particularly for those with multiple recurrences of major depressive episodes.

Meta-analytic studies further support the association between depression and deficits in cognitive control. A recent meta-analysis identified a small overall effect (g = −0.31) that was stronger with age (Dotson et al., 2020). Further, a meta-analysis of the association between depression and n-back performance revealed depressed participants performed worse relative to controls across varying levels of cognitive load (Nikolin et al., 2021). Of note, greater deficits in cognitive control, particularly with regard to attention control and working memory capacity, have been shown to be associated with history of suicide attempts among a sample of individuals with major depressive disorder or bipolar I disorder (Keilp et al., 2013). As such, improving cognitive control is likely important for improving the efficacy of current treatments for depression (Roiser et al., 2012; Siegle et al., 2007) and may even play a role in preventive interventions (Ronold et al., 2019). In addition, measurement of cognitive control may inform who is more likely to benefit from treatment. For example, research conducted by Tozzi et al. (2020) has supported the role of functional connectivity associated with response inhibition as a predictive biomarker for response to antidepressant treatment of major depressive disorder.

Limited cognitive control in depression is thought to be the result of hyperaccessibility of negative emotional content in working memory due to poor inhibition (i.e., impairments in attention control; see Gotlib & Joormann, 2010, for a review). Irrelevant, negative emotional stimuli may subsequently be more difficult for depressed individuals to suppress or intentionally forget (Hertel & Gerstle, 2003; Joormann & Gotlib, 2008; Power et al., 2000; Yang et al., 2016). Notably, there are robust literatures supporting biased recall of negative information (Mathews & MacLeod, 2005) and overgeneral autobiographical memory in depression (Williams et al., 2007). Hyperaccessibility of negative emotional stimuli in working memory may contribute to these negative biases in long-term memory (Gotlib & Joormann, 2010).

Rumination, or the tendency to engage in repetitive, self-focused, negative thinking patterns, has been identified as a key mechanism through which deficits in cognitive control contribute to symptoms of depression (De Raedt & Koster, 2010; Joormann & D'Avanzato, 2010; Joormann & Vanderlind, 2014; Mor & Daches, 2015). Rumination has been shown to be a proximal risk factor for a variety of disorders, including anxiety, depression, substance use disorders, and eating disorders (Nolen-Hoeksema & Watkins, 2011). Rumination is stable over time and has been shown to predict the onset of major depressive episodes (Nolen-Hoeksema et al., 2008). Rumination is also considered to be “an intensely attention-demanding process,” (Hertel, 2004, p. 187). The relationship between rumination and attention control has been reviewed by Whitmer and Gotlib (2013) and H. Roberts et al. (2017). Both reviews highlight several studies that have demonstrated correlations between rumination and difficulties inhibiting or disengaging from irrelevant information. Whitmer and Gotlib describe an attentional scope model of rumination, noting that “individual differences in attentional scope influence how likely individuals are to ruminate when they experience a negative mood” (p. 1053). As such, “attentional scope” is considered to represent a mechanism through which symptoms of depression relate to rumination. Roberts et al. note that valid measurement of inhibition is needed to clarify the nature of cognitive deficits in rumination, including examination of causality. We have argued that the antisaccade task may represent one such measure. Interestingly, De Lissnyder et al. (2011) found that reaction times in antisaccade were slower in individuals prone to rumination but not depressed individuals. The authors suggested that attention control may therefore be particularly important for informing underlying vulnerability to depression.

The relationship between rumination and impairments in attention control may be bidirectional. For example, rumination induction has been associated with more stereotyped counting responses in a random number generation task (Watkins & Brown, 2002) and with impaired performance on a standard Stroop task (Philippot & Brutoux, 2008). In contrast, evidence from an experiment that used cognitive bias modification to train participants to either engage in inhibition or to not engage in inhibition suggests that impairments in attention control may lead to rumination, which subsequently contributes to negative biases in long-term memory (Daches et al., 2019). Notably, the negative effects of rumination on memory may be reversed when participants are directed to complete a task that facilitates attention control during encoding (Hertel & Rude, 1991), even when presented with rumination inducing stimuli during encoding (Hertel et al., 2012). Attention training has also been shown to protect against rumination-related biases in long-term memory (Daches et al., 2019). These findings indicate that attention-targeted interventions have the potential to improve, and possibly prevent, symptoms of depression.

Similar to attention bias modification in treatment for anxiety disorders, cognitive bias modification and other training programs have been applied to the treatment of depression. A meta-analysis of cognitive training paradigms for depression (including data from nine randomized trials) found small to moderate overall effects on reduction of symptoms (g = .43–.72) as well as moderate-to-large effects on various measures of executive functioning (g = .67–1.05; Motter et al., 2016). A systematic review of cognitive control training paradigms for depression conducted by Koster et al. (2017) highlighted how different approaches to cognitive control training may enhance treatment effectiveness. Their review identified three factors which were associated with better training efficacy, including far transfer of training (e.g., reduction of symptoms and known risk factors of depression): (1) multiple sessions of cognitive control training, (2) inclusion of participants demonstrating deficits in cognitive control at baseline, and (3) incorporation of emotional stimuli in training paradigms. For example, the authors reviewed two studies conducted with clinical samples that demonstrated that cognitive control training improved emotion regulation and symptoms of depression (Siegle et al., 2007; Siegle et al., 2014). These improvements corresponded with better connectivity between prefrontal regions of the brain and the amygdala and, importantly, were maintained across time. Koster et al. (2017) also speculated that attention control may represent an essential component for maximizing the effectiveness of cognitive control training on symptoms: “Within training it seems key that individuals are engaged with training that demands activating frontal areas such as the [dorsolateral prefrontal cortex] which are implicated in attention control, while ignoring task-unrelated stressful thoughts” (p. 89). Taken together, research on these training paradigms demonstrates the applicability of attention control in the prevention and treatment of emotion dysregulation (e.g., rumination) and subsequent symptoms of depression.

Posttraumatic stress disorder

Posttraumatic stress disorder (PTSD) is unique in that the diagnosis is dependent on experiencing a traumatic event. Among those who experience trauma, relatively few go on to develop clinically significant symptoms (McFarlane, 2000). Several factors have been identified that may help explain why some people develop PTSD and others do not, including experience of traumatic events that are chronic, unpredictable, and perceived as uncontrollable (Mineka & Zinbarg, 2006). Cognitive factors may also inform risk and resilience to PTSD, and to that end several studies have established a link between cognition (measured prior to exposure to trauma) and resilience to PTSD following exposure to trauma (e.g., Brewin et al., 2000; Jha et al., 2010; Kremen et al., 2007; Macklin et al., 1998). Further, Kremen et al. (2007) found that genetic factors fully accounted for this cognitive risk or resilience. Notably, many of these studies focused on intelligence and so they offer limited insight into the underlying processes that convey risk and resilience for PTSD. Assessment of attention control may be important for clarifying risk and resilience since attention control is possibly the link between intelligence and PTSD.

Research from studies exploring the more specific relationship between PTSD and cognitive control have revealed mixed findings (e.g., B. L. Gillie & Thayer, 2014). Evaluation of deficits in attention control may help clarify understanding of the relationship between symptoms of PTSD and impairments in cognitive control. For example, Leskin and White (2007) found positive associations for inhibition tasks among participants with PTSD relative to controls, but not for measures of task-switching, alerting, or orienting.

The most effective treatments for PTSD include those that incorporate exposure therapy (Cusack et al., 2016). As was described above in the anxiety disorder section, exposure therapy is thought to improve symptoms through development of new safety learning that promotes “inhibitory learning” (Craske et al., 2014). Cognitive training programs also show promise for treatment of PTSD. For example, Schweizer et al. (2017) found that adolescents diagnosed with PTSD who completed affective working memory training demonstrated improved attention control, decreased symptoms, and increased utilization of adaptive emotion regulation strategies. Targeting attention control may also be important for early psychological interventions designed to protect against the development of PTSD following exposure to trauma, which, to date, have demonstrated limited efficacy (N. P. Roberts et al., 2019).

Racial trauma and minority stress

Early life stress such as childhood abuse, more commonly experienced in marginalized groups, has been shown to predict working memory impairments in adulthood regardless of clinical status (see Goodman et al., 2019 for a systematic review and meta-analysis). Similarly, sexual abuse, more commonly experienced by those who identify as women (Smith et al., 2018), and particularly by those who identify as transgender (Stotzer, 2009), has also been shown to correlate with working memory impairments independent from symptoms of PTSD (Blanchette & Caparos, 2016). More nuanced understanding of factors that contribute to psychological distress beyond social categories such as race or gender is necessary to facilitate culturally informed theories and treatment. For example, experiences of stigma and discrimination have been theorized to contribute to negative mental health outcomes (Meyer, 2003) and have been associated with several health-related indicators of chronic stress including cardiovascular health (Panza et al., 2019) and accelerated aging (Carter et al., 2019). Notably, researchers have also reported an association between chronic stress and deficits in working memory capacity (G. W. Evans & Schamberg, 2009; Mika et al., 2012). In addition, researchers have reported evidence of negative effects of discrimination on cognition, including impaired performance on a Stroop task (Bair & Steele, 2010; Salvatore & Shelton, 2007). These findings highlight the importance of considering how factors that negatively impact minoritized groups (such as racial trauma, discrimination, and minority stress), affect cognitive functioning and, in turn, increase risk for symptoms of psychopathology.

Stereotype confirmation concerns, defined as fear that one’s social behaviors will elicit judgments consistent with common stereotypes about a social group to which they belong (Contrada et al., 2001), represent one such potential mechanism that may explain negative health and mental health disparities among minoritized groups. The construct of stereotype confirmation concerns is similar to stereotype threat, defined as a reduction in task performance when a stereotype about an individual’s social group is made salient (Steele, 1997), but stereotype confirmation concerns are considered to be more enduring whereas stereotype threat is acute. Notably, working memory capacity has been shown to mediate the effect of stereotype threat on performance (Aronson et al., 1999; Schmader & Johns, 2003; Steele, 1997; Steele & Aronson, 1995). In addition, results from a survey of 353 adults who identified as lesbian, bisexual, or gay indicated experiences of discrimination, concealment, and internalized homophobia were positively associated with psychological distress, which, in turn, was significantly related to self-reported impairments in working memory capacity (P. C. Jones, 2017). Importantly, the synergistic effect of intersection of identities is important to consider. For example, the effect of microaggressions on working memory may be compounded by multiple salient identities that may contribute to stereotype threat simultaneously (C. Harrison & Tanner, 2018); for example a Black, immigrant, young woman and first-generation college student being told she should not get her hopes up for applying to medical school by a White American middle-aged man who is her academic advisor (multiple identities may be relevant in this situation). Measures of attention control may represent valuable tools for exploration of the role of intersectionality in increased risk for psychological distress. In addition, validation of psychological distress that results from discrimination in the form of microaggressions is theorized to lessen cognitive load and may facilitate rupture repair between a therapist and a client following a microagressive communication by the therapist (Gaztambide, 2012). Identification of evidence-based strategies to promote resilience in response to stigma and discrimination, both within and outside of therapeutic contexts, is sorely needed (Metzger et al., 2021). Psychometrically strong measures of attention control may reduce adverse impact compared with more commonly used cognitive measures (also see Burgoyne et al., 2021), and could potentially serve as valuable benchmarks of effectiveness for such interventions. We expand on this point in the upcoming section on psychological testing.

Externalizing disorders

Disinhibition, defined by deficits in self-regulation, is considered to represent a core vulnerability factor for externalizing disorders, including conduct and antisocial disorders, substance use disorders, and risky behaviors (see Mullins-Sweatt et al., 2019, for a review). Impulsive behaviors characteristic of people with externalizing disorders may result, in part, from deficits in the ability to keep potential future consequences of behavior in mind (Barkley, 2001; Finn, 2002). As such, working memory is implicated in behavioral disinhibition (Bogg & Finn, 2010; Finn et al., 2002; Grégoire et al., 2012), and has been hypothesized to represent a mechanism through which personality traits, such as disinhibition, contribute to externalizing behaviors and psychopathologies (Finn, 2002). Young et al. (2009), however, found that response inhibition (measured by the antisaccade task, stop-signal task, and Stroop) was a better predictor for externalizing symptoms relative to working memory updating and task shifting, highlighting the value of evaluating attention control.

Finn’s (2002) cognitive-motivational theory identifies underlying mechanisms including working memory that may increase risk to the development of alcohol-use disorder. Based on this theory, attention control may represent a mechanism for impulsivity/novelty seeking (difficulties resisting strong appetitive urges) and low harm avoidance (impairment in behavioral inhibition in response to punishment). In support of this view, worse performance on a go/no-go task were observed among individuals with early onset alcoholism who exhibited antisocial traits, but not for those without antisocial traits (Finn et al., 2002). Poor attention control was also associated with increased impulsivity/novelty seeking and low harm avoidance. Additionally, improvements in working memory capacity have been observed following abstinence and treatment for substance use disorder (Bell et al., 2017; Vonmoos et al., 2014), and working memory training has been argued to be effective in treatment for substance use disorder (Bickel et al., 2011; Brooks et al., 2017; Verdejo-Garcia, 2016). Effect sizes, however, have been moderate and cognitive interventions for substance use disorders may benefit from greater emphasis on attention control (Verdejo-Garcia, 2016).

Thought disorders

Findings from Caspi et al. (2014) demonstrated that thought disorders are best explained by individual differences in the p factor, which was strongly correlated with measures of cognitive control. Therefore, cognitive control may play a particularly important role in understanding of the development and treatment of these disorders. In support of this view, robust findings across more than 40 studies consistently found that people with schizophrenia tend to make more errors on the antisaccade task relative to healthy controls, including those with recent onset of symptoms and those who have never received pharmacological treatment (e.g., Harris et al., 2006; Hutton & Ettinger, 2006; Kleineidam et al., 2019; Radant et al., 2007). In fact, Hutton and Ettinger (2006) have argued that difficulties in performing antisaccade tasks may serve as an endophenotype, or indicator of genetic risk, for development of schizophrenia. Consistent with this view, impaired performance on antisaccade has been observed among non-disordered biological relatives of those with schizophrenia (Calkins et al., 2004) as well as those considered to have clinical high risk for symptoms of psychosis (Nieman et al., 2007). Additionally, deficits in attention control and working memory have been associated with increased genetic risk for development of schizophrenia in a twin study conducted by Cannon et al. (2000). Measures of attention control may also be valuable for informing evaluation of pharmacological interventions for thought disorders by serving as benchmarkers for improvement (Hutton & Ettinger, 2006; Lesh et al., 2011). For example, improved symptoms of schizophrenia from antipsychotic medication have been associated with improved Stroop task performance as well as functional connectivity in the anterior cingulate cortex (Cadena et al., 2019), a brain region associated with various attentional mechanisms such as conflict monitoring, error monitoring, and goal-directed behavior more generally (e.g., Devinsky et al., 1995; Weissman et al., 2003).

Neurodevelopmental disorders

Neurodevelopmental disorders, such as autism and especially ADHD, are uniquely characterized by deficits in cognitive functioning, including working memory and attention. Habib et al.’s (2019) meta-analysis of 34 studies exploring performance on working memory tasks among individuals with autism spectrum disorder reflected significant impairment in working memory capacity compared with healthy controls while controlling for age and intelligence. Effect sizes ranged from moderate (d = .56) to very large (d = 1.45). Researchers have also found evidence of impaired attention control among individuals with autism spectrum disorder (Minshew et al., 1999). Notably, recent work by Hendry et al. (2020) delineates how measurement of attention control may serve as an important indicator of risk for development of symptoms of autism spectrum disorder, ADHD, and functional impairments among infants.

Researchers have posited that impairments in working memory capacity and “response inhibition” may represent endophenotypes for ADHD (Aron & Poldrack, 2005; Castellanos & Tannock, 2002; Crosbie et al., 2008; McAuley et al., 2014; Nigg et al., 2018; Vaurio et al., 2009). Despite a pattern of mixed findings, meta-analytic studies evaluating group differences in performance on measures of working memory capacity revealed significant and sizable effects in both children (d = .69–.74; Kasper et al., 2012), and adults (d = .49–.55; Alderson et al., 2013). Alderson et al. (2013) discussed factors that may have contributed to the pattern of mixed findings: “A more rigorous operational definition that emphasizes attentional shifts between stimuli and the processing component of the task might have improved power to predict between-study [effect size] heterogeneity” (p. 298). This highlights the potential value that an increased focus on attention control could have for this literature. For example, treatment studies evaluating the effects of methylphenidate (a stimulant medication commonly used to treat ADHD) on attention control found evidence of improved antisaccade performance (C. Klein et al., 2002; O'Driscoll et al., 2005). In contrast, the effect of methylphenidate on attention control among nondisordered individuals is negligible (g = .20; Ilieva et al., 2015). In sum, incorporation of measures of attention control has the potential to advance our understanding of the development and treatment of neurodevelopmental disorders.

Summary of attention control and psychological distress

To summarize, across various presentations of psychological distress, measurement of attention control shows great promise in advancing understanding of risk and resilience as well as informing treatment. A particular strength of this approach is the broad, transdiagnostic applicability of attention control across disorders, suggesting that interventions targeting this construct may offer widespread benefit. Future research should explore the extent to which different therapeutic interventions contribute to improvements in attention control, how well benefits are maintained over time, and how improvements in attention control contribute to symptoms, functioning, and quality of life. We also argue that early identification of risk and preventative interventions may be made more feasible through assessment of attention control during early development and among high-risk populations. Cognitive training interventions targeting attention control may subsequently be utilized to prevent psychological distress prior to reaching clinical significance.

Psychological testing

Predicting real-world success

Psychological testing is often used in the real world in making important decisions such as school admittance; employee hiring and promotion; military personnel selection, placement, and enrollment into training programs; family court rulings; and criminal culpability (e.g., Amrein & Berliner, 2002; Erickson et al., 2007; Hartmann et al., 2003; Heilbrun, 1992; Nwafor & Adesuwa, 2014). A consistent finding in this area is that general mental ability (namely, psychometric intelligence) is the best predictor of academic performance, training success, job performance, and career potential (Bosco et al., 2015; Gottfredson, 1986; Kuncel et al., 2004; Ree & Earles, 1992; Schmidt et al., 2016; Schmidt & Hunter, 1998; Song et al., 2010)—hence, why many personnel selection tests place heavy demands on accumulated knowledge (crystalized intelligence) and reasoning ability (fluid intelligence).

Some researchers have advocated for the use of working memory tests either instead of or in addition to measures of intelligence for predicting relevant outcomes including academic achievement (Alloway & Alloway, 2010; Aronen et al., 2005; Cockcroft, 2015), performance in air traffic control (Ackerman & Beier, 2007), and multitasking ability (a proxy for job performance; Colom et al., 2010; Hambrick et al., 2010; König et al., 2005; Redick et al., 2016). Lemonaki et al. (2021) reported that reduced working memory capacity, but not intelligence, was associated with job burnout, and moreover that working memory capacity mediated the negative relationship between burnout and job performance.

The effectiveness of attention control measures more specifically has been less emphasized in the realm of psychological testing. But studies have shown that attention control is a good predictor of mathematical ability in preschools (e.g., Bull & Espy, 2006), predicts scholastic attainment and achievement about as well as working memory capacity (St. Clair-Thompson & Gathercole, 2006), and correlates with multitasking ability above and beyond both performance on the Armed Services Vocational Aptitude Battery (ASVAB; a widely used selection test administered by the United States Military) and intelligence tasks (Martin, Mashburn, & Engle, 2020).

Test fairness and adverse impact in high-stakes testing

Test fairness

A notable finding in psychological testing is that scores on many cognitive tests differ among various groups of individuals (Roth et al., 2001). This is of particular concern when scores are substantially lower for individuals of a protected group (based on sex/gender, race, ethnicity, etc.) and in high-stakes testing, in which the examinee’s test scores are likely to have significant and direct personal consequences (Amrein & Beliner, 2002; Burgoyne et al., 2021; Nunnally, 1964). As such, not only should standards of test reliability and validity be higher when tests are used in high-stakes situations (Nunnally, 1964), but practitioners need to be especially cognizant of the extent to which their measures may be inequitable for certain populations (Ceci & Papierno, 2005). When test scores have substantial subgroup differences, using them for selection purposes can result in adverse impact, which means that individuals are disproportionally selected over others based on race, ethnicity, sex/gender, religion, or other protected status (see Burgoyne et al., 2021; Schmitt et al., 1997; Zedeck, 2010).

The reasons for subgroup differences and adverse impact are debated and likely multifaceted but are thought to be largely due to systematic inequalities and societal marginalization. As a result, some groups of people have, on average, less socioeconomic status (Ryan & Siebens, 2012) and therefore less access to social resources such as quality schooling and education, supplemental instruction, nutrition, healthcare, and other opportunities for learning and enrichment (Bradley & Corwyn, 2002; Burgoyne et al., 2021; Outtz & Newman, 2010). Systematic inequalities and marginalization may also result in mistrust of authority, stereotype threat, test reluctance, and test suspicion, all of which could manifest as decreased motivation or ability to perform well on a psychological test (Arthur Jr. et al., 2002; Chan, 1997; Chan et al., 1997; B. D. Edwards & Arthur Jr., 2007; Hausknecht et al., 2004; Spencer et al., 2016; Steele & Aronson, 1995). Further, psychological tests are often normed and validated using largely homogenous and nonminority samples (Graham, 1992; Okazaki & Sue, 1995). Personnel selection tests are believed to be sensitive to systematic inequalities because performance to some extent relies on things such as accumulated knowledge (crystalized intelligence), language ability, and acculturated learning, of which individuals with lower socioeconomic status will generally have less due opportunity and circumstance (Burgoyne et al., 2021; Ployhart & Holtz, 2008; R. D. Roberts et al., 2000).

Attention control measures may improve test fairness and reduce adverse impact

Unfortunately, psychological testing cannot immediately solve great societal injustices. But it is still imperative from a legal, moral, and practical (e.g., economical) standpoint that researchers and practitioners strive to minimize the extent to which these differences are reflected in test scores, and consequently the extent to which tests result in adverse impact (see Burgoyne et al., 2021; Ceci & Papierno, 2005). Critical reviews and examinations have found that one of the most effective strategies for combating subgroup differences, and thus adverse impact, is to use noncognitive selection methods such as personality assessments (e.g., integrity and conscientiousness), biographical data, and structured interviews, as these methods can reduce or even eliminate subgroup differences and may improve prediction of job performance when used in addition to cognitive tests (Bobko et al., 1999; Newman & Lyon, 2009; Ployhart & Holtz, 2008; Pulakos & Schmitt, 1996; Roth et al., 2001; Schmidt et al., 2016; Schmitt et al., 1997; Sinha et al., 2011). However, adding noncognitive tests is not universally practiced because it results in increased administration time and some practitioners are concerned that using this approach reduces overall predictive validity (the so-called diversity-validity dilemma; see Campion et al., 2001; Ployhart & Holtz, 2008). Another way to improve test fairness is to more carefully select which cognitive tests are used, which can be easily combined with noncognitive methods. Specifically, some scholars have advocated for the use of tests which assess fluid abilities (including working memory capacity and attention control) over existing selection tests, as tasks of fluid cognition rely less on factors that correlate with socioeconomic status and instead are purer measures of cognitive potential (Bosco et al., 2015; Burgoyne et al., 2021; Held et al., 2014; Hough et al., 2001; Nelson, 2003).

A notable study in this area was Bosco et al. (2015) who tested a large sample of undergraduates and bank employees and found that “executive attention” (combined performance on operation span, reading span, and flanker) predicted supervisor ratings and a simulated job-performance task as well as the Wonderlic Personnel Test (an intelligence test often used for selection purposes). Further, subgroup differences in scores between Black and White participants were around 40% smaller for the executive attention scores compared with Wonderlic performance. Another study found that attention control predicted multitasking ability above and beyond fluid intelligence and ASVAB scores while also reducing Black-White score differences by also around 40% compared with the differences observed in ASVAB (Martin, Mashburn, & Engle, 2020; see also Burgoyne et al., 2021, for unreported analyses).

When considering the goal of reducing subgroup differences, attention measures seem to have a major advantage over working memory measures—simplicity. Consider the operation span task in which instructions and practice alone take more than 9 minutes on average (see Table 5 from Foster et al., 2015) in the standard administration and involve explaining multiple aspects of the task: the primary task (maintain the to-be-recalled items in mind), the response screen and how to properly report the items (click the letter in the correct serial position in which they occurred, also here is how to use the “blank” button to skip a letter in the sequence, and the “clear” button to reset your response), a secondary task (answer true or false to this arithmetic question), that they have a limited amount of time to respond to the secondary task (and what happens if they do not respond in time), and that they have to maintain at least 85% accuracy on the secondary task in order for their scores to be considered valid. Working memory tasks therefore often have strong linguistic demands and can be sensitive to modality and language effects (Mohapatra & Laures-Gore, 2021), strategy use (McNamara & Scott, 2001), and a host of other potential abilities and factors, depending on the task.

On the other hand, attention tasks usually have simple instructions such as “look away from the flash on the screen,” “indicate the color of the ink this word is in,” “indicate which direction this central arrow is pointing,” or “did this object change from when you saw it previously?” which place relatively few additional demands on the examinee. Consider the antisaccade task, which is incredibly simple to explain, easy to understand, and poses virtually no demand onto the respondent other than to retain and execute the instruction to look away from the flash. The task usually has just two response options (e.g., “O” or “Q”), but performance can be assessed purely through eye-tracking, therefore requiring no response from the participant other than to shift their eye gaze. Given this simplicity, and the maintenance/disengagement framework from which we are operating here, performance on attention control tasks should therefore be a more direct index of cognitive functioning (Mashburn et al., 2020; Shipstead et al., 2016) and maximally invariant to factors known to exacerbate subgroup differences, such as language skills, accumulated and acculturated knowledge, and strategy use (Bosco et al., 2015; Burgoyne et al., 2021; Kyllonen, 2002). Finally, it is plausible that attention measures would be less likely to induce stereotype threat and testing hesitancy because they do not have the same stigma attached to them as intelligence or memory-based tasks.

Applied psychology (human factors)

Applied psychology, sometimes known as engineering psychology or human factors, uses the methods and knowledge of experimental psychology to discover fundamental knowledge of human capabilities and limitations as they interact with technology. This knowledge is informative in refining theories of human cognition, but also in designing complex systems that are safer and easier to use, or to a lesser extent, determining selection criteria for operators.

Because the focus of most human factors research is to inform the design of systems and environments around typical human capability and limitations, most studies rely on the experimental approach. In the research that takes an individual differences approach, working memory is one of the most studied predictors of performance in real-world tasks (e.g., Chen & Terrence, 2009), as the need to maintain information for short periods of time while under cognitive load is highly relevant for complex, dynamic task situations. The potential role of attention control, as we have defined it, has received less interest.Footnote 6 Despite the lack of research examining the potential importance of the psychometric construct of attention control in human factors, we submit that (1) extant positive relationships between working memory and complex task performance are due more so to attentional factors than memory, and (2) more variance in complex performance would be predicted if future studies examined attention control over working memory capacity.

In the following sections, we review the relatively few notable studies linking working memory to different kinds of complex task performance in various domains, and then we elaborate further on the hypothesized role of attention control.

Use of automation and autonomous systems

Automation is when a machine carries out a task that was once performed by a human. For example, the blind spot warning systems in vehicles monitor areas that are difficult or impossible for the driver to view. Without this system, the driver would have to incur the additional workload of having to maintain a higher level of alertness. Another form of automation is the advanced decision support systems that can help users with complex decision-making tasks by integrating and summarizing information. For example, tools that help consumers decide the best insurance plan based on a multitude of factors. Prior research has shown that there are strong individual differences in people’s ability to successfully use automation (de Visser et al., 2010). A key goal in some recent work has been to try to understand the factors that contribute to these individual differences.

Working memory capacity has been identified as one of these important factors (Ahmed et al., 2014; Chen & Terrence, 2009; de Visser et al., 2010; McKendrick et al., 2014; Pak et al., 2017; Parasuraman et al., 2012; Rovira et al., 2017; Saqer & Parasuraman, 2014). In these kinds of studies, participants are usually paired with automation to carry out a workload intensive task, such as looking for targets in a simulated battlefield or calculating the best targets based on many conditions. The automated systems in these studies vary in their reliability, or the extent to which they are correct, and so human intervention is sometimes needed to avoid error. De Visser et al. (2010) found that working memory capacity strongly predicted performance on an automation task involving unmanned aerial vehicle targeting, and the authors reasoned that this relationship was because individuals with higher working memory capacity can better detect when the automation fails and then carry out the task manually. Conversely, individuals with lower working memory capacity are less able to detect automation errors and thus less likely to attempt to override the system, and, even if they do, are less capable of manually carrying out the task. In another study, Rovira et al. (2017) had participants engage in a targeting task where they had to determine how to pair simulated battalion units to targets. The optimal targets were based on several criteria (e.g., proximity to self, proximity to headquarters, value). The task, if carried out manually, was workload intensive but participants were given automation that could help them make the decision in different ways (i.e., simple integration support or complex decision-making support). They found that those with higher working memory capacity benefited from any kind of automation support, whether it was reliable or unreliable. However, individuals with lower working memory levels benefited only from reliable automation. Finally, converging evidence from neurogenetics has showed that individuals that possess genes thought to be related to worse working memory functioning showed greater impairments in performance when using unreliable automation (consistent with Rovira et al., 2017) than those that had genes associated with higher working memory capacity (Parasuraman et al., 2012).

However, other studies have failed to find a relationship between automation and working memory capacity. For example, Pak et al. (2017) tried to replicate the role of working memory capacity in automation performance in older (aged 60+) individuals. Participants interacted with an automated system of varying reliability and performed the operation span task. Contrary to previous findings with younger adults, they found that when the automation failed, working memory capacity was not associated with performance; older adults with higher working memory capacity performed as poorly as those with lower working memory capacity. These discrepant findings may simply be due to methodological differences between the studies (e.g., use of a single task to measure working memory capacity), but another possibility raised by Pak et al. is that individual differences in automation performance are influenced more by individual differences in disengagement than maintenance or other memory factors. More specifically, Pak et al. argued that a mechanism related to the attentional aspects of working memory, perhaps the ability to suppress irrelevant information or consciously direct attention, is more age sensitive. When interacting with automation, the automatic response is to follow the automation’s recommendation or allow it to take over (Mosier & Skitka, 1999). However, when automation fails, the operator must first detect that the automation has failed and then manually carry out the task. Detecting the automation failure, and then inhibiting an automatic response (i.e., proceed with automation’s recommendation) likely requires a certain degree of engagement and attention control. This notion is strongly suggested in a recent correlational study that found antisaccade performance was moderately correlated (r = .32) with the ability to detect when automation has failed (Foroughi et al., 2019). Clearly, further research is needed to clarify the roles of both working memory capacity and attention control in automation, and we would expect this research to reveal that attention control strongly predicts how individuals interact with automation.

Complex task performance

In complex and dynamic task environments, such as aviation, surface transportation, and healthcare, people must keep track of rapidly changing conditions to determine the best course of action. Air traffic control operators must maintain high levels of vigilance while communicating with pilots, while pilots must monitor instruments, the external environment, and aircraft controls. Similarly, when driving, the same processes are crucial in maintaining driving performance and navigating obstacles to reach one’s destination (Johannsdottir & Herdman, 2010). In these and many other complex and dynamic environments (e.g., healthcare; Gaba et al., 1995) it is critically dependent on the operator’s ability to perceive, comprehend, and use information from the external environment to project or predict future states.

In human factors, a concept often implicated as being critical for performance in complex and dynamic environments is that of situation awareness, which is comprised of the three processes of perception, comprehension, and projection (Endsley, 1995). One is said to have increased levels of situation awareness depending on how successfully they perceive their environment (Level 1), comprehend and understand these perceptions (Level 2), and are able to anticipate and project their future state (Level 3). Unsurprisingly, individual differences in cognition are expected to affect the probability of successfully achieving different levels of situation awareness. Perceptual and attentional abilities are primarily thought to apply to Level 1 situation awareness; long-term memory retrieval is thought to affect Level 2 (comprehension); and Level 3 is thought to be sensitive to working memory capacity as it requires holding and processing a large amount of information simultaneously to calculate or predict the near future of the system (Endsley, 1995). Working memory (specifically the storage/retrieval component; Durso & Gronlund, 1999) has been called a bottleneck in the development of situation awareness (Gonzalez & Wimisberg, 2007), and thus has elevated prominence in the literature, whereas attention is typically thought to primarily influence Level 1 (Endsley, 1995). Traditionally, situation awareness is measured via memory probes; for example, a pilot may be asked to recall the configuration of warning lights just before it is obscured or may have to report the locations of aircraft on a display just before probing. For this reason, many studies that examined and found a link between situation awareness and working memory have used storage as an explanation for the association (e.g., Johannsdottir & Herdman, 2010). It has therefore been difficult to localize whether situation awareness is associated primarily with memory components of working memory, or the ability to control attention more generally.

Gutzwiller and Clegg (2013) wanted to resolve some of these problems to more precisely measure the relationship between working memory and situation awareness (specifically, Levels 1 and 3). In their study, participants engaged in a firefig

hting management simulation where their task was to assign and dispatch fire engines to various fires. The dynamic simulation required participants to track the truck locations, wind direction, fire locations, and water refill areas. Level 1 situation awareness (perception of cues) was assessed by asking participants to report the location of current fires, truck locations, and wind direction. Although Level 1 situation awareness is thought to rely on perception, this method of measurement of situation awareness is memory-reliant. Level 3 situation awareness was measured as performance in predicting the location of future fires. Predictive performance was measured more indirectly (and in a less memory-reliant manner) by examining how often and for how long participants viewed critical fire engines (those that were relevant to a future, preprogrammed fire), and how long those trucks remained idle. Finally, participants completed various complex span tasks. The researchers found that working memory capacity was related to Level 3 situation awareness but not Level 1. The explanation for the relationship, according to situation awareness theory, is that higher working memory capacity enhances decision-making. This study demonstrated a clear relationship between working memory and situation awareness (see also Durso et al., 2006). However, the direct relationship between situation awareness and attention control was not tested. Indeed, most research into the cognitive components of situation awareness have focused on the role of working memory (e.g., Sohn & Doane, 2004). We would argue that, given the nature of situation awareness, especially Level 3, the ability to control attention would be more predictive than working memory capacity.

Other research has attempted to more directly link performance in dynamic situations with cognitive abilities, bypassing the concept of situation awareness. The hypothesis that working memory capacity plays an important role in flight performance has been supported by Wang et al. (2018) regarding pilot training success and Lopez et al. (2012) for predicting flight performance of expert pilots. The Lopez et al. study is particularly interesting because it is one of the rare studies to include direct measures of both working memory capacity and attention control. In their study, they had United States military pilots perform simulated flights during a period of 35 continuous hours of sleep deprivation (N = 10, which is a clear limitation). They then compared how well a computerized aviation test, operation span, and the psychomotor vigilance task (an intentionally boring and nonengaging measure of sustained attention) predicted performance in the flight simulator. While none of the expert pilots crashed the plane in the simulated flights, there were individual differences in performance levels among the pilots in the last 15 hours of sleep deprivation. They found that operation span and the psychomotor vigilance task were much more predictive of flight performance than the computerized aviation test, accounting for an impressive 58% of the total variance in flight performance in the sleep-deprived pilots. Of note is that both measures predicted unique variance in flight performance (15% for psychomotor vigilance; 11% for operation span). Lopez et al. concluded:

Performance of highly experienced and skilled pilots on the [sic] one of the most complicated and sophisticated flight simulators available was predicted by two simple activities: (a) recalling a string of letters while doing simple arithmetic and (b) noticing a change in a visual display. (p. 32)

With a larger sample and optimal measures (ideally, multiple measures for each construct), we would expect attention control to explain much more variance in these sorts of studies than working memory capacity.

Driving a vehicle on the road is one of the most common and accessible dynamic and multi-tasking situations. A good deal of research has documented associations between working memory capacity and aspects of driving performance (e.g., Louie & Mouloua, 2019; Ross et al., 2015; Wood et al., 2016). Specifically, Louie and Mouloua (2019) examined how distraction in a naturalistic driving scenario was associated with working memory capacity. They found that working memory capacity seemed to moderate the effects of distraction on driving performance, but the authors stopped short of presenting a causal mechanism for explaining what aspect of working memory might be related to driving performance. Other researchers have more directly argued that individual differences in attention are important for predicting driving behavior. For example, Mäntylä et al. (2009) suggested that underdeveloped prefrontal-mediated executive control functions among teenagers might explain their relatively high crash rates compared with adults. They had high school students complete six tasks measuring the executive functions described by Miyake et al. (2000; updating, mental set shifting, and inhibition), and the participants were then placed into a low fidelity driving simulator. They found that only updating, which included an n-back task, predicted driving performance (lane-keeping stability), which is unsurprising given that shifting and inhibition were measured primarily with difference scores and using tasks that have consistently been shown to be poor for correlational purposes (e.g., Draheim et al., 2019; Friedman & Miyake, 2017; Hedge et al., 2018; Paap & Sawi, 2016). In another study, Wood et al. (2016) theorized that failures in goal maintenance and inability to inhibit distraction may affect the ability to detect hazards while driving. They had participants perform a hazard perception task and a secondary (distraction) task and found that those who performed worse on operation span were unable to maintain performance on the hazard perception task and self-reported more attention lapses. They attributed their findings to individual differences in attention control.

Finally, another important area of research is which factors contribute to individuals making mistakes in dynamic and fast-paced environments, such as that of some workplaces. It has been shown that distractions in the form of interruptions have deleterious effects on postinterruption performance in workplace environments (e.g., Foroughi et al., 2014) and at great human cost: for example, interruptions are thought to be a direct cause of many fatal medical errors and aviation disasters (e.g., Anthony et al., 2010; Latorella, 1996; Trbovich et al., 2010; Weick, 1990; Westbrook et al., 2018). An important topic in this area is what role cognition has in both susceptibility to interruptions and ability to recover after interruptions occur. Attention control has been linked to resistance to interruption in work environments in a few studies (e.g., Tams et al., 2015 and a preregistered study, Mirhoseini et al., 2020), but more research has been devoted to the role of working memory capacity (e.g., Foroughi, Barragán, & Boehm-Davis, 2016a; Foroughi, Malihi, & Boehm-Davis, 2016b; Foroughi, Werner, et al., 2016c; T. Gillie & Broadbent, 1989; Westbrook et al., 2018). These studies found support for the idea that working memory capacity has a protective effect against interruptions. For example, Foroughi, Barragán, and Boehm-Davis (2016a) found that working memory capacity accounted for about 12% of the variance in errors in manual data entry following an interruption. While the short-term storage of information may be an important aspect of task resumption, it is more likely that the management of attention and place keeping would play a crucial role and explain more variance in performance following interruptions.

To summarize, we believe there is ample opportunity in the field of human factors to better understand the relationship between attention control and the performance in complex and dynamic settings. The use of traditional complex span measures of working memory shed light on the importance of the cognitive mechanism of maintenance but have limited the conclusions that can be drawn about the role of disengagement on complex task performance. As we have argued, we believe that attention control will be a better, more parsimonious predictor of performance even for complex tasks.

Performance in sports

Highly skilled sports performance is often assumed to be automatic (i.e., System 1 processing; Kahneman, 2011) through well-learned and overpracticed behavior and thus relatively free from the constraints of higher-order cognition (J. S. B. T . Evans & Stanovich, 2013). Furley and colleagues have instead argued that sports performance is dependent on several processes such as planning, imagery, tactical decision-making, skill acquisition, and performing under pressure—all of which are fundamentally affected by individual differences in working memory capacity, notably the attentional aspects more so than memory (Furley & Memmert, 2010; Furley & Wood, 2016; Wood et al., 2016). Supporting the role of working memory in motor movements, Buszard et al. (2016) showed that working memory differences were associated with differential strategy use in learning a novel tennis task. Specifically, those with larger working memory capacity tended to use a verbal-analytical strategy that aided in learning the novel motor task.

But even after a movement is well-learned, working memory and attention seem to play a role in athletic performance. Within the field of sports psychology, researchers have observed the phenomenon of “choking under pressure,” which is when an athlete performs worse than usual due to some stressor or pressure (e.g., time pressure, prospect of evaluation/observation, pivotal point in the match, or a high-stakes game). The effect is thought to be due to stress and anxiety that causes intrusive thoughts that disrupt performance (Baumeister, 1984), but is has also been explained by athletes devoting cognitive resources (i.e., attending) to previously automatized skills (Masters & Maxwell, 2008). Thus, if motor performance is reliant on working memory, disruptive activities that deplete working memory capacity should affect sports performance. It also leads to the ironic consequence that those with higher working memory capacity exhibit greater performance decrements when under pressure compared with individuals with lower working memory capacity (Beilock & Carr, 2005; Buszard et al., 2013; Sattizahn et al., 2016). This is likely because individuals with better cognitive functioning are usually able to devote additional cognitive resources to other activities that aid in their performance, but when they choke under pressure those additional cognitive resources are diverted to skills and actions that were previously automatic, thus bringing their performance levels down to the levels of individuals with lower cognitive functioning. Several studies have suggested that choking under pressure is primarily an attentional phenomenon. For example, Englert and Oudejans (2014) studied 53 semiprofessional tennis players and found that the relationship between anxiety and choking under pressure was fully explained by self-reported distraction. Choking under pressure is also observed in academic settings, and attentional factors have been proposed as the reason that individuals with anxiety perform worse in academic testing situations (see Beilock, 2007).

Working memory and attention are thought to play an important role in performing during fast-paced team sports as well. Athletes in team settings must divide cognitive resources to multiple relevant aspects of the game, ignore distractions, and manage their most immediate goals. Such tactical decision-making has been proposed to rely on the ability to control attention. To test this, Furley and Memmert (2012) took still images in basketball games where a player was holding a ball and had basketball experts rate the best next course of action for the player (i.e., whom to pass the ball) in each of the images. The participants (basketball players) then viewed each image and were asked to determine the best course of action (shoot, dribble, or pass). During the task, distracting auditory messages were presented through headphones. They found that players who performed worse on counting span were also worse at determining the best action in the task. The authors concluded that this demonstrated the importance of the enhanced ability to filter out distracting information in a complex sport task. In their second experiment, the authors sought to test how enhanced ability to select among competing response options would influence tactical decision-making. Pictures were taken from hockey games where a player was holding the puck in an offensive situation with many decision options to take next. As before, hockey experts rated each image and determined the best decision for each picture. The participants then viewed the images and had to quickly decide to shoot, pass, or make a solo effort (i.e., stickhandle). For some of the trials, a time-out was simulated by preceding the hockey image with tactical information about the upcoming image/play (a recommendation for which decision to take). Critically, the recommendation was valid only two-thirds of the time. They found individuals who performed better on operation span were also better able to adapt their decisions to new information (correctly managing competing response options), whereas those who scored poorly on operation span tended to blindly follow the time-out advice regardless of whether it was the correct course of action. In addition, when the time-out information suggested an incorrect course of action, participants who had high operation span scores were less likely to blindly follow the erroneous advice.

To summarize, several studies have reported that the ability to resist distraction, resist conflict, and accommodate new information can translate to better sports performance, which is consistent with the idea of goal maintenance and distractor disengagement (Shipstead et al., 2016). However, studies in this area often use complex span tasks when their hypotheses are based on the control of attention, often use only a single measure to index cognitive ability, and often rely on extreme-groups designs. The field of sports psychology is therefore a prime candidate to benefit from the recent advances in understanding and measuring individual differences in attention control—which would help determine to what degree sports performance is influenced by attention-specific factors as opposed to limitations in working memory and storage. In most cases, we would anticipate attention control to be a more predictive of athletic performance.

Police decision-making

Working memory and other cognitive variables are likely to play a role in many dimensions of criminal justice such as policing (e.g., Kleider et al., 2009), interrogation (e.g., Maldonado et al., 2018), jury decision-making (e.g., Goldinger et al., 2003), and even sentencing (e.g., Moore et al., 2008). Relevant, given the current American milieu, is when police officers must decide whether to use lethal force on a suspect. The decision to use a firearm is extremely time-limited and stressful, with multiple cues that can inform or distract. An insight by Kleider and Parrott (2009) was that most research had focused on situational cues to explain shoot/no-shoot behaviors rather than dispositional or cognitive factors. They theorized that individual differences in the ability to control attention (resist automatic responses to shoot) might be an important predictor of tendencies to shoot civilians who turned out to be unarmed and little threat to the officer. To test this hypothesis, Kleider and Parrott showed college students pictures of a male holding either a gun or a neutral object and asked them to make shoot/no-shoot decisions by pressing a key on a keyboard. They found that participants with lower operation span scores were more likely to shoot when the targets were holding neutral objects. This supported their hypothesis, but it again must be noted that they used a (single) working memory task whereas the hypothesis was that attentional mechanisms were responsible. Subsequent research has also shown that in a similar speeded computer-based shooting task, individuals with higher working memory capacity are more sensitive in detecting a weapon and are better able to adjust their response criterion to changing conditions compared with those with lower working memory capacity (Brewer et al., 2016).

Wood et al. (2016) noted that the bulk of the research in police decision-making was conducted in laboratory conditions using abstract experimental tasks, and so they tested whether findings could be replicated in a more complex and realistic situation in which such abilities would seem to be important: the decision to shoot while in a highly stressful situation. In their study, participants of either high or low working memory capacity were presented with a Stroop-like task embedded within a targeting task. Participants were presented with a color word along with four colored bullseye targets on each corner of the screen. Their task was to shoot the colored bullseye using a toy gun that corresponded to the written central target while ignoring the color the word was printed in. They also manipulated the level of anxiety (to simulate a more threatening environment) by falsely informing some participants that their performance would be shared with everyone in the study. In addition, those in the high-threat condition were told that if they made a mistake (i.e., shot the wrong target), they would be shot with the toy gun by the experimenter. The researchers found that the higher working memory capacity group had better shooting accuracy than the low working memory capacity group (in both congruent and incongruent trials) but also that the low working memory capacity group’s performance was more negatively affected by anxiety than the high working memory capacity group.

The studies described above support the notion that controlled attention is crucial in shooting behavior (for a review, see Kleider-Offutt et al., 2016). Currently, however, most studies have been carried out in laboratory settings and with non-police offers. In addition, virtually all studies have used various measures of working memory capacity to examine the link between attention control and shooting decisions. Just as with sports psychology, this provides an interesting opportunity for future research to use more refined and specific measures of attention control. Based on the laboratory studies discussed, the split-second decision to shoot or not shoot seems to be dependent on the shooter maintaining their primary goal and disengaging from distracting cues, and less on storage components of working memory capacity.

Concluding remarks

This article began with a description of the constructs of working memory capacity and attention control and the importance of both in explaining real-world behavior. Decades of research has established that individual differences in working memory capacity are clearly important for explaining real-world phenomena, but we believe it is time for applied researchers to devote more resources and focus on the role of attention control, which, in many cases, mediates the relationship between the task or ability being studied and working memory capacity. Future research will hopefully uncover which cognitive behaviors and real-world phenomena turn out to be more driven by attention control than working memory and, equally important, whether memory-specific factors can account for some phenomena above and beyond the role of attention.

Importantly, the present review was not exhaustive and covered only a few select areas within applied research. We believe that many other areas and topics not covered here that invoke working memory as an explanatory ability could also benefit substantially from reorienting to attention control explanations. For example, even after accounting for personality and demographic factors, individual differences in working memory and verbal memory have been shown to predict individual differences in COVID-19 social-distancing behaviors (O'Shea et al., 2021; Xie et al., 2020) and vaccine hesitancy (Batty et al., 2021). Of note is that Xie et al. (2020) used nonselective visual arrays to assess working memory capacity, and so it is an open question as to whether their results could be attributed more so to working memory capacity or attention control.

We also highlighted recent developments in the understanding of the nature of attention control which motivated this shift in emphasis, as well as recent developments on the measurement of attention which can facilitate the shift. We conclude by offering recommendations for investigators interested in assessing the role of attention in their own research.

General recommendations and best practices for conducting individual differences research

Note that most of these are broadly applicable to any individual differences pursuit, although they are particularly relevant for correlational research on attention control (for similar and partially overlapping recommendations for assessing intelligence in the laboratory, see Ackerman & Hambrick, 2020).

  1. 1.

    Maximize between-subjects variance (and therefore, reliability).

    1. a.

      Use measures with large effect sizes (e.g., Rouder et al., 2019).

      1. i.

        A large effect size means more variance. Studies on individual differences of attention control often fail because some reaction time measures have effect sizes on the order of 10–50 ms (see Earles et al., 1997).

    2. b.

      Sample from a broad population and consider corrections for restriction of range (Ackerman & Hambrick, 2020; Wiberg & Sundström, 2009).

      1. i.

        Many individual differences studies produce weak or null associations because the population is heavily restricted in range of abilities on or related to the construct of interest (e.g., Tsukahara & Engle, 2021). Conversely, in some cases it may be that that restriction of range overestimates true effect sizes (see Wiseman, 1967). It is therefore important to sample from as broad of a range as possibility, ideally in terms of ability as well as background and demographic factors.

    3. c.

      Match difficulty with sample ability (e.g., McBee, 2010).

      1. i.

        Floor and ceiling effects restrict variance and therefore result in attenuated correlations. Ensure that tasks are sufficiently difficult for your population to bring their performance below ceiling, but not too difficult such that many are at chance/floor performance.

    4. d.

      Avoid difference scores (e.g., Cronbach & Furby, 1970; Draheim et al., 2019; Hedge et al., 2018; Paap & Sawi, 2016)

      1. i.

        Difference scores, particularly those in attention measures such as Stroop, Simon, and flanker, tend to be low in effect size, low in reliability, and result in weak correlations. Only use difference scores if necessary and/or you are confident that effect sizes and reliability are sufficient. Even then, keep in mind that performance in the baseline condition still likely involves construct-relevant variance, and so taking a difference between two conditions will remove variance of interest and further attenuate correlations.

    5. e.

      Administer enough trials to enough participants (see Rouder & Haaf, 2019; Rouder et al., 2019).

      1. i.

        Although dependent on a number of factors, correlations do not typically stabilize until around 200–250 participants (e.g., Schönbrodt & Perugini, 2013). But equally as important is how reliable the measures are, which is related to the number of data points (i.e., trials) obtained from each measure. More trials are generally better, but if a measure requires hundreds of trials to achieve sufficient reliability, a concern is whether factors such as fatigue, overpractice, lack of motivation, and so forth will come into play, thus reducing validity. It is therefore important to use paradigms and measures which can reliably assess individual differences with relatively few trials. In tasks such as antisaccade and visual arrays, 50–100 trials are usually adequate. For typical versions of Stroop and flanker, even hundreds of trials may be insufficient to achieve satisfactory reliability (Rouder et al., 2019).

    6. f.

      Account for speed–accuracy relationships (Draheim et al., 2019; Hedge et al., in press; Heitz, 2014; Wickelgren, 1977).

      1. i.

        Speed and accuracy interact in several ways and can be a significant confound in a study. Depending on the nature of individual differences in speed and accuracy in a particular study, the presence of speed–accuracy relationships can either artificially increase power (and therefore cast doubt into findings) or decrease power. Avoid using solely reaction time or accuracy as the dependent variable if the other is also relevant for performing (e.g., most Stroop tasks). Do not assume that a simple instruction to “perform as quickly and accurately as possible” is sufficient to eliminate individual differences in speed–accuracy emphasis. Consider using integrated speed–accuracy measures (e.g., Hughes et al., 2014; Liesefeld & Janczyk, 2019; Vandierendonck, 2017, 2018, 2021), adaptive procedures which hold constant either accuracy or reaction time (Leek, 2001), nonbehavioral tasks, or tasks in which only either reaction time or accuracy is important for performing.

    7. g.

      Avoid extreme groups designs if possible.

      1. i.

        Extreme groups designs are still popular because fewer participants are required and they allow for group comparisons (i.e., ANOVA-based tests). However, they have several noteworthy limitations (see Preacher et al., 2005). If possible, use full-range correlational designs and a sufficient number of participants.

  2. 2.

    Use multiple measures for each underlying ability.

    1. a.

      A task is not a construct.

      1. i.

        Administering a single task does not mean that the underlying ability/construct has been measured, as performance in any single task is determined by multiple sources, including many construct-irrelevant factors. Do not frame results as though “attention control” or “working memory capacity” have been measured if only antisaccade or operation span were used. Use latent analysis, factor scores, and/or composite scores when possible.

    2. b.

      Ensure predictor and outcome variables are adequately measured.

      1. i.

        The psychometric properties of outcome measures are often given less consideration than the predicting measures. If possible, have multiple outcome measures as well as predictors.

    3. c.

      Diversify your construct.

      1. i.

        If possible, broadly sample each of your constructs. It is better to have a mix of modalities (e.g., verbal, spatial, auditory) and include different paradigms to measure an ability.

    4. d.

      Less is more.

      1. i.

        It is better to reliably measure a small number of abilities than to measure an array of them poorly.

  3. 3.

    Properly report.

    1. a.

      Define and operationalize your constructs.

      1. i.

        Be clear about what you are measuring, or at least what you think you are measuring. Introduce each concept/ability with a formal definition if appropriate.

    2. b.

      Properly report and calculate reliability (Green et al., 2016; Parsons et al., 2019).

      1. i.

        Too often, simple reliability estimates are not reported in correlational studies. It is critical to both report reliability estimates when possible and ensure that reliability calculations match how tasks are scored. For example, do not report internal consistency of incongruent trial performance for a Stroop task if the dependent variable is the difference in performance between incongruent and congruent trials.

    3. c.

      Report enough information about your measures.

      1. i.

        Report enough information for the reader to understand the measure and compare and contrast with similar measures. Not all “antisaccade” or “Stroop” trials are equivalent. This should include number of trials, nature and extent of practice, presence or absence of feedback, response deadlines, testing environment, how the dependent variable was calculated, and so on.

    4. d.

      When comparing two correlations, test whether they are statistically different from one another (e.g., Steiger, 1980).

      1. i.

        Researchers consistently fail to conduct and report a proper test of whether numerically different correlations are statistically different. Another common mistake is to assume that two correlations are different because one is statistically different from 0 and the other is not. For example, in a typical study, a statistically significant correlation of r = .20 will not be statistically different from a nonstatistically significant correlation of r = .15. Note that the more independent two correlations are from the third (shared) variable, the larger the difference between the two correlations of interest need to be to achieve a statistically significant difference.

    5. e.

      Do not equate statistical significance with meaningfulness (cf. Ackerman & Hambrick, 2020).

      1. i.

        With sufficient sample size, even weak correlations will be significant. What constitutes a meaningful correlation depends on the variables being assessed and theoretical considerations (look to the literature to gauge this). In behavioral individual differences studies of cognitive performance, a correlation under r = .20 (less than 4% of variance accounted for) is not likely to be meaningful, regardless of statistical significance.

    6. f.

      When discussing correlations, it is often useful to frame them in terms of variance explained.

      1. i.

        Although correlations should be reported, generally the more informative metric is how much variance the correlation represents. Because variance explained is the square of the correlation coefficient, numerical differences between correlations is more meaningful and impactful with larger correlations. As an example, the numerical difference between r = .50 and r = .60 is one-third that of the numerical difference between correlations of r = .00 and r = .30, but both cases represent the same difference in terms of explained variance (9%).

      2. ii.

        Note that the coefficient of reliability is a direct estimate of the percentage of true variance (see Hoyt, 1941; Kuder & Richardson, 1937). As a result, a reliability coefficient is the same as the estimated proportion of reliable variance (e.g., a reliability of .50 means that 50% of the variance is estimated to be reliable, and not 25% as one might assume).

    7. g.

      When possible, analyze and report how much variance is explained above and beyond other variables.

      1. i.

        Similar to accounting for the placebo effect in medical and clinical studies, the utility of a measure is not simply how much variance is accounted for in an outcome, but rather more how much incremental variance is accounted for above and beyond other predictors.

  4. 4.

    Consider opportunity cost and practicality.

    1. a.

      Administering tests requires resources, and often significant financial resources if administered for real-world purposes. Choosing which tests to employ requires consideration of more factors than just the amount of variance explained in the outcome variable. For example, if Measure A requires 10 minutes of administration time and correlates with an outcome of interest at r = .60 (36% variance), and Measure B takes an hour to administer and adds just 3% incremental variance explained to the same outcome, it may be better to administer Measure A instead due to cost in time and money with minimal sacrifice in validity. But for high-stakes testing and/or if small improvements in prediction can lead to significant cost savings (e.g., selecting personnel for long and expensive training programs; see Held et al., 2014), Measure B may be preferred.

    2. b.

      Also consider requirements of administering certain tests. One possible reason that intelligence tests are widely used for selection is that they are often convenient to administer. Many can be easily administered in multiple formats (computerized, adaptive, pencil-and-paper, verbally, mobile, etc.). Most measures of working memory capacity and attention control have limitations, such as requiring computerized testing because of the need for precise presentation timings, accurate recording of response times, controlling for visual angle, and so on.


  1. By “successful” we mean the group-level effects are easy to replicate across experiments and labs, performance on them can be altered in predictable ways with manipulation, and studies using these tasks have advanced the theoretical understanding of human cognition. See MacLeod (1991) for a review of reliable experimental effects found using the Stroop paradigm.

  2. The criteria we used to assess the relative strength of each attention task were as follows: (1) reliability (both internal consistency and test–retest), (2) average correlation with all other attention measures, (3) average factor loading when testing all possible tri-indicator combinations of the ten attention tasks, and (4) average correlation with Z-score composites of working memory capacity (three complex span) and fluid intelligence (Raven’s, number series, and letter sets).

  3. Intelligence is also widely used in applied research as both a predictor and an outcome. However, the scope of the present article is generally limited to the literature using working memory and attention as explanatory or predictor variables for real-world behavior, we will refer to intelligence only when necessary to facilitate the conversation regarding working memory and attention.

  4. While this practice may be questioned due to the body of research establishing the domain-generality of working memory capacity (e.g., Chein et al., 2011; Colom & Shih, 2004; Kane et al., 2004), evidence and utility for domain-specificity has been reported (e.g., Demir et al., 2014; Mackintosh & Bennett, 2003; Shah & Miyake, 1996). T. L. Harrison et al. (2015) noted that while the bulk of the reliable performance variance in working memory tasks reflect a domain-general and unitary underlying ability, roughly 1/3 of the variance in each task can be attributed to narrower abilities. It should also be noted that short-term memory tasks appear to involve more modality-specific and domain-specific processing than working memory.

  5. Rueda and colleagues (Pozuelos et al., 2019; Rueda et al., 2005; Rueda et al., 2012) have also demonstrated transfer using attention-based training in young children. The generalizability of these studies is difficult to evaluate due to methodological factors. Specifically, Rueda et al. (2012); Rueda et al., 2005) had less than 20 participants for each of their training groups, and Pozuelos et al. had around 30 participants for each of their three groups (two training, one active control). Pozuelos et al. did report far transfer to a fluid intelligence measure after attention training (but not in the control). Their effects were larger for the training group that received metacognitive assistance from a trainer in addition to the attention training. However, attention training (without the metacognitive assistance) did not result in transfer to their measures of verbal intelligence, composite intelligence, or working memory.

  6. “Attentional control” is a term widely used in human factor studies (e.g., Chen & Barnes, 2012; Levulis et al., 2018), but in these studies it is usually measured as a self-reported preference for multitasking and other related concepts (Derryberry & Reed, 2002). Our view of attention control is much broader and is more akin (but not completely synonymous) to what is variously known as cognitive control, self-regulation, inhibitory control, or executive functioning. Attentional control is also a term used outside of human factors, and in some contexts it is close to the present definition of attention control.