The objective of the present article is to address the issue of the relationship between working memory (WM) capacity and within-task intraindividual variability (IIV) in reaction time (RT) tasks, investigating whether this relationship might vary across the lifespan. In this introduction, we will briefly present our approach to WM—in conformity with the demand of the editor of this special issue—and then discuss why a potential relationship can be hypothesized between WM and IIV and whether individual differences in WM could be related to IIV.

As a number of researchers interested in developmental and/or individual differences in WM, we consider the performance in WM tasks to be determined by a set of underlying attentional processes. These processes, such as the activation and inhibition of relevant informational units, are considered to become more efficient with age, accounting for part of cognitive development in children, and to vary in efficacy across individuals. We consider WM to be an index of what has frequently, and rather loosely, been labeled general resources. WM capacity, while remaining severely limited, increases with age both because of an increase in neurobiological mechanisms and because the use of resources becomes more efficient with experience. There is here a major point of departure between cognitive experimentalists, on the one hand, and developmentalists and differentialists, on the other hand. The first group is interested in the functioning and underlying processes of WM and suggests the existence of a WM system; the latter researchers are primarily interested in the capacity of WM, its changes with age and across individuals, and its predictive character with respect to other, more complex aspects of cognition. According to that second perspective, the processes and mechanisms underlying WM are not necessarily specific to a WM system. Baddeley and Hitch’s (1974) initial WM model, which proved so fruitful in yielding an enormous amount of experimental studies, is the leading model of the first type of approach, whereas Pascual-Leone’s (e.g., 1987) approach and, more generally, neo-Piagetian models are representatives of the latter perspective. A common ground for these two lines of research and theorizing might not seem easy to identify; it has, however, been suggested that the central executive (CE) component of the Baddeley and Hitch model, which is explicitly defined as an attentional component, provides such possibilities (Baddeley, 1986; Baddeley & Hitch, 1994). The further developments proposed by Baddeley (e.g., 1993) in the definition of the CE have brought this component very close to the North American tradition of WM research, as well as to neuropsychological studies of executive functions (see de Ribaupierre, 2000). Furthermore, Baddeley (2000) suggested that the introduction of the episodic buffer provides a better basis for tackling the more complex aspects of executive control in WM. Nevertheless, Baddeley and Hitch (2000) considered that their model and Pascual-Leone’s are independent views and should continue “developing alongside each other,” but might “in due course reach a common solution” (see also de Ribaupierre & Bailleux, 2000; Pascual-Leone, 2000). We share this vision, but consider that these two approaches can already be combined, at least from an empirical perspective.

The present article falls within the developmental and individual-differences perspective; in particular, we do not view WM as an independent cognitive system, but consider that WM tasks index attentional processes that are also at work in other tasks. Indeed, more specific to a developmental and individual-differences perspective like ours is the proposition that WM reflects the interplay of activation and inhibition mechanisms shared with other cognitive functions, and that differences in WM capacity reflect differences in those mechanisms. These common mechanisms might therefore explain the close relationships that WM entertains with executive functions and with other broad cognitive functions, such as fluid intelligence (Gf). Neo-Piagetians, in particular Pascual-Leone, were precursors not in proposing the existence of a set of attentional processes labeled WM (e.g., Broadbent, 1958), but rather in suggesting that cognitive development is in large part explained by an increase in WM capacity (Case, 1978, 1985; Case, Kurland, & Goldberg, 1982; Dasen & de Ribaupierre, 1987; de Ribaupierre, 1997; de Ribaupierre & Bailleux, 1994, 2000; Pascual-Leone, 1970, 1987, 2000, 2011; Pascual-Leone & Goodman, 1979; Pascual-Leone & Johnson, 2005). Among the neo-Piagetians, Pascual-Leone offered the most detailed account of how attentional capacity, as measured by WM tasks, is determined (e.g., Pascual-Leone, 1987). He suggested that three mechanisms act in combination: (1) M power, a mechanism boosting the activation weight of a number of informational units (chunks or schemes); (2) an inhibition mechanism (I) that serves to deactivate, or diminish, the saliency of misleading or distracting stimuli; and (3) executive schemes that serve to direct M and I activation (or deactivation, as concerns I) onto the relevant schemes. Other mechanisms have also been postulated in the complete and complex picture that Pascual-Leone has given of cognitive development, but the interplay of these three mechanisms is the crucial one from a developmental and differential point of view. All three mechanisms change with development in terms of both size and efficacy, with executive schemes being more closely dependent on previous experience than are M and I, which rely on neurobiological substrates. Individual differences, within a given developmental level, are to be explained mainly by differences in inhibition and executive schemes.

Among the various models that have been proposed more recently in cognitive psychology, Cowan’s (2005) model is in our view very close to those neo-Piagetian models, as well as to the perspective that we adopt in the present article. For Cowan (2005), WM is not a system structurally different from long-term memory. It contains a small amount of information that must be kept available for use in cognitive tasks, consisting of both automatically and attentionally activated information, the latter of which is in the focus of attention. Cowan (2001) has suggested, on the basis of a number of task analyses, that the upper limit of the focus of attention is relatively invariant across situations (a proposal very close to the neo-Piagetian one) and amounts to three or four items in adults (vs. a maximum of seven independent items in Pascual-Leone’s theory, and four complex units in Case’s model). For Cowan et al. (2005), measures of this capacity correlate well with intelligence and aptitude tests, and the size of the focus of attention increases with age.

From a differential perspective, Engle and his collaborators have empirically shown for a number of years that high-span individuals, as assessed by WM tasks, differ from low-span individuals in terms of attentional resources, in particular as concerns their efficacy in inhibiting irrelevant information and maintaining the task goals in mind (Conway, 1996; Conway & Engle, 1994; Engle, Cantor, & Carullo, 1992; Engle, Conway, Tuholski, & Shisler, 1995). Engle and colleagues have also shown, in young adults, that WM tasks are general, rather than specific, in nature and correlate with—or probably even determine—performance in terms of other cognitive functions, such as executive functions or fluid intelligence (Conway, Kane, & Engle, 2003; Engle, 2002; Engle, Kane, & Tuholski, 1999).

Lifespan and cognitive-aging researchers have also emphasized the important role of WM in older adulthood. In all studies, older adults present lower WM capacity than do young adults (Park et al., 2002; Salthouse, 1990, 1992, 1994, 2010); this shows particularly in complex, and/or nonverbal WM tasks. To our knowledge, no detailed account of how and why WM capacity decreases with age in adulthood has been presented, which would provide a counterpart to the neo-Piagetian models of the growth of WM during childhood. Hasher and colleagues (Hasher & Zacks, 1988; Stoltzfus, Hasher, & Zacks, 1996) have suggested that a decrease of the efficacy of inhibition might be responsible for lower WM performance. They have provided abundant evidence of less efficient inhibition in older adults, but have not empirically demonstrated the link between WM and inhibition. WM is also related to other cognitive functions in older adulthood, in particular with processing speed, known to be one of the most universal functions that declines with age. For Salthouse, for instance, WM is related to both processing speed and fluid intelligence (Kail & Salthouse, 1994; Salthouse & Pink, 2008; but see also Fry & Hale, 1996, 2000). Yet, once processing speed is controlled, age differences in WM are greatly, if not totally, decreased (Salthouse, 1996, 2010). Similarly, age differences across the lifespan (in children, young adults, and older adults) in WM tasks have been shown to be largely accounted for by age differences in processing speed and inhibition (Demetriou et al., 2013; de Ribaupierre, 2002; de Ribaupierre, Fagot, & Lecerf, 2011).

The empirical link between WM and processing speed therefore seems well established, at least as concerns age differences. The studies that have best demonstrated this link are those that used relatively complex processing-speed tasks, such as Salthouse’s (1992) tasks, in which time is limited and the number of correct items is counted. Fewer studies have relied on RTs in experimental cognitive tasks and related them to performance in psychometric tasks. There have been very few theoretical explanations for this link. The relation between WM and processing speed might depend on a common cause, which may present a biological substrate (Baltes, Staudinger, & Lindenberger, 1999). It can also be theorized that speed and activation power, as the latter has been proposed by Pascual-Leone, are two faces of a same coin, as we have suggested elsewhere (e.g., de Ribaupierre, 2000; de Ribaupierre et al., 2011): Faster processing allows for more items to be maintained/processed in WM during a short time frame, leading to a larger span that can be measured in a WM task; thus, an increase in speed, due for instance to more cerebral connections, would drive span. Reciprocally, a larger WM capacity, which itself is probably due to a larger number of simultaneously activated connections, allows more items to be processed simultaneously, leading to a faster RT. In this case, span accounts for speed. It is, at least currently, not possible to disentangle this relation and establish which is the cause and which is the effect (more activation or faster speed of activation).

All of the studies presented above focused on “average” performance; that is, for each participant, the mean RTs and/or mean number of correct responses were recorded. However, several experiments have shown more recently that intraindividual variability (IIV) in RT or accuracy tasks should provide important information about cognitive functioning. Following this line of research, the present article focuses on IIV in processing speed—that is, on moment-to-moment fluctuations or oscillations within a given task—which has been demonstrated to constitute a reliable marker of cognitive health. Interest in IIV has greatly increased in recent years and developed in various directions, notably in psychometrics and in aging studies. Thus, Jensen (2006), who has been a long-time proponent of looking at cognitive speed for the comprehension of intelligence, has insisted on the theoretical importance of IIV. For Jensen, IIV is more important, from a theoretical standpoint, than measures of central tendency, and is to be considered as a more basic variable than the median response time, even though both indices are very highly correlated. He argued that this is because the hypothesized cause of IIV is the oscillation of a neural excitatory potential: A slower oscillatory rate (longer period) has been demonstrated to result in a larger standard deviation (IIV). Jensen also developed the advantage of faster rates (i.e., less variance in RTs) in higher cognitive processes; in particular, due to the rapid decay of information, a rapid rate puts less strain on WM capacity, limiting the risk of the breakdown of information, which in turn causes errors.

Higher IIV has indeed been observed in pathological subgroups showing lower cognitive functioning (Bodling, Denney, & Lynch, 2012; Burton, Strauss, Hultsch, Moll, & Hunter, 2006; de Frias, Dixon, Fisher, & Camicioli, 2007; Hultsch, MacDonald, Hunter, Levy-Bencheton, & Strauss, 2000; Murtha, Cismaru, Waechter, & Chertkow, 2002; Shammi, Bosman, & Stuss, 1998; Stuss, Pogue, Buckle, & Bondar, 1994). Numerous studies have focused on older adults, and a few on children. Basically, age differences in IIV present the inverse pattern of the U-curve observed for mean general cognitive functioning and WM across the lifespan, with children and older adults showing greater IIV than do young adults (Li et al., 2004; Williams, Hultsch, Strauss, Hunter, & Tannock, 2005; Williams, Strauss, Hultsch, & Hunter, 2007). For Rabbitt, Osman, Moore, and Stollery (2001), the mean RT is essentially a summary of variability in RTs; they consider that IIV is more basic than mean speed. These authors also observed that greater variability was associated with lower fluid intelligence (Gf). There is still some controversy with respect to age differences in older adults. Note, however, that those studies that have reported a lack of age differences in IIV either have relied on accuracy scores rather than RTs in experimental tasks (e.g., Robertson, Myerson, & Hale, 2006) or have studied variability in a task repeated within a short period of time; in the latter case, IIV was computed on relatively few occasions and/or on the number of correct responses within a limited time rather than on RTs (e.g., Salthouse, 2007; Salthouse & Berish, 2005; Salthouse & Nesselroade, 2010).

The theoretical explanation of greater IIV has not been very articulated yet. As we just mentioned, both Jensen (2006) and Rabbitt (Rabbitt et al., 2001) suggested that variability is more basic than the mean response rate. Individual differences in IIV seem more predictive of other individual differences. This could also be because extreme responses (i.e., a more skewed distribution) better distinguishes individuals from one another; the worst-performance rule, according to which intelligence (g) is predicted better by worse-performance than by better-performance trials, goes in the same direction (Coyle, 2003; see also Fernandez, Fagot, Dirk, & de Ribaupierre, 2014). This does not yet explain IIV, but does demonstrate that it should no longer be neglected. Hultsch and colleagues have proposed that IIV reflects fluctuations in attentional processes (Hultsch & MacDonald, 2004; Hultsch, MacDonald, & Dixon, 2002; Hultsch et al., 2000; Hultsch, Strauss, Hunter, & MacDonald, 2008; Strauss, MacDonald, Hunter, Moll, & Hultsch, 2002). This attentional-lapse hypothesis is the one that has received the most attention currently, providing one more reason for examining the relation between IIV and WM.

Because we were interested in comparing age differences in IIV across the lifespan and across various tasks, we conducted a study with children, young adults, and older adults, using a number of RT tasks varying in complexity, as well as WM tasks. In the present article, we address the question of whether WM capacity is related to IIV and/or to the mean in RT tasks, and whether this relation varies over the lifespan. To achieve this goal, two complementary sets of analyses were used: (a) an “extreme-group design,” on the model of a number of studies conducted by Engle and collaborators, in which low-span and high-span individuals are compared, and (b) a multivariate design, in which the relations between RT tasks (mean and IIV in RTs) and WM tasks were analyzed. In the first approach, we distinguished high-span versus low-span participants within each age period (childhood, young adulthood, and older adulthood) on the basis of two WM tasks. Specifically, we were interested in determining whether high-span individuals would be less variable than low-span individuals. Indeed, if both WM capacity and IIV depend on attentional resources, one would expect that those individuals with a larger WM capacity would present fewer attentional lapses; they would then be less variable from trial to trial. In other words, one would expect high-span individuals to present a smaller IIV in RT tasks. This would also account for age differences in IIV. However, because RT tasks are often relatively simple, differences between high- and low-span individuals might be found only in more complex tasks, which exert a stronger constraint on attentional resources. In the second approach, using both hierarchical regression analyses and structural equation modeling, we asked whether IIV and mean RT both relate to WM, again within each age period, but this time relying on the total sample. The question amounts to asking whether these two indices are partly, or perhaps fully, redundant with respect to variance in WM.

The sample consisted of five age groups: younger children (9–10 years of age), older children (11–12 years), young adults (20–30 years), young-old adults (60–69 years), and old-old adults (70 years and above), to whom two WM tasks and six experimental RT tasks were administered. For both sets of analyses (comparison of high- vs. low-span individuals, or multivariate analyses on the entire sample), IIV in the RT tasks was computed on the basis of the intraindividual standard deviation (iSD), controlled for age group; this control made it possible to focus on individual differences in IIV within each age group, independently of the large age differences in RT tasks (Hultsch et al., 2002; see the Method section). We hypothesized that low-span individuals would be more variable than high-span individuals in the RT tasks, but that the difference would be more marked in the children and older adults than in the young adults, and in the more difficult conditions (e.g., choice RT tasks) than in the easier condition (simple RT task). The multivariate analyses, in which all of the tasks were analyzed together, were more exploratory. The question, in this type of analysis, is not the magnitude of IIV, but rather how the three sets of variables organize the individuals, and whether this organization varies with age.

Method

Participants

The sample consisted of 100 children from 9 to 10 years of age (mean 9.5 years ± 0.50; 38 females, 62 males), 101 children from 11 to 12 (mean 11.50 years ± 0.50; 54 females, 47 males), 137 young adults from 18 to 30 (mean 21.71 years ± 2.53; 117 females, 20 males), 117 young-old adults from 59 to 69 (mean 64.82 years ± 2.68; 90 females, 27 males), and 102 old-old adults aged from 70 to 89 (76.15 years ± 4.65; 75 females, 27 males). The total sample thus consisted of 557 participants. Because multivariate analyses were also conducted, only the individuals who had a fully complete set of data were retained. The final sample was therefore composed of 539 individuals. The characteristics of this sample are provided in Table 1.

Table 1 Sample characteristics

Children were recruited from urban primary schools in Geneva with the authorization of the Department of Public Instruction of the Canton of Geneva. The young adults were undergraduate psychology students at the University of Geneva, participating for course credit. The older adults were volunteers recruited from the community, either from the University of the Third Age of Geneva or through newspaper and association advertisements for pensioners. All participants were native French speakers or were fluent in French (more than 5 years of practice) and had normal or corrected-to-normal vision.

The study was approved by the ethics committee of the Faculty of Psychology and Educational Sciences of the University of Geneva. All adult participants gave written informed consent, and the older adults received a small amount of money as compensation for their transportation costs.

Material and procedure

Both young and older adults were administered the same tasks in our laboratory during two sessions, one week apart. An additional session was sometimes necessary for the older adults to complete the entire battery of tasks.Footnote 1 Sessions lasted about 1.5 h. Children were evaluated in a quiet room at school, during school hours, and were administered the same tasks as the adults over four or five sessions lasting about 45 min, distributed over two to three weeks. All tasks were individually administered on a Dell computer using E-Prime (Schneider, Eschman, & Zuccolotto, 2002), in the same order for all participants in order to allow for studying inter- and intraindividual variability.

Processing speed tasks

Six tasks of varying complexity were used to assess speed of processing: a simple reaction time (SRT) task; two choice reaction time tasks—line comparison (LI) and cross-square (CS); two complex visual processing tasks—letter comparison (LC) and digit–symbol (DI); and one task classically used as a measure of resistance to interference—the Stroop color-word (ST) task.

In the SRT task, adapted from Hultsch et al. (2002), participants were asked to press a button as quickly as possible with their preferred hand when a stimulus appeared in one of five different positions on the screen. The stimulus appeared after a fixation point, which had been presented for a variable duration (500, 800, 1,100, 1,400, or 1,700 ms) pseudorandomly distributed across the trials. The CS task is a choice reaction time task also adapted from Hultsch et al. (2002), in which participants were presented two groups of three crosses, on the left and right parts of the screen, respectively. They had to rapidly detect on which side one of the crosses changed into a square. In the LI task, adapted from Vernon (1987), participants were presented two vertical lines of different lengths, one on the left part of the screen and the other on the right part of the screen. They had to rapidly decide on which side the longest line was located. In the DI task, adapted from Salthouse (1992), participants had to determine whether a number–symbol pair was similar to a reference matrix presented in the upper part of the screen. The SRT, LI, and CS tasks consisted of 120 items distributed in five blocks of 24 items. The DI task consisted of 144 items distributed into five blocks (four blocks containing 30 trials, and one with 24 trials). In the LC task, adapted from Salthouse and Prill (1987), participants were required to decide whether two series of letters (consisting of six or nine consonants, LC6 or LC9) were identical or not. Three blocks of 20 trials were given for each condition (LC6 and LC9).

For all of these tasks, participants responded using a two-button device (either Yes/No or Left/Right), except on the SRT task, which necessitates only a one-button response. The rate of administration depended on the participant’s response speed. The stimuli remained on the screen until the response was given. As soon as the response was given, the following trial began.

In the ST task (Stroop, 1935), participants were required to name the color in which words or signs were written. A voice-key was used to measure reaction times, and the experimenter rated the response as either correct or wrong. The trials consisted of French color words (“vert,” “jaune,” “rouge,” and “bleu”) and signs (““““, ++++, □□□□, ****) that were presented in green, yellow, red, or blue color. This resulted in three different conditions: neutral (signs printed in any color, STn), congruent (color name printed in the same color, STc), and incongruent (the color name and the presentation color were different, STi). There were 18 blocks of 24 trials, resulting in a total of 432 test trials (144 in each condition). Neutral, congruent, and incongruent items were counterbalanced within each block. In order to compute the short-term reliability of the task, and to reduce the participant’s burden, the Stroop task was administered over two sessions. In the present study, the three conditions are considered measures of processing speed.

Additionally, measures of fluid (Raven’s Progressive matrices; Raven, 1938) and crystallized (Mill Hill Vocabulary Test; Deltour, 1993) intelligence abilities were assessed.

Working memory assessment

Two tasks were used for WM assessment: one verbal (the reading span test, RSpan; de Ribaupierre & Bailleux, 1995; de Ribaupierre, Ghisletta, & Lecerf, 2006; Robert, Borella, Fagot, Lecerf, & de Ribaupierre, 2009, adapted from Daneman & Carpenter, 1980) and one visuospatial (the matrices task; de Ribaupierre et al., 2006; Lecerf & de Ribaupierre, 2005). Each task contained two phases: (a) span assessment and (b) administration of ten trials at span level and administration of ten trials at span + 1 level. The first phase was used to classify participants as being high or low-span individuals.

In the RSpan task, participants were instructed to read a series of sentences presented sequentially on the computer screen and to decide whether or not each sentence was semantically correct. In parallel, they were asked to memorize the final word of each sentence and to orally recall all of them at the end of the series. Each word was presented only once throughout the task. In the span assessment phase, three trials of different levels of complexity (from two to six sentences) were presented in an ascending procedure. The procedure stopped whenever the participant failed all three trials at a given complexity level. The span level was defined as the last level at which at least two trials were entirely correct. The score used to determine whether individuals were low- or high-span was the total number of correctly recalled words; the maximal score was 60 words. In the second phase, 20 trials were administered: ten at the span level and ten at the span + 1 level. The task was repeated one week later, so that the total number of trials was 40 (20 by span level). The mean number of correctly recalled words for each level of difficulty was used as the score.

The matrices task was also composed of two phases. In the span assessment phase, a 5 × 5 grid containing blackened cells was presented to the participants. The distribution of the cells was random, except that obvious and/or compact patterns were avoided. Participants had to recall the position of the black cells and to replace them on a blank grid by touching the screen. Three trials of different complexity level (from two to eight black cells) were presented in an ascending procedure. The procedure stopped whenever the participant failed all three trials at a given complexity level. As for the RSpan task, the span level was defined as the last level at which at least two trials were entirely correct. The score used to determine whether individuals were low- or high-span was the total number of correctly recalled positions; the maximal score was 81 positions. The subsequent phases comprised two experimental conditions, the simple matrices task and the double matrices task. The simple matrices task was similar to the one used for span assessment (i.e., recall the position of the blackened cells). In the double matrices task, words were placed in the cells, and participants had to recall both the words and their positions. No word was repeated throughout the task. Recall was oral for the words, and by touching the screen for the positions. For the simple matrices (positions only), ten trials were administered at the span level and ten at the span + 1 level, and for the double matrices (words and positions) tasks, ten trials were administered at the span + 1 level and ten trials at the span + 2 level. The score was the mean number of correctly retained items for each difficulty level, keeping words and positions separate; that is, there were four position scores (the number of correctly retained positions in the simple and in the double matrices task, for each span level) and two word scores (the number of correctly retained words in the double matrices task, at the levels span + 1 and span + 2).

Practice trials were given before each task, and WM performance was recorded by the experimenter on an ad-hoc protocol for both tasks.

Analyses

Data preparation for the processing speed measures (RTs)

Only RTs associated with correct responses were considered. In addition, RTs were trimmed as follows: extremely fast responses (RTs below 150 ms for the SRT, LI, and CS tasks; 200 ms for ST and AT tasks; and 500 ms for the LC and DI tasks) and extremely slow responses (RTs above 1,000 ms for the SRT task; 1,500 ms for the LI and CS tasks; 2,000 ms for the inhibition tasks; 5,000 ms for the LC task; and 12,000 ms for the DI task). Cleaning of the data resulted in an average loss of 1.06 % of the data (0.003 %–3.79 %, depending on the task and the age group).

Both the intraindividual mean (iM) and the intraindividual standard deviation (iSD) were computed for each task separately, and for each condition when relevant. iMs and iSDs were then residualized for the age group effect in order to explore the effect of WM abilities on processing speed independently of the group’s level of performance. The data were then standardized and transformed into T-scores.

Group comparisons

Low/high-span assessment

A composite score, taking into account both verbal and visuospatial components, was computed to assess WM span as follows: The total number of correctly recalled words in the RSpan task (span assessment phase) and the total number of correctly recalled positions in the matrices task (span assessment) were standardized, transformed into T- scores, and then averaged into a composite score. Terciles were computed within each age group for this composite score, to determine low-span (N = 176 across the entire sample), middle (N = 213), and high-span (N = 179) individuals. Among the low-span individuals, only those who were low-span in one of the two WM span tasks and either low or middle in the other span task (as determined by another tercile analysis) were retained for further analysis.Footnote 2 The same procedure was used with the high-span individuals: Only those who were high in one of the two WM span tasks and either high or middle in the other were kept for the analyses. The final sample consisted of 167 low-span individuals and 159 high-span individuals. The low- and high-span individuals were equivalent with respect to most other characteristics. They differed, nevertheless, on three variables: Low-span individuals presented lower scores in fluid intelligence (Progressive Matrices), and in the vocabulary test (used only in adults) than did high-span individuals; low-span individuals were also significantly younger than the high-span individuals in the children and young-adult groups.

Analyses of the span effect

Analyses of variance were carried out on (a) the intraindividual mean level of performance (iM) and (b) the intraindividual standard deviation (iSD) in these nine processing-speed conditions, with Age Group (five groups) and WM Span (high and low) as between-subjects factors (2 × 5 design). Because we hypothesized that span effects would be different across the lifespan, planned comparisons testing the span effect in each age group were conducted for each task/condition.

Multivariate analysesFootnote 3

As we mentioned in the introduction, multivariate analyses were carried out to test whether the relations between WM span, on the one hand, and the mean level of speed or the intraindividual variability in speed, on the other hand, were similar in the different age groups. This type of analysis differed from the analysis of the span effect in two ways. First, the entire sample was taken into consideration, rather than only those participants included in two of the three terciles. Second, this was a multivariate, correlational analysis, asking whether the organization of the variables was the same in the different age groups; it did not address the magnitude of IIV. An additional question was whether the mean speed and standard deviation were overlapping or independent in how they accounted for differences in WM performance. Such analyses are rather complex and extensive. Therefore, for reasons of space, only a restricted set of them are presented here. The analyses presented below were all conducted on the T- scores for each task—that is, on scores residualized for age group and standardized by age group. This was in keeping with the general objective of this article, to analyze individual differences within each of the three large age periods considered. Two types of analyses were used.

First, hierarchical regression analyses were conducted on the WM scores, taken as dependent variables. In this analysis, the second phases of the WM tasks were analyzed. The eight WM scores were grouped as follows: (1) reading span (average of two scores), (2) positions in the matrices tasks (average of four scores), and (3) words in the double matrices task (average of two scores). This grouping was supported by correlational and factor analyses (see also the structural equation modeling analyses, below). Two models were compared in which the WM score was regressed on the two RT indicators by an Enter method (using SPSS 22): a model in which the mean RT was entered first and the standard deviation entered second, and a model in which the order was reversed. The R 2 change was then analyzed.

Second, confirmatory factor analyses and structural equation modeling were performed, in which all of the RT mean scores and the RT SD scores were used as the observed variables. The same 26 scores (eight WM scores and 18 RT scores) were used in structural equation modeling, using Lisrel 8.72 (Joreskög & Sörbom, 2005). More details on these analyses will be provided in the Results section.

Results

The results will be described in two parts: (a) group comparison of the high- and low-span individuals for each of the RT tasks, and (b) multivariate analyses conducted on the entire sample.

Group comparison

Raw data in milliseconds concerning both the means and IIV are presented in the Appendix, in Tables 5 (iMs) and 6 (ISDs). Figure 1 illustrates the raw results for one task, DI.

Fig. 1
figure 1

Digit–symbol task: (a) Mean reaction times (in milliseconds) and (b) individual standard deviations (iSD, in milliseconds) by age groups, as a function of working memory capacity (high/low span). Error bars represent standard deviations

Mean level of performance

The results of the 2 × 5 ANOVAs carried out on the mean level of performance showed a significant span effect in the nine tasks/conditions [SRT: F(1, 325) = 13.00, p < .001, η p 2 = .040; CS: F(1, 325) = 14.44, p < .001, η p 2 = .044; LI: F(1, 325) = 14.40, p < .001, η p 2 = .044; DI: F(1, 325) = 44.68, p < .001, η p 2 = .124; LC6: F(1, 325) = 14.82, p < .001, η p 2 = .045; LC9: F(1, 325) = 4.91, p = .027, η p 2 = .015; STn: F(1, 325) = 14.81, p < .001, η p 2 = .045; STc: F(1, 325) = 7.14, p = .008, η p 2 = .022; STi: F(1, 325) = 15.13, p < .001, η p 2 = .046]. These results show that low-span individuals were slower than high-span individuals across all the tasks. Because the data were residualized for age group, no main effect of age was found. The results also displayed a significant Age × Span interaction in the digit–symbol task [F(4, 352) = 2.49, p = .044, η 2 = .031], indicating that the span effect was significant only for children and older adults (all ps < .01), but not for young adults. The Age × Span interactions were not significant in the other tasks. However, planned comparisons showed consistent results: Whichever the condition, span effects were not significant for young adults. In contrast, in younger children and old-old adults (>70 years old), a span effect was almost systematically found. In the youngest children, as well as in the oldest adults, low-span individuals were slower than high-span individuals. The results were less regular in the older children and in the young-old adults. The detailed results of the planned comparisons are presented in Table 2 (first panel).

Table 2 Summary of span effects on processing speed

Intraindividual variability

Similar results were found for IIV. The main effect of span was significant in all tasks [SRT: F(1, 325) = 16.51, p < .001, η p 2 = .050; LI: F(1, 325) = 14.24, p < .001, η p 2 = .043; CS: F(1, 325) = 12.84, p < .001, η p 2 = .039; DI: F(1, 325) = 30.05, p < .001, η p 2 = .087; LC6: F(1, 325) = 12.05, p = 001, η p 2 = .037; STn: F(1, 325) = 20.77, p < .001, η p 2 = .062; STc: F(1, 325) = 10.20, p = .002, η p 2 = .031; STi: F(1, 325) = 17.31, p < .001, η p 2 = .052], with the exception of the nine-letters condition in the LC task [F(1, 357) = 1.45, p = .229, η p 2 = .005]. These findings indicate that low-span individuals were more variable in their RTs than were high-span individuals. A significant Age × Span interaction was also observed in the digit–symbol task and in the neutral condition of the Stroop task [F(4, 325) = 2.48, p = .044, η p 2 = .033, and F(4, 325) = 3.05, p = .017, η p 2 = .037, respectively]. This interaction was not significant in the other tasks. However, as is illustrated in Fig. 1, planned comparisons showed a significant span effect on RT variability mostly in younger children (9–10 years old) and in old-old adults, but also in older children and young-old adults. Young adults did not show any span effect on RT variability (see Table 2, second panel, for the detailed results).

Multivariate analyses

Regression analyses

The objective of these regression analyses was to assess whether the mean RT (iM) and the variability in RTs (iSD) accounted for different parts of the variance in the WM tasks. Three WM composite scores were used as dependent variables, and two models were tested for each age group, shown in Table 3: (a) All of the iM RTs were entered as a block of variables (i.e., nine variables), followed by all of the iSD RTs (nine variables); and (b) all of the iSD RTs were entered, followed by the iM RTs. The results showed that both types of scores explained a moderate amount of the variance observed in the WM scores (significant R 2s in all analyses), varying between 7 % and 25 %, depending on the age group and the task. The amount of variance explained was slightly lower in the young adults (R 2 ranging from 7 % to 22 %) than in both the children (R 2 ranging from 14 % to 25 %) and the older adults (R 2 ranging from 14 % to 22 %). In most cases, iM and iSD accounted for similar amounts of variance and showed high overlap. Their joint contribution was usually slightly higher than either score alone. However, only in a few cases was the difference significant, in particular for positions.

Table 3 Regression analyses: R 2 and R 2 change values by age group, variable, and model

Structural equation modeling

A large number of analyses were conducted with Lisrel, and several models were tested. They are not all reported here, for the sake of space. Preliminary confirmatory factor analyses were used, to assess the possibility of defining latent variables for each of the groups of variables (i.e., RT iM, RT iSD, and WM scores), either on the raw scores or on the T-scores residualized for age, and within each age group. These supported the feasibility of structural equation modeling analyses. For each of the three groups of variables, we observed that the best solutions consisted of three correlated factors: simple RT tasks (SRT, line comparison, and cross-square tasks), complex RT tasks (digit–symbol and letter comparison tasks), and the Stroop task (three conditions). This grouping was the best solution for both the iM and iSD scores, so that one might wonder whether these two types of scores could be grouped. Yet, analyzing iM and iSD in a single analysis showed that it was preferable to keep either six latent variables (three correlated variables for the iM and three correlated variables for the iSD scores) or two “general” factors, grouping all iM scores on the one hand, and all iSD scores on the other hand, rather than combining iM and iSD. As concerns the WM scores, the best model turned out to be one defining three correlated latent variables: RSpan, words in the matrices double task, and positions in both the simple and the double matrices. This organization into three factors for each of the types of variable was preferable in each of the age groups.

In order to analyze the relationship between all of the variables, we decided to adopt a model postulating a single factor for each type of variable (iM, iSD, and WM), even though this organization was not optimal. Keeping three latent variables for each type would evidently have made the between-variable relationships too complex to analyze. Thus, the model retained and tested for each age group was based on three “general” factors: mean RT performance, IIV in RTs, and mean WM. It was contrasted with alternative models, defining either one, two, or three factors for all of the variables, and varying in a number of other characteristics, such as in how the manifest variables were equalized and whether the error variances correlated for the same condition.

Figure 2 presents the model finally retained and tested in each age group; in this model, the speed scores were correlated, and WM scores were regressed on the speed scores, in conformity with our hypothesis. The results presented here (see Table 4 for the values) are based on the T- scores, implying that all age effects have been controlled for. This model did not present a very good fit. This was to be expected. Indeed, as we just mentioned, we simplified the structure and defined a single factor by types of variable. Moreover, we found high collinearity among the speed scores. This was therefore already a good sign that the analyses did converge. The organization depicted in Fig. 2 was the best model in all age groups. This implies that the mean and the IIV should be considered independent variables, even though they are very highly correlated (ranging from .80 to .90). Grouping all of the RT variables into a single latent variable rather than distinguishing iM and iSD scores yielded a significantly worse fit. As can be seen, the results were very similar for all of the age groups. The variance in WM was not fully accounted for by the RT latent variables; it is interesting to note that RT iSD related more strongly to WM than RT iM. Other analyses conducted on the raw scores showed, in contrast, that mean RT accounted for a large part of the age effects. Incidentally, the same model was also used to test the relation of these three variables with Raven’s Progressive Matrices (PM). PM was regressed on the WM latent variable (Beta parameter), itself regressed on RT iM and RT iSD (see Panel b of Table 4). These results turned out to be very equivalent to those obtained for the three sets of variables, and the Beta parameter from WM to PM varied from .39 to .42, depending on the age group. This was also true for the “saturations” of all manifest variables into the latent variables (Lambda parameters), which varied from the .40s to the .90s, depending on the task. Even though the results were so similar for the three age groups, a fully invariant model in which all parameters were kept identical for each age group turned out to be much worse, or even did not converge.

Fig. 2
figure 2

Structural equation modeling: Model retained for between-task relationships. Analyses were run on T-scores on the total sample, residualized for age. SRT, simple reaction time task; LI, line comparison task; CS, cross-square task; DI, digit–symbol task; LC, letter comparison task (LC6 = six letters, LC9 = nine letters); ST, Stroop task (STN = Stroop neutral, STC = Stroop congruent, STI = Stroop incongruent conditions); RS, reading span task; SP, matrices simple task, positions; DP, matrices double task, positions; DW, matrices double task, words. sd = intraindividual standard deviation; m = mean score; N, span level; N1, span + 1 level

Table 4 Structural equation modeling analyses: Summary of results

Discussion

The objective of this article was to address the relationships between speed (RTs) and WM capacity. In particular, we asked whether both the average response speed (mean RT) and intraindividual variability (IIV) in processing speed relate to WM capacity, whether this relationship is stronger for one indicator or the other, and whether it varies across the lifespan. Scores were residualized for age group (and standardized to allow for grouping the scores, when relevant), so as to focus on individual differences within each large age period. Two perspectives were adopted to analyze the results, and will be used to structure the discussion, too. First, a group approach was used, in which high-span and low-span individuals were compared within each age group. Participants were selected using a tercile method, and those belonging to the middle tercile were not included. On the basis of the literature, our hypothesis was that low-span individuals would not only be slower than high-span individuals, but probably more variable as well. Indeed, it has been shown that low-span and high-span individuals differ in terms of attentional resources (e.g., Engle, 2002). Because IIV is considered to reflect lapses in attentional control, it should be greater in individuals with fewer attentional resources—that is, in low-span individuals and/or in children and older adults. We also hypothesized that IIV would be larger in more complex tasks. In a second approach, relying on regression and structural equation modeling—this time including all of the participants (rather than only the high- and low-span ones)—we investigated whether both the mean and IIV in RTs were related to WM capacity. These analyses were more exploratory. First, we did not know whether it would be possible to fit a model with the two types of indices, given their high collinearity. Second, we did not have a strong a priori hypothesis. Indeed, the question now becomes whether the mean or the standard deviation is a better predictor of WM capacity, and whether these two indices are redundant.

Group comparison

The results supported our main hypotheses, concerning the role of span and its variation with age. Note first that the raw RTs showed the usual age differences, whether in the entire sample or in the span groups: Young adults were faster and less variable than both children and older adults, as would be expected. These age differences demonstrate the necessity to control for the level of the RTs (via control of age group, in the present article), in order to disentangle IIV from the mean, as well as individual differences from developmental differences (e.g., Hultsch et al., 2000). As we mentioned, analyses of variance were then conducted on the residualized (for age group) and standardized scores (in T-scores). Consistently with our hypothesis, a span effect was observed in most tasks, indicating that high-span individuals were both faster and less variable than low-span individuals. This distinction between high-span and low-span participants was observed only in children and older adults, but not in young adults. Moreover, a difference was present in almost all conditions in the youngest and oldest groups; a significant span effect was less systematically observed in older children and young-old adults. The span effect can thus be summarized by saying that the younger or the older the participants, the larger the difference between low- and high-span individuals. Once again, it should be stressed that, because age group was controlled for, the differences reflect within-age-group individual differences, and not age differences.

Because we were interested in assessing whether IIV varies with complexity, we conducted a rough task analysis of complexity. Our hypothesis was that the nine experimental conditions would address two types of complexity, which might not be exclusive: the quantity of information to process, and the misleadingness of information. This distinction is akin to the two mechanisms supposed to be at work in WM tasks, in particular by Pascual-Leone (1987)—that is, M power (or the focus of attention, in Cowan’s model) and inhibition. The first type of complexity allows for roughly ordering most tasks in three groups, of ascending complexity: (1) simple reaction time (SRT), (2) line comparison (LI) and cross-square (CS), and (3) digit–symbol (DI) and letter comparison, which can itself be divided into two subgroups (LC6 and LC9). The sixth task, namely the Stroop task, has to be considered apart. It is not a complex task; indeed, it is not difficult to name the color of an item, and might not even require controlled attention. The task is, however, misleading, because the prepotent response (i.e., reading) has to be suppressed, and a less familiar, although easy, response (i.e., naming the color) has to be activated. Obviously, only the incongruent condition makes the Stroop task difficult; however, because trials of the three conditions were randomly distributed across the task, the congruent and neutral conditions now also required controlled resources. Yet, this was not the same kind of controlled attention that was required in the first three groups of tasks. Our hypothesis was thus that IIV would increase with complexity, and/or with misleadingness.

The mean raw RTs supported this analysis: RTs were relatively short for SRT, somewhat longer for LI and CS, much longer for DI, and longest for LC 6 and LC9, with a clear distinction between the latter two conditions. The Stroop task was somewhat more difficult than LI and CS, but easier than DI, and presented the expected interference effect (i.e., RTs that were longer in the incongruent than in the neutral and congruent conditions). This grouping was also supported by confirmatory factor analyses conducted on the entire sample (again on T-scores), with the difference that SRT was grouped with two other tasks to allow for identification of a latent variable. So, the best model, whether for the mean scores or the standard deviations, consisted of three latent variables: simple processing speed (SRT, LI, and CS), complex speed (DI, LC6, and LC9), and the three Stroop conditions.

In contrast, the hypothesis, according to which task complexity should additionally contribute to the differences between high- and low-span individuals, was not supported by the findings: There were no more differences between high- and low-span individuals in the more complex task, in any of the age groups. When a span effect was observed in an age group, it did not systematically appear in the more complex task. For instance, the most difficult condition of the LC task (i.e., the condition in which one had to compare two series of nine letters rather than two series of six letters) did not significantly differentiate high from low span, even in young children; only the older low-span adults were slower (but not more variable) in this task. When the two groups differed significantly in an easy task (e.g., SRT), they were not systematically also more variable in more complex ones.

One can ask why more complex tasks did not discriminate high-span from low-span individuals more than did simpler tasks. Of course, our coarse analyses might be erroneous; task analyses are always subject to controversy because of the rather high degree of inference they require. Yet, the raw RTs, which may be considered as some kind of proxy for the difficulty of the tasks—but note that RTs are far from being unequivocal—broadly supported our analyses. It might also be the case that the complexity of the tasks is not a good criterion to distinguish low- from high-span individuals. More probably, however, the difference in complexity between the tasks used in the present study was not large enough to tax attentional resources differently in the two groups of individuals. Moreover, the difference among the RT tasks might be addressing another type of attention than attentional resources, in the sense of Cowan’s focus of attention or Pascual-Leone’s M capacity (i.e., the attention required to keep a certain quantity of information simultaneously available in short-term memory and to effortfully process it). After all, the type of cognitive processing required by the LC or the DI task is perhaps not more complex than the processing required in the CS task. One could thus speculate that the major difference between the various RT tasks used here was not in terms of attentional resources only. This hypothesis would require another set of tasks to be administered. Yet, attentional resources are altogether required in all of these tasks, as attested by the difference observed between high and low span in a number of age groups. This is not an all-or-none matter: Low-span individuals do not fail in these RT tasks, nor do they show more errors, as they probably would in other, more complex cognitive tasks. They are just slower or more variable. Once again, it is interesting to note that, among young adults, the high-span individuals no did not have an advantage over low-span individuals. A finer, and purely descriptive, analysis of the scores and raw times showed that the low-span young adults were generally as good, or even better, than the high-span individuals in the other age groups. It thus cannot simply be concluded that high-span individuals are usually better than low-span individuals.

Multivariate analyses

Multivariate analyses provided another light on the relationship between WM capacity and the mean and/or IIV in RTs, by analyzing several or even all tasks jointly, and using all participants from the same age period. These analyses showed first that, as would be expected, RT did relate to WM tasks, but did not account for all of the variance. Regression analyses, conducted on three different composite scores (reading span, positions in the matrices task, and words in the matrices task) showed that the mean RT and IIV in RTs accounted for about the same amounts of variance and did not systematically bring an increment in variance explained, relative to each other; only in some cases did the addition of mean RT to IIV bring a small supplementary proportion of variance. Together, the two scores accounted for a slightly larger part of the variance in children and older adults (between 20 % and 25 %) than in young adults, and in the matrices positions than in reading span tasks. These differences were not very large, though.

Structural equation modeling analyses confirmed this relationship. Interestingly, they showed that both mean RT and IIV could be entered simultaneously in the model; although they were not very high, moderate fits were obtained. Moreover, the two scores were not redundant. That is, the fit was better if two factors (latent variables, one for the mean speed and one for IIV) were distinguished, rather than all of the RT scores being combined into a general speed factor; the mean factor and the IIV factor were nevertheless very strongly correlated. Finally, and more surprisingly, the model that proved most satisfactory in each age group was identical. We did not have precise a priori hypotheses as to between-group differences in the relationships, but, given the results obtained in the high-span versus low-span comparisons, we would have predicted that IIV would play a more important role in children and older adults than in young adults. Instead, in all three age groups, IIV seemed to play a more important role than mean RT (except when analyzing the total sample): The β (Beta) values were close to .30 for IIV, but lower for mean RT. A model with invariant parameter values across the age groups was not adequate (i.e., measurement invariance), which is a sign of slight differences between them.

In sum, our results showed that low-span individuals were both slower and more variable than high-span individuals among children and older adults, but not among young adults, and in most tasks. The concept of attentional resources, which was discussed in the introduction, might account for this link between higher/lower performance in WM span tasks, on the one hand, and IIV in processing-speed tasks, on the other.

Note, however, that low-span individuals were not only more variable, but also slower. They tended to obtain lower scores on the other tasks that we used, such as fluid intelligence or vocabulary. Thus, our data do not point to a specific link between WM capacity and IIV. In other words, IIV does not seem to bring much additional information, beyond that provided by the mean; of course, the argument could be reverted—that is, the mean does not bring much additional information on top of IIV. Multivariate analyses showed convergent results: The mean RT and IIV do seem to relate more or less similarly to WM capacity, with analyses in children and older adults pointing to a slightly larger role for IIV in accounting for WM capacity. Further studies are obviously needed, using other, more diversified RT tasks that would tax more attentional resources than did the ones used in the present study, to determine whether the RT mean and IIV provide redundant or complementary information. Our present, provisional interpretation is that the closer the task demands are to the limits of the participants’ attentional resources (as is probably the case in children and older adults), the more variable the performance may be. That is, the complexity of the tasks and their effects on individual differences need to be analyzed relative to the participants’ levels of functioning: A complex task might generate a larger difference between individuals when it is at or beyond the limits of their resources, but not necessarily when it can be mastered relatively easily by all participants. This hypothesis will need to be tested with more complex tasks that will tax more of the participants’ attentional resources than any of the tasks that were used here.