Introduction

Several studies have shown considerable individual differences in processing speed and working memory performance in humans (e.g., Ho et al. 1988; Neubauer et al. 2000; Luciano et al. 2001; Polderman et al. 2006). According to the limited capacity hypothesis (Jensen 1998; Vernon 1987, 1989), processing speed and working memory are inherently linked as faster speed of information processing facilitates access to information that is sustained in the working memory system (Baddeley 1992; Baddeley and Hitch 1974) before it is lost through decay or interference. As working memory is crucial to complex information processing, measures of both processing speed (PS) and working memory retrieval speed (WMS) are thought to predict performance on general cognitive tasks (Jensen 1998; Salthouse and Babcock 1991). Various studies have confirmed that measures of PS covary with measures of general cognitive ability (Mcgue et al. 1984; Vernon 1987, 1989; Baker et al. 1991; Rijsdijk et al. 1998).

Several paradigms exist to investigate the sources of individual differences in processing speed in the context of working memory performance. A classic paradigm for assessing information processing speed is Sternberg’s Memory Scanning task (SMS-task) (Sternberg 1966, 1969). In this task, subjects are presented with sets of digits, which need to be stored in memory. The set length ranges from 1 to 5 digits and sets are presented in randomized order. During the presentation of this list, subjects are required to press a ‘home’ button. After a warning signal a target digit is presented and subjects have to decide whether the target digit is part of the set by pressing a ‘yes’ or a ‘no’ button as fast as possible. A distinction can be made between positive trials (when the target was actually present in the remembered set) and negative trials (when the target was not present in the remembered set).

Performance on this task is hypothesized to consist of three phases: an encoding phase, in which information is encoded in short term memory (STM), a maintenance phase, in which this information is kept active in STM, and a retrieval or reaction phase, in which stored information is retrieved from STM and acted upon. Classically, reaction time in the SMS task increases with increasing set size (Sternberg 1966). The mean reaction time for a set-size of one (or the intercept of the line that can be drawn to graph the increase in reaction time as a function of set size; referred to here as processing speed (PS)) is assumed to reflect basic processing speed and (pre)motoric processes. The increase in reaction time as a function of set size (or the slope of the line; referred to here as working memory retrieval speed (WMS)) is thought to reflect the time required to retrieve an item from short term memory, and usually varies around 40 ms (Sternberg 1966).

In the Sternberg-paradigm, a distinction is made between decision time (DT), defined as the time between target stimulus onset and home button release, and movement time (MT), defined as the time between home button release and pressing the target button. DT is thought to reflect the time a subject needs to decide whether the target stimulus is part of the set, while MT is thought to reflect the time a subject needs to physically move the hand from one button to the next. In various studies DT and MT are analyzed as a single composite measure reflecting overall reaction time (i.e., DT + MT), other studies analyze DT and/or MT as single measures. The focus in this present study is on DT.

Various studies have shown that increase in RT due to set size does not differ between conditions in which the target digit is part of the memorized set (positive trials) and conditions in which the target digit is not part of the set (negative trials) (Burle and Bonnet 2000). According to Sternberg (1966), this suggests that information in working memory is stored serially and is exhaustively scanned before a decision is made (but note various discussions on the serial versus parallel storage: e.g., Townsend 1971, 1972, 1990; Townsend and Ashby 1983; Atkinson et al. 1969). Responses to negative trials are on average considerably slower than responses to positive trials, regardless of set size (Sternberg 1966).

Few studies investigated genetic and environmental sources of individual differences in PS and WMS. McGue et al. (1984) administered the SMS-task to a small sample of twin pairs reared apart (34 MZ and 13 DZ twin pairs). Although MZ correlations ranged from .16 to .37 in different conditions and DZ correlations were generally lower, the presence of genetic influences could not be detected due to small sample size (McGue 1989). Neubauer et al. (2000), administered the SMS-task in a sample of 169 MZ and 131 DZ twin pairs. For PS, a moderate heritability of 23% was reported. Shared environmental factors were not significant. For WMS, both additive genetic factors and shared environmental factors were not significant. Polderman et al. (2006) administered a memory-task comparable to the SMS-task to 12-years old twins and their siblings (97 MZ and 80 DZ twin pairs, and 55 siblings of these twins). In this study, heritability estimates of 51% for PS and of 43% for WMS were reported. However, for WMS both a model with additive genetic factors and unique environmental factors, and a model with shared and unique environmental factors showed a good fit to the data.

The results of these previous studies show that it is difficult to establish reliable twin correlations for PS and WMS parameters in the SMS task. Yet, the origin of individual differences in basic PS and WMS is of theoretical interest as these parameters supposedly reflect basic psychological functions. Reliable estimation of the twin correlations may be hindered by measurement error. Both PS and WMS are based on reaction time measures, and these may show considerable fluctuation due to, e.g., short dips in concentration. More reliable estimates could be obtained by increasing the number of measurements. Another possibility, however, is to explicitly model the measurement error such that twin correlations can be estimated for PS and WMS parameters that are corrected for measurement error (i.e., measurement error is partialled out). This latter option is feasible if the SMS task data are subjected to a growth curve model (McArdle 1988; Meredith and Tisak 1990), in which the variance that is not explained by the Intercept, Linear Slope and (if required) Quadratic Slope factors is separated off in the form of freely estimated residuals.

The aim of the present study is to investigate the genetic and environmental causes of individual differences in PS (the intercept in the growth curve model) and WMS (the slope in the growth curve model), while explicitly modeling measurement error and accounting for sex and age effects. An extended twin design, including twin pairs and additional siblings, is used, resulting in an effective sample size of 623 participants (Posthuma and Boomsma 2000).

Method

Participants

The sample consisted of MZ and DZ twin pairs and their non-twin siblings. Participants were recruited from the Netherlands Twin Registry (Boomsma et al. 2002, 2006). All subjects participated in an ongoing study on the genetics of cognition and brain functioning in adults. Participants were paid around 30 dollars (€22,-) if they completed the 4.5 h test session.

Participants were invited to the VU University Amsterdam to complete a psychological test battery including the Sternberg Memory Scanning Task (SMS-task) (Sternberg 1966, 1969).

SMS-task data were available from 302 families (726 participants in total). After outlier correction and elimination of incorrect answers (see description task for more details), 623 subjects of 293 families remained (267 men, 356 women). This sample consisted of 221 MZ twins (93 complete pairs), 238 DZ twins (91 complete pairs), and 164 siblings. Age of these participants ranged from 13 to 70 years with a mean of 36.72 (SD = 12.56). Effect of sex and age (Z-scores) were modeled on the means of the Intercept, Linear slope, and Quadratic slope (see “Statistical Analyses” section). Note that this implies that the effects of age and sex are partialled out, after which genetic variance decomposition is carried out on the remaining (residual) variance.

Task and instruments

Sternberg memory scanning task

The task consisted of 100 experimental trials with 10 conditions, i.e., 5 set sizes of 1–5 stimuli each, for both positive and negative trials. Previous to the experimental trials, 12 practice trials were presented. In each trial a random sequence of 1–5 digits was presented on a 240 × 180 mm, 60.1 Hz computer screen which had to be memorized during 1,000 milliseconds (ms). The task was semi-self-paced, in the sense that subjects were required to press the home button to start each new trial. In case the home button was not pressed, the task started automatically after a randomized inter trial interval of 400 or 800 ms. After the stimulus display, a 500 ms fixation indicated the end of the list to be memorized followed by a randomized interval of 200 or 600 ms. Subsequently the target stimulus was presented for a maximum of 500 ms. After target presentation, subjects had to release the home button and had to press one of the two response buttons as fast as possible. The ‘yes’ (left) button in case the target was part of the memorized list (positive trials), or the ‘no’ (right) button in case the target was not part of the memorized list (negative trials). The maximum response time from the moment the target stimulus is given was 1,500 ms. The sequences were semi-randomized; each subject was provided with the same sequences in the same, for subjects unpredictable, order. For the positive trials, the position of the target digit within the memorized sequence (e.g., first digit of the sequence, second digit of the sequence, etc.) was completely randomized (i.e., not block wise).

The focus in this study was on DT (DT+ for positive trials, DT− for negative trials), defined as the time between the moment the target digit appeared on the screen and the moment the home-button was released. DT is thought to reflect the time a subject needs to decide whether or not the target digit is included in the memorized set.

Outlying DT scores were coded as missing. Scores were considered outlying if they exceeded ±3 SD from the particular subject’s mean (within-subject outlier detection) or if they exceeded ±3 SD from the sample mean (between-subject outlier detection). When subjects had less than 80% correct answers within a set (i.e., 8 out of 10 trials), entire set scores for this subject were coded as missing (29.42% of the data; including complete data sets of 22 subjects). When less than 70% of the total number of trials (i.e., 70 out of 100 trials) was answered correctly or non-outlying, the entire set of data for that subject was coded as missing (81 subjects, 11.16%). Note that there is an overlap of 7.84% between the first and second criterion. Sets with less then 80% correct items that were not previously eliminated as a result of the 70% criterion were recoded as missing within the remaining dataset. Final analyses were based on 623 subjects (221 MZ twins, 238 DZ twins and 164 siblings, ~86% of the original sample). Table 1 shows error rates for men and women separately for the full, unselected sample (i.e., all subjects for whom SMS-task data were available, N = 726, 315 men, 411 women) and for the selected sample (i.e., subjects who had at least 80% correct/non-outlying trials in each condition, and minimally 70% correct/non-outlying trials overall; N = 623, 267 men, 356 women).

Table 1 Error rates for full and selected sample, and men and women separately

Statistical analyses

To model the increase in DT (DT− and DT+) resulting from the increase in memory load due to increasing set size, a standard non-linear growth curve (nLGC) model was fitted (McArdle 1988; Meredith and Tisak 1990). That is, for both DT+ and DT−, we first calculated the mean decision time across the valid trials of each set size (1–5) within each subject. Next, these 2 × 5 mean scores were used as indicators for the nLGC-models for DT+ and DT−, respectively. As the increase in DT as a function of set size is not necessarily linear, we included both a linear and quadratic slope. Thus, the nLGC-models included three 3 factors (intercept, linear and quadratic slope). Usually a linear slope can be modeled by fixing path coefficients from the latent factor to the measurements at, e.g., 1, 2, 3, 4. A quadratic slope can then be coded as 1, 4, 9, 16. However, this introduces collinearity between the linear and quadratic slopes. We therefore used standard orthogonal polynomials to code the three latent factors. The first factor, for which all factor loadings were fixed to .447, represents the Intercept. The second factor, for which the factor loadings were fixed to −.632, −.316, 0, .316, and .632, respectively, represents the Linear slope. The mean of this Linear slope factor represents (a linear transformation of) the linear rate of increase in decision time for the entire sample, while the variance of this factor represents the variation around this Linear slope. The third factor, for which the factor loadings were fixed to .535, −.267, −.535, −.267, and .535, respectively, represents the Quadratic slope. Note that in contrast to, e.g. repeated measures ANOVA, where the Intercept, Linear slope and Quadratic slope are fixed parameters (i.e., no variance), the Intercept, Linear slope, and Quadratic slope in the nLGC-model are considered random effects as their variance is indicative of the individual differences, which can be decomposed into genetic or environmental factors.

We first verified whether a linear (excluding the Quadratic slope) or a non-linear model described the SMS-task data adequately by fitting LGC- and nLGC-models in Mplus 5 (Muthén and Muthén 1998–2007)Footnote 1 to the individual subjects’ data while taking into account familial relatedness between subjects. Effects of age (Z-scores) and sex (coded 0 for men and 1 for women) were modeled on the means of the Intercept, Linear slope, and Quadratic slope factors. Subsequently, in a genetic model, observed variation in the latent factors is decomposed into additive genetic effects (A), dominance genetic effects (D) or shared environmental effects (C) and non-shared environmental components (E) by fitting the nLGC-model to the family data using the Mx program (Neale et al. 2003). C includes all environmental influences that render family members more alike, while E includes all environmental influences that create differences between members of the same family. E is always specified in the model as it also includes measurement error. The effects of C and D are confounded when only data from twins and siblings are available. Disentangling the separate contributions of C and D requires data from, e.g. twins reared apart, half-siblings, or non-biological relatives reared together (Posthuma et al. 2003). As such data were not available for the present study, the analyses were confined to ADE- or ACE models. When the observed correlation between DZ twins and between siblings, is about half the size of the correlation observed in MZ twins or larger, dominance effects are assumed absent and ACE models are deemed most suitable. When the correlation between DZ twins and between siblings, is however substantially smaller than half the MZ correlation, dominance effects are likely to be present (although not necessarily statistically significant) and ADE models are deemed more suitable.

A full ADE model is illustrated in Fig. 1 for one subject. The lower part of the figure shows the nLGC model where the Intercept (I), Linear slope (S), and Quadratic slope (Q) are derived from 5 observed measures (M1–M5). Parameters ε1 to ε5 denote the residuals, i.e., the parts of the observed measures M1–M5 that are not explained by the nLGC model. The upper part of the figure shows the univariate variance decomposition of the variance of the Intercept. The variance of the Intercept is modeled via parameters a, d and e. Sibling data were included in the analyses when available. MZ twin pairs share 100% of their additive genetic and dominance effects, so correlations between these variance components are fixed to 1. DZ twins and sibling pairs share on average 50% of their additive and, 25% of the dominance genetic effects, so correlations between these components are fixed to 0.5 and .25, respectively (Neale and Cardon 1992; Posthuma et al. 2003). Correlations between the E-components are by definition fixed to 0 in MZ twins, DZ twins and regular siblings, as these components include all sources of variation that result in differences between family members. When an ACE model is fitted to the data, variance of the Intercept is modeled via parameters a, c and e. As MZ twins, DZ twins, and siblings all by definition share 100% of their familial environment, correlations between the C-components are fixed to 1 between all family members.

Fig. 1
figure 1

Path diagram of a non-linear growth curve model where the Intercept (I), Linear slope (L), and Quadratic slope (Q) are derived from 5 observed measures (M1–M5) for one twin pair. Parameters ε1 to ε5 denote the residuals of the observed measures M1–M5. The variance of the Intercept is decomposed into additive genetic effects (A), dominance genetic effects (D) and unique environmental effects (E). The variance of the Intercept is modeled via parameters a, d, and e. Between subjects, correlations between additive genetic effects (A) are fixed to 1 for MZ twins and to .5 for DZ twins and regular siblings, correlations between dominance genetic effects (D) are fixed to 1 in MZ twins and to .25 in DZ twins and regular siblings, while correlations between unique environmental effects (E) are fixed to 0 in all groups. Sex and age effects were modeled on the means of the Intercept, Linear slope, and Quadratic slope. Correlation between the three latent factors I, L and Q, are theoretically possible (not drawn)

Raw data likelihood procedures were used to allow for partial missingness.

A series of nested (increasingly more restricted) models was fitted to the raw data, in which parameters were fixed to zero to test for their significance. The fit of the nested models was compared to the fit of less restricted models by χ 2-difference tests. If the χ 2-difference test is significant, then the constraints imposed on the nested models are not tenable. If the χ 2-difference test is not significant, the nested, more parsimonious model is to be preferred. A criterion level α of .05 was adopted for all tests.

Descriptive statistics were calculated using SPSS (2004).

Results

Descriptive statistics

Table 2 summarizes the age-adjusted means and standard deviations of the ten mean scores for men and women, separately. Note that these means are close to the means reported earlier by Sternberg (1966) and McGue et al. (1984).

Table 2 Age-adjusted means and standard deviations for the ten mean scores for DT+ and DT− (in ms) for men and women separately

Paired t-tests showed that the mean decision time for negative trials was always higher than the mean decision time for the positive trials (P < .001).

Model fitting: positive trials

Phenotypic analyses

Phenotypic model fitting was carried out while taking into account familial relatedness. The nLGC-model described the DT+ data well (Comparative Fit Index (CFI) = .99, Standardized Root Mean Square Residuals (SRMR) = .02, see Schermelleh-Engel et al. (2003) for guidelines for evaluating the fit of structural equation models).Footnote 2 Although a linear growth curve model, excluding the Quadratic factor, also described the data adequately (CFI = .99, SRMR = .02), the difference in fit between these models was significant (χ 2(5) = 70.93, P < .001). The non-linear model including both a Linear and Quadratic slope was therefore used in the modeling of the family data.

The model fitting results for family data are presented in Table 3 (Model 1–10). We started out by fitting a nLGC-model to the data, with sex and age effects on the means of the three factors Intercept, Linear slope, and Quadratic slope (Model 1). In this model, residual variances were constrained to be equal across siblings (ε1 to ε5 in Fig. 1) and the variances and covariances of DZ twins were constrained to equal those of siblings. All residual variances in Model 1 were significantly different from zero (for all residuals, χ 2(1) > 79.00, P < .001).

Table 3 Model fitting results for decision time positive (DT+)

In Model 2, we fixed all the covariances between Intercept, Linear slope and Quadratic slope (i.e., cross-factor covariances) to zero within subjects as well as between subjects. The model fit did not deteriorate significantly (Model 2 vs. Model 1: χ 2(9) = 14.77, ns), implying that the three growth curve factors did not intercorrelate significantly.

Fixing the variance of the Quadratic slope and the covariances between the Quadratic slopes of family members to zero did also not result in a significant drop in fit (Model 3 vs. Model 2: χ 2(3) = 2.32, ns), suggesting that the Quadratic slope should be interpreted as a fixed factor. The Linear slope could not be interpreted as a fixed factor (Model 4 vs. Model 3: χ 2(3) = 24.93, P < .001), and neither could the Intercept (Model 5 vs. Model 3: χ 2(3) = 3121.69, P < .001).

Sex effects on the Intercept, Linear slope, and Quadratic slope could all be dropped from the model without significantly deteriorating the fit (Model 6 vs. Model 3: χ 2(3) = 1.95, ns). Age effects on the Intercept, Slope, and Quadratic slope could not all be dropped from the model without significantly deteriorating the fit (Model 6 vs. Model 3: χ 2(3) = 84.54, P < .001). The effect of age on the Intercept was significant (Model 7a vs. Model 6: χ 2(1) = 79.15, P < .001), but the age effects on the Linear and Quadratic slopes were not (Model 7d vs. Model 6: χ 2(2) = 3.24, ns). The age effect on the Intercept was estimated at 2.86, suggesting a substantial increase in the mean of the Intercept factor with every standard deviation increase in age.

In this model, considerable variance was observed in the Intercept (Var(Intercept) = 43.58), implying that this is indeed a random effect, i.e., there are substantial individual differences in Intercept scores. The variance of the Linear slope, however, was considerably smaller (Var(Slope) = .59). As stated earlier, the variance of the Quadratic slope could be fixed to zero.

For the Intercept, the MZ twin correlation was estimated at .44 (CI 95%: .23–.59), while the correlation between Intercept scores of DZ twins, including regular siblings was .15 (CI 95%: .04–.26).Footnote 3 The twin correlations for the Linear slope were estimated at .08 for the MZ twins (CI 95%: −.76 to .95) and .73 for the DZ twins (CI 95%: .26–1.00). Note that due to the large CI’s, these correlations were not statistically different from each other. Because of the very small variance, its genetic decomposition was renounced.

For the Intercept, however, the higher MZ twin correlation, compared to the DZ correlation, suggests the presence of genetic influences and as the MZ twin correlation is more than twice as high as the DZ correlation, genetic dominance is implicated. We therefore fitted an ADE model to the Intercept in DT+ data.

Genetic analysis

In Model 8 (Table 3), the variance of the Intercept factor was decomposed into additive genetic influences (A), variation due to genetic dominance (D) and unique environmental influences (E). In this full model, A explained 16% of the variance (CI 95%: 0–54%), D explained 28% of the variance (CI 95%: 0–63%), and E explained 56% of the variance (CI 95%: 41–77%). Fixing the dominance effects of the Intercept factor to zero (i.e., AE model) did not result in a significant deterioration of the fit (Model 9 vs. Model 8: χ 2(1) < 1, ns), but fixing both the additive genetic effects and the dominance effects to zero (i.e., E model) did (Model 10 vs. Model 8: χ 2(2) = 21.16, P < .001).Footnote 4 The AE model is thus the preferred model, with additive genetic effects accounting for 38% of the individual differences in the Intercept (CI 95%: .21–.57), while 62% of the observed variance was due to unique environmental effects (CI 95%: .48–.79).

Model fitting: negative trials

Phenotypic analyses

The nLGC-model described the DT-data well (CFI = 1.00, SRMR = .02). Similar to the positive trials we found that although a linear growth curve model, excluding the Quadratic slope factor, described the data adequately (CFI = .99, SRMR = .03), the linear model fitted significantly worse than the non-linear model (χ 2(5) = 41.75, P < .001), and the non-linear model was therefore used in the analyses of the family data.

The results for DT− were very similar to those observed for DT+ and are summarized in Table 4 (model 1–11). In Model 1, a standard nLGC-model was fitted, in which sex and age effects were modeled on the means of the three factors, residual variances were constrained to be equal across siblings and the variances, and covariances of DZ twins were constrained to equal those of regular siblings. All residual variances in Model 1 were significantly different from zero (for all residuals, χ 2(1) > 47.00, P < .001).

Table 4 Model fitting results for decision time negative (DT−)

In Model 2, we fixed all the covariances between Intercept, Linear slope and Quadratic slope (i.e., cross-factor covariances) to zero within subjects as well as between subjects. The model fit did not deteriorate significantly (Model 2 vs. Model 1: χ 2(9) = 14.48, ns), implying that the three growth curve factors did not intercorrelate significantly.

Fixing the variance of the Quadratic slope and the covariances between the Quadratic slopes of family members to zero did also not result in a significant drop in fit (Model 3 vs. Model 2: χ 2(3) = 2.08, ns), suggesting that the Quadratic slope should be interpreted as a fixed factor. The Linear slope could not be interpreted as a fixed factor (Model 4 vs. Model 3: χ 2(3) = 21.34, P < .001), and neither could the Intercept (Model 5 vs. Model 3: χ 2(3) = 3435.61, P < .001).

Sex effects on the Intercept, Linear slope, and Quadratic slope could all be dropped from the model without significantly deteriorating the fit (Model 6 vs. Model 3: χ 2(3) = 4.90, ns).

Age effects on the Intercept, Linear slope, and Quadratic slope could not all be dropped from the model without significantly deteriorating the fit (Model 7 vs. Model 6: χ 2(3) = 99.30, P < .001), and subsequent submodels (Model 7a–7c) showed that the effect of age was significant for all three factors (see Models 7a–7c, Table 4). The age effect on the Intercept was estimated at 2.76, suggesting a substantial increase in the mean of the Intercept factor with every standard deviation increase in age. The age effect on the Linear slope was estimated at −.18, suggesting a decrease in the difference in decision time between young and older subjects with increasing set size. The age effect on the Quadratic slope was estimated at −.20, suggesting a decrease in the curvi-linear effect with age.

In this model, considerable variance was observed in the Intercept (Var(Intercept) = 45.13), implying that this is indeed a random effect, i.e., there are substantial individual differences in Intercept scores. The variance of the Linear slope was considerably smaller (Var(Slope) = .56).

For the Intercept, the MZ twin correlation was estimated at .31 (CI 95%: .08–.50), while the correlation between Intercept scores of DZ twins, including regular siblings was .16 (CI 95%: .06–.27). For the Linear slope, the MZ twin correlation was estimated at .57 (CI 95%: −.09 to 1.00), while the correlation between Linear slope scores of DZ twins, including regular siblings was −.14 (CI 95%: −.51 to .23). As with the DT+ scores, the twin correlations for the Linear slope showed no sign of the presence of familial effects, genetic or common environmental, and genetic decomposition of the Linear slope variance was therefore renounced. For the Intercept, however, the higher MZ twin correlation, compared to the DZ correlation, suggests the presence of genetic influences. As the MZ twin correlation is about twice as high as the DZ correlation, genetic dominance is presumed absent. We therefore fitted an ACE model to the DT− data.

Genetic analysis

In Model 8 (Table 4), the variance of the Intercept factor was decomposed into additive genetic influences (A), variation due to common environmental effects (C) and unique environmental influences (E). In this full model, A explained 30% of the variance (CI 95%: .00–.50), C explained 1% of the variance (CI 95%: .00–.27), and E explained 69% of the variance (CI 95%: .54–.92). Fixing the common environmental effects of the Intercept factor to zero (i.e., AE model) did not result in a significant deterioration of the fit (Model 9 vs. Model 8: χ 2(1) < 1, ns), and neither did fixing the additive genetic effects to zero (i.e., CE model: Model 10 vs. Model 8: χ2(1) = 1.59, ns). However, fixing both effects to zero did (i.e., E model: Model 11 vs. Model 8: χ 2(2) = 15.29, P < .001), suggesting that familial effects are present, but that the study lacks power to distinguish between an AE and a CE model. As the AE model is preferred over the CE model (based on AIC), the AE model is the preferred, most parsimonious, model, with additive genetic effects accounting for 32% of the individual differences in the Intercept of DT− (CI 95%: .16–.51), and unique environmental effects accounting for 68% of the individual differences (CI 95%: .54–.86).

Multivariate analyses

As the statistical power to detect genetic and environmental effects may benefit from a multivariate design (Schmitz et al. 1998), we also analyzed the positive and negative trials simultaneously, resulting in a 6-variate model (Intercept, Linear slope and Quadratic slope, for the positive and negative trials, respectively). Like in the previous analyses, all cross-factor covariances (i.e., between the Intercept factors, Linear slope factors, and Quadratic slope factors, respectively) could be constrained to zero (χ 2(36) = 50.44, ns), and the Quadratic slopes could be considered fixed factors (χ 2(11) = 14.25, ns). For the covariates age and sex, the pattern of effects was similar to the pattern observed in the previous analyses. The correlation between the two Linear slope factors was estimated as larger than 1 (Heywood case, most likely due to the small variances and the large SEs). For the model to make sense, we constrained this correlation to 1, all twin correlations to be equal across the two Linear slope factors, and the cross-trait-cross-twin correlation to equal the twin correlation, even though this resulted in a slight deterioration of the model fit (χ 2(5) = 14.55, P < .01). The MZ twin correlation for this collapsed Linear slope factor was .23 (CI 95%: −.22 to .63), and the DZ correlation .18 (CI 95%: −.06 to .43). Note that both these correlations are not significantly different from 0, implying predominant presence of unique environmental effects. The correlation between the Intercept factors for positive and negative trials was estimated at .95 (CI 95%: 94 to .96). The twin correlation were as follows: Intercept positive trials: MZ: .45 (CI 95%: .28 to .56), DZ: .15 (CI 95%: .08 to .26); Intercept negative trials: MZ: .32 (CI 95%: .21 to .46), DZ: .16 (CI 95%: .08 to .27); cross-trait-cross-twin: MZ: .37 (CI 95%: .23 to .48), DZ: .15 (CI 95%: .08 to .25).

Genetic analyses showed that for the collapsed Linear slope factor, additive genetic effects and common environmental effects could be dropped from the model (for both χ 2(1) < 1, ns), i.e., the variance in the Linear slope was purely unique environmental in nature. Cholesky decomposition was used to decompose the (co)variance of the two Intercept factors, with the Intercept factor for positive trials modeled as first factor. The additive genetic (A) and common environmental (C) specifics of the Intercept factor for negative trials were not significant (χ 2(2) < 1, ns), while the unique environmental specific was (χ 2(1) = 31.51, P < .001). In addition, the part of the covariance between Intercept for positive trials and Intercept for negative trials modeled via C (i.e., the cross path) was insignificant (χ 2(1) = 3.06, ns), while the cross paths for A and E were significant (χ 2(1) = 15.35, P < .001, and χ 2(1) = 532.92, P < .001, respectively). In sum, this implies that all genetic influences between the Intercept factors are shared, and that the covariance between the two Intercept factors is genetic as well as unique environmental in nature.

Sample selection

In the present study, only trials which were answered correctly were included in the analyses. Outliers (±3 SD) were eliminated, total sets were eliminated when less than 80% of the answers within a set were incorrect (i.e., 8 out of 10 trials), and entire subject scores were eliminated when less than 70% of the subject’s data were valid (i.e., 70 out of 100 trials). This rigorous data cleaning left us with ~86% of the original sample. All analyses were also run using different selection criteria (e.g., eliminate entire subject scores when overall error rate >10%), but the general results remained very similar, confirming the robustness of the results presented here.

Discussion

In the current study, the Sternberg Memory Scanning (SMS) task was administered to twins and their non-twin siblings, to investigate the etiology of variation in the Intercept (assumed to reflect basic processing speed) and the linear Slope (assumed to reflect time required to retrieve an item from memory) parameters of this task. A distinction was made between positive trials (target stimulus is part of the set) and negative trials (target stimulus is not part of the set), and the SMS-data were subjected to a non-linear growth curve (nLGC) model. Such a model allows accommodation of measurement error which provides more reliable operationalisations of Intercept, Linear and Quadratic slope, compared to using difference scores.

Sex effects were absent for both positive and negative trials. For the positive trials, age effects were only significant for the Intercept, with older subjects requiring more time to decide whether or not a stimulus was part of the target set than younger subjects. For the negative trials, age effects were significant for Intercept, Linear slope and Quadratic slope. The age effects on the slope parameters were negative, suggesting a decrease in the difference in decision time between younger and older subjects with increasing set size. This could be related to the finding that older subjects were slower to begin with. The phenomenon that the magnitude of the reaction to a manipulation or treatment depends on someone’s initial status or performance level is often referred to as the law of initial values (see e.g., Campbell 1981). In the present case, increasing the level of complexity of the task had smaller effects on the speed of subjects who started out slower. Overall, subjects reacted faster to positive trials than to negative trials, regardless of set size. This finding is in line with previous results (e.g., Sternberg 1966).

Although previous studies using selected samples (i.e., encephalitic mental retardates, senior citizens, mnemonist etc.) have reported large variances in the Linear slope parameter (Cavanagh 1972; Sternberg 1975; Hunt 1980; MacLeod et al. 1978), the small variance of Linear slope in the current study is comparable to the findings of Neubauer et al. (2000) in another sample of healthy adults. In our study, the variation of the Quadratic slope could be fixed to zero, i.e., an effect that does not differ between individuals. Note that previous studies did not model quadratic effects to describe the increase in retrieval time with increasing set size. The significant quadratic effects in the present study may be due to the large age-range of our sample.

Twin correlations suggested that the variation in the Linear slope (denoting WMS) of both positive and negative trials was not familial. The finding that twin correlations for WMS are small and close to zero is in line with previous studies (e.g., McGue et al. 1984; Neubauer et al. 2000). As measurement error is accommodated in the nLGC model in the form of freely estimated residual variances, the predominance of unique environmental effects for the WMS parameter cannot simply indicate an abundance of noise in the psychometric measurement of WMS. It is however, possible that more trials than 10 per conditions (or even more than 20 if negative and positive trials are combined) are required for a reliable estimate of WMS (i.e., smaller standard errors). Whether more trials would indeed result in a more stable estimate of WMS can be tested by administering a more extended version of the SMS-task. The predominance of unique environmental effects for WMS does not necessarily preclude a genuine biological phenomenon. Possibly, working memory retrieval speed depends on the connections formed in the brain following experience, which is not necessarily familial in nature (e.g., van Ooyen and van Pelt 1994, but see also Eroglu et al. 2009). Alternatively, the WMS parameter might mainly depend on the ‘strategy’ subjects use while conducting the SMS-task (i.e., serial versus parallel storage and processing of information), and this choice of strategy may not be familial either. Finally, we would like to note again that the variance of the Linear slope (WMS) was very small to begin with (i.e., ~.6, i.e., >70 times smaller than the variance of the Intercept (PS)), which greatly affects statistical power and thus complicates reliable genetic decomposition. In contrast, twin correlations for the Intercept (PS) suggested familial influences. Genetic analyses of PS showed that additive genetic influences explained 38% of the observed individual differences in positive trials and 32% of the observed individual differences in negative trials, while non-shared environmental influences (E) explained 62 and 68% of the individual differences, respectively. Furthermore, our multivariate models showed that the same genetic effects affected PS for positive and negative trials. For the positive trials, dominance genetic effects were not statistically significant, even though the MZ twin correlations were clearly more than twice as high as the DZ correlations. It is noteworthy, however, that power studies have shown that for intermediate levels of heritability, the statistical power to resolve dominance genetic effects can be quite poor when only data from twins and siblings are available (Eaves 1969; Martin et al. 1978). Moreover, the confidence intervals of the twin correlations were broad, further complicating the distinction between ACE and ADE models. All in all, the present findings are comparable to those reported in previous studies (McGue et al. 1984; Neubauer et al. 2000; Luciano et al. 2001; Polderman et al. 2006).

A few limitations of this study should be noted. The age-range in our sample was broad, ranging from 13 to 70 years. However, the number of subjects younger than 20 or older than 60 was small (39 in total), and re-analyses of the data without these 39 subjects showed that the general conclusions remained unaltered.

In the present study, 12 practice trials were presented and 10 trials for each set size. Presenting more trials per condition is certainly advisable in view of reliable parameter estimation. Low reliability will result in an underestimation of possible genetic influences as the heritability of a trait can never exceed the reliability. We can therefore not rule out the possibility that our finding that variation in the WMS parameter is non-familial, is partly due to the limited number of trials we presented per condition. However, in other studies in which more trials were administered, heritability estimates for WMS were also not statistically significantly different from zero (e.g., McGue et al. 1984: 15 practice trials, and 3 conditions of 30 trials each, 50% of trials are positive; Neubauer et al. 2000, 6 practice trials, 3 conditions of 16 trials each, 50% of trials are positive).

The ‘working memory model’ as proposed by Baddeley and Hitch (1974) comprises multiple components, i.e., the phonological loop, the visuo-spatial sketchpad and the central executive system, the latter covering various executive functions such as inhibition, shifting, and updating (Baddeley 1992; Miyake et al. 2000; Friedman et al. 2008). The working memory retrieval speed (WMS), as indicated by the slope parameter of the SMS task, represents only a small part of the full working memory system as envisioned by Baddeley and Hitch. How retrieval-speed such as operationalized in this study (i.e., the linear slope of the SMS task) relates to other executive functions, such as updating (i.e., the dynamic revision of the content of memory in light of new, relevant information, or the ability to store and process information simultaneously), is still unclear, and merits further research.

The present study has several advantages compared to previous studies. First, a distinction was made between positive and negative trials. Although the genetic decomposition turned out comparable across positive and negative trials, significant mean differences were observed, and age effects were more profound for negative trials. Second, studying this study the focused was on decision time only, rather than collapsing decision time and movement into overall reaction time. Third, rather than using difference scores, a latent growth curve model was used to model the increase in decision time resulting from increasing memory set size, allowing the explicit accommodation of measurement error in the statistical model.

In summary, sex effects were absent on the SMS-task, while age did affect performance, especially on the negative trials. Although genetic influences on working memory speed could not be detected mainly due to low individual differences, this study showed moderate heritability of processing speed. This suggest that genetic influences on working memory are more likely to act upon processing speed (basic processing speed and (pre)motoric processes) than on working memory speed (i.e., the speed with which an item is retrieved from short term memory).