Over the age of 60 normal healthy adults start to show changes in cognition. Many cognitive functions, which have been slowly improving up to this point, start to plateau, and then decline (Schaie et al. 2004). These changes occur at the same time in life as a number of other challenges and stressors, for example changes in employment and increasing physical health problems. Whether cause or effect, these changes may have significant impact on quality of life despite not constituting dementia. Research into this area is timely as this population is swelling rapidly, the ‘baby boomers’ having started to turn 65 in 2011 (Daffner 2010). Populations show increasing variability in cognitive performance as they age, suggesting that there are factors that enable some individuals to preserve their functions better than others (McDaniel et al. 2007).

Twin studies can investigate the extent to which variation in a population is due to genes or environment. To date, there is sparce evidence that intra-individual differences in cognitive change with age are genetically determined. This stands in contrast to cognitive performance at a single time point, which is highly heritable (Lee et al. 2010; McArdle and Plassman 2009; McGue and Christensen 2002; Reynolds et al. 2005). Evidence for a genetic contribution to change with age includes heritability of brain morphology changes over a four-year follow up in older males (Pfefferbaum et al. 2004). In addition, using growth curve models, Reynolds et al. have shown that there is heritability of quadratic change in many cognitive tests (Reynolds et al. 2005). There is some evidence that such intra-individual quadratic change is more important than linear change in predicting conversion to dementia states (Small and Backman 2007). Intriguingly, McArdle and Plassman, investigating word recall longitudinally, found a genetic contribution to rate of change before age 74, but predominantly environmental factors driving change after this age (McArdle and Plassman 2009), a finding in common with some other systems in ageing (Steves et al. 2012).

It is well established that processing speed and reaction times are most sensitive to ageing within a normal population (Salthouse 2010). Computerized tests of speed of inspection, decision, and reaction are now possible, yet so far there have been no published longitudinal quantitative genetic studies of reaction time, and only one cross sectional study (Finkel and McGue 2007).

We aimed to describe the quantitative genetics of a battery of computerised tests which include a number of speed measures, including reaction time, choice reaction time, correct inspection time and correct decision time. Our hypothesis was that we would find significant heritability of cognitive ageing with a battery rich in such speed measures.

Methods

Study participants

This study utilised a subset population of the Twins UK registry of female volunteer twins who were recruited in media campaigns from 1992 onwards. The initial campaigns focused on diseases affecting older women predominantly, so the cohort in 1999 consisted of 96 % women, mainly in mid-life to old age. Because of this, and due to the fact that age-related quantitative traits may have different aetiologies in men and women (Weiss et al. 2006), only women were recruited into this study, which is part of the Healthy Aging Twin Study (Moayyeri et al. 2012; Spector and Williams 2006). Ethical approval for the present study was obtained from Guys and St Thomas’ Research Ethics Committee.

In 1999 there were a large number of individuals (2,768) over the age of 43 on the register, (this lower age cut off was arbitrarily chosen to capture the greatest numbers). During 1999, all twins visiting the unit were asked to complete the Cambridge Automated Neuropsychological Test battery (CANTAB). There were no exclusion or selection criteria at this stage. In total, 488 individuals (all women) underwent the CANTAB battery as part of their study visit and their characteristics, compared to the cohort as a whole, are shown in Table 1 and discussed below. The aim of this study was to focus on ‘normal’ ageing and so predefined exclusion criteria for the follow-up study in 2009 were (i) death of one twin and (ii) significant cerebral pathology in one twin (iii) withdrawal from the study. In 2009, 401 of these met study entry criteria (Fig. 1). They were invited back for repeat testing. 324 (127 MZ, 197 DZ) were tested, constituting a 66 % follow-up rate. Zygosity was assessed by use of single nucleotide polymorphism arrays.

Table 1 Occupational classification, age and verbal IQ: comparison with general population and whole cohort
Fig. 1
figure 1

Study participants

Between 1993 and 1999 all participants in the cohort were asked to complete two separate occupational questionnaires, both of which scored their occupations during their life using the British Registrar General’s scale. Where individuals changed occupation, their paid occupation of longest duration was chosen. Where individuals have answered both questionnaires, the more recent is used (Table 1).

Table 1 also reports the proportions of employed women in each occupational group in 2010. While there are differences in age, cohort and employment classification between our cohort and the general population, these data show that our cohort of volunteers consists mainly of the middling occupations, and under represents both professional and unskilled occupation categories. However, those in this study, and those unselected and lost to follow-up, are broadly similar.

With regard to age, the means for those selected in 1999 and the unselected are close. There was no significant difference in age between those followed up in 2009 and those lost to follow-up. The National Adult Reading Test (NART), an estimate of verbal IQ, was 114 in the 1999 study population and not statistically significantly different in those lost to follow-up.

Study visits

Informed written consent was obtained for the study which was named ‘Hearing, Learning and Memory’, in an attempt to minimise the impact of stereotypes about ageing (McDaniel et al. 2007). In both visits, twins underwent separate, contemporaneous testing. The NART (24) and CANTAB battery were delivered in the same order (as detailed below), using the same script in 1999 and 2009. Participants were asked to refrain from caffeine for 2 h and alcohol for 12 h before the tests, to continue regular medications, but avoid as-required sedatives for 2 days before the tests. In 1999 additional measures of physical function were also collected as well as detailed questionnaire data. In 2009, the Mini Mental State Examination (MMSE) (Folstein et al. 1975), Geriatric Depression Scale (GDS) (Yesavage et al. 1982), self-reported quality of life (WHOQOL-Bref) (The WHOQOL Group 1994) among other self-reported questionnaires were collected. Both the 1999 and the 2009 studies were conducted in the same institution, with the exception that in 2009 in order for physically frailer individuals to still be included in the study, 14 individuals were tested in their own homes, situated throughout the UK.

Cognitive measures: CANTAB battery

The CANTAB is a series of tests completed on a touch sensitive screen, and using a response button, which has been standardised using a large elderly population (Robbins et al. 1994). The advantages of using this battery are: (i) that it is largely automated and the scripts given to the operator are standardised, thereby reducing operator influence on performance; (ii) it includes multiple measures of processing speed sensitive to age (Der and Deary 2006; Schaie et al. 2004; Verhaeghen and Salthouse 1997); (iii) while it is “computerised,” it was originally developed in the 1980s and its operation is very basic, requiring no knowledge or familiarity with computers; (iv) each test includes several parallel trials which should increase robustness (Salthouse 2012). However, it contains no verbal cognitive measures. Brief descriptions follow but these tests have been described in detail (Robbins et al. 1994).

Motor screening

This is a training procedure to relax the subject and introduce them to the equipment. It screens for problems with vision, motor function and comprehension. A series of crosses appear on the screen in different positions and the subject is asked to touch them.

Pattern recognition memory (PRM)

The subject is asked to remember simple patterns presented one at a time. They are then presented with the familiar pattern and a colour matched pattern of similar complexity. They are asked, “touch the one which you saw before”. The task involves elementary comparison of only two choices, and the accuracy rate is high (87 % in our sample), so this task gives a measure best described as inspection time. Outcome measure: mean latency of a correct response over 24 trials (lowest number of correct trials was 14 in our sample).

Delayed matching to sample (DMS)

The subject is shown a complex visual pattern made of four coloured sub-elements, and, after a brief delay, four other patterns, one of which is the correct one. She is instructed, “touch the one you saw before”. Again, no suggestion is given to the subject that speed is the measure of interest. There are three practice trials, after which no further instruction is given. In this test the patterns are moderately complex and there are multiple choices and varying delays. The accuracy rate is high, at 84 % in our sample. For these reasons, it is best described as a measure of decision time. Outcome measure: mean correct latency over 40 trials (lowest number of correct trials was 23 in our sample).

Paired-associates learning (PAL)

This is an episodic memory task. The subject is presented with up to eight individual shapes within eight boxes differently located on the screen. She is then asked to locate the box corresponding to each shape, presented one after the other. Until all responses are correct, the locations are shown again and the subject asked to repeat the location. Outcome measure: number of errors made, adjusted if the participants could only correctly match fewer than eight boxes.

Spatial span (SSP)

This is a computerised version of the Corsi Blocks task assessing spatial working memory capacity. White boxes on the screen change colour in a sequence. The subject is asked to repeat the sequence back. Outcome measure: longest sequence remembered correctly, given up to three attempts.

Spatial working memory (SWM)

The test begins with a number of coloured squares (boxes) on the screen. The subject searches for blue tokens, which are hidden one at a time, behind the boxes. She must not revisit previously filled boxes. The aim of this test is that, by a process of elimination, the subject should find one blue ‘token’ in each of the boxes and use them to fill up an empty column at the side. If the subject completes the trial, finding all tokens, the number of boxes increases to a maximum of eight. The first three-box trial is used to explain the test and is not included in the measure. This tests the subject’s ability to retain spatial information and to manipulate remembered items in working memory. It is a self-ordered task and also assesses heuristic strategy and is considered to measure frontal lobe and ‘executive’ function (Owen et al. 1990). Outcome measure: number of ‘between’ errors made (boxes opened which previously contained a token.)

Reaction time (simple (RTIS) and five-choice (RTIFC))

This measures the speed of response to a visual target (a yellow spot appearing on the screen). The subject places her finger on a button (press pad) and is asked to touch the target on the screen with the same finger, as fast as she can. She is instructed, “Don’t let go of the button until you see the spot.” The measure is taken between stimulus presentation and the moment her finger leaves the press pad, thus eliminating most of the movement time involved. Trials are aborted and repeated if the subject removes their finger before the presentation of the target. Targets are presented with varying pre-stimulus duration. There are practice stages containing a minimum of 30 trials. Outcome measures: simple reaction time (RTIS), where the stimulus is predictable in the centre of the screen (mean of 8 correct trials), and five-choice reaction time (RTIFC), where the stimulus could be in one of five evenly spaced locations (mean of 8 correct trials).

Analysis and statistical methods

Two types of analysis were conducted, cross sectional analysis at each time point, and analysis of change. The Stata 11 statistical package was used for all transformations, factor analysis and regression modelling (StataCorp 2009).

For the cross sectional analysis at each time point, raw scores were transformed as appropriate. Table 2 gives descriptive statistics of the raw and transformed scores. In addition, a general battery measure was produced using the first un-rotated factor from exploratory factor analysis of all seven transformed measures at each time point separately. This method is frequently used to determine a general factor ‘g’ (Harris and Deary 2011). These cross-sectional first factors captured 75 and 76 % of the shared variance in 1999 and 2009 respectively. Standard univariate structural equation modelling of the contribution of genetic effects and environments to variability in these data was carried out using Mx opensource software (Neale 1994). The stability of these effects over time was then analysed using a bivariate Cholesky decomposition.

Table 2 Descriptive statistics of raw test scores

For analysis of change, we first conducted longitudinal analysis of the change in each cognitive measure. As is universally the case in follow-up studies, there was a significant relationship between raw difference score and baseline score consistent with regression to the mean (Barnett et al. 2005), and mainly for this reason the reliability of raw difference scores are low. This effect is alleviated by using ANCOVA to define change, as is routinely used in the analysis of clinical trial data (Twisk and Proper 2004). For each cognitive measure we used the raw difference scores adjusted for baseline by fitting the baseline test result as a linear covariate. This gives a measure indicating the change in score over 10 years assuming all individuals started at the same baseline performance. It is mathematically the same as fitting the baseline as a covariate with the follow-up measurement as the outcome variable (Twisk 2003), which is the definition of cognitive ageing used by others (Deary et al. 2012). For all tests a positive change score translates as ‘improvement’ over 10 years, and a negative score is decline (scores for SSP have been reversed to allow this) (Table 3).

Table 3 Descriptive statistics of change scores

Univariate structural equation models were then fitted to the raw adjusted change scores to determine the contribution of genetic effects and environments to variability in these change measures.

Secondly, in the analysis of change, we attempted to capture change shared between all the tests in the battery. Cross sectional studies show some support for the concept that age-related influence on a wide range of cognitive variables are shared (Verhaeghen and Salthouse 1997). We have further hypothesised that change in cognition over time in the same individual in different tests should have a large common or shared component which is greatest in older individuals. This might be due to an underlying factor or factors, such as degenerative change, which affect change in all the test scores for an individual over time. Indeed, there is significant positive correlation between the change scores, and each is associated with age (Table 4). Using Stata 11, we performed exploratory factor analysis of the seven adjusted delta scores (communalities set to 1, no rotation). This is a similar method to that which is used to identify general cognitive ability at a single time point (Harris and Deary 2011), but the difference is that here we apply it to change scores. To our knowledge (and to the knowledge of Cambridge Cognition, the suppliers of the software), there has been no previous factor modeling applied to change in CANTAB scores. For these reasons, exploratory factor analysis was appropriate and the principal-component factor method in STATA was used. This produced three factors with eigenvalues >1 (top 4 factors 1.76, 1.31, 1.12, and 0.86 respectively). Factor 1 explained 25 % of the variance and is strongly associated with age (standardized beta −0.066 p < 0.001, scatter plot against age is shown in Fig. 2). This factor is referred to as the age-related change (ARC) factor in further analysis. Factors 2 and 3 were not associated with age and explained 19 % and 16 % of the variance respectively. Factor loadings are reported in Table 5, showing that the greatest loadings for the ARC factor are on tests with a speed component. While further rotation produces factors more easily interpretable as reaction time, decision time and working memory (data not shown), these are all equally associated with age and none of them support the heuristic of an age-related change factor.

Table 4 Pairwise correlations between change measures
Fig. 2
figure 2

Relationship between Age and ARC factor score. y-axis: first principle component of factor analysis of all the change scores adjusted for baseline performance. Scores below 1 indicate declining function over 10 years. x-axis: age at second time point

Table 5 Factor loadings for factor analysis of change scores

Finally, using phenotypic information collected on these twins in 1999, we tested whether the ARC was associated with known risk factors for cognitive change (Hendrie et al. 2006): vascular risk factors and physical performance level. For this we used multiple linear regression (generalised estimating equation model, adjusting for family structure (Carlin et al. 2005)).

Results

Study population characteristics

The study population was well educated and had relatively high self-reported socio-economic status in 2009 (Table 6). The average age at first testing was 55.7 (SD 7.32 range 43–73), and at the second testing 65.7(see Fig. 2). The average MMSE score at the end of the study was 29, similar to a normal ‘healthy’ population, and a significant minority gave a history of depressive illness (17 %) and current GDS > 5 (7 %). 18 % were on a medication identified as being potentially psychoactive, such as opiate analgesia or antidepressants (Table 6). Such potential confounders were included as covariates in the regression models, but as there were no significant differences between monozygotic and dizygotic twins in these population characteristics, they were not included in the twin modelling.

Table 6 2009 Cohort characteristics

Cross sectional heritability estimates

In almost all measures, the heritability estimates slightly increased over the 10 years of follow-up (Fig. 3; Supplementary Table S1). The heritability measures from a general battery factor (first factor, or ‘g’) at each time point were 55 % in 1999, and 76 % in 2009 (Table 7). The factor loadings for the tests were slightly different for 1999 and 2009, especially with regard to DMS and PRM. Measurement invariant scores were computed using the factor loadings from 2009 data. Heritability of these composite measures were almost identical to the estimates in Table 7 (best models AE, with A estimates 75 % (63–83 %) in 2009 and 54 % (36–68 %) in 1999). As is the case in twin studies of many ageing traits, most of the best models were AE models, and the estimated effect of shared environment was minimal. MZ and DZ correlations are found in Supplementary Table S2.

Fig. 3
figure 3

Cross sectional heritability estimates for each cognitive measure. Best fitting model heritability point estimates for each of the tasks and 95 % confidence intervals in the two waves of testing. The best models were CE for PAL errors, RTIS or SWM between errors in 1999

Table 7 Heritability of ‘g’ or general cognitive ability factor (first principle component of all transformed measures) calculated in 1999 and 2009

Bivariate analysis

Bivariate modelling of test results at both time points for each individual produced estimates for the variance contributions of A, C and E similar to those reported in the univariate analysis, again showing marginally higher heritabilities at the later time point (see Supplementary Table S3). For the individual measures, models equating these estimates did not have significantly worse fit compared to the saturated models, but the trend over all of them is notable. In the case of the general battery measure, equating the estimates for the two time points trended towards a worse fit (p = 0.078). The correlation (r) of the genetic factors contributing to variance across time was very high (0.92–1.0) in all but one of the tests (RTIS) indicating that the genetic factors were largely the same over the 10-year period. Interestingly, E, which includes environmental factors unique to that individual and measurement error, was also, in all cases, significantly correlated between the time points, indicating substantial stability in the environmental factors underpinning variance in cognitive performance over the 10-year period as well. This correlation is strongest in the episodic memory measure PAL (45 %).

Longitudinal study results

The heritability estimates for change in cognition over time (absolute differences in scores adjusted for baseline performance) were expectedly lower than the cross sectional estimates, but there was still significant effect of a genetic factor seen in 5 of the 7 tests (Table 8). The age related change factor computed from all the change scores together was significantly heritable at 47 % (4–62 %), but in the other two factors heritability estimates did not reach significance. While we do not feel age adjustment is appropriate, because relevant data associated with age would be discarded (see “Discussion and conclusion”), for completeness we have supplied age-adjusted data for the three factors underlying the change scores in Supplementary Table S4, where the heritability of the age-related change factor is reduced to 20 % (0–41 %). Likewise, estimates of heritability for rotated factor solutions were lower: 13 % (0–46 %) for reaction time, 36 %(0–57 %) for decision time, 13 % (0–55 %) for working memory (Supplementary Table S5).

Table 8 Heritability of change measures

Using phenotypic information collected on these twins in 1999, we tested whether the ARC was associated with known risk factors for cognitive change (Hendrie et al. 2006): vascular risk factors and physical performance (here indicated by leg power on the Nottingham Power Rig). We found, as well as being significantly negatively associated with age, the ARC factor was also negatively associated with history of diabetes (p < 0.001) systolic blood pressure (p = 0.018) and leg power (p < 0.001) at baseline, despite controlling for age, education and occupation status. However, no significant association was found with body mass index. The ARC factor was strongly positively associated with quality of life (WHOQOL-Bref) at follow-up (p = 0.002), again controlling for age, education and occupation status.

In order to check whether the heritability in our population is likely to be biased by the relatively higher socioeconomic mix of the cohort, we conducted ad hoc subgroup analysis. Heritability of the ARC factor was calculated in the 27 twins pairs where both twins classified themselves as professional/managerial in 2009 and in the 78 twin pairs where both classified themselves as non-professional/managerial. Supplementary Table S6 shows that in our sample the non-professional subgroup had higher a heritability of the ARC factor (66 %, 19–82 %).

Discussion and conclusion

This study uses the same cognitive tests in the same older women, performed under the same protocol 10 years apart, and its main finding was significant heritability of change in cognition over the 10 years of the study in 5 of the 7 tests. Moreover, 47 % of the variance in the age-related change factor was found to be attributable to genetic factors. This factor was also predicted by history of diabetes, systolic blood pressure and physical fitness. This lends face validity to this factor as a measure of important cognitive change, and echoes other literature showing the importance of vascular risk factors and physical activity in contributing to longitudinal cognitive change (Hendrie et al. 2006).

Estimates for the heritability of each measure were all marginally higher at the second time point. While for each individual measure models equating estimates for the time points were not significantly different, for the general measure (first factor) such a model trended towards a worse fit, and the consistent trend over all of the tests is apparent, suggesting a definitive difference would have been seen with a larger sample size. The factor loadings of the general measure were slightly different so their definition was not strictly consistent. Computation of scores for each time point based on the factor loadings in 2009 yielded almost identical heritability estimates, indicating that any difference in heritability of ‘g’ is not likely to be due to measurement non-invariance between the two time points.

Heritability of change in variables showing increasing heritability with age is found in some other organ systems in ageing studies, for example motor and lung function (Finkel et al. 2003). Increasing heritability of cognitive function with age would be in keeping with early studies and those focusing on the age range from adolescence into adulthood (Deary et al. 2009), but contrary to more recent studies focusing on older age groups (Lee et al. 2010), and may reflect the relatively young age range of this cohort. Of course, factors differing between the two time points include age, but also include any other development happening within the cohort in this time period.

Heritability of change in cognition is contrary to what has been found in several other twin studies. The Longitudinal Study of Aging Danish Twins (LSADT) showed repeatedly substantial heritability for the intercept parameter and no significant heritability for the slope, testing in waves every 2 years for up to 10 years (McGue and Christensen 2002, 2007). SATSA followed 798 non-demented individuals (including 268 complete twin pairs with two or more time points) in four waves spanning over 13 years, measuring WAIS-R subtests and a standard Swedish ability battery. Numerous studies of this dataset have consistently shown genetic influences on the level of cognitive functioning but only slight, or insignificant influence on the linear slope of change. For example, the heritability of the first principal component measure of general cognitive ability from the SATSA was estimated to be 91 % for the intercept, but only 1 % for the slope (Reynolds et al. 2005). The lack of finding for heritability for linear change has been noted to be an unexpected failure (McGue and Johnson 2007). However, Reynolds’ study did find substantial heritability for quadratic change ranging from 33 to 75 % for non-verbal tests. Our study with only two time points cannot measure quadratic change, but the finding of heritability of the first factor of change in all tests (ARC) in the current study provides some replication for this finding, in that this measure itself changes with age. There is evidence that accelerating change may be more important in predicting cognitive morbidity and mortality (Small and Backman 2007). Indeed accelerating memory decline was seen to be greater in non-demented individuals with specific genetic risk for dementia (TOMM40 L and APOE e4) (Caselli et al. 2012).

There is good evidence that studying speed is of value in cognitive ageing. Age related variance in a range of cognitive variables has been noted to be shared with speed, showing in particular, a close relationship between speed, working memory and fluid intelligence (Verhaeghen and Salthouse 1997). Choice reaction time was found to be significantly associated with mortality in the same sample followed up to age 70, and a stronger predictor of death than general intelligence (Deary and Der 2005a, b). Salthouse contended that age-related increases in variability, which are classically found in computerized tests of speed, are not seen in paper and pencil tests (Salthouse 2000).

There appear to be some differences in the quantitative genetics of speed and working memory measures compared to other cognitive measures. Cross sectional data shows that, contrary to other cognitive measures in advanced age, speed measures continue to increase in heritability from 70 to 80 years of age (Lee et al. 2010). The Swedish Adoption Twin Study of Aging (SATSA) longitudinal study shows heritability of change in some cognitive domains, specifically block design, card rotations and digit span (Reynolds et al. 2005). The WWII veterans’ twin study in the United States has shown that over 9–13 years there is substantial heritability of decline in perceptual speed (48 % at 13 years), but not over 4 years (Lessov-Schlaggar et al. 2007). These studies used measures of perceptual speed using paper and pencil—most commonly the Digit Symbol Substitution Test (DSST) a timed code substitution test.

The CANTAB battery used in the present study was rich in speed measures, and reports their heritability. Interestingly, individual speed measures (and rotated factors separating speed components), were not as heritable as the ARC factor. This would suggest that the heritability for the ARC in the present study is either not related to speed, or related to a speed element which is shared between reaction time, decision time and working memory.

There are some limitations to this study. There were only two time-points, which limits analysis of change, but the interval between testing is likely to be long enough to avoid significant learning effect. The cohort in this study range from mid-life to early old age, so findings reflect developments in late adulthood or the “third age” (Baltes 1997). In later old age, the relative contribution of genetic factors to variation in cognitive ageing may decrease (McArdle and Plassman 2009). The possibility of pre-clinical protopathic bias was minimised by visiting frailer participants at follow-up, and the response rates overall were good.

The study population was volunteer women only, who are generally educated and mainly represent the middling occupations (intermediate and skilled workers). The study sample was unselected from within the whole cohort population and its demographics are similar to the cohort as a whole. In terms of the implications of the socio-economic status of the cohort for heritability estimates, there has been work in younger individuals suggesting a gene*environment interaction between socio-economic status and genetic factors, (such that there is higher heritability of cognitive abilities in more educated and higher socio-economic groups). However, significant recent other work contradicts this (Hanscombe et al. 2012). The only study of which we are aware in adult women suggests no moderating effect of additive genetic effects on cognition in different socio-economic backgrounds (van der Sluis et al. 2008). Indeed, subgroup analysis of our study, found that the lower socioeconomic group (non-professional/managerial) had higher heritability of the ARC factor than the professional group. This would suggest that any bias introduced by having a cohort of relatively higher socioeconomic status would lead to our under-estimating heritability of age-related change. Nevertheless, the domain of this study is older volunteering women, and further studies would be needed to ascertain whether the findings also apply to less educated and unskilled groups, and in particular, to men.

CANTAB testing is very basic to perform, having been devised in the 1980s. It does not require any computer/technology knowledge, so we did not anticipate that recent computer use would give a significant operative advantage. We did not, however, prospectively ask the subjects about their computer use/familiarity, so are unable to test whether variation in experience/familiarity with computers could, to some extent, drive the measured change. The CANTAB battery did not include verbal measures, so the general battery factors created will not necessarily be synonymous with other measures of general cognitive ability (Deary et al. 2009). However, there are some advantages to not including verbal or arithmetic measures. Firstly, there is a body of literature suggesting that speed measures using verbal, lexical or arithmetic information may be less sensitive to the effects of age than speed measures using non-verbal, non-lexical measures (as in this battery) (Salthouse 2000). Secondly, the effects of education/occupation may be reduced in non-verbal as compared with verbal/numerical measures.

We did not feel adjustment for baseline age was appropriate in this case. There was a significant relationship between the change scores and age, but controlling for this would entail discarding change associated with age, and this is specifically of interest to this study (time is part of the causal chain mediating change in cognition). Age does however constitute a shared environment of twins, and so can lead to overestimating the effect of shared environment. In this case, C was never found to be significant and Supplementary Table S4 shows that age adjustment does not significantly alter the estimates of C.

In conclusion, we found substantial heritability of age related change in cognition. This constitutes the second paper reporting that genetic factors may have more influence on accelerating change, as opposed to linear change. Importantly this sort of change, which may be more predictive of subsequent morbidity, may be a better phenotype in the search for genetic pathways involved in cognitive ageing.