Clinically significant cognitive decline resulting in mild cognitive impairment (MCI) affects approximately 15–20% of adults aged 65 or older, influencing 5.9 million Americans who later develop dementia due to Alzheimer’s disease (AD) (Alzheimer's Association, 2019). AD leads to brain atrophy; initially in medial temporal lobe structures of the hippocampus (Henneman et al., 2009; Lockhart & DeCarli, 2014). The specific biomarkers associated with an AD diagnosis are beta amyloid (Aβ) plaques and tau neurofibrillary tangles, which can be measured via positron emission tomography (PET) or with cerebrospinal fluid (CSF) (Jack et al., 2018). Aβ, tau, and neurodegeneration—AT(N) criteria of the National Institute on Aging and Alzheimer’s Association (NIA-AA) Research Framework—represent the three biological markers of neuropathology that indicate the presence and severity of AD (Jack et al., 2018). In the absence of an effective treatment strategy, factors that can slow the progression to dementia are of great importance to identify, especially since delaying onset of dementia results in notable public health savings (Brookmeyer et al., 1998; Zissimopoulos et al., 2014) and maintaining quality of life.

Cognitive reserve (CR) may be one mechanism through which individuals are protected against clinically significant cognitive decline even in the presence of neuropathology (Stern, 2002; 2009; Cabeza et al., 2018; Stern et al., 2020). The concept is based on the notion that sociobehavioral proxies such as education, intellectually engaging occupation, and various other activities help build more resilient neuronal networks that shield cognitive function even as AD biomarkers point to progressing neuropathology (Stern, 2012). CR is expected to moderate the association between neuropathology and cognitive performance; that is, individuals with high CR show greater resilience against AD-related neuropathology (Arenaza-Urquijo & Vemuri, 2018; 2020).

Research has identified two common operationalizations of CR—one which uses the actual sociobehavioral proxies and the other which uses residual variance approaches to estimate CR (Menardi et al., 2018; Nilsson & Lövdén, 2018; Wang et al., 2019; Stern et al., 2020). The latter quantifies CR by calculating the difference between actual and predicted cognitive performance on neuropsychological tests, where predicted performance is estimated relative to underlying neuropathology (Nilsson & Lövdén, 2018). Both operationalizations of CR indicate that higher levels of CR are associated with a reduced risk for dementia progression, whether simultaneously accounting for underlying AD neuropathology (Reed et al., 2010; Zahodne et al., 2015) or not (Allegri et al., 2010; Andel et al., 2005; Clouston et al., 2015; Dekhtyar et al., 2019, 2015; Karp et al., 2004; Kröger et al., 2008; Marioni et al., 2012; Mazzeo et al., 2019). The lack of a uniform measurement of CR is considered a major shortcoming by some (Menardi et al., 2018; Nilsson & Lövdén, 2018).

Focus of the Current Review

Associations between CR and levels of AD neuropathology have consistently shown that individuals with higher CR are able to endure greater levels of neuropathology than individuals with low CR before cognitive deficits or clinical impairment become apparent (Bartres-Faz & Arenaza-Urquijo, 2011; Hoenig et al., 2017; Menardi et al., 2018; Rentz et al., 2017; Stern, 2009; Stern et al., 2020). However, when studying incident dementia, many researchers investigating the effect of CR on risk of dementia progression do not include measures of neuropathology in their assessment, thereby limiting a thorough test of the CR hypothesis (Stern et al., 2020). Although a recent review assessed prospective longitudinal studies to describe the associations between CR, AD biomarkers, and cognitive/clinical outcomes in participants who were cognitively normal at baseline (i.e., preclinical AD-dementia) (Soldan et al. 2018), their focus on how CR was related to multiple outcomes including onset of clinical symptoms of MCI, changes in cognition, and changes in AD biomarkers, precluded a quantitative examination and limited conclusions regarding the effect of CR on dementia progression.

To adequately assess the CR hypothesis, we identified longitudinal cohort studies through a systematic review and meta-analysis to assess the extent to which CR is protective against incident MCI or dementia after controlling for AD-related structural pathology and biomarkers. A second goal was to examine whether operationalizations of CR (residual of cognition after accounting for neuropathology vs. CR proxies like education or occupation) yield different outcomes in terms of the CR-incident dementia relationship. To operationalize CR, we chose to focus on composite proxies of CR rather than single indicators. CR is an abstract concept that inherently involves multiple factors. Therefore, composite proxies are likely a better representation of CR than single factors. We hypothesized that CR would protect against dementia progression controlling for AD-related structural pathology and biomarkers. Further, based on some previous research (Reed et al., 2010; Zahodne et al., 2013), we expected that residual variance may be more strongly related to dementia incidence than CR measured with a composite of common proxies.


Literature Search

Embase, PsycINFO, PubMed, and Web of Science were searched for relevant articles through February 2020. An updated search in September 2020 identified no additional relevant studies. Database searches included natural language terms searched in the title and abstract (PsycINFO and PubMed), topic (Web of Science; which includes the title, abstract, author keywords, and keywords plus), or the title, abstract, and keywords (Embase). Further, relevant controlled vocabulary for search terms was included where applicable (i.e., Emtree, MeSH terms, and APA Thesaurus of Psychological Index Terms). Natural search terms included the topics of CR, progression, AD-related structural pathology and biomarkers, and mild cognitive impairment or Alzheimer’s disease. These four topics were combined with the AND operator. Each of the four aforementioned topics had a search string that was combined with the OR operator. Where applicable, asterisks were used to generate articles using different forms of relevant words (e.g., progress* would yield both progressing and progression). CR terms included: cognitive reserve, cognitive capacity, brain reserve, neural reserve, brain maintenance, and residual variance. Progression terms included: transition, cognitive decline, cognitive deterioration, progress*, conver*, neurodegeneration, risk, incident, and longitudinal. We use the term progression to indicate a change in diagnosis from cognitively intact or mild impairment to a later diagnostic stage. AD-related structural pathology and biomarkers terms included: magnetic resonance imaging, MRI, grey matter, gray matter, white matter, positron emission tomography, PET, beta amyloid, and tau. Cognitive impairment terms included: mild cognitive impairment, MCI, Alzheimer*, AD, dement*, mild neurocognitive disorder, and major neurocognitive disorder. The full search strategy is available as Supplemental Table 1 in Online Resource 1.

Selection of Studies

Identified studies (N = 1,077) were first assessed for duplicate records. After removing duplicates (n = 452), 625 records were screened for inclusion based on their title and abstract. After excluding articles that did not match the following inclusion and exclusion criteria (n = 524), 61 articles were then assessed by full text review. Articles were included if they were published in English, had a longitudinal study design investigating risk of progression to incident dementia (either MCI, AD-dementia, or all-cause dementia), included a measure of CR (either residual variance or composite proxy), included a structural (volumetric) measure of the brain that could index AD-related structural pathology (e.g., total gray matter, hippocampus) or AD-related biomarkers (i.e., Aβ, tau), and reported hazard ratios (HRs) to be included in our meta-analysis.

Specific exclusion criteria included: wrong study design (i.e., cross-sectional studies, case studies, reviews, meta-analyses, editorials, book chapters, commentaries); gray literature (i.e., conference abstracts); animal studies; were focused on other neurological conditions (Parkinson’s disease, Huntington’s disease, epilepsy, stroke, traumatic brain injury, multiple sclerosis, amyotrophic lateral sclerosis, multiple system atrophy, or normal pressure hydrocephalus); studies that focused exclusively on varieties of dementia other than AD-dementia due to their limited prevalence and different etiology (frontotemporal dementia, vascular dementia, or dementia with Lewy bodies; i.e., studies reporting incident all-cause dementia were included, but studies reporting exclusively on incident vascular dementia were excluded); were focused on mental health conditions that could influence cognition (e.g., depression, anxiety, schizophrenia, bipolar disorder, or post-traumatic stress disorder); had a focus other than dementia (e.g., post-surgery delirium); or were focused on cognitive decline rather than the onset of a clinical diagnosis of MCI or dementia. Two authors jointly reviewed the nine selected articles according to the inclusion and exclusion criteria and reached 100% agreement upon their inclusion in the systematic review and meta-analysis (see Fig. 1 for the PRISMA (Moher et al., 2009) flow chart). To assess the robustness of our selection criteria, the second author independently reviewed a random selection (n = 10) of full-text articles to verify their agreement with their respective inclusion or exclusion (100% agreement).

Fig. 1
figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) chart illustrating the process for final selection of articles. Search terms included: cognitive reserve, cognitive capacity, brain reserve, neural reserve, brain maintenance, residual variance, transition, cognitive decline, cognitive deterioration, progress*, conver*, neurodegeneration, risk, incident, longitudinal, magnetic resonance imaging, MRI, grey matter, gray matter, white matter, positron emission tomography, PET, beta amyloid, tau, mild cognitive impairment, MCI, Alzheimer*, AD, dement*, mild neurocognitive disorder, and major neurocognitive disorder

Study Quality Assessment

Quality of selected studies was assessed with the Newcastle–Ottawa Scale (Wells et al., 2019) for cohort studies, whereby studies are rated based on criteria related to selection, comparability, and outcome. Criteria pertaining to selection include their representativeness of the exposed and unexposed cohorts, ascertainment of exposure, and assessing incidence (not just prevalence) of the outcome. Criteria for comparability pertains to adjustment for possible confounding. Criteria for outcome include information regarding assessment of the outcome, proper length of follow-up to incidence, and the description of any differences in follow-up availability between the exposed and unexposed cohorts.

Data Extraction

Data from studies meeting inclusion criteria were extracted and reviewed for accuracy by two authors. Data regarding study characteristics included: sample size, source of study sample, length of follow-up, and demographic variables of study participants (age, gender, race/ethnicity, and education). Data were also extracted about the diagnostic criteria used to make an MCI or dementia diagnosis, the measure of CR, what type of AD neuropathology was controlled for, study outcomes, and the hazard ratio and 95% confidence intervals (CIs) associated with a one unit increase in CR. Where hazard ratios and 95% confidence intervals were unavailable, the authors reached out to the corresponding author of each study to obtain these estimates.


Nine prospective cohort studies were included in the meta-analytic results (Hohman et al., 2016; Petkus et al., 2019; Pettigrew et al., 2017; Soldan et al., 2013, 2015; Udeh-Momoh et al., 2019; van Loenhoud et al., 2017, 2019; Xu et al., 2019). In order to reduce variability between studies, we extracted the hazard ratios and corresponding confidence intervals associated with high CR at baseline after controlling for relevant structural (Petkus et al., 2019; Pettigrew et al., 2017; Soldan et al., 2015; van Loenhoud et al., 2017, 2019) or biomarker (Hohman et al., 2016; Soldan et al., 2013; Udeh-Momoh et al., 2019; Xu et al., 2019) covariates. Some studies examined the interaction of CR by neuropathology in addition to the main effect of CR in relation to risk of progression to dementia (Pettigrew et al., 2017; Soldan et al., 2013, 2015; Udeh-Momoh et al., 2019). To hone in on the specific effect of CR on risk of progression, we chose to include the main effects of CR on risk of progression from these studies rather than the interaction. The two types of CR measurements differed in terms of their use of markers of AD neuropathology, which were used as covariates in studies using proxies and directly in the calculation of CR in studies using the residual approach. While both fixed- and random-effects models were conducted for transparency, the random-effects estimates should be given greater consideration due to the substantial differences in study methodology (i.e., controlling for structural characteristics versus biomarker characteristics, using composite proxy versus residual variance approaches for CR).

Between-study variance was estimated with τ2, with larger values suggesting greater between-study variance. The proportion of between-study heterogeneity not solely caused by sampling error was estimated with I2; Higgins and Thompson classify 25% as low heterogeneity, 50% as medium heterogeneity, and 75% as high heterogeneity (Higgins & Thompson, 2002). I2 is preferable to formal tests of statistical heterogeneity when sample sizes are small as this statistic is less affected by power. The statistical significance of the between-study heterogeneity was estimated with Cochran’s Q (p < 0.05 suggests statistically relevant between-study heterogeneity). Publication bias was estimated visually with a funnel plot and statistically with Egger’s Test of the Intercept (p < 0.05 indicates substantial funnel plot asymmetry and concern of publication bias). However, since both Cochran’s Q and Egger’s Test are underpowered when sample sizes are small, as in our study of nine articles, prioritizing the results of I2 and the funnel plot will convey between-study heterogeneity and publication bias, respectively. Sub-analyses were carried out by the CR approach (i.e., one model for composite proxy and one model for the residual variance approach) and are fully accessible in the online supplemental materials (Online Resource 1). Meta-analysis was conducted using R 3.6.1.


Description of Studies

Nine longitudinal cohort studies were included (Hohman et al., 2016; Petkus et al., 2019; Pettigrew et al., 2017; Soldan et al., 2013, 2015; Udeh-Momoh et al., 2019; van Loenhoud et al., 2017, 2019; Xu et al., 2019). Four studies used the residual variance approach to measure CR (Hohman et al., 2016; Petkus et al., 2019; van Loenhoud et al., 2017, 2019), where CR was estimated from variance in cognitive performance or structural brain integrity. Five of the studies used a composite proxy approach to measure CR (Pettigrew et al., 2017; Soldan et al., 2013, 2015; Udeh-Momoh et al., 2019; Xu et al., 2019), with the variables comprising the composite proxy including intelligence tests, years of education, occupation, intracranial volume (ICV), cognitive activities, and social activity in late life in various combinations. Markers of AD neuropathology varied somewhat across the nine studies and included measures such as gray matter volume, CSF Aβ, and cortical thickness. See Table 1 for information extracted from the studies and Table 2 for results.

Table 1 Information Extracted from Included Studies Focused on Dementia Progression
Table 2 Main Predictor and Outcome Variables from Included Studies

Residual Variance Approach

Two studies calculated CR as the residual variance in cognitive performance after accounting for relevant AD-related structural pathology (Petkus et al., 2019) or biomarkers (Hohman et al., 2016). Hohman and colleagues (2016) calculated CR as cognitive resilience, a latent construct defined as the residual between Aβ and tau and memory and executive function performance. Participants who were cognitively intact and those with MCI at baseline were combined in the analysis to assess their risk of progression to either MCI (from intact cognition) or dementia (from intact cognition or MCI). Petkus and colleagues (2019) defined CR with both domain-specific cognitive categories (i.e., attention, verbal memory, figural memory, language, and spatial) and a general CR construct which was defined as a latent variable underlying the domain-specific CR components. In separate analyses, they assessed progression from normal cognition to MCI or from normal cognition to dementia. Both Hohman and colleagues (2016) and Petkus and colleagues (2019) found that their measure of CR was associated with a reduced relative risk of progression to either MCI or dementia.

Two studies operationalized CR as the difference between observed and expected brain volume given level of cognitive performance (van Loenhoud et al., 2017, 2019). Both van Loenhoud and colleagues (2019) and van Loenhoud and colleagues (2017) used a measure of global cognitive performance when defining CR (i.e., the Alzheimer’s Disease Assessment Scale-cognitive subscale [ADAS-Cog] and an average of standardized neuropsychological tests including the domains of memory, executive functioning, attention, language, and visuospatial, respectively). Both studies measured risk of progression from cognitively intact to MCI or AD-dementia in a single analysis. The former study by van Loenhoud and colleagues (2019) also included results stratified by baseline diagnostic stage, with similar results to their overall findings. Whereas one found higher CR associated with reduced relative risk of progression to MCI or AD-dementia (van Loenhoud et al., 2019), the other found that higher CR was associated with an increased relative risk of progression to MCI or AD-dementia (van Loenhoud et al., 2017), presumably because of differences in disease stage between participants in both studies.

Composite Proxy Approach

Three studies used the same variables to calculate the composite proxy score for CR, included participants who had normal cognition at baseline, and had the outcome as clinical symptom onset (Pettigrew et al., 2017; Soldan et al., 2013, 2015). Two of these studies controlled for structural measures (mean cortical thickness of AD vulnerable regions [e.g., the entorhinal cortex] (Pettigrew et al., 2017); baseline levels and atrophy of the medial temporal lobe (Soldan et al., 2015)) whereas the other controlled for the CSF biomarkers Aβ, phosphorylated tau, total tau, and their combination measured at baseline and over time (Soldan et al., 2013). Each of these studies found that higher CR was related to a reduced relative risk of clinical symptom onset. One study calculated CR as a proxy from variables representing engagement across the lifespan (Xu et al., 2019). They controlled for Aβ and tau present post-mortem and found that high CR was associated with a reduced relative risk of dementia progression. Finally, one study investigated whether a composite proxy of CR was associated with reduced relative risk of progression to MCI or AD-dementia from normal cognition (Udeh-Momoh et al., 2019). They controlled for Aβ and cortisol levels in their analyses but did not find that CR was significantly associated with progression to MCI and AD-dementia. However, for the group who had the highest risk for progression (i.e., those with high cortisol levels and abnormal Aβ), high CR was associated with a reduced relative risk of progression.


Both the fixed-effect and random-effects models revealed a significant effect of higher CR on progression to MCI or dementia after controlling for structural or biomarker factors (Fig. 2; fixed-effect pooled-HR: 0.46 [0.42, 0.51], p < 0.001; random-effects pooled-HR: 0.53 [0.35, 0.81], p = 0.003). This association was highly variable across studies (Q = 66.63, p < 0.001; I2 = 88.0% [79.4%, 93.0%]; τ2 = 0.371), though no substantial concern of publication bias was found (Fig. 3; Egger’s Test: p = 0.22). We also conducted the meta-analysis with the Duval and Tweedie trim-and-fill procedure (Duval & Tweedie, 2000) and found results similar to our original analysis, with a slightly smaller hazard ratio, but confidence interval limits that overlap (random-effects pooled-HR: 0.41 [0.25, 0.68]; see Online Resource 1: Supplemental Fig. 1 for forest plot and Supplemental Fig. 2 for funnel plot). Given the nature of pooling several types of methodologies together (e.g., differences in calculating CR, controlling for biomarkers versus structural characteristics), the results of the random-effects model are more appropriate.

Fig. 2
figure 2

Forest plot conveying the risk of progression to MCI or all-cause dementia. Petkus et al. (2019), Pettigrew et al. (2017), Soldan et al. (2015), and van Loenhoud et al. (2017; 2019) controlled for structural indicators of Alzheimer’s disease such as hippocampal volume. Hohman et al. (2016), Soldan et al. (2013), and Udeh-Momoh et al. (2019) controlled for biomarkers of Alzheimer’s disease such as Aβ or tau. Further, Hohman et al. (2016), Petkus et al. (2019), and van Loenhoud et al. (2017; 2019) examined cognitive reserve using the residual variance approach, whereas Pettigrew et al. (2017), Soldan et al. (2013; 2015), Udeh-Momoh et al. (2019), and Xu et al. (2019) used the composite proxy approach

Fig. 3
figure 3

Funnel plot of the included studies to estimate publication bias. The long-dotted line is the fixed-effect model estimate and the short-dotted line is the random-effects model estimate. Egger’s Test of the Intercept: p = 0.22


Fixed-effect models were used for sub-analyses due to the especially small sample size and the uniformity in the CR approach – though random-effects models are also reported in the supplemental files. Among studies that used the composite proxy approach (Supplemental Fig. 3 in Online Resource 1), the fixed-effect model (pooled-HR: 0.52 [0.46, 0.60]) was statistically equivalent to the full model. Among studies that used the residual variance approach (Supplemental Fig. 4 in Online Resource 1), the fixed-effect model (pooled-HR: 0.38 [0.33, 0.45]) was statistically equivalent to the full model. However, the point estimates and 95% confidence intervals of both approaches do not contain each other. This pattern suggests that while both measurements of CR reveal a protective effect from incident MCI or dementia, the residual variance approach leads to a stronger effect (62% versus 48% reduction in risk, p < 0.001).

Supplemental Analyses

Most studies examined risk of progression to a later diagnostic stage from normal cognition or MCI in a combined hazard ratio. However, there were four studies (Petkus et al., 2019; Pettigrew et al., 2017; Soldan et al., 2013, 2015), that investigated risk of progression from normal cognition to MCI that we included as an additional sub-analysis. Results indicate that CR was associated with a reduced relative risk of MCI (fixed-effect HR = 0.43 [0.37, 0.50]; Supplemental Fig. 5 in Online Resource 1). Thus, results are consistent when assessing progression to either MCI or dementia.

Finally, we also conducted a sensitivity analysis excluding the Xu and colleagues (2019) study since they measured AD biomarkers post-mortem rather than at baseline as the other studies did. Results were not changed by exclusion of the study (data not shown).

Quality of Studies

The Newcastle–Ottawa Scale (Wells et al., 2019) was used to assess the quality of the included studies (Table 3). Overall, quality of the included studies was high evidenced by complete star assignment in the Selection, Comparability, and Outcome sections. The exposure of interest for the review was CR and the outcome of interest was incident MCI or dementia. Both the exposed and unexposed cohorts were taken from the same community in each of the studies, all studies controlled for age and at least one additional variable in analyses, and most provided adequate information on the verification of the outcome of interest and relevant follow-up information on the cohorts. Only two studies did not have full star assignment (van Loenhoud et al., 2017; Xu et al., 2019). Therefore, results of the current review and meta-analysis were not likely influenced by the quality of included studies.

Table 3 Quality of Studies According to the Newcastle–Ottawa Scale


We set out to assess whether studies testing the CR hypothesis including measures of AD neuropathology were associated with MCI or dementia progression and how different operationalizations of CR were also related to risk of incident MCI or dementia. As expected, our systematic review and meta-analysis provided consistent evidence that higher CR was associated with a lower relative risk of MCI or dementia progression above and beyond AD-related structural pathology and biomarkers, cutting the risk by almost half (47%). Overall, these results indicate that CR delays the onset of MCI and dementia in the presence of AD neuropathology and, subsequently, provides potential targets for preventative interventions. Our results illustrating the protective effect of CR on dementia progression may also be an underestimation of the effect, as the sample-specific estimations of CR could have included a limited number of participants who have low CR.

The concept of CR suggests that individual differences in expected level of cognitive performance due to levels of neuropathology can be attributed to a dynamic process that imparts neural protection (Stern, 2009). Further, CR is conceptualized to be a summative factor influenced by the accumulation of differing experiences across the lifetime (Stern, 2009). In all, the concept of CR is inherently abstract and cannot be measured directly, which lends it to multiple operationalizations. The two common operationalizations of CR—CR as a proxy of common risk factors and CR as a residual variance of cognitive performance after AD neuropathology is accounted for—reflect attempts to tap into the CR concept as both a static and dynamic entity. Thus, both operationalizations of CR are a combination of factors that are stable and dynamic.

In line with our second hypothesis, we found a stronger effect for the residual variance approach in comparison to the composite proxy approach, although the difference was rather insubstantial, particularly considering the often more distant nature of the CR proxy measurement (48% reduction in risk overall) compared to the concurrent nature of the measurement of residual variance (62% reduction in risk overall). This finding suggests that, although quantifying CR differently, both the residual variance approach and the proxy approach exert a strong effect on MCI and dementia progression. In particular, the finding that proxy measures reduced relative risk of MCI and dementia by almost half even after at least partial control over AD neuropathology underscores their utility in terms of population-based efforts to reduce incidence of dementia by encouraging the use of factors represented among CR proxy variables in the everyday lives of middle-aged and older adults.

Advantages and Disadvantages of Different CR Operationalizations

The advantages and disadvantages of both operationalizations of CR should be noted (see Jones et al., 2011; Nilsson & Lövdén, 2018 for more comprehensive reviews). CR proxies can be easily measured in epidemiological research settings via self-report measures that can incorporate a range of lifetime experiences. Further, as tangible aspects of lifetime experiences, proxies can be promoted as points of intervention to delay dementia progression. Proxies (education, occupational characteristics, leisure activities, etc.) have also been frequently used as measures of CR and have been shown to be associated with better cognitive outcomes even when relatively high levels of neuropathology are present (Stern et al., 2020), providing evidence for their construct validity as measures of CR.

One disadvantage of proxies is that proxies may be related other than through CR (i.e., their shared variance may reflect another construct) (Stern et al., 2020) and can be related to cognitive performance through pathways other than CR, for example, better management of health conditions that may influence cognitive aging such as diabetes (Jones et al., 2011); therefore, including them as measures of CR may not accurately represent the CR concept. Proxies of CR may also qualitatively differ by cohort or geographic region. Additional caution should be used when examining CR as proxies since they could be subject to reverse causation (i.e., individuals reducing their engagement with elements of proxies early in a clinical diagnosis, such as withdrawing from social interactions or reducing their engagement with cognitively stimulating activities) or when represented as a summative proxy may miss unique associations between CR and impairment (Stern et al., 2020). Finally, proxies often take on a static nature (e.g., early-life education), which prevents assessment of changes in CR that may be related to dementia progression (though some proxies such as engagement in social or physical activities are dynamic).

The residual variance approach has the potential of greater construct validity of CR than the proxy approach since the residual variance approach is a quantitative estimate of the discrepancy between predicted and actual cognitive performance given neuropathology. However, in practice, studies usually do not account for all aspects of AD-related neuropathology. By quantifying the latent nature of CR, the residual variance approach can also account for bias present in individual proxy indicators (Jones et al., 2011); although using a latent variable approach to combine proxies would similarly account for this bias. The residual variance approach incorporates both static and dynamic aspects of CR (Stern et al., 2020) allowing for assessment of changes in this indicator to assess changes in CR over time.

The original approach of calculating CR as residual variance was to identify the residual in memory performance (Reed et al., 2010; Zahodne et al., 2015), given that declines in episodic memory are commonly observed as the first cognitive changes in AD-related impairment. This approach is potentially limited as it only assesses a single domain of cognition and does not fully capture CR across other cognitive domains. Further, the residual variance approach shows particularly high levels of variation in variables included in its composition (Stern et al., 2020), leading to substantial variability between studies, which may play a role in inconsistent results. Studies also often include few indicators of structural integrity (Oschwald et al., 2019), possibly limiting the amount of variance explained by brain variables in cognitive performance. Due to the limited number of brain markers included in the calculation, the residual variance approach could include many unmeasured brain and other confounding variables within the CR calculation (Stern et al., 2020; Reed et al., 2010; Zahodne et al., 2013, 2015). This measurement imprecision influences the construct validity of the residual variance approach as an operationalization of CR. Future research needs to refine and expand the residual variance approach to incorporate more complete and precise measures of biomarkers and brain variables that predict cognitive performance so that confounding factors remaining in the CR calculation can be removed. Both operationalizations of CR need to account for level of neuropathology in order to accurately assess the CR concept (Stern et al., 2020), representing a potential challenge to research settings that do not have the equipment needed to measure neuropathology.

Assessing Measures of Alzheimer’s Neuropathology

A potential source of between-study variability in our meta-analytic results could have been our focus on both volumetric indicators and biomarkers of AD within the included studies as opposed to considering these effects separately. Although the presence of Aβ and tau indicates underlying neuropathology characteristic of AD as does the presence of structural neurodegeneration, the markers manifest in a lagged manner or at different stages along the AD continuum (Jack et al., 2018). Further, gray matter atrophy is not unique to AD and can be the result of other neurodegenerative conditions and occurs during the aging process. Therefore, simply measuring just biomarkers or structural neurodegeneration may not fully explain which older adults could experience a progression to dementia.

Contradictory and Null Findings

In our review, one study (van Loenhoud et al., 2017) reported contradictory results (i.e., high CR associated with increased risk of progression) and another study reported null findings (Udeh-Momoh et al., 2019). van Loenhoud and colleagues (2017) suggested that the reason for this discrepancy with typical findings could be the short follow-up in which they tracked dementia progression. Specifically, van Loenhoud and colleagues (2017) indicated that participants in their study may have been more advanced in their progression to dementia, which would result in a faster decline for individuals with high CR (Stern, 2009). Although not controlling for neuropathology, others (Mazzeo et al., 2019) have found a similar result, such that high CR was related to a lower risk of progression from subjective cognitive decline to MCI, but a higher risk of progression from MCI to dementia for apolipoprotein E4 carriers.

Regarding the null findings, since Udeh-Momoh and colleagues (2019) included participants who had available biomarker information (e.g., cortisol) they had a much smaller sample than most of the other studies. Therefore, their lack of an effect for CR could have resulted from low power. However, they did find that high CR was related to reduced risk of dementia progression in the group of participants at greatest risk for progression (Udeh-Momoh et al., 2019).

Alternative Study Designs Measuring CR

Although CR is a heavily investigated research area, few studies look at the association between CR and dementia incidence when controlling for AD neuropathology, and even fewer investigate this question prospectively using incident cases. Of studies that have not controlled for AD neuropathology when examining the association between CR and dementia incidence, some have found similar effects (Pettigrew et al., 2013) whereas others have found weaker effects of CR on dementia progression (Dekhtyar et al., 2019, 2015; Clouston et al., 2015). However, conclusions regarding the CR concept are limited in these studies as the mechanism through which CR is purported to operate is not included. Rather, these studies may be better conceptualized as studies investigating risks associated with dementia instead of providing evidence for CR.

Several studies were excluded from our meta-analysis because they examined CR and dementia status cross-sectionally (e.g., Garibotto et al., 2008; Lopez et al., 2016; Osone et al., 2016; 2015; Tokuchi et al., 2014). Some were in-line with our findings (Garibotto et al., 2008; Tokuchi et al., 2014), though some suggested that CR was not related to dementia status (Lopez et al., 2016). Overall, these studies have less bearing on conclusions about dementia risk than longitudinal cohort studies that assess risk of progression to dementia over time. Others have investigated dementia progression longitudinally, but used different models (e.g., latent difference score models (Zahodne et al., 2015), relative risk ratios (Reed et al., 2010), or standardized log odds (Zahodne et al., 2013)), with consistent findings with our results.

Several studies were excluded for using education solely as a proxy for CR with proportional hazard models (Albert et al., 2018; Pyun et al., 2017; Roe et al., 2011; Sorensen et al., 2019; Vemuri et al., 2011). Consistent with prior literature (Nilsson & Lövdén, 2018), we support the notion that CR should be operationalized as something greater than years of education, since a one unit increase in years of education is likely qualitatively different than a one unit increase in CR when measured as a composite proxy or through the residual approach. Using only education as a measure of CR may be especially problematic when examining cross-cultural differences where the number of years of education vary drastically, or when studies are affected by cohort effects (e.g., education levels of older adults who grew up during World War II in occupied countries). There is also research on AD neuropathology and MCI or dementia progression that includes education in some role other than a variable of interest, mainly as a covariate. However, including these types of studies was beyond the scope of this systematic review and meta-analysis but is a limitation of the current work. Thus, future research should assess the relationship between education only and dementia incidence when controlling for AD neuropathology.


This review was based on a relatively small sample of studies, highlighting that, despite a long line of research studies testing CR, few have taken the step of accounting for AD neuropathology—a crucial factor in establishing CR. Several limitations stemming from the small sample of studies should be noted. Meta-regression was not carried out due to the small sample size, but should be considered in the future. Three (Pettigrew et al., 2017; Soldan et al., 2015, 2013) of the five studies using the composite proxy approach analyzed the same sample with identical calculations of the composite score. In spite of this limitation, these studies looked at different aspects of AD neuropathology, thus generating support for the protective effect of CR against MCI or dementia progression when controlling for both AD-related structural pathology and biomarkers.

Additional study limitations should be noted. Due to the variety of definitions of CR, the current review may have missed relevant studies. Additionally, many of the articles reviewed for inclusion were focused on cognitive decline instead of progression to dementia. Other studies that focused on dementia progression included odds ratios, relative risk ratios, or used regression-based techniques to predict dementia progression. Although excluding these studies limited our sample size, by focusing solely on hazard ratios we were able to show the relative risk of dementia progression at any point in time associated with CR which is of greater clinical utility. We only included studies written in English which could limit generalizability to non-English speaking countries. Further, there was limited racial and ethnic diversity in included studies and some studies did not include racial information in their reports, also limiting generalizability.

Our results also combined studies that look at transitions from normal cognition to MCI or dementia and from normal cognition or MCI as the baseline measure to dementia incidence. Future research in this area should measure the association between CR and dementia progression separately for normal cognition and MCI, as the direction of the progression risk can switch once a clinical threshold has been crossed (i.e., a reduced risk looking at a pre-clinical state of cognitive impairment as the baseline, but an increased risk when MCI is the baseline, (e.g., Mazzeo et al., 2019; Myung et al., 2017)); however, some still show reduced risk of progression with high CR and transition from MCI to dementia (Allegri et al., 2010). Further, the extent to which this change in risk is influenced by level of neuropathology should also be examined. Examining the relationship between risk of progression among different levels of prodromal and clinical diagnoses will better inform how environmental factors influence progression depending on the point of the AD continuum participants lie.

Relatedly, the studies had a considerable amount of variability in follow-up time (i.e., from an average of two to almost twelve years). As individuals in each of the studies could have been at different points of clinical progression (and the relationship between CR and progression indicates more rapid decline once onset has occurred for those with high CR; (Stern, 2009)), the differences in follow-up time could have also contributed to our between-study variability. Finally, the current investigation was limited to structural brain measures and CSF pathology, with one study assessing biomarkers post-mortem rather than prospectively. Fruitful areas for future research could also include measures of vascular biomarkers of pathology and how they relate to CR and dementia progression.

Implications and Future Research

We hope that our results spur this burgeoning area of research by incentivizing research groups to develop prospective cohort studies. Specifically, there appears to be little research investigating CR’s influence on incident dementia while taking into account AD neuropathology. At the same time, the concept of CR revolves around the notion that adverse effects of AD neuropathology can be reduced by greater CR. In this context, future research should continue to address the hypothesis that the influence of CR on cognitive and dementia outcomes is modified by the extent of AD neuropathology. For this purpose, longitudinal research that includes measures of neuropathology and lifespan variables, in addition to proper assessment of cognition and dementia status, is needed. Second, most of the current research included only baseline measurements of neuropathology. To further refine knowledge in this area, it is important to test the hypothesis that change in AD neuropathology may better explain the relationship between brain integrity, CR, and dementia progression.

Third, many of the proxy measurements represented a static measurement of CR, defined by achievement in years of education or a baseline cognitive task, for example. Thus, future studies should examine whether representing CR with environmental factors that can change over time (e.g., social or intellectual engagement, change in cognitive function) strengthen or weaken the CR-AD neuropathology-dementia progression interaction. Relatedly, future research should assess how reductions in these proxies as a result of social distancing orders in response to the COVID-19 pandemic, such as reductions in social activity, may have long-term implications for dementia incidence. Fourth, identifying how aspects of the environment may protect against dementia progression above the effect of neuropathology—that is, identifying the mechanisms through which environmental factors influence cognition— and what is the ideal combination of environmental factors to delay dementia onset can lead to more refined guidelines and interventions aimed at promoting healthy cognitive aging. Fifth, much of the research included in the current review was from a rather homogenous group (i.e., mostly white, highly educated participants). Future research should test whether the interaction of CR, AD neuropathology, and MCI or dementia progression applies to ethnically and racially diverse older adults. Additionally, research should also investigate how CR relates to dementia progression in the context of the novel resistance/resiliency framework proposed by Arenaza-Urquijo and Vemuri (2018; 2020). That is, research should assess whether CR, specifically CR proxies, directly influences accumulation of AD neuropathology. Finally, our results suggest that brain function is only partially dependent on underlying neuropathology. Determining the genetic, biological, and psychosocial characteristics of individuals who experience greater resilience in terms of cognitive performance in the face of neuropathology is a key question that remains to be answered in order to help reduce the burden of dementia on society.