Background

Florbetapir positron emission tomography (PET) amyloid positivity (A+) is a biomarker for fibrillar amyloid associated with a high likelihood of progression to Alzheimer’s disease (AD) dementia [1]. β-Amyloid (Aβ) accumulation has been linked to brain atrophy [1,2,3] and cognitive decline in AD [1,2,3,4,5,6,7]. However, findings have been mixed regarding whether and how A+ relates to cognitive dysfunction or hippocampal volume (HV) across the spectrum of normal cognition (NC), early mild cognitive impairment (eMCI), and AD [4, 8,9,10,11,12,13].

Some inconsistencies in the extant literature relate to disease stage included and concurrent versus longitudinal assessment of outcomes. For example, on one hand, although A+ may be detectable in NC, it may not be meaningfully related to concurrent cognition until development of eMCI [2, 14]. On the other hand, A+ may be strongly linked to retrospective longitudinal decline in NC but less predictive of future decline as the disease progresses to eMCI [4]. Other potentially explanatory factors include differences in sample size, type of Aβ measurement (e.g., binary positivity versus continuous measures of Aβ load), hippocampal segmentation and correction methodology [15,16,17], and type of outcome measure employed (e.g., screening measures such as overall scores on the Mini Mental State Examination [MMSE] versus more specific memory measures).

Although traditionally treated as a confound, sex differences in the relationship of A+ to cognition and HV may also explain conflicting findings. Recent investigations have revealed sex effects on hippocampal atrophy in normal aging, eMCI, and AD, though one study showed this only before controlling for Aβ levels [18,19,20]. Researchers have also shown that women’s established verbal memory advantage over men [20,21,22,23] appears to function as a form of sex-specific cognitive reserve, affording women equal or better cognitive performance compared with men via compensation despite positive biomarkers for AD, including mild to moderate levels of hippocampal atrophy [24] or fluorodeoxyglucose (18F-FDG)-PET hypometabolism [25]. Mechanisms of sex effects on hippocampal atrophy and cognition remain unclear, but as recent reviews and studies suggest, the etiology may include a complex interaction of effects of sex hormones; genetics (e.g., apolipoprotein E ε4 [APOEε4] carrier status); and psychosocial (e.g., differences in stress or coping), demographic (e.g., education), and lifestyle (e.g., exercise, smoking, and alcohol use) factors [20, 26]. Whether sex-specific reserve in cognition is seen in the face of A+ remains unclear. It is also unknown whether sex-specific hippocampal reserve exists for women with A+.

We examined whether sex moderates the effect of A+ on verbal learning and memory and HV in individuals with NC and eMCI. We hypothesized that sex would moderate the relationship of A+ and diagnosis with cognition such that women with NC would show a memory-related advantage over men that persists despite A+ and that women with eMCI would lose that advantage. We further hypothesized that sex would moderate the relationship of A+ and diagnosis with HV such that women with women with A+ and NC would show a neural robustness in hippocampal integrity but that women with eMCI would lose that advantage.

Methods

Participants

The Alzheimer’s Disease Neuroimaging Initiative (ADNI) is a longitudinal, multisite AD biomarker study (www.adni-info.org). The present investigation includes participants enrolled in the Alzheimer’s Disease Neuroimaging Initiative second cohort (ADNI2) and the Alzheimer’s Disease Neuroimaging Initiative Grand Opportunity Cohort (ADNI-GO) who had amyloid PET imaging at baseline and cognitive testing at screening (n = 742). Of included participants, 526 had screening visit HVs that met University of California, San Francisco (UCSF), quality control standards (UCSF Freesurfer Methods Quality Control [http://adni.loni.usc.edu/]). Participants were NC ADNI2 participants (n = 285) and participants with eMCI (ADNI2, n = 329; ADNI-GO, n = 128). ADNI required participants with NC to have MMSE scores of 24–30, a Clinical Dementia Rating (CDR) of 0, and no memory complaints. ADNI defined early eMCI as including MMSE [27] scores of 24–30, CDR [28] of 0.5, CDR Memory box score of 0.5 or greater, objective memory loss as assessed by education-adjusted scores on the Wechsler Memory Scale Logical Memory II test (raw scores 9–11 for >16 years of education, 8–15 for 5–9 years of education, 0–7 for 3–6 years of education), subjective memory complaint, and not meeting criteria for dementia [29].

Hippocampal image processing

Fully processed HV and total intracranial volume (TIV) numerical values were downloaded from ADNI, with methods described in the UCSF Freesurfer Methods Quality Control document (www.adni.loni.usc.edu). In brief, magnetic resonance imaging (MRI) scans were obtained at baseline according to a standardized protocol (http://adni.loni.usc.edu/methods/mri-analysis/mri-acquisition/). Nonaccelerated T1-weighted images (multiplanar reconstruction or inversion recovery-spoiled gradient recalled acquisition in steady state) in Neuroimaging Informatics Technology Initiative format were preprocessed by the Mayo Clinic (gradient warping, scaling, B1 correction, and N3 inhomogeneity correction). Freesurfer (version 5.1; documented and freely available for download online [http://surfer.nmr.mgh.harvard.edu/]) was employed for motion correction and averaging [30] of multiple volumetric T1-weighted images (when more than one was available), removal of nonbrain tissue using a hybrid watershed/surface deformation procedure [31], automated Talairach transformation, segmentation of the subcortical white matter and deep gray matter volumetric structures (including hippocampus) [32, 33], intensity normalization [34], tessellation of the gray matter-white matter boundary, automated topology correction [35, 36], and surface deformation following intensity gradients to optimally place the gray/white and gray/cerebrospinal fluid borders at the location where the greatest shift in intensity defines the transition to the other tissue class [37, 38]. Visual quality control assessment of images was performed at UCSF (see www.adni.loni.usc.edu).

In the present analyses, we employed HV that passed UCSF-established quality control thresholds. We examined left and right HVs rather than a mean volume across hemispheres. This was based on significant testing for effect of hemisphere conducted prior to primary analyses described in the Statistical methods section (i.e., analysis of variance with sex, A+, and their interaction as between-subjects factors and hemisphere as a within-subject factor, predicting left and right hemisphere volumes). The results revealed significant effects of hemisphere on volume in NC [F(4,179) = 12.30, p = 0.001] and eMCI [F(4,347) = 25.87, p < 0.001] as well as a significant sex-by-A+-by-hemisphere interaction in eMCI [F(4,347) = 4.12, p = 0.043]. We adjusted left and right HVs for TIV according to procedures set forth by Mormino and colleagues [39]. In brief, adjusted hippocampal volume (aHV) was calculated according to the formula [aHV = raw HV − β(TIV − mean TIV)]. Mean TIV values were defined separately for NC and eMCI; mean TIV for the appropriate diagnostic group was subtracted from each individual’s TIV. This value was multiplied by the regression coefficient (β) obtained from a regression of TIV predicting HV in the appropriate diagnostic group. Finally, we calculated aHV by subtracting this value from raw left and right HVs for each participant.

Florbetapir PET image processing

We downloaded fully processed 18F-FDG-PET binary positivity/negativity values from ADNI, where full protocols are also described (www.adni.loni.usc.edu). Florbetapir synthesis, image acquisition, and processing are additionally described in prior publications [4, 40, 41]. In brief, amyloid PET images were acquired at a variety of sites (4 × 5-minute frames obtained 50–70 minutes postinjection). Images were realigned; averaged; resliced to 1.5-mm3 voxel size; smoothed to 8-mm FWHM; and coregistered to baseline native space structural MRI scans, which were segmented and parcellated with Freesurfer version 5.3.0 to define cortical gray matter regions of interest (frontal, anterior/posterior cingulate, lateral parietal, lateral temporal) [39, 40]. A+ was determined by extracting weighted cortical retention means (regional standardized uptake value [SUVr]) from these regions, calculating average SUVr, and dividing by the cerebellar SUVr as a reference [40, 41]. In the present analyses, we used binary A+, employing the recommended 1.11 SUVr ratio threshold for cross-sectional analyses [40, 41].

Apolipoprotein E carrier status

We downloaded apolipoprotein E (APOE) genotype data fully processed from ADNI (adni.loni.usc.edu). A binary variable was created, coding all individuals as APOE ε4 carriers (heterozygotes, n = 251; homozygotes, n = 53) or noncarriers.

Clinical and cognitive measures

Cognitive outcome measures consisted of total learning and delayed free recall performance scores on the Rey Auditory Verbal Learning Test (RAVLT) [42]. We calculated total RAVLT learning scores by adding the five learning trial scores. We included modified total performance score on the Montreal Cognitive Assessment (MoCA) [43] as a measure of baseline cognitive status. To create a MoCA score that did not include a measure of memory performance, points earned for delayed list word recall were excluded from the total score, resulting in a maximum score of 25.

Statistical methods

All analyses were performed using IBM SPSS Statistics software (IBM, Armonk, NY, USA) and the PROCESS macro [44, 45]. Mann-Whitney U tests were performed to examine group differences in demographic control variables. Four separate moderation regression analyses were performed to examine whether sex and diagnosis moderated main effects of amyloid status on RAVLT learning and delayed free recall scores as well as left and right HVs. For all analyses, we treated A+ as an independent variable, diagnosis as a moderator, and sex as a secondary moderator. Modified MoCA scores, age at screening visit, education, and APOE ε4 carrier status were included as covariates. All continuous covariates were mean-centered; dichotomous covariates were zero-centered.

For each of the four moderation analyses, outlying and influential data points were defined as those that failed two of the following three thresholds: (1) Cook’s D [D > 4/(n − k − 1)], where n = number of participants in the analysis and k = number of predictors; (2) leverage as defined by (2k + 2)/n, where n = number of participants in the analysis and k = number of predictors; and/or (3) Mahalanobis value greater than the chi-square cutoff at p < 0.001 (df = 6). On the basis of these criteria, one participant was excluded for the RAVLT learning analysis, none were excluded for the RAVLT delay analysis, and two were excluded for each of the HV analyses. We carefully inspected data for all participants whose data failed a single threshold measure in order to ensure no operator error created outlying data points, as well as to ensure that data points did not appear to belong to a different population.

Results

Demographics

Of 742 participants, 48.4% were women, 344 were A+, 304 were APOE ε4 allele carriers, and 457 were diagnosed with eMCI. The average age of the sample was 71.59 years (SD 6.98) and ranged from 55 to 91 years. Additional demographics by diagnosis, sex, and Aβ status are displayed in Table 1.

Table 1 Sample characteristics by diagnosis, sex, and amyloid status

Mann-Whitney U test results showed that, for NC, men were significantly older (p = 0.01) and more educated (p < 0.001) than women. For eMCI, men were also significantly older (p = 0.02) and more educated (p < 0.001) than women. There were no sex differences in modified MoCA score or APOE ε4 carrier status for NC or eMCI.

Mann-Whitney U tests showed that, for NC, those with A+ were significantly older (p < 0.001), less educated (p = 0.03), and less frequently APOE ε4 carriers (p < 0.001) than those with florbetapir positron emission tomography amyloid negativity (A). No differences based on A+ were observed in modified MoCA scores (p < 0.001) for those with NC. For eMCI, those with A+ were significantly older (p < 0.001), had lower modified MoCA scores (p = 0.001), and were more often APOE ε4 carriers (p < 0.001) than those with A. No differences were observed for education.

Sex moderation of amyloid status and diagnosis effects on verbal learning and free recall

The overall model with A+, diagnosis, sex, and their interactions predicting verbal learning was significant [F(11,727) = 48.26, p < 0.001, R 2 = 0.39]. A three-way interaction showed that sex significantly moderated the effects of diagnosis and A+ on verbal learning [t(727) = −2.25, p = 0.02]. Parsing this interactive effect indicated that women were impacted differently by A+, depending on diagnosis. In particular, women with A+ eMCI, but not those with A eMCI, showed poorer learning [A+ eMCI, t(727) = −3.65, p < 0.01]. Similar A+ effects were not seen in women with NC [t(727) = −0.18, p = 0.85]. In contrast, A+ impacted men similarly regardless of diagnosis, with A+ showing trends toward poorer learning [NC, t(727) = −1.94, p = 0.05; eMCI, t(727) = −1.66, p < 0.01]. All findings were significant after controlling for age, education, modified MoCA score, and APOE ε4 carrier status. See Table 2 and Fig. 1a.

Table 2 Summary of regression moderation analyses for Rey Auditory Verbal Learning Test Verbal Learning and Free Recall memory test outcomes (n = 742)
Fig. 1
figure 1

Sex moderation of diagnosis and amyloid status effects. Sex moderates effects of diagnosis and florbetapir positron emission tomography amyloid positivity (A+) on verbal learning (a) and marginally moderates effects on verbal delayed recall (b) and right hippocampal volume (HV; d), but it does not moderate effects on left HV (c). Specifically, learning and memory scores appear robust to A+ effects in women with normal cognition (NC; a, b). Women with prodromal AD (A+ early mild cognitive impairment [eMCI]) lose this advantage (a, b). In contrast, A+ impacts men’s verbal learning and memory scores comparably across NC and eMCI (a, b). Sex shows no moderating effect for left HV (c), but individuals of both sexes with eMCI show smaller left HV than individuals with NC. Sex marginally moderates the relationship of A+ and diagnosis with right HV, such that women with NC showed no effect of A+ on HV and women with prodromal AD lost that advantage in neural integrity (d). A Florbetapir positron emission tomography amyloid negativity, AVLT Auditory Verbal Learning Test. Rey AVLT scores are group means. HV units are derived via correction for total intracranial volume

The overall model with A+, diagnosis, sex, and their interactions predicting verbal delayed recall was also significant [F(11,729) = 32.94, p < 0.001, R 2 = 0.27]. Analyses showed trends toward a three-way interaction, suggesting that sex marginally moderated the effects of diagnosis and A+ on verbal delayed recall [t(729) = −1.18, p = 0.07]. Again, women with A+ eMCI, but not those with A eMCI, showed poorer delayed recall [t(729) = −3.64, p < 0.01], and no A+ effect was seen in women with NC [t(729) = −0.96, p = 0.34]. There was no effect of A+ on delayed recall for men with NC or eMCI [NC, t(729) = −1.36, p = 0.17; eMCI, t(729) = −1.08, p = 0.28]. Findings are controlled for age, education, modified MoCA score, and APOE ε4 carrier status. See Table 2 and Fig. 1b.

Sex moderation of amyloid status and diagnosis effects on hippocampal volume

The overall model with A+, diagnosis, sex, and their interactions predicting left HV was significant [F(11,512) = 14.55, p < 0.001, R 2 = 0.25]. However, the three-way interaction of A+, diagnosis, and sex was not significant [t(512) = −0.35, p = 0.73], indicating no moderating effects of sex or diagnosis on A+ effects for the left HV. Main effects indicated that diagnosis related to smaller left HV across sexes, with men and women with eMCI showing smaller left HV than men and women with NC [t(512) = −5.88, p < 0.001]. Main effects of A+ on left HV showed a trend toward smaller left HV in men and women with A+ NC and eMCI [t(512) = −1.69, p = 0.09]. There was not a significant main effect of sex on left HV [t(512) = −1.30, p = 0.19]. See Table 3 and Fig. 1c.

Table 3 Summary of regression moderation analyses for left and right hippocampal volume outcomes (n = 526)

The overall model with A+, diagnosis, sex, and their interactions predicting right HV was significant [F(11,511) = 18.00, p < 0.001; R 2 = 0.28]. The three-way interaction of A+, diagnosis, and sex was a trend, suggesting that sex marginally moderated the effects of diagnosis and A+ on right HV [t(511) = −1.17, p = 0.09]. Parsing this interactive effect indicated that women were again impacted differently by A+ depending on diagnosis, with women with A+ eMCI, but not those with A eMCI, showing smaller right HV [t(511) = −2.71, p < 0.01]. There was no association of A+ with right HV in NC [t(511) = −0.77, p = 0.44]. For men, there was a trend toward A+ relating to smaller right HV in A+ eMCI and not A eMCI [t(511) = −1.75, p = 0.08]. No relationship was observed between A+ and right HV in NC men [t(511) = −1.58, p = 0.11]. See Table 3 and Fig. 1d.

Discussion

In the present study, we examined the moderating effects of sex on the impact of diagnosis and A+ on verbal learning and memory and HV. The main finding was that sex moderated the effects of A+ and diagnosis on verbal learning. In addition, we showed that sex marginally moderated the effects of A+ and diagnosis on verbal delayed recall and that sex marginally moderated the effects of A+ and diagnosis on right HV. In contrast, no sex moderation effects were observed for left HV.

With respect to cognition, our findings specifically suggest that women’s advantage over men in verbal learning—and to a lesser extent delayed recall—was robust to A+ in NC. Moreover, in eMCI, only women with A+, and not those with A, showed poorer learning—and to a lesser extent poorer delayed recall. These effects were observed after accounting for baseline cognitive status, age, education, and APOE ε4 carrier status. We conceptualize these findings as consistent with A+ eMCI representing a prodromal AD stage and A eMCI as representing suspected non-Alzheimer’s pathophysiology (SNAP). Our findings are consistent with literature showing better verbal memory performance in women [24] and positing a cognitive or memory reserve advantage for women with fewer prodromal AD traits (i.e., amnestic eMCI but moderate to large HV), but not with more prodromal AD traits (i.e., amnestic eMCI, dementia diagnosis, and small HV) [25, 26, 46]. Our findings are also partially consistent with a very recent study showing that women with low to moderate Aβ burden (but not high Aβ burden) had better verbal delayed recall than men and that this effect was specific to MCI versus NC or AD. A moderating effect of sex shown in the present study may help to explain some conflicting findings in the extant literature because, depending on sample size and diagnostic stage included, collapsing across sexes may lead to masked or exaggerated findings.

The present result showing that sex moderates the effect of A+ and diagnosis on learning and memory also has implications for clinical diagnosis of AD in women. Specifically, as has been suggested in the past [24, 25], memory reserve in women could delay prodromal AD diagnosis even in the face of positive biomarkers such as A+. However, the present results suggest that longitudinal assessment of the potentially steeper decline in memory for women between NC and prodromal AD, which is absent in SNAP, or combining measures of memory with other biomarkers, possibly with an approach that places heavier weight on biomarkers such as A+ early on, could increase diagnostic accuracy. This finding could also be relevant for development of therapeutics for AD, both with respect to inclusion criteria for trials (e.g., guidelines including a memory or learning score deficit requirement could exclude women with preclinical AD unintentionally) as well as outcome measures (e.g., the differing trajectories of memory decline in men and women could either exaggerate or mask important findings, depending on group composition, if sex is not considered).

With respect to HV, our present findings suggest moderating effects of sex for right HV. Similar to the pattern of results for the cognitive data, there was no relationship of A+ with right HV in women with NC. For women with eMCI, those with prodromal AD, but not SNAP, showed smaller right HV. The pattern in men was weaker and not significant, but it was similar. No moderating effects of sex were found for left HV.

Taken together, these findings may suggest that women have a neural reserve at the level of the hippocampus, such that hippocampal integrity is robust to effects of A+ in preclinical stages in women. Importantly, the present results do not imply that women have larger hippocampi and thus more volume to lose. Instead, they would suggest that neural reserve could be defined as a robustness to neurodegeneration, beginning at similar neural volume as men, when adjusting for TIV. Replication in even larger samples, as well as in samples of clinic-typical patients, will be important for understanding whether this is a true example of sex-specific neural reserve or whether findings would be significant in men with larger cohorts. If the latter were true, it might alternatively suggest that A+ is sensitive to concurrent HV loss in early clinical disease stages but not in NC. Larger cohort replication might also be helpful in determining whether the present lateralized findings might be consistent or whether bilateral effects would emerge. Certainly, left hemisphere effects might be expected, given the literature showing that women with NC and women with eMCI have stronger verbal memory [20, 24], and our lateralized results deserve further investigation.

Of note, in the present HV analysis, we intentionally employed a residual correction methodology for TIV [39], based on our specific sample composition as well as on guidelines recently published [17]. Previous work has suggested that a major source of variability in literature describing assessment of HV sex differences may be lack of [15], or differing methods for [16], correcting HV for total brain or intracranial volume. Use of a deliberate statistical approach taking sex into account at all levels may help to reduce or explain contradictory results in the literature relating A+ and sex to HV across NC and eMCI. Further research is needed to determine what pattern of sex moderation may exist at AD dementia stages at which women have been shown to have more rapid trajectories of decline [18, 46].

Strengths of the present study include use of a large, well-characterized study sample employing the prodromal AD diagnosis and rigorous control of potential confounding variables. Limitations include lack of longitudinal analysis, which could help to clarify causality, and use of a smaller cohort of individuals with HV data. It was also beyond the scope of the present analysis to explore ways in which HV may itself be a moderator of cognitive decline or to further probe HV at the level of subfields [47].

Future research is warranted on the longitudinal implications of these findings, as is replication in a larger cohort. In particular, because the present analysis employs clinically defined diagnostic stage groups in which men and women would be expected to express similar clinical symptoms, fine-grained examination of when exactly—or how much—pathology such as amyloid burden leads to cognitive, atrophic, and clinical symptom expression is needed. In addition, further validation and exploration of the currently used modification of the MoCA eliminating the memory component will also be important. Finally, examining the moderating effect of sex on other outcome measures, including hippocampal subfields, nonverbal memory, and resting state functional MRI, may be interesting.

Conclusions

The present study shows that sex moderates the relationship of A+ and diagnosis with verbal learning performance and marginally moderates the effect of A+ and diagnosis on verbal delayed recall and right hemisphere HV. Whereas women with NC show learning and memory scores that are robust to A+ effects, women with prodromal AD lose this advantage; in contrast, A+ impacts men’s learning and memory scores in a less significant way or not at all and comparably across NC and eMCI. For right HV, the marginal sex moderation effect showed that women with NC had no effect of A+ on HV. Women with prodromal AD, but not those with SNAP, lost that advantage in right HV neural integrity; effects among men remain unclear. Further study of sex effects in prodromal AD and AD dementia has the potential to lead to clinical developments that increase diagnostic accuracy at early stages, as well as to increase the accuracy of treatment group formation and outcome assessment when developing novel therapeutics.