Background

Mild cognitive impairment (MCI) is often thought of as a transitional stage between normal cognitive aging and dementia, including Alzheimer’s disease (AD) [1, 2]. However, MCI is linked to substantial biological heterogeneity [3, 4]. Although the annual rate of conversion from MCI to AD is set at approximately 10–12% [5,6,7], not all who have a diagnosis of MCI demonstrate progressive decline, and many exhibit differential clinical outcomes, including remaining at the MCI level or reverting to cognitively normal (CN) state [8, 9]. In spite of its ubiquity, the heterogeneity of MCI regarding cognitive trajectories and progression to AD is currently not well understood, hindering further progress in clinical practice and research.

To date, there are no effective therapies for the disease, and most clinical trials, aimed to slow down or halt the conversion from MCI to dementia, have so far failed [10]. One possible reason for widespread failures of therapeutic development for the disease may be due to neglecting the heterogeneous nature of MCI and treating all individuals as if they were the same. In line with this notion, a recent simulation study has shown that individuals with early AD demonstrated substantially varying rates of cognitive decline, even when randomization procedures were applied at the start of the study [11]. For instance, one potential way to facilitate the development of effective therapies is to identify and remove subgroups of MCI individuals who have relatively normal cognitive performance and a low rate of progression to AD [12]. The authors suggest that the inclusion of subjects with “disease-free” MCI may attenuate the potential beneficial effects of treatment.

Several previous studies have utilized subtyping approaches to sort out the heterogeneity of MCI in a non-biased manner [4, 13,14,15,16,17,18,19]. With regard to delineating cognitive subtypes, several investigators have utilized data-driven and specifically cluster-analytic approaches based on cross-sectional neuropsychological assessment data. Delano-Wood was one of the first to provide evidence that distinct subgroups of MCI can be empirically derived based on cognitive data using a clinical sample of 70 [20]. Subsequent studies applying empirical methods to neuropsychological test scores have identified multiple MCI subgroups in clinic-based [21,22,23,24,25,26,27,28], community-based [29, 30], and clinical trial [12] samples. However, given the progressive nature of cognitive aging or AD, considering the longitudinal progression and not just a snapshot of the current state may provide a more comprehensive picture of the prodrome stage of the disease. In Xie et al.’s [31] study, group-based trajectory modeling (GBTM) has been performed to identify 5 different longitudinal cognitive trajectories on the Mini-Mental State Examination (MMSE) score in subjects with MCI. In three other studies, by Lee et al. [32], Kim et al. [33], and Kim et al. [34], the GBTM method was applied to discover cognitive trajectories according to Clinical Dementia Rating Sum of Boxes (CDR-SB) in subjects with MCI. The first study by Lee et al. identified two cognitive trajectories (fast decliners and slow decliners), while the other two investigations discovered three cognitive trajectories (stable, slow decliners, and faster decliners). Despite the prominent role of the Alzheimer’s Disease Assessment Scale-Cognitive Subscale 13 (ADAS-Cog-13) in evaluating the efficacy of antidementia medications [35], only one study has attempted to delineate the heterogeneity of cognitive trajectories of ADAS-Cog-13 among 238 participants with amyloid-positive MCI [36]. The identified three clusters showed different cognitive trajectories over time. However, no studies have attempted to uncover MCI heterogeneity based on longitudinal ADAS-Cog-13 scores in both amyloid-positive and amyloid-negative subjects. The inclusion of all MCI subjects regardless of amyloid status is clinically relevant, particularly in a clinical trial targeting other pathobiological pathways other than the amyloid pathway. Additionally, further validation of identified cognitive trajectories using other neuropsychological tests, neurodegeneration, in vivo AD pathologies, and clinical progression may enhance the robustness of the resulting clusters.

Thus, by applying a data-driven, longitudinal clustering analysis approach, we investigated whether distinct cognitive trajectories could be derived within the Alzheimer’s Disease Neuroimaging Initiative (ADNI) MCI cohort and, if present, assessed the associations of trajectory membership with longitudinal changes in all major AD biomarkers.

Methods

ADNI database

Data used in the present study were obtained from the ADNI database. The ADNI study was launched in 2003, and its primary goal has been to measure the clinical progression of MCI and early AD by utilizing numerous markers, such as clinical, neuropsychological, and imaging assessments. Recruitment procedures for the ADNI cohort have been described at the following website: www.loni.usc.edu/ADNI, and the ADNI eligibility criteria can be found at the following website: www.adni-info.org/Scientists/ADNIStudyProcedures.html.

Participants

Our study focuses on the 936 individuals diagnosed with amnestic MCI at baseline and at least one follow-up assessment (with ADAS-Cog-13 administration) in the next 5 years. These follow-up time points were included regardless of the diagnostic status of the subject at these visits. Follow-up visits beyond 5 years after baseline were not included in the analyses. MCI diagnosis was assigned to an individual if he/she met the following criteria: memory complaint, memory impairment as verified by the Logical Memory II subscale (delayed paragraph recall) from the Wechsler Memory Scale-revised (WMS-R), MMSE [37] score of between 24 to 30 (inclusive), CDR [38] score of 0.5, essentially preserved activities of daily living, and absence of AD or dementia. These ADNI MCI criteria are largely consistent with commonly used criteria in clinical trials, such as the Peterson criteria [6].

Each ADNI participant or authorized representative provided written informed consent and the institutional review board of each participating ADNI site approved the ADNI study. This project was also submitted for review to the institutional review board of Wenzhou Seventh People’s Hospital. However, given that this study did not involve contact with human subjects and used de-identified data, the institutional review board of Wenzhou Seventh People’s Hospital determined this study did not require review.

Neuropsychological tests

ADNI participants underwent a comprehensive battery of neuropsychological assessments during visits. Four primary cognitive tests were selected in this analysis. We included the ADAS-Cog-13 [39], which examines 13 aspects of cognitive function (range: 0–85, higher scores represent more severe cognitive impairment). MMSE, one of the most popular brief screening cognitive tests, was included as a measure of global cognition. The ADNI Memory composite score was derived from MMSE, logical memory, ADAS-Cog (3 versions), and Rey Auditory Verbal Learning Test (2 versions) [40]. The ADNI executive function composite score was developed from Clock Drawing, Digit Span Backwards, Category Fluency, Trails A and B, and Wechsler Adult Intelligence Scale-Revised Digit Symbol Substitution. Both the ADNI Memory and Executive function composite scores have been validated in previously published studies [41, 42].

Structural magnetic resonance imaging (MRI) measures

The procedure for MRI acquisition has been described previously [43]. Temporal lobe atrophy, particularly in the hippocampal formation and entorhinal cortex, has been considered to be the biological alteration most proximal to the onset of cognitive impairment [44]. Additionally, ventricular enlargement is thought to be a valid measure of clinical progression in individuals with MCI or AD [45]. In this study, we thus focus on these three structural MRI markers. To adjust sex differences in head size, three structural MRI markers were calculated using the following formulas:

$$Adjusted\;hippocampal\;volume\;(aHV)\;=\;hippocampal/intracranial\;volume\;\times\;10^3$$
$$Adjusted\;entorhinal\;cortex\;volume\;(aEV)\;=\;entorhinal\;cortex/intracranial\;volume\;\times\;10^3$$
$$Adjusted\;ventricular\;volume\;(aVV)\;=\;ventricular/intracranial\;volume\;\times\;10^3$$

PET imaging measures

The cerebral metabolic rate for glucose was determined by [18F] fludeoxyglucose (FDG) positron emission tomography (PET). A standard procedure for image pre-processing can be found at the following website: http://adni.loni.usc.edu/methods/pet-analysis/pre-processing/. Previous ADNI studies developed a “MetaROI” of the brain regions that demonstrate hypometabolic alterations among subjects with MCI and AD [46, 47]. These brain regions included bilateral posterior cingulate, right and left angular gyri, and right and left inferior temporal gyri. Global Standardized uptake value ratios (SUVRs) were calculated by averaging FDG uptake across the MetaROI and dividing by the pons and cerebellum.

Brain amyloid deposition was examined by [18F] florbetapir (AV45) PET as shown in http://www.adni-info.org. The mean AV45 uptake was determined within four regions, including frontal, lateral parietal, anterior/posterior cingulate, and lateral temporal regions. Global SUVRs were calculated by averaging across four brain regions and dividing by the whole cerebellum.

Genetic/CSF-based biomarkers

APOE (gene map locus 19Q13.2) genotypes of the study participants were extracted from the ADNI database. Participants with no APOE4 genotype were classified as APOE4 non-carriers while participants with at least one APOE4 genotype were categorized as APOE4 carriers. The levels of CSF Aβ42, total-tau (t-tau), and phosphorylated-tau at threonine 181 (p-tau) were determined by the Department of Pathology & Laboratory Medicine and Center for Neurodegenerative Disease Research, Perelman School of Medicine, University of Pennsylvania. The multiple xMAP Luminex platform and Innogenetics INNO-BIA AlzBio3 immunoassay reagents were used [48].

Clustering of longitudinal trajectories of cognitive performance

Given that ADAS-Cog is the gold standard for evaluating the efficacy of antidementia medications [49], ADAS-Cog-13 (a modified version of ADAS-Cog that includes more cognitive tasks than the original one and is thought to be more sensitive to cognitive impairment at the earliest stages of dementia) was used as our primary cognitive outcome. Only participants with at least one follow-up assessment of ADAS-Cog-13 were included in this analysis. To identify distinct longitudinal cognitive profiles, a non-parametric k-means longitudinal clustering method in the R package “kml” [50] was applied to detect trajectories of cognitive decline over a 5-year follow-up period. Before the clustering analysis, missing values were imputed by the “Copy Mean” method [50]. Briefly, the basic idea of this method is to impute missing values either using linear interpolation or last occurrence carried forward (LOCF) and then add a variation to make the individual trajectory similar to the “shape” of the overall sample’s average trajectory. We built the models for 1 to 8 clusters and selected the 4-cluster solution based on the Bayesian information criterion (BIC) [51] and the elbow method. A visual representation of the elbow method was created (see Additional file 1: Fig. S1). The raw individual trajectories and resultant 4-cluster trajectories are demonstrated in Fig. 1.

Fig. 1
figure 1

Cognitive trajectories based on ADAS-Cog-13 from baseline to 5 years. Overall, 936 participants with MCI were included in this analysis, including those with at least 2 data points of ADAS-Cog-13 over a 5-year follow-up period. Four cognitive trajectories were identified by the longitudinal K-means cluster analysis. The solid blue, long-dash green, dash orange, and dot-dash red lines represent clusters 1, 2, 3, and 4, respectively. The thin gray lines represent individual cognitive trajectories. Abbreviation: ADAS-Cog-13, Alzheimer’s Disease Assessment Scale-Cognitive Subscale 13

Statistical analyses

At baseline, we used the R statistical software v4.1.2 [52] to explore the relationships between cluster membership and demographics, APOE4 status, neuropsychological evaluations, structural MRI assessments, PET imaging markers, and CSF AD pathologies. The differences between clusters were assessed with analysis of variance (ANOVA) for continuous variables and Pearson’s x 2 tests for categorical variables. When group differences were detected with ANOVA or Pearson’s x 2 tests, we used pairwise t-tests or x 2 tests in post hoc analyses and corrected for multiple testing using the false discovery rate (FDR) correction [53]. Group comparisons were also demonstrated visually as shown in Fig. 2.

Fig. 2
figure 2

Baseline characteristics by cluster. Differences between clusters were assessed with ANOVA. When group differences were detected with ANOVA, we used pairwise t-tests in post hoc analyses and corrected for multiple testing using the FDR correction. Abbreviations: MCI, mild cognitive impairment; MMSE, Mini-Mental State Examination; ADAS-Cog-13, Alzheimer’s Disease Assessment Scale-Cognitive Subscale 13; aVV, adjusted ventricular volume; aEV, adjusted entorhinal cortex volume; aHV, adjusted hippocampal volume; FDG, fludeoxyglucose; SUVRs, standardized uptake value ratios; Aß42, ß-amyloid; t-tau, total tau; p-tau, phosphorylated tau

Linear mixed-effects models were used to examine the associations of cluster membership with the longitudinal change in cognition, structural MRI markers, PET imaging markers, and CSF AD biomarkers over up to 5 years from baseline. Eleven models were created for the following dependent variables: MMSE, memory composite score, executive function composite score, aHV, aEV, aVV, FDG SUVRs, AV45 SUVRs, CSF Aβ42, t-tau, and p-tau. Time since baseline (years), clusters, and their interaction were included as fixed effects. Age, gender, years of education, APOE4 status, and their interactions with time were included as covariates. All models included a random intercept for each participant. The model equations are as follows:

$$Y_{change}\;\sim\;Clusters^\ast time\;+\;Age^\ast time\;+\;Gender^\ast time\;+\;Education^\ast time\;+\;APOE4\;status^\ast time$$

where Y change is the change in each dependent variable from the baseline.

Additionally, to further understand the slope differences of these four cluster groups, pairwise comparisons between clusters using the estimated marginal means (EMMs) were performed and the FDR method was used to correct for multiple testing.

Kaplan-Meier curves were conducted to demonstrate the rate of conversion to dementia in the four clusters, and pairwise comparisons using log-rank tests were performed to compare survival curves. Follow-up duration was the number of years from baseline to dementia diagnosis at their last visit. During their follow-up period, subjects who did not convert to dementia were censored at their last visit.

Results

Findings of longitudinal k-means cluster analysis

As illustrated in Fig. 1, MCI subjects were assigned into the following clusters according to their cognitive trajectories: (1) cluster 1 with stable cognitive performance (we called this cluster “stable MCI,” n = 255, 27%), (2) cluster 2 with a mild cognitive decline (we called it “mild decliners,” n = 336, 35%), (3) cluster 3 with a moderate cognitive decline (we called it “moderate decliners,” n = 240, 26%), and (4) cluster 4 with a steep cognitive decline (we called it “aggressive decliners,” n = 105, 11%).

Baseline cluster characteristics

Table 1 shows the demographic and clinical characteristics by cluster. For age, the participants in the mild, moderate, and aggressive decliners groups were older than those in the stable MCI group, while no other pairwise difference was found. For education, the participants in the mild, moderate, and aggressive decliners groups were less educated than those in the stable MCI group, while no other pairwise difference was significant. For gender, the mild decliners group had a lower female proportion than the stable MCI group, while no other pairwise difference was observed. For APOE4 status, the participants in the moderate decliners and aggressive decliners groups had a higher proportion of APOE4 carriers than those in the stable MCI and mild decliners groups, while no other pairwise difference was found. For follow-up duration, the stable MCI group had a longer follow-up duration than all other groups, and the aggressive decliners group had a shorter follow-up duration than the mild decliners and moderate decliners groups. Regarding neuropsychological performance, all cognitive tests (MMSE, ADAS-Cog-13, memory composite score, and executive function composite score) showed significant differences between clusters. Regarding structural MRI assessments, four groups showed significant differences, with the exception of equivalent levels between the mild decliners and moderate decliners groups on aVV. For brain glucose metabolism, the moderate decliners and aggressive decliners groups did not differ in FDG SUVRs, while all other pairwise differences were significant. For both brain amyloid PET and CSF Aβ42, all pairwise differences were significant, with the exception of comparable levels between the moderate decliners and aggressive decliners groups. Regarding CSF tau pathologies, all pairwise differences were significant, with the exception of comparable levels between the moderate decliners and aggressive decliners groups on CSF p-tau. Figure 2 showing group differences is also created for visual inspection.

Table 1 Cluster characteristics at baseline

Cluster membership and longitudinal changes

The results from the linear mixed-effects models examining the associations between cluster membership and longitudinal changes in cognition, neurodegeneration, and CSF AD pathologies are displayed in Table 2 and Figs. 3 and 4.

Table 2 Summary of linear mixed-effects models
Fig. 3
figure 3

Forest plots showing the effect difference relative to stable MCI. Circles represent coefficients (as shown in Table 2), and horizontal dark lines represent the 95% confidence intervals. Abbreviations: MMSE, Mini-Mental State Examination; aVV, adjusted ventricular volume; aEV, adjusted entorhinal cortex volume; aHV, adjusted hippocampal volume; FDG, fludeoxyglucose; SUVRs, standardized uptake value ratios; Aß42, ß-amyloid; t-tau, total tau; p-tau, phosphorylated tau

Fig. 4
figure 4

Cluster membership and longitudinal changes in all major AD biomarkers over a 5-year follow-up. Intercepts and slopes of the four clusters come from linear mixed-effects models. There were significant differences between all four clusters in the amount of change in MMSE, memory, executive function, and aVV (A–D). All pairwise differences in the rates of decline in aEV, aHV, and FDG-PET were significant, with the exception of comparable levels between the cluster 3 and 4 groups (E–G). However, four clusters exhibited similar rates of change in Aβ-PET, CSF Aβ42, t-tau, and p-tau proteins (H–K). Abbreviations: MMSE, Mini-Mental State Examination; aVV, adjusted ventricular volume; aEV, adjusted entorhinal cortex volume; aHV, adjusted hippocampal volume; FDG, fludeoxyglucose; SUVRs, standardized uptake value ratios; Aß42, ß-amyloid; t-tau, total tau; p-tau, phosphorylated tau

For the models involving cognitive assessments (MMSE, the ADNI memory composite score, and the ADNI executive function composite score; see Table 2, Figs. 3A–C and 4A–C), the clusters (mild decliners, moderate decliners, and aggressive decliners) × time interactions were all significant such that the mild decliners, moderate decliners, and aggressive decliners groups had steeper slopes (i.e., faster cognitive decline) compared to the stable MCI group. To further understand the group differences in slope, post hoc analyses using the FDR correction were performed. The group differences in slope were significant in all pairwise comparisons (all FDR-adjusted p < 0.0001).

For the aVV model (see Table 2, Figs. 3D and 4D), the clusters × time interactions were all significant such that the mild decliners, moderate decliners, and aggressive decliners groups had steeper slopes on aVV (i.e., faster ventricular enlargement) compared to the stable MCI group. To further understand the group differences in slope, post hoc analyses using the FDR correction were performed. The group differences in slope were significant in all pairwise comparisons (all FDR-adjusted p < 0.0001).

For the aEV and aHV models (see Table 2, Figs. 3E, F and 4E, F), the clusters × time interactions were all significant such that the mild decliners, moderate decliners, and aggressive decliners groups had steeper slopes (i.e., faster entorhinal atrophy and faster hippocampal atrophy) compared to the stable MCI group. To further understand the group differences in slope, post hoc analyses using the FDR correction were performed. All pairwise differences in slope were significant, with the exception of comparable levels between the moderate decliners and aggressive decliners groups on entorhinal atrophy (coefficient: − 0.00551, SE: 0.01223, FDR-adjusted p: 0.6526) and on hippocampal atrophy (coefficient: 0.0117, SE: 0.0087, FDR-adjusted p: 0.1798).

For the FDG SUVRs model (see Table 2, Figs. 3G and 4G), the clusters × time interactions were all significant such that the mild decliners, moderate decliners, and aggressive decliners groups had steeper slopes on FDG SUVRs (i.e., faster decline in brain glucose metabolism) compared to the stable MCI group. To further understand the group differences in slope, post hoc analyses using the FDR correction were performed. All pairwise differences in slope were significant, with the exception of comparable levels between the moderate decliners and aggressive decliners groups on FDG SUVRs (coefficient: 0.0020, SE: 0.0030, FDR-adjusted p: 0.4963).

For the AV45 SUVRs model (see Table 2, Figs. 3H and 4H), the aggressive decliners × time term, but not the mild decliners × time or moderate decliners × time term, was significant, indicating that the aggressive decliners group rather than the mild decliners or moderate decliners group differed in slopes on AV45 SUVRs compared to the stable MCI group. When corrected by the FDR method, however, post hoc analyses did not find any pairwise differences in slopes (all FDR-adjusted p > 0.05).

For the CSF Aß42 model (see Table 2, Figs. 3I and 4I), the clusters × time interactions were not significant, suggesting that the mild decliners, moderate decliners, and aggressive decliners groups did not differ in slopes on CSF Aß42 compared to the stable MCI group. Furthermore, post hoc analyses also did not find any pairwise differences in slopes (all FDR-adjusted p > 0.05).

For the CSF t-tau model (see Table 2, Figs. 3J and 4J), the aggressive decliners × time term, but not the mild decliners × time or moderate decliners × time term, was significant, indicating that the aggressive decliners group rather than the mild decliners or moderate decliners group had a steeper slope on CSF t-tau compared to the stable MCI group. When corrected by the FDR method, however, post hoc analyses did not find any pairwise differences in slopes (all FDR-adjusted p > 0.05).

For the CSF p-tau model (see Table 2, Figs. 3K and 4K), the clusters × time interactions were not significant, suggesting that the mild decliners, moderate decliners, and aggressive decliners groups did not differ in slopes on CSF p-tau compared to the stable MCI group. Furthermore, post hoc analyses also did not find any pairwise differences in slopes (all FDR-adjusted p > 0.05).

Progression to dementia

Of the 936 subjects, 307 (32.8%) progressed to dementia within a 5-year follow-up period. Kaplan-Meier curves showing the rate of conversion to dementia are demonstrated in Fig. 5. A log-rank test found significant cluster differences in survival curves (x 2[3] = 462; p < 0.001). All pairwise comparisons with the FDR correction were significant (FDR-adjusted p < 0.001). Regarding the type of dementia, 294 of the 307 (95.8%) who converted to dementia were diagnosed with AD dementia. Thirteen subjects (4.2%) progressed to a non-AD dementia (4 frontal temporal dementia, 3 primary progressive aphasia, 1 progressive supranuclear palsy, 1 vascular dementia, 1 Shy-Drager syndrome, 1 semantic dementia, 1 with Parkinson’s disease and Lewy body dementia features, 1 other CNS disorder).

Fig. 5
figure 5

Kaplan-Meier survival curves showing the rate of progression to dementia in the four clusters. All clusters differed significantly from one another. Abbreviation: MCI, mild cognitive impairment

Sensitivity analyses

The residual approach was used to obtain adjusted MRI volumes to examine whether our MRI results are robust to a different ICV adjustment approach [54]. Coefficients and the 95% confidence intervals of the three linear mixed-effects models were summarized (see Additional file 1: Fig. S2). Compared to the results of Fig. 3.D–F, the patterns of coefficients and the 95% confidence intervals of these three models remained unchanged.

Discussion

There were four key findings of the current study. First, there is substantial heterogeneity in disease progression among MCI subjects despite being intentionally recruited as an independent clinical entity often thought of as the prodromal stage of dementia [6]. Second, we identified a considerable portion of subjects with MCI (cluster 1) showing a very little cognitive decline over 5 years of follow-up. This group of individuals exhibited a remarkably benign-looking biomarker profile. Third, individuals in the cluster 2 and 3 groups demonstrated relatively mild and moderate cognitive decline trajectories, respectively. Fourth, a subgroup of individuals with MCI (cluster 4) exhibited an aggressive cognitive decline trajectory and was characterized by a pronouncedly abnormal biomarker profile.

We found substantial differences between the four clusters in clinical characteristics and longitudinal changes in major AD biomarkers, indicating that our data-driven clustering method categorized MCI subjects into biologically and clinically different subgroups. Cluster 1, comprising about 27% of our study sample, showed a nearly non-existent rate of change in the ADAS-Cog-13 over a 5-year follow-up period. This subgroup had remarkably better baseline performance on the ADAS-Cog-13, MMSE, memory composite, and executive function scores; a more healthy-looking biomarker profile; and a substantially slower change in clinical progression relative to the rest of the MCI group. In agreement with this finding, previously published studies conducted in a population-based sample [30] and the ADNI data set [55] also identified a very similar MCI subgroup, which had less impaired baseline performance on cognitive tasks and a substantially lower risk of clinical progression compared to other MCI groups. This subgroup is especially relevant in clinical trials because the inclusion of “disease-free” MCI subjects may minimize the potential to detect the beneficial effects of treatment for MCI [12]. Edmonds and colleagues suggest that the identification and removal of this subgroup could maximize the capability to observe the treatment effects of new therapeutics in clinical trials involving subjects with MCI [12]. Better stratification of MCI populations before recruitment in clinical trials may help increase the chances of observing efficacy and will contribute to the development of more efficient study designs.

Cluster 2, the largest MCI subgroup identified in this study (n = 336, 35%), initially performed worse (indicated by a larger intercept in Fig. 1) on ADAS-Cog-13 and exhibited slightly faster cognitive deterioration over time relative to cluster 1. Moreover, levels of AD-associated biomarkers differed significantly between the cluster 1 and 2 subgroups (Fig. 2). Namely, at baseline, participants in cluster 2 had higher levels of CSF tau pathologies and lower levels of CSF Aβ42 compared to those in cluster 1. These findings are in accordance with earlier longitudinal studies where they have shown that abnormal CSF biomarker profiles are predictive of conversion from MCI to AD with high accuracy [56, 57]. We also found more impaired cognitive performance, as evaluated with the MMSE, memory, and executive function composite scores, in the cluster 2 group, compared to the cluster 1 group (Fig. 2). These two subgroups demonstrated pronounced differences in Aβ-PET SUVRs, a biomarker of amyloid accumulation in the brain, and FDG-PET SUVRs, a marker of neurodegeneration and synaptic dysfunction [58]. Our findings are consistent with a previous PET imaging study, which suggests that Aβ deposition is a robust predictor of clinical progression from MCI to AD [59], with amyloid changes occurring long before the start of cognitive decline [60]. Likewise, the finding that cluster 1 had higher levels of FDG-PET SUVRs (representing less severe neurodegeneration) relative to cluster 2 agrees with the observation that higher levels of FDG-PET SUVRs are associated with remaining cognitively stable among individuals with MCI [61].

Clusters 3 (n = 240, 26%) and 4 (n = 105, 11%) exhibited substantially steeper cognitive deterioration over time compared to clusters 1 and 2 (Fig. 1), with participants in the cluster 4 group showing the most aggressive cognitive decline trajectory. These findings may have a critical impact on potentiating clinical trials involving MCI subjects based on the predicted magnitude of change in cognition (e.g., ADAS-Cog-13) over time. For example, it is likely that future trials may attempt to enroll those subjects who would be predicted to fall into the cluster 4 group since the inclusion of these aggressive cognitive decliners may enhance the probability of success in clinical trials and lead to a significant gain of power to observe treatment effects. However, this approach should be conducted with caution because cluster 4 is a relatively small group (n = 105, 11%). The inclusion of only those subjects in cluster 4 would likely hinder clinical trials, as participants in other subgroups (89%) would be excluded. Furthermore, future studies should focus on the development of statistical models predicting cluster membership using baseline demographics and clinical characteristics to facilitate the recruitment process for clinical trials.

The amyloid cascade hypothesis of AD postulates that the pathologic process initiates with amyloid deposition (as measured by CSF Aβ42 and Aβ-PET), followed by changes in CSF tau proteins, then changes in FDG-PET and structural MRI, followed by cognitive symptoms [44, 62]. Our results largely support this conceptual model. Specifically, our linear mixed-effects models with four cognitive trajectories as the independent variable and all major AD biomarkers as dependent variables (Fig. 4) found that four distinct cognitive decline trajectories (i.e., different rates of cognitive decline) had comparable rates of changes in Aβ-PET, CSF Aβ42, and tau proteins (Fig. 4H–K) but exhibited significantly different rates of changes in structural MRI and FDG-PET (Fig. 4D–G) at the MCI stage of dementia, in accordance with the notion that cognitive decline is only loosely coupled with changes in Aβ-PET and CSF Aβ42 [63, 64], but is tightly accompanied by changes in markers of neurodegeneration [65,66,67] at the later stages of the disease (e.g., MCI and AD dementia). For instance, previous studies found that among patients who were experiencing a rapid cognitive decline (e.g., AD patients), rates of MRI changes were correlated with cognitive deterioration, while rates of brain amyloid deposition were not [64, 68]. In addition, Vemuri and colleagues found that among subjects with MCI, correlations with cognitive measures were strong with MRI volumes but were not significant with levels of CSF tau [69]. As expected, participants with four distinct cognitive decline trajectories identified by the cluster technique based on longitudinal ADAS-Cog-13 data also exhibited different rates of other cognitive outcomes (i.e., MMSE, memory, and executive function; Fig. 4A–C), and all pairwise differences in rates of cognitive decline were significant. Somewhat unexpectedly, we observed a significant difference between cluster 3 and cluster 4 in the rate of change in aVV (i.e., widening ventricles; Fig. 4D), but not aEV, aHV, or FDG-PET (Fig. 4E–G), despite that all of these imaging markers are thought of as neurodegenerative markers. This discrepancy may be attributed to the fact that relative to medial temporal atrophy (i.e., aEV and aHV) or hypometabolism on FDG-PET, the enlargement of ventricles is considered to be a more downstream event and more strongly coupled with a change in global cognition over time [70]. It is also likely that several factors that we did not examine in the present study, such as cognitive reserve, brain resilience, and other brain neuropathologies (e.g., vascular damages, Lewy bodies), may contribute to the difference in the rate of cognitive decline between clusters 3 and 4 [71, 72].

This study has several limitations. First, we observed highly variable individual trajectories on the ADAS-Cog-13 among individuals with MCI (i.e., the thin gray lines in Fig. 1). In this study, the usage of cluster analysis should be interpreted as an exploratory analysis in nature, rather than a confirmative one. We acknowledge that a larger sample size of each subgroup, particularly the cluster 4 group, would be warranted to yield more robust and generalizable findings. However, our linear mixed-effects models with cluster membership as the independent variable and other cognitive outcomes (i.e., MMSE, memory, and executive function) produced a very consistent pattern of cognitive trajectories (Fig. 4A–C), further supporting the notion that the four trajectories identified in the cluster analysis were stable and robust. Second, we did not use or incorporate other AD biological biomarkers in the clustering process, since our primary study goal was to examine the heterogeneity in cognitive decline, and the ADAS-Cog-13 is the most predominant assessment used to track disease progression in AD clinical trials [49]. Third, changes in AD biomarkers over long periods are non-linear [62] but were modeled as linear in our linear mixed-effects models. Nevertheless, over a shorter period, changes in AD biomarkers can be modeled as linear functions since such non-linearity seems to be minimal [73]. Fourth, the ADNI memory composite score was derived from several memory assessments, such as memory tasks of ADAS-Cog. This may introduce some degree of circularity since the ADNI memory composite score partly overlaps with the ADAS-Cog-13.

In conclusion, we identified four distinct cognitive decline trajectories of MCI and further characterized changes in all major AD biomarkers over time for each subgroup. Our findings highlight the importance of considering the heterogeneity of MCI when recruiting participants in clinical trials, thus potentially contributing to better trial design and more precise personalized medicine.