FormalPara Key Summary Points

Why carry out this study?

To assess treatment effects in clinical trials for mild cognitive impairment (MCI) due to Alzheimer’s disease (AD), the FDA recommended the use of a composite measure. Following this recommendation, Eisai developed and validated the Alzheimer’s Disease Composite Score (ADCOMS).

Although the psychometric properties of the ADCOMS have been established, the threshold for clinical meaningfulness has yet to be identified. The objective of this study was to identify that threshold.

What was learned from the study?

Based on a triangulation evaluating the range of estimates for meaningful change, the following cut-off points for meaningful change were proposed: 0.05 for MCI and 0.10 for dementia.

By establishing the thresholds for clinically meaningful change, researchers can more confidentially use the ADCOMS as an outcome measure for clinical trials in MCI due to AD and for dementia.

Introduction

Alzheimer’s disease (AD) is the most common type of major neurocognitive disorder, in which the death of brain cells leads to cognitive decline and memory loss [1]. The underlying pathology of AD begins years before the appearance of clinical symptoms. Research is ongoing to better understand the genetic, molecular, and clinical mechanisms of the disease in order to characterize early disease and prevent its progression to AD [2, 3]. Mild cognitive impairment (MCI) is an early stage impairment that precedes AD. Patients with MCI due to AD experience limited cognitive symptoms, but these impairments are mild by definition, and patients are able to function fairly normally [4, 5]. Even though early deficits in cognition may be subtle, it is important to identify and treat them to hopefully delay or prevent more severe neurodegeneration [6, 7].

Prior to the Food and Drug Administration’s (FDA) approval of aducanumab in 2021 [8], no new treatments had been approved by the FDA since 2003. Within that period of time, more than 200 investigational programs failed or were abandoned because of several factors, including treating patients after AD had already progressed, inappropriate treatment dosage, endpoints that lacked sensitivity to change, a lack of understanding of the pathophysiology of AD, and trial design issues [9]. Because early disease has limited and variable symptomatology, it is critical that the measurement of clinical outcomes in trials for AD treatments is sensitive to the specific symptoms and possible treatment effects in the early stages of disease. In the FDA’s 2013 guidance for developing treatments for early stage AD, they suggested that a composite measure may be the most appropriate for use in a clinical trial for MCI due to AD to measure clinical outcomes [10].

Following the FDA’s recommendations, Eisai developed the Alzheimer’s Disease Composite Score (ADCOMS), which combines components of widely used clinical scales that have been shown to be the most responsive to treatment effects [11]. The tool was developed for use in early AD (i.e., MCI and mild AD dementia) clinical trials. The ADCOMS tool was initially developed and validated using data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) [12] and the placebo group from three clinical trials in patients with MCI [11]. To select the items for inclusion in the ADCOMS tool, a linear decline model was developed to depict disease progression over 12 months using all the items from available clinical scales in these datasets. Items with the highest combined sensitivity to change over time were selected for the final tool that includes four items from the Alzheimer’s disease Scale-Cognitive Subscale (ADAS-Cog) [13], two from the Mini Mental State Examination (MMSE) [14], and all six items from the Clinical Dementia Rating Scale Sum of Boxes (CDR-SOB) [15].

Like the FDA, the EU/US Clinical Trials in Alzheimer’s disease (CTAD) task force also recommends a composite measure for pre-dementia trials and has pointed to the ADCOMS tool as an appropriate measure that incorporates both cognitive and functional elements [16]. The ADCOMS tool has demonstrated improved sensitivity to clinical decline over other individual scores like the ADAS-Cog, CDR-SOB, and MMSE [17], and in a prodromal AD randomized trial, it was able to identify cognitive and functional benefits, suggesting that it is sensitive to treatment changes [18]. Although these psychometric properties of the ADCOMS have been established, the threshold for ascertainment of the clinical meaningfulness of disease progression using the ADCOMS has yet to be identified. Therefore, the objective of this study was to evaluate the clinical meaningfulness threshold of the ADCOMS among patients with MCI due to AD and mild AD. A secondary objective of this work was to compare the threshold between APOE ε4 allele carriers and non-carriers. The APOE ε4 allele increases the risk of AD and is associated with earlier age of dementia onset [19].

Methods

Materials

The following three data sources were used to conduct this study.

Alzheimer’s Disease Neuroimaging Initiative (ADNI) data Data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu). ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of the ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure and predict the progression of MCI and early AD. For up-to-date information, see www.adni-info.org. ADNI is a global, prospective, observational cohort study intended to develop clinical, imaging, genetic, and biochemical biomarkers for early detection and monitoring of AD [12]. The ADNI datasets used for this analysis (ADASSCORES, CDR, MMSE and BLCHANGE) were downloaded in December 2020 and included the following ADNI phases: ADNI1, ADNI2, and ADNI GO. The data selected for Visit 1 included the data at screening and baseline or month ‘0;’ the data selected for Visit 2 included data at month 12.

National Alzheimer’s Coordinating Center (NACC) data The NACC is a prospective, longitudinal study, which maintains a large database of standardized clinical neuropathological research data collected from the NIA-funded Alzheimer’s Disease Centers (ADCs) (funded by the National Institute on Aging [U24 AG072122]). The clinical variables contained in the NACC Uniform Data sets version 3.0 (UDS 3.0) were downloaded in December 2020 (N = 38,137). The following visits were selected: days from the initial visit (NACCFDYS) were either 0 for Visit 1 or between 335 or 395 for Visit 2, and the visit number (NACCVNUM) was limited to Visits 1 and Visit 2.

Combined Legacy Dataset The third data source is a combined legacy dataset, the one which was originally used to develop the ADCOMS [11]. It contains pooled data from four sources: (1) a data collection wave from the observational study, ADNI (ADNI-1; ADNI-MCI, downloaded on 20 May 2010); (2) the placebo group from the Alzheimer’s Disease Cooperative Study (ADCS), a randomized, double-blind, placebo-controlled trial to evaluate the safety and efficacy of vitamin E and donepezil HCL (Aricept) to delay clinical progression from MCI to AD (ADCS-MCI); (3) the placebo group of a 1-year, multicenter, randomized, double-blind, placebo-controlled evaluation of the efficacy and safety of donepezil hydrochloride in patients with MCI (E2020-A001-412 NCT00293176 or the MCI-412) study; (4) the placebo group from the Hippocampus Study: comparative effect of donepezil 10 mg/day and placebo on clinical and radiological markers (E2020-A001-412 NCT00293176 or the MCI-415). The sample size for the legacy dataset includes the following: n = 405 from ADNI, n = 264 from ADCS-MCI, n = 388 from MCI-412, and n = 103 from MCI-415. The two visits for this dataset were baseline and month 12. All procedures performed in these studies involving human participants were in accordance with the ethical standards of the institutional review board and with the 1964 Helsinki Declaration and its later amendments, Permissions were obtained to access and use the data from the ADNI and NACC databases. Participants who were categorized as having clinical impairment at baseline, but normal levels of impairment at the 12-month visit, were excluded from the analyses (N = 7 for ADNI; N = 122 for NACC).

To score the ADCOMS, variables from the MMSE, CDR-SOB, and ADAS-Cog are required. The ADAS-Cog was not collected in the NACC. Therefore, the ADCOMS was imputed for the NACC at baseline and month 12 using the following procedure: ADCOMS was first calculated using the ADNI dataset with the MMSE, CDR-SOB, and ADAS-Cog. Then, ADCOMS was predicted in the ADNI using only the items available in the NACC (MMSE items 1 and 7, CDR-SOB items). The regression weights obtained from that model were then applied to predict the ADCOMS in the NACC data.

The overall study analysis population included the subjects with non-missing ADCOMS at baseline and the 12-month visit. Thus, after selecting participants with non-missing ADCOMS at Visit 1 and Visit 2, the final sample size for the study population was n = 3675 for NACC, n = 714 for ADNI, and n = 966 for the legacy dataset. It should be noted that two of the datasets may contain overlapping patients (ADNI dataset N = 714 and the ADNI component of the legacy dataset N = 358). The legacy ADNI data were downloaded in 2010 using ADNI-1 and ADNI-MCI, while the ADNI data we downloaded were downloaded in 2020, using ADNI-1, ADNI-2, and ADNI-GO. Based on the size of the two samples, they could at most overlap by 50%.

Methods

We used multiple methods to derive the threshold for clinical meaningfulness (anchor-based method, ROC analysis, and distribution-based method). The anchor-based method used the CDR-SOB as the anchor with sensitivity analyses conducted using MMSE and ADAS-Cog as alternative anchors. The anchor-based method used CDR-SOB change ≥ 0.5 as a meaningful decline cut-off for the MCI group and CDR-SOB change ≥ 1 for the dementia group. These cut-offs were determined based on findings from the Andrews et al. (2019) study, where the minimum threshold for clinically meaningful decline was defined as 0.5 on the CDR-SOB [20]. Between visits in the Andrews et al. study, patients with MCI had an average increase of 0.98 on the CDR-SOB, and patients with dementia had an average increase of 1.63 on the CDR-SOB. Because here we are looking at the minimum values to define decline, and this required an integer value that could be measured within each patient, 0.5 and 1.0 were selected as the cut-offs for MCI and dementia stages, respectively. The mean changes in ADCOMS based on the CDR-SOB thresholds for meaningful change were evaluated using t-tests for both MCI and dementia.

The ROC method compared the model-based predictions for different ADCOMS cut-offs using the CDR-SOB pre-defined groups (≥ 0.5 for the MCI group and ≥ 1 for the dementia group). The ADCOMS cut-offs were selected using the highest values of Youden’s index (which is a measure of balance between the levels of sensitivity and specificity).

Distribution-based methods to evaluate meaningful change were also used, including three statistics: the ¼ standard deviation (SD), ½ SD at baseline, and the standard error of measurement (SEM). The SEM was computed by multiplying the SD of the measure by the square root of one minus its reliability (in this analysis, the reliability coefficient used was the intra-class correlation coefficient [ICC]). The ICC was computed on a stable population, defined using the following criteria: ‘No change’ response for impairment/worsening in non-memory cognitive domains on neuropsychological testing in the legacy dataset, no subject-reported decline in memory at both the baseline and month 12 visits for the NACC, and a ‘No’ [change] response on the clinician global impression of change (CGIC) at month 12 for the ADNI.

The anchor-based, distribution-based, and ROC methods described above were used for triangulation to derive preliminary thresholds for clinically meaningful change on the ADCOMS for MCI and dementia stages. In addition, we considered APOE ε4 status within MCI and dementia groups to determine whether carrier vs. non-carrier status would alter the thresholds for meaningful change. Probability density function (PDF) plots and cumulative density function (CDF) plots by CDR-SOB groups were used to provide supportive evidence for the appropriateness of the proposed thresholds.

Results

There were 5355 participants in the overall analysis sample who had ADCOMS (actual or imputed) at baseline and month 12. Sociodemographic and clinical characteristics are summarized in Table 1.

Table 1 Sociodemographic and clinical characteristics of the study sample by data sources

The average ADCOMSs at baseline were 0.22 for the combined legacy dataset (range 0.03–0.58), 0.29 for NACC (range 0.04–2.01), and 0.25 for ADNI (range 0.00–1.17). The average ADCOMSs at baseline across the datasets were 0.22 for MCI patients and 0.64 for dementia patients (Table 2). There were minor floor effects limited to the normal cognition group (0.4%) and no ceiling effects. The sample consisted mainly of patients with MCI and dementia, although there were some in the NACC and ADNI who were considered to have normal cognition for their age. The combined legacy dataset did not include any dementia patients. The NACC data had fewer MCI patients than dementia patients, while the ADNI set had more MCI patients (MCI: combined legacy dataset N = 966; NACC N = 686; ADNI N = 249; dementia: combined legacy dataset N = 0; NACC N = 1253; ADNI N = 159).

Table 2 ADCOMS scores by clinical group at baseline

Validity

The convergent and divergent validities of the ADCOMS were examined by evaluating its correlation with the CDR-SOB, MMSE, ADAS-Cog, and Geriatric Depression Scale (GDS) (Table 3). The results demonstrated that the correlation of the ADCOMS with corresponding neurocognitive assessments was strong (r > 0.50), both at baseline and for the change from baseline to 12 months, with a few exceptions: MMSE in the MCI group at baseline, MMSE in the MCI group change over time, and the ADAS-Cog in both MCI and dementia groups' change over time. The correlation with the CDR-SOB was the strongest, suggesting that the CDR-SOB would make the best anchor for evaluation of the clinical meaningfulness of the ADCOMS. The correlation with the GDS, a measure of depression, was weak and non-significant, providing further confirmation that the ADCOMS is a measure of cognition, not depression.

Table 3 Correlation of ADCOMS with outcome measures

Establishing Clinical Meaningfulness of the ADCOMS

The ability of the ADCOMS to detect change was evaluated as the FDA recommended, namely by comparing changes in ADCOMS to changes in CDR-SOB, MMSE, and ADAS-Cog, which indicated that the patient’s cognition had declined [21]. In Table 4, the data for the CDR-SOB are presented (only the CDR-SOB results are shown), which show that the ADCOMS demonstrated statistically significant changes in MCI groups, defined by a ≥ 0.5 change on the CDR-SOB in all three data sets, p < 0.0001. The ADCOMS was also able to detect change between dementia patients (based on a ≥ 1.0 change on the CDR-SOB). Across datasets, the mean difference between MCI patients with a meaningful decline and no meaningful decline (as defined by scores on the CDR-SOB) ranged from 0.17 to 0.20 (Table 4). The mean difference between dementia patients with a meaningful decline and no meaningful decline ranged from 0.26 to 0.33. Similar results were observed when subjects were stratified by APOE ε4 carrier status, with the ADCOMS able to detect change within both groups. However, change from baseline in the APOE ε4 carriers tended to be numerically larger among MCI patients (MCI patients: Table 5; dementia patients: Table 6). This trend of larger changes in APOE ε4 was not consistently maintained for dementia patients.

Table 4 Anchor-based analysis: ADCOMS change from baseline to month 12 CDR-SOB change score responder group, stratified by MCI and dementia group
Table 5 Anchor-based analysis: ADCOMS change from baseline to month 12 CDR-SOB change score responder group, stratified by APOE ε4 carrier status for MCI patients
Table 6 Anchor-based analysis: ADCOMS score change from baseline to month 12 CDR-SOB change score responder group, stratified by APOE ε4 carrier status for dementia patients

To further evaluate thresholds for meaningful change, we evaluated sensitivity, specificity, and Youden’s index from an ROC curve analysis. For MCI patients, the threshold values with the best Youden’s indices from the ROC curve analyses were 0.05 for the combined legacy dataset, 0.02–0.03 for NACC, and 0.05 for ADNI. For dementia patients, the values with the best Youden’s indices were 0.11 for NACC and 0.13 for ADNI.

We also conducted the ROC analysis based on APOE ε4 status in patients with MCI and dementia. For MCI patients with APOE ε4 status, the best Youden’s indices from the ROC curve analyses were 0.09, 0.06, and 0.05 for the combined legacy dataset, 0.02–0.03 for NACC, and 0.08 for ADNI. For MCI patients without the APOE ε4 biomarker, the best Youden’s indices were 0.03–0.05 for the combined legacy dataset, 0.02–0.03 for NACC, and 0.04 for ADNI. For dementia patients with the APOE ε4 biomarker, the values with the best Youden’s index were 0.11 for NACC and ADNI, while for dementia patients without the biomarker, the best values were 0.06 and 0.11 for NACC and 0.13 for ADNI. [These data are included in Supplemental Table 1.]

Table 7 presents all of the above outcomes as well as several distribution-based statistics and the median change in ADCOMS to aid in the process of selecting the best threshold. The resulting thresholds were evaluated in CDF (Figs. 1 and 2) and PDF plots (Supplemental Figures). The dashed vertical lines in Figs. 1 and 2 reflect the proposed thresholds of 0.05 and 0.10 for MCI and dementia, respectively.

Table 7 Triangulation table for meaningful change for MCI and dementia patients
Fig. 1
figure 1

CDF plot of the change in ADCOMS score from baseline to month 12 by CDR-SOB for MCI patients. The dotted line indicates the proposed ADCOMS threshold for meaningful change for MCI (0.05). ADCOMS Alzheimer’s Disease Composite Score, CDF cumulative density function, CDR-SOB Clinical Dementia Rating Scale Sum of Boxes, MCI mild cognitive impairment

Fig. 2
figure 2

CDF plot of the change in ADCOMS score from baseline to month 12 by CDR-SOB for dementia patients. The dotted line indicates the proposed ADCOMS threshold for meaningful change for dementia (0.10). ADCOMS Alzheimer’s Disease Composite Score, CDF cumulative density function, CDR-SOB Clinical Dementia Rating Scale Sum of Boxes

The SEM for MCI ranged from 0.04 to 0.05 across the datasets (Table 7), and it was in a similar range for calculations based on 0.25 SD and 0.5 SD. For dementia patients, the SEM ranged from 0.08 to 0.10 across the datasets.

The median change in the ADCOMS among those who had meaningful decline was 0.14 for MCI patients and 0.26 for dementia patients.

Based on a triangulation evaluating the range of estimates for meaningful change, the following cut-off points for meaningful change were proposed: 0.05 for MCI and 0.10 for dementia. To determine this threshold, the ROC curve analysis was prioritized. However, the distribution-based estimates also provided ranges consistent with the thresholds selected. The mean differences and median values were de-emphasized. The median values reflect the median change in those in the decline and no-decline groups and therefore represent the mid-point of changes, not the threshold that is associated with meaningful change. Accordingly, they were given less weight in the triangulation. The same principle applies to the mean differences. The results of the APOE ε4 subgroup analyses provided roughly similar results, suggesting that the same thresholds for meaningful change can be used in these subgroups.

PDF and CDF plots were produced to supplement the triangulation result and evaluate how well the proposed thresholds performed in separating those with cognitive decline from those without decline (again stratified by meaningful change on the CDR-SOB) for the combined datasets.

Both the PDF and CDF plots show separation between the curves based on the proposed minimal clinically important differences between the decline and no decline groups for MCI (0.05) and dementia (0.10) (PDF plots in supplemental materials) (Fig. 1 for MCI; Fig. 2 for dementia).

Discussion

This article builds upon previous research which has established that the ADCOMS is a valid and reliable tool. A major strength is that the overall study sample was derived from three different sources, including the ADNI, NACC, and a legacy dataset made up of data from four studies (the placebo groups from three MCI studies as well as an earlier cut from the ADNI database, which also contained MCI patients). Across these disparate sources of information, using a large sample of patients who ranged in severity from MCI to dementia, the results were very consistent.

The ADCOMS was specifically developed to improve sensitivity to treatment effects by choosing items from well-known measures that were more sensitive to change than other components of the measures. In fact, the ADCOMS has previously demonstrated improved sensitivity to clinical decline when compared to traditional measures for AD, such as the ADAS-Cog [11]. The ADCOMS demonstrated sensitivity to both clinical decline and treatment effect in MCI populations. However, until now, thresholds for clinical meaningfulness were not evaluated. By defining thresholds for clinically meaningful change on the ADCOMS tool, this work is an important step in allowing researchers to interpret clinical findings and estimates of treatment effects in early stage AD trials.

Based on the triangulation evaluating the range of anchor-based and distribution-based estimates and the scores for participants at the relevant timepoints, the following thresholds for minimal clinically meaningful changes on the ADCOMS have been proposed: MCI: 0.05; dementia: 0.10.

Despite the large sample size and use of well-established and published databases, there are several important limitations to our study. One is that we used retrospective databases that were designed for other purposes than instrument validation. Accordingly, we were limited to the measures that were already included in the databases. It might have been beneficial to have other anchors to use for evaluation of meaningful change or convergent and divergent validity. Although we did also evaluate several clinician global ratings and concluded that they were not useful for the current analysis, it might have been useful to have been able to compare the ADCOMS results to other cognitive assessments such as the Montreal Cognitive Assessment (MOCA) [22]. Furthermore, since the ADCOMS was derived from the measures we used to validate the scale and establish meaningful thresholds for change, the relationships between the underlying concepts might have been overestimated, thus leading to lower estimates for the thresholds of meaningful change. Future research on meaningful change using the ADCOMS should include additional anchors to ensure broader validity.

A final limitation is that the NACC dataset does not include the ADAS-Cog tool, so the ADCOMS had to be imputed as described above in the study methods. However, the model that was used to impute the ADCOMS in the NACC data fit the data from the ADNI and combined legacy databases very well.

Conclusion

This paper builds upon previous research which has established that the ADCOMS is a valid and reliable tool. Establishing thresholds for clinically meaningful change is a critical step in the psychometric validation process that ensures researchers are able to interpret clinical findings. This work on establishing thresholds for clinically meaningful change on the ADCOMS will enable researchers to more readily use the ADCOMS as an outcome measure for clinical trials in both MCI due to AD and dementia.