Introduction

Clinical trials are increasingly focused on treating Alzheimer’s disease (AD) during early, asymptomatic, or pre-dementia stages. Differentiating those likely to progress in clinical diagnosis from normal to MCI or from MCI to dementia/AD from those unlikely to progress is challenging. Recruiting non-progressors into trials leads to reduced statistical power for detecting efficacy (13). As biomarkers are not necessarily correlated with clinical outcomes, using biomarkers alone as screening tools might not be optimal for trial enrichments, may be costly, and is susceptible to sample selection bias because of their invasive nature. The importance of accurate identification of progressors is expected to increase because of the recent FDA approval of aducanumab for MCI and mild dementia due to Alzheimer disease. The Centers for Medicare and Medicaid Services (CMS) made a decision to cover treatment costs with this monoclonal antibody only for Medicare beneficiaries enrolled in approved studies (https://www.cms.gov/medicare-coverage-database/view/ncacal-decision-memo.aspx?proposed=N&ncaid=305). Since this agent has a relatively high risk of amyloid-related imaging abnormalities (ARIA) (4), accurately predicting and enrolling those at higher risk of clinical progression into trials is critical for avoiding unnecessary treatments.

In the United States, National Institute of Health (NIH)-funded Alzheimer’s Disease Centers are required to use an uniform assessment approach (Uniform Data Set [UDS]) and upload this data to the National Alzheimer’s Coordinating Center (NACC) (https://www.naccdata.org). We previously used clinical variables from NACC UDS Version 2 (V2) and a machine learning approach to identify a set of variables predicting normal to MCI conversions within 4 years (5). UDS Version 3 (V3) replaced V2 in 2015 (6, 7). In the current study, we updated our analyses by using V3 with additional analyses. These additional analyses include: 1) using shorter durations of follow-ups to predict conversions from MCI to clinical diagnosis of AD (2 years and 3 years) and 2) generating a user-friendly calculator which estimates the probability of progression by entering subject-specific values of the small set of selected variables. Since patients interested in enrolling in anti-amyloid and other drug trials are likely to be referred to Alzheimer’s Disease Centers (ADCs) for in-depth diagnosis, pragmatic trials can be proposed by including participants who visited ADCs. The current study aims to provide the probability of conversion in clinical diagnosis with high accuracy using the NACC UDS V3 data. The calculator may be useful for pre-screening potential trial participants with high probabilities of clinical progressions before pursuing higher-cost biomarker assessments, such as an assessment of amyloid burden by a positron emission tomography (PET). The calculator can also be used for study enrichment for other trials and to strategize patients’ follow-up plans.

Methods

Data

NACC data

The NACC at the University of Washington maintains a repository of the UDS collected from participants in all of the National Institute on Aging (NIA)-funded Alzheimer’s Disease Centers (ADC) in the United States. There are over 30 past and present ADCs. The UDS consists of data collection protocols administered systematically to participants enrolled in each ADC (816). Participants are recruited, enrolled, and followed on an annual basis, generating center-specific longitudinal cohorts. These participants include individuals with clinical syndromic diagnoses of normal cognition (NC), MCI or cognitive impairments not meeting clinical MCI criteria, and dementia of various etiologies, including AD. Each AD Center enrolls participants in a NACC research cohort according to center-specific priorities. In general, most participants come from clinician referrals, self-referral by patients or family members, active recruitment through community organizations, and volunteers who wish to contribute to research. Most centers also enroll volunteers with normal cognition. NACC participants are not an epidemiologically based sample of the U.S. population with or without dementia. They are best regarded as a referral-based or volunteer case series selected based on each center’s research focus. Consent is obtained at the individual ADCs, as approved by their Institutional Review Boards (IRBs). The UDS data includes demographics, medical history, medication use, physical and neurological exam findings, clinical ratings of dementia severity [Clinical Dementia Rating (CDR®) Dementia Staging Instrument] (17), and neuropsychological test scores. Systematic guidelines for clinical diagnosis are based on the most up-to-date published diagnostic research criteria (8, 1820). Further information on the NACC database may be found at: https://www.alz.washington.edu. Current analyses use data downloaded on March 12, 2021.

Analytical Approaches

We have two independent objectives in this study: MCI progression prediction and AD progression prediction. In the MCI progression prediction, we aimed to differentiate those who converted to clinical diagnosis of MCI (or AD without having a diagnosis of MCI) within 4 years from baseline normal cognition and those maintaining normal cognition for at least 4 years. For this prediction task, a cognitively normal subject converting to MCI at any time within the 4-year observation window is a positive case, and a cognitively normal subject remaining normal for at least 4 years is used as a negative case. Cognitively normal subjects remaining normal by the time of loss to follow-up before year 4 were excluded. We also examined the transition to a specific subtype of MCI, amnestic MCI (aMCI), again with a 4-year observation window. Due to the relatively small sample size of non-amnestic MCI subjects in this cohort, we did not examine the transition to non-amnestic MCI (naMCI). In predicting the progression to clinical diagnosis of Alzheimer’s disease (AD), we aimed to differentiate between those converting to AD from MCI and those maintaining MCI status, using two follow-up durations: observation windows of 2 years and 3 years. We used these durations because the majority of clinical trials recruiting MCI participants complete follow-ups within 36 months.

Data preparation

Considerable time and effort were devoted to making the data appropriate for the supervised modeling process. Baseline numerical patient characteristics are included and missing indicators that indicate reasons for missingness (e.g., 99, −4) were coded appropriately by creating a dummy variable that indicates each type of missingness if they are considered to contribute to the prediction. After data clean-up and preparation procedures and determination of cognitive status at each assessment time point (see Supplement for more information), we applied the models listed below.

Feature selection method & Classification method

We examined the sensitivity, specificity, and accuracy of predictions using Receiver Operating Characteristics Area Under Curve (AUC). In this study, classifiers included Support Vector Machine (SVM), Logistic Regression (LR), and Random Forests. We compared the performance of univariate feature selection methods, including Information Gain (InfoGain), chi-squared test (Chi2), Fisher Score, and Analysis of variance (ANOVA). Additionally, embedded methods, including LASSO and Decision Tree, which incorporate feature selection within the classifier construction, and determine the feature importance simultaneously (joint feature selection property), were also used.

Different numbers of clinical variables (henceforth we call “features”) were examined (2, 5, 10, 15, 20, 25, 30, 35, and 40) to compare model performance. The number of features is determined when AUC does not improve significantly after including additional features. In our pipeline of model performance evaluation, datasets were divided into test and training datasets. We examined each combination independently by 5-fold cross-validation on the training datasets to automatically select the proper combination of our final model according to average validation performance. The whole assessment process was repeated 10 times to compute the average performance across the 10 repetitions. The final model was tested on the test datasets and corresponding performance metrics were reported.

Diagnosis

Cognitive statuses assigned during annual visits included normal cognition, MCI, aMCI, and AD. This information was extracted from the NACC data and used to determine progressions (i.e., from normal to MCI and from MCI to AD). The variables used to determine the cognitive status including etiology are described in detail in Supplement A and Supplement Table 1.

Table 1 The number of sample size for each scenario: NC2MCI NC2aMCI, MCI2AD_2, and MCI2AD_3

Results

Table 1 summarizes the sample sizes used for each prediction. The largest sample size (number of unique participants) was for examining predictors of MCI to AD transition with 2 years (MCI2AD_2) because we could use samples with only 2 years of follow-up. The smallest sample size was for examining predictors of normal cognition to aMCI. Those who progressed to MCI within 4 years were reduced from 408 to 183 once we limited MCI incidence to aMCI.

Figure 1 shows that 20 selected clinical variables were the best candidate parameter for our prediction model in the cross-validation process. That is, further adding features beyond 20 did not improve performance significantly. Therefore, the performance metrics were assessed using 20 clinical variables in the following model constructions.

Figure 1
figure 1

AUC of prediction models over different numbers of features

Prediction model performance

Tables 2 and 3 show results of the models’ test performances for each progression, including from normal cognition to MCI (NC2MCI) within 4 years, normal cognition to amnestic MCI within 4 years (NC2aMCI), MCI to AD within 2 years (MCI2AD_2) and MCI to AD within 3 years (MCI2AD_3). The AUC was lower for predicting MCI transition which was around 70%, compared with predicting AD transition, which was around 80%. For example, for the NC2aMCI transition, the combination using Random Forest as a classifier and ANOVA as a feature selection method reached the highest AUC with average 74.6% accuracy, 76.4% specificity, 71.6% sensitivity, and 74.0% AUC. For the MCI2AD_3 transition, the combination using Random Forest as a classifier and ANOVA as a feature selection method reached the best performance with an average of 82.1% accuracy, 75.9% specificity, 85.1% sensitivity, and 80.5% AUC. Overall, accuracy and AUC were the highest for predicting MCI to AD transition within 2 years (85% for accuracy as well as sensitivity and specificity).

Table 2 Overall Performance of Prediction Models for Each Transition
Table 3 Selected Clinical Variables for Each Transition Model (Variables Names are Those Used in the National Alzheimer’s Coordinating Center (NACC) Data Dictionary

Selected Variables

Table 3 shows variables selected for transition models. Clinician’s judgment (the variable named COGSTAT in the NACC UDS V3 Data Dictionary) based on the neuropsychological examination was selected for all 4 transitions as one of the 20 variables. Supplemental Table 2 shows the distribution of responses for each variable by the transited vs. non-transited participants. Supplemental Table 2 can be read as follows: for example, the variable, COGSTAT (clinician’s judgement), was more likely to be coded as “1: better than normal for age” or “2: normal for age” at baseline for those who remained normal and the proportion of “3: one or two test scores abnormal” or “4: three or more scores are abnormal or lower than expected” increases as the duration of transition to conversion reduces from 3 years to 2 years, implying that those who converted to AD within a short duration showed more impairment at baseline, as expected. Transition to AD within 3 years was predicted mostly by neuropsychological tests and CDR (memory, community affairs and judgement subitems) while a transition to AD within 2 years was predicted more by variables indicating whether biomarkers including FDG-PET, amyloid PET, and CSF were assessed. We included a response variable that indicates these assessments were not conducted or unknown as a potential predictor variable, instead of removing it since the missing of this variable itself would be informative for predictions. 16% of those who progressed from MCI to AD within 2 years received FDG-PET assessment, while only 4% of those who did not transition to AD received the assessment (Supplemental Table 2.d). Interestingly, for the MCI2AD_2 transition, whether smoked a cigarette in the last 30 days was selected as one of the predictors, showing that 96% of those who remained MCI did not smoke while a lower % (72%) of those who transited to AD indicated they did not smoke.

Probability threshold which optimizes prediction for pre-screening or study enrichment

Although the higher the probability, the more likely the subject is going to transit, fewer subjects are selected as the probability increases. For example, based on our results, subjects with an estimated transitional probability of 0.9 (90%) for NC2MCI transition have a positive predictive value of 96.6%, indicating that they are almost guaranteed to transit to MCI within 4 years. However, a small fraction of our sample — 25.3% among the total samples used for NC2MCI prediction — has an estimated transitional probability of 0.9 or above. It is often a balance of how sure we would like to be in selecting those who progress in clinical symptoms vs. what proportion of subjects among the sampling pool can be selected for pre-screening or trial enrichment. We provided examples of sensitivity/specificity associated with each probability and how much % of subjects are selected (i.e., prevalence) in our total sampling pool in Figure. 2 (a–d). Examples of how to interpret the figures are described in the footnote.

Figure 2
figure 2

Estimated transitional probabilities and the associated sensitivity/specificity/positive predictive values with the proportion of subjects with sample estimated probabilities (based on the classifier Logistic Regression and number of features at 20)

Note: For example in (A), if we select subjects with predicted probability ranging from 0.7 to 1.0, we can achieve a sensitivity of 87.3%, a positive predictive value of 93.7%, and 41.4 % of participants in the test datasets have this probability. On the other hand, if we select subjects with a predicted probability of 0.9 and above, we can achieve a sensitivity of 99.1%, a positive predictive value of 96.6%, but only 25.3% of the test data sets have this probability.

Calculator

We developed a calculator that provides conversion probabilities by entering the person-specific information on the selected (in this study, 20) variables used in the NACC UDS V3. For each transition model, the hyperparameters validated by the previous cross-validation were used to generate the prediction model. In order to generate probability output, we only used logistic regression as our classifier in the calculator. We applied bootstrap along with the classifier and dataset to estimate the mean value of the feature weights (i.e., the coefficient for each feature). And with the weights, we were able to estimate the probability of each new subject by using his or her information for the selected variables. The excel calculator can be downloaded at: https://www.shorturl.at/nHOY2. To aid in how to use the calculator, Figure 3 shows a screenshot of the calculator using the excel sheet used to estimate the probability of conversion from normal cognition to MCI within 4 years as an example. Please note that the pull-down menu functionality is available once it’s downloaded.

Figure 3
figure 3

Example of how to use the calculator

In the above example (screenshot), the excel sheet to estimate the probability of conversion from normal cognition to MCI within 4 years was selected. In this example, NACCCOGF variable (Clinician Judgment of Symptoms) was highlighted, and its response categories were shown in the pulldown menu. The variable names used in the calculator are the same as those used in the NACC code book so that the study coordinator can easily identify the targeted variable. The code book is located at: https://files.alz.washington.edu/documentation/uds3-rdd.pdf. In this example, given the data entered into this calculator, the subject is likely to convert to MCI within 4 years with 89.1% (the probability of the conversion). This probability is accurate with 95% of the simulated samples, implying that we have high confidence that this subject will convert to MCI within 4 years.

Discussion

The main aim of this study is to generate prediction models (prediction in terms of clinical diagnosis) and a calculator to identify those at high risk of progressing from normal cognition to MCI, and from MCI to AD within relatively short follow-up durations. We used an observation window of 4 years from normal to MCI conversion and 2 and 3 years for MCI to AD conversions, i.e., a common duration of follow-up in pharmacological trials. Our user-friendly calculator can be used for preselecting subjects for further assessment of in-vivo biomarkers or for enriching study participants in various trials. The clinical variables selected are all derived from the data collected uniformly across all NIH-funded Alzheimer’s Disease Centers in the United States (UDS V3). In tables and supplemental materials, we included variable names used in the NACC Data Dictionary which is downloadable from the website (https://www.naccdata.org) so that researchers can replicate or expand our studies. Although the subjects used for this study are not from an epidemiological cohort and not are representative of the general elderly population in the US, these subjects are likely to represent those interested in participating in AD and related dementia (ADRD) clinical trials.

Similar to our previous study using an older UDS version (UDS V2) (5), neuropsychological test results played important roles in predictions: MoCA (global cognition), Craft story (learning and delayed memory), Animal Fluency (language, language-based executive functions), Trail Making A (psychomotor speed), Multilingual Naming Test (naming) and Benton Figure Delayed Drawing (visuospatial) are important predictors. Detailed descriptions of these tests are explained previously (29). As expected, participants’ reports of their subjective cognitive decline (DECSUB) played roles in predicting transitions to MCI, but informants’ reports were more important for predicting transitions to AD, consistent with prior studies of these transitions (21).

An intriguing finding is that the variables consistently predicting transitions involved clinicians’ judgements of participants’ degree of cognitive (either memory or overall) impairments. This variable was consistently selected as an important predictor, both in our previous study (2), and in other studies (22). Clinicians’ judgements synthesize information that might not be necessarily reported in the UDS, such as the way respondents answer questions, facial expressions and speech utterances, as well as detailed information collected from informants/partners. Although we do not believe that machine learning-based approaches can fully replace clinical judgements, it might be possible to develop algorithms that resemble clinical judgement using information not reported in standard paper-pencil formats. For example, a large number of studies using speech characteristics (linguistic and acoustic) showed promising results in differentiating early-stage MCI subjects from those with normal cognition (2327). Additionally, information about longitudinal medical history free from recall bias might be supplemented by Electronic Health Records (EHR), using modeling approaches similar to those used in this and other studies (28).

Many approaches were proposed to predict conversions using only clinical variables, just like our current study (3034). The analytical approaches and the model performance have been extensively discussed in each study. We would like to emphasize that the choice of a prediction model also depends on the nature of the data the developed model will be applied to in practice, and its usefulness depends on the available data. For example, Bernier et al. used longitudinal “slopes” of cognitive decline and its deviation from the normative decline to predict conversions (33, 34). Using only one global cognitive test is advantageous, but their chart requires at least two data points to obtain the slopes. Our calculator uses data obtained in NACC UDS V3, which takes about 90 minutes or longer to complete and requires trained assessors and clinicians, i.e., is tasking. Therefore, our approach is practical when patients come from ADCs or memory clinics where NACC UDS is routinely used. We anticipate that patients who are interested in enrolling in anti-amyloid and other drug trials are likely to be referred to memory clinics. The trial study coordinator can enter the data into the calculator to estimate the likelihood of clinical progression within a short interval for the study enrichment and/or pre-screening for biomarker assessments.

There are limitations to this study. Each AD Center enrolls participants in the NACC research cohort according to center-specific priorities. The cohort used in this study is not a representative group of the elderly population in the USA. The participants are skewed towards higher income and education strata. Certain racial and ethnic groups in the general population at risk of MCI or AD were also underrepresented. The generalizability of our results is limited and the probability calculator may not be applicable to other cohorts. We used linear models and thus captured only the linear relationships between the predictor variables and target variables in this study.

Conclusion

Drawing on the most recent version (UDS V3) of uniformly collected data from all NIH-funded Alzheimer’s Disease Centers in the United States, we identified a small set of clinical variables which strongly predict transitions from normal cognition to MCI within 4 years and from MCI to AD within 2 and 3 years, updating our previous work (5). We developed a user-friendly conversion probability calculator that can be used for clinical trial enrichment and/or pre-screening for subsequent assessment of biological markers.