Mini Mental State Examination and Logical Memory scores for entry into Alzheimer’s disease trials

Chapman, Kimberly R.; Bing-Canar, Hanaan; Alosco, Michael L.; Steinberg, Eric G.; Martin, Brett; Chaisson, Christine; Kowall, Neil; Tripodis, Yorghos; Stern, Robert A.

doi:10.1186/s13195-016-0176-z

Mini Mental State Examination and Logical Memory scores for entry into Alzheimer’s disease trials

Research
Open access
Published: 22 February 2016

Volume 8, article number 9, (2016)
Cite this article

Download PDF

You have full access to this open access article

Alzheimer's Research & Therapy Aims and scope Submit manuscript

Mini Mental State Examination and Logical Memory scores for entry into Alzheimer’s disease trials

Download PDF

Kimberly R. Chapman¹,
Hanaan Bing-Canar¹,
Michael L. Alosco^1,2,
Eric G. Steinberg¹,
Brett Martin^1,3,
Christine Chaisson^1,3,4,
Neil Kowall^1,2,5,6,
Yorghos Tripodis^1,4 &
…
Robert A. Stern ORCID: orcid.org/0000-0002-5008-077X^1,2,7,8

16k Accesses
93 Citations
15 Altmetric
1 Mention
Explore all metrics

Abstract

Background

Specific cutoff scores on the Mini Mental State Examination (MMSE) and the Logical Memory (LM) test are used to determine inclusion in Alzheimer’s disease (AD) clinical trials and diagnostic studies. These screening measures have known psychometric limitations, but no study has examined the diagnostic accuracy of the cutoff scores used to determine entry into AD clinical trials and diagnostic studies.

Methods

ClinicalTrials.gov entries were reviewed for phases II and III active and recruiting AD studies using the MMSE and LM for inclusion. The diagnostic accuracy of MMSE and LM-II cutoffs used in AD trials and diagnostic studies was examined using 23,438 subjects with normal cognition, mild cognitive impairment (MCI), and AD dementia derived from the National Alzheimer’s Coordinating Center database.

Results

MMSE and LM cutoffs used in current AD clinical trials and diagnostic studies had limited diagnostic accuracy, particularly for distinguishing between normal cognition and MCI, and MCI from AD dementia. The MMSE poorly discriminated dementia stage.

Conclusions

The MMSE and LM may result in inappropriate subject enrollment in large-scale, multicenter studies designed to develop therapeutics and diagnostic methods for AD.

Is comprehensiveness critical? Comparing short and long format cognitive assessments in preclinical Alzheimer disease

Article Open access 13 September 2021

A Hierarchical Bayesian Latent Class Model for the Diagnostic Performance of Mini-Mental State Examination and Montreal Cognitive Assessment in Screening Mild Cognitive Impairment Due to Alzheimer’s Disease

Article 29 August 2022

Bayesian estimation for the accuracy of three neuropsychological tests in detecting Alzheimer's disease and mild cognitive impairment: a retrospective analysis of the ADNI database

Article Open access 12 October 2023

Background

Alzheimer’s disease (AD) clinical trials and diagnostic studies are responsible for the testing and development of therapeutics and diagnostic methods for AD. These large-scale, multicenter studies must have strict inclusion criteria to accurately identify and discriminate normal cognition, mild cognitive impairment (MCI), and AD dementia and recruit the population of interest to facilitate internal and external validity. This, however, is no straightforward task. Although there have been great gains in the development of biomarkers for the accurate in vivo diagnosis and early detection of AD (e.g., lumbar puncture, positron emission tomography) [1–3], these are invasive procedures typically conducted following initial screening methods. Instead, investigators in AD clinical trials and diagnostic studies often initially rely on brief cognitive screening tests to detect cognitive impairment and classify patients, using a variety of research-derived cut scores, as having normal cognition, MCI, or dementia. The Mini Mental State Examination (MMSE) [4] and the Wechsler Memory Scale (WMS) Logical Memory (LM) test [5] are two screening measures commonly used to determine inclusion in these studies.

The use of the MMSE and LM in AD clinical trials and diagnostic studies to ascertain diagnostic status and determine inclusion may be methodologically problematic. Numerous studies have demonstrated the psychometric limitations of the MMSE, such as large ceiling and floor effects, and sensitivity to practice effects [6–8]. The utility of the MMSE in detecting MCI and AD dementia is indeed limited [9–11]. Perneczky et al. [12] examined the correspondence between the MMSE and Clinical Dementia Rating (CDR) scores and found the MMSE lacked accuracy in the identification of patients with MCI or mild AD dementia.

Scores on the delayed recall dimension of LM (LM-II) can also lack diagnostic utility when administered in isolation. LM-II is associated with significant learning biases [13], and practice effects may undermine its detection of impairment, particularly among potential AD trial subjects who have had repeated exposure to LM. For example, LM has been administered annually to all participants in the National Institutes of Health (NIH)-funded Alzheimer’s Disease Centers (ADCs), as part of the National Alzheimer’s Coordinating Center (NACC) Uniform Data Set (UDS), for approximately 20 years [14]. The consortium of these centers is an important source of enrollment for AD clinical trials. The ability of the LM relative to other tests to accurately detect AD has also been questioned [15], and healthy older adults frequently demonstrate impairments on LM retention [16]. Performance on LM may also be relatively more sensitive to executive dysfunction than episodic memory [17].

Given the diagnostic and psychometric limitations of the MMSE and LM, many AD clinical trials and diagnostic studies in which these instruments are used to determine eligibility may be inappropriately including or excluding subjects. This could influence the reliability and validity of study outcomes due to sampling biases. The recent phase III bapineuzumab and solanezumab trials both included the MMSE as part of study entry criteria, and both failed to meet primary efficacy endpoints [18, 19]. (In fact, no new compounds for the treatment of AD have been approved by the U.S. Food and Drug Administration since 2003.) Although research from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) has been at the forefront of the development of diagnostic methods and biomarkers for AD, ADNI also relies on MMSE and LM scores to determine eligibility.

The extent to which MMSE and LM scores are being used as eligibility criteria in AD clinical trials and diagnostic studies is unclear, and no study has examined whether the cutoffs used in these studies accurately correspond to AD spectrum clinical diagnoses. The purpose of the present study was twofold: (1) to identify all active and recruiting phases II and III AD clinical trials and diagnostic studies in the United States to determine the extent to which the MMSE and LM are used as eligibility criteria and to identify the cutoff scores used to ascertain AD diagnostic category; and (2) to exploit the large NACC database to determine the correspondence between MMSE and LM cutoff scores used in current clinical trials and diagnostic studies and AD spectrum diagnoses made by multidisciplinary diagnostic conference teams. The MMSE is often used to determine dementia severity in clinical and research settings, and past work [12] suggests this may be problematic due to the weak correspondence between the MMSE and the CDR (the gold standard for rating dementia severity), particularly at the mild end of the disease spectrum. Therefore, in the present study we sought, as a secondary aim, to replicate and expand upon the previous smaller-scale study on the MMSE and CDR [12] by examining their correspondence in the large NACC dataset.

Methods

Search criteria

We first examined the extent to which the MMSE and LM are used in AD clinical trials and diagnostic studies, as well as the cutoff scores employed in these studies. To do so, all phases II and III recruiting and active AD trials were identified in the ClinicalTrials.gov database. The search was limited to U.S. trials that listed “Alzheimer’s disease” as a keyword. Inclusion criteria as they pertained to MMSE and LM and their cutoffs were obtained from the inclusion description under “Eligibility.”

Subjects

The diagnostic accuracy of MMSE and LM cutoffs used in AD trial and diagnostic studies was tested using subjects in the NACC database diagnosed with normal cognition (n = 10,741), MCI (n = 5883), and AD dementia (n = 6814). The NACC, established by the National Institute on Aging in 1999 to promote collaborative AD research, is a publicly accessible, longitudinal database of standardized clinical data gathered from 34 past and present ADCs across the United States. The regional ADCs are based in university medical centers, and recruitment is carried out via neurology referrals and community outreach. Each year beginning in 2005, the ADCs have contributed standardized cognitive, behavioral, and functional data for each participant to a UDS that now forms the NACC-UDS database. For full descriptions of the NACC-UDS, please refer to publications by Weintraub et al. [14], Beekly et al. [20, 21], and Morris et al. [22]. Before engagement in the research registry, written informed consent was obtained by all study participants or their legally authorized representatives. All aspects of the study adhered to necessary ethical guidelines and were approved by the local ADC’s human subjects review board.

A formal data request to NACC for this study was approved (proposal ID 606), and data were provided on 28 September 2015. The sample was restricted to initial visits of subjects between the ages of 50 and 100 years with a diagnosis of normal cognition, MCI, or primary possible or probable AD dementia. Baseline evaluations for the current sample occurred between 2005 and 2015. Data queried included the UDS cognitive test battery (see below for version), diagnostic status, CDR score, and demographic variables. The sample was further restricted to those who completed the English version of the MMSE. See Table 1 for study variables.

Table 1 NACC sample characteristics

Full size table

Diagnostic categories

For the current NACC sample, 23.0 % of neurological diagnoses were made by a single clinician and 77.0 % of the diagnoses were assigned through multidisciplinary diagnostic consensus conferences composed of neurologists, neuropsychologists, geriatricians, and geriatric psychiatrists. Consensus diagnoses were made following presentation and discussion of all examinations, UDS (and other) test findings (including neuroimaging and other biomarkers, if available), and psychosocial and medical history. At the time of data collection for this study, AD dementia was diagnosed on the basis of the National Institute of Neurological and Communicative Disorders and Stroke/Alzheimer’s Disease and Related Disorders Association criteria [23]. MCI diagnosis was based on criteria defined by Winblad et al. [24].

Measures

Mini Mental State Examination

The MMSE is a 30-item assessment of global cognitive status that taps into domains such as orientation, concentration, attention, verbal learning (without delayed recall), naming, and visuoconstruction [4]. Despite its weaknesses, the MMSE has long been used to detect and monitor dementia progression.

Logical Memory test

The LM subtest of the WMS-R is a standardized assessment of narrative episodic memory [5]. A short story is orally presented, and the examinee is asked to recall the story verbatim (immediate recall). Approximately 20 or 30 min later, free recall of the story is again elicited (delayed recall). Of the NACC sample, 11,569 subjects were administered the UDS cognitive battery version 1 and 11,869 were given version 2. Between 2005 and 2007, two version 1 examinations were administered (1.1 and 1.2), with version 1.1 having a delayed story recall of 30 min and version 1.2 being 20 min. The UDS version 2 retained the 20-min recall. For all UDS versions, LM Story A delayed recall was used and the only difference from version 1.1 to versions 1.2 and 2 was the delay interval. Of note, we were unable to distinguish between subjects who received versions 1.1 and 1.2, but the differences in the delay intervals have been shown not to be associated with number of units recalled [14].

Clinical Dementia Rating

The CDR is a widely used, valid, and reliable tool for staging dementia severity [25–27]. Specifically, the CDR is standardized for multicenter use, has demonstrated good interrater reliability and criterion validity, and has been shown to predict neuropathology [26]. In fact, even without an informant, recent work in community-dwelling elderly shows the CDR exhibits strong internal consistency (Cronbach’s α 0.83–0.84) and good interrater reliability (0.95 for global rating) and test-retest reliability (κ = 0.80 for global rating) [28]. The CDR assesses the extent of a person’s impairment in six domains: memory, orientation, judgment/problem-solving, community affairs, home and hobbies, and personal care. An algorithm is used to create an overall rating of impairment severity: 0 (no dementia), 0.5 (questionable dementia), 1.0 (mild dementia), 2.0 (moderate dementia), or 3.0 (severe dementia). Typically, a score of 0.5 is given to individuals with a diagnosis of MCI [25].

Statistical analyses

Subjects were excluded for missing data on the MMSE or LM that was due to physical, cognitive, or behavioral (including refusal) problems that interfered with testing. Receiver operating characteristic (ROC) curves were examined to evaluate the accuracy [area under the ROC curve (AUC)] of the MMSE and LM cutoffs used in AD clinical trials and diagnostic studies (results presented below) in distinguishing the diagnostic groups. The MMSE and LM-II were transformed to binary variables (i.e., above and below the identified cutoff) and served as the test variable. An AUC value of 0.75 was considered to be clinically meaningful [29]. ROC curve analyses were repeated with MMSE and LM as continuous test variables to obtain the sensitivity and specificity values of the cutoffs used in AD clinical trials and diagnostic studies. Positive and negative predictive values (PPVs and NPVs, respectively) were then calculated to determine the diagnostic accuracy of the cutoffs in the NACC sample. The prevalence of MCI and AD dementia used for the calculation of PPV and NPV was based on the prevalence of MCI and AD dementia in this sample and, in some instances, varied according to the age or educational group to which analyses were restricted.

κ Statistics were used to examine the level of agreement between the MMSE and CDR groups. MMSE cutoffs of 30, 29–26, 25–21, 20–11, and 10–0 have previously been shown to map onto CDR scores (0, 0.5, 1.0, 2.0, and 3.0, respectively) and thus were used in this study to define no, questionable, mild, moderate, and severe dementia, respectively [12]. Standard convention was used in the interpretation of κ values in terms of level of agreement [30].

Results

Prevalence of MMSE and LM as inclusion criteria in AD clinical trials and diagnostic studies

There were 111 phases II and III AD trials and diagnostic studies that were listed as recruiting or active. Of those 111, 64 (57.7 %) used the MMSE for eligibility criteria, including randomized controlled treatment trials “Effect of Passive Immunization on the Progression of Mild Alzheimer’s Disease: Solanezumab (LY2062430) Versus Placebo” (sponsored by Eli Lilly and Company) and “Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease (A4 Study)” (sponsored by Eli Lilly and Company, Alzheimer’s Disease Cooperative Study collaborator). The major multisite diagnostic study, Alzheimer’s Disease Neuroimaging Initiative 2 (ADNI 2) (funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and private sector contributions facilitated by the Foundation of the National Institutes of Health), was also found to use the MMSE. MMSE cutoffs ranging from 3–14 to >27, but ≤24 and/or ≤26 most commonly defined the clinical spectrum of AD. A majority of studies that use 26 primarily target subjects with MCI or mild AD dementia.

Seven recruiting and active AD clinical trials use LM to determine eligibility, and five of the seven use both the MMSE and LM. All trials use the delayed recall score of LM (i.e., LM-II). Notable trials include the “Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease (A4 Study)” and “A Placebo-controlled, Double-blind, Parallel-group, Bayesian Adaptive Randomization Design and Dose-Regimen-find Study to Evaluate Safety, Tolerability and Efficacy of BAN2401 in Subjects with Early Alzheimer’s Disease” (sponsored by Eisai Inc.). There was inconsistency with the WMS version used: the BAN2401 uses WMS-IV, but other trials use the WMS-R or the WMS-III. ADNI 2 also uses LM-II (WMS-R) to determine study eligibility. ADNI 2 and the A4 and BAN2401 trials use both the LM-II and MMSE.

Regarding LM-II cutoffs, the A4 trial uses 6–18 to define asymptomatic AD. The BAN2401 trial cutoffs are age-adjusted and include 50–64 years: ≤15; 65–69 years: ≤12; 70–74 years: ≤11; 75–79 years: ≤9; and 80–90 years ≤7. ADNI 2 implements the following LM-II education-based cutoffs:

16 years of education: normal ≥9; early MCI = 9–11; AD ≤8
8–15 years of education: normal ≥5; early MCI = 5–9; AD ≤4
0–7 years of education: normal ≥3; early MCI = 3–6; AD ≤2

AUC, PPV, and NPV for MMSE cutoff scores

AUC

On the basis of the above-described search results, we tested the accuracy of MMSE scores ≤24 and/or ≤26 in distinguishing normal cognition from AD dementia and MCI, respectively, and then MCI from AD dementia. Table 2 presents AUC values. The accuracy of an MMSE ≤26 was suboptimal for MCI, but an MMSE of ≤24 was adequate for detecting AD dementia.

Table 2 AUC values for MMSE and LM-II cutoffs used in AD clinical trials and diagnostic studies in NACC subjects

Full size table

PPV and NPV: normal cognition versus MCI

Table 3 provides PPVs and NPVs for MMSE cutoff scores used in AD clinical trials and diagnostic studies. Of note, we provide a range of MMSE cutoff scores other than ≤24 and/or ≤26 to determine the accuracy of other scores and to potentially facilitate decision-making regarding optimal cutoff score use. The MMSE score of 26 yielded a PPV and NPV of 64.6 % and 68.2 %, respectively, suggesting this cutoff is associated with a >35 % chance that NACC subjects may be inaccurately classified as having MCI or AD dementia. MMSE scores <26 increased in PPV and declined in NPV.

Table 3 PPV and NPV for AD clinical trial and diagnostic study MMSE cutoff scores for normal cognition versus MCI and AD dementia in NACC

Full size table

PPV and NPV: normal cognition versus AD dementia

In the comparison between normal cognition and AD dementia, there was a PPV of 97.0 % for the MMSE cutoff of ≤24, but the NPV was only 80.8 %. The NPV suggests there is a 19.2 % chance that NACC subjects with AD dementia are not detected by this cutoff.

PPV and NPV: MCI versus AD dementia

Relative to normal cognition versus AD dementia, the PPV and NPV for the MMSE cutoff of 24 were lower for distinguishing between MCI and AD dementia (90.1 % and 63.8 %, respectively).

MMSE and CDR score agreement

The level of agreement between the MMSE and CDR scores improved across the AD spectrum (p < 0.001 for all). Using MMSE cutoffs previously validated to discriminate across CDR scores [12], κ values were the worst for questionable dementia or MCI (κ = 0.15, slight agreement). There was fair agreement for normal cognition (κ = 0.37), as well as mild (κ = 0.27) and moderate (κ = 0.33) dementia. There was moderate agreement for severe dementia (κ = 0.48).

AUC, PPV, and NPV for LM-II cutoff scores

AUC

The accuracy of all LM-II cutoff scores was clinically inadequate for distinguishing between normal cognition and MCI (AUC <0.75 for nearly all) and was worst for the BAN2401 trial cutoff of ≤15 in 50–64-year-olds (AUC 0.61). A similar pattern emerged between MCI and AD dementia. In terms of normal cognition versus AD dementia, the LM-II cutoff of ≤15 in the BAN2401 trial for 50–64-year-olds was only 0.64; the AUC for all remaining LM-II cutoffs was >0.75.

PPV and NPV: normal cognition versus MCI

Tables 4, 5, and 6 present PPVs and NPVs for LM-II cutoff scores for the A4 and BAN2401 trials and ADNI 2. The suggested LM-II cutoffs had a high probability of inaccurate NACC diagnosis. The age-adjusted LM-II cutoff of ≤15 used in the BAN2401 trial resulted in a 33.1 % probability that NACC subjects with MCI, aged 50–64 years, actually had MCI. NPV was lowest (72.8 %) for the LM-II cutoff used for 80–90-year-olds. There was a similar pattern for the remaining age categories. Regarding ADNI 2 cutoffs that defined MCI, the PPV reached as low as 52.2 % for NACC subjects with 16+ years of education. The NPV was <50.0 % for all LM-II cutoffs among NACC subjects with 0–7 years of education.

Table 4 PPV and NPV for AD clinical trial and diagnostic study LM-II cutoff scores for normal cognition versus MCI in NACC

Full size table

Table 5 PPV and NPV for AD clinical trial and diagnostic study LM-II cutoff scores for normal cognition versus AD dementia in NACC

Full size table

Table 6 PPV and NPV for AD clinical trial and diagnostic study LM-II cutoff scores for MCI versus AD dementia in NACC

Full size table

PPV and NPV: normal cognition versus AD dementia

The PPV for an LM-II score of 15 in 50–64-year-old NACC subjects was 37.2 %, and it remained similar for the other age categories. In terms of ADNI 2 cutoffs for individuals with 16+ years of education, there was a 17.2 % chance that NACC subjects who did not have AD dementia scored at or below the ADNI 2 defined score for AD dementia. NPV was >90.0 % for almost all AD dementia LM-II cutoffs, although the cutoff for NACC subjects with 0–7 years of education had an NPV of 58.4 %.

PPV and NPV: MCI versus AD Dementia

PPV was 53.9 % for the LM-II cutoff score of 15 in the 50–64-year-old group and 48.2 % for the cutoff of 12 in the 65–69-year-old group. Almost all other PPVs were below 70.0 %. NPV reached as low as 64.0 % for the LM-II score of 2 in the 0–7 years of education group.

Discussion

Nearly 60 % of currently active and recruiting phases II and III AD clinical trials and diagnostic studies rely on MMSE scores to determine inclusion, and several use LM-II test scores. MMSE and LM-II cutoffs used to determine eligibility were associated with a high probability of inaccurate diagnostic classification in the >23,000 NACC subjects with normal cognition, MCI, and AD dementia. In the NACC sample, the MMSE and LM-II cutoff scores lacked diagnostic accuracy for the identification of MCI, and LM-II cutoffs poorly distinguished AD dementia from MCI (AUC <0.75). It was the consistently low PPVs and NPVs for MMSE and LM-II cutoff scores across all diagnostic classifications that were most alarming, however. The PPV was only 64 % for the MMSE cutoff often used to define MCI (or early AD dementia) in AD trials and diagnostic studies. The PPVs and NPVs were remarkably low for LM-II cutoffs and spanned the entire AD spectrum, with many PPVs and NPVs below 50 %, and as low as 33 % and 42 %, respectively. Given NACC’s large sample and representativeness of the target clinical trial population (NACC is an important recruitment source for AD trials), there is a strong possibility that many of the multicenter studies in which investigators are testing AD therapeutic and diagnostic methods include subjects from a nontarget population. Such inappropriate sampling could potentially lead to biased or inaccurate results.

The psychometric limitations of the MMSE and LM (e.g., large ceiling and floor effects, learning biases) limit their diagnostic accuracy. The MMSE and LM are highly influenced by demographic factors (e.g., age, education) [11, 31–33]. Many patients with MCI and mild AD dementia can perform within the “normal” range, and cognitively intact individuals can frequently score within the impaired range (healthy older adults commonly exhibit impaired retention on the WMS) [16]. Subjects in AD trials and diagnostic studies are likely to have had repeated exposure to the MMSE and LM, given that they are among the most widely used measures in the clinical management of AD and are included in many research registries (e.g., NACC ADCs). Both instruments are sensitive to practice effects and subsequent inflated scores; in fact, past work has shown that subjects with MCI from ADNI demonstrated practice effects only for LM [34]. It is the psychometric weaknesses of the MMSE and LM-II that underpin their lack of utility as stand-alone diagnostic measures, and their isolated use to determine inclusion into AD trials and diagnostic studies could lead to inappropriate sampling and affect the validity and reliability of study results [18, 19].

There was an overall lack of agreement between MMSE and CDR scores among NACC subjects. The CDR is considered the gold standard for staging dementia severity, and the MMSE appears to lack validity in detecting and discriminating across the various disease stages, particularly at the mild end of the spectrum [35]. This has significant economic, clinical, and research implications, given that clinical trials rely on the MMSE to distinguish between subjects with normal cognition, MCI, and AD dementia when determining inclusion. Moreover, national guidelines in parts of the world use the MMSE to define dementia severity to guide pharmacological intervention [36, 37].

There is a need to improve upon current study inclusion methods for AD clinical trials and diagnostic studies. The continued use of existing brief, inexpensive methods for determining entry into AD trials may lead to inaccurate study findings, including failure to meet endpoints, because of inappropriate inclusion of subjects into the trials, rather than lack of efficacy of the compounds being studied. With the tremendous amount of time and financial resources devoted to the development of new and cutting edge AD therapeutics and diagnostic methods, it may be time to set aside this “penny-wise and pound-foolish” approach to the selection of screening and/or selection instruments and devote adequate time and resources to the development of rigorous new measures that may be expensive and time-consuming, but would increase the accurate detection of the appropriate population and do justice to the science at hand. This would require development of specific tests based on the specific goal of the screening, focusing on cognitive (or other) domains, with appropriate variability (diminished floor and ceiling effects), extensive normative and clinical data, and attention to cultural and language differences. Alternatively, there are existing instruments and methodology that could be implemented that may facilitate appropriate enrollment. For example, a recent study in ADNI found optimal diagnostic certainty of MCI included <1 standard deviation below normative reference on two episodic memory measures (LM-II and Auditory Verbal Learning Test delay recall) [38]. If AD clinical trials and diagnostic studies continue to rely on single screening measures due to their time- and cost-effectiveness, it is encouraged that more stringent criterion cutoffs be used [39] and the methodology employed in this paper be used (i.e., calculation of PPV, NPV) to identify the best cutoffs for accurate diagnostic classification using existing measures. However, instead of relying on the MMSE, more comprehensive screening tools, such as the Montreal Cognitive Assessment (MoCA), may be helpful. The MoCA has been shown to demonstrate greater predictive ability of dementia and have lower ceiling effects than the MMSE [40, 41]. In sum, the specific solution for optimizing inclusion methods for AD trials is unclear, but alternatives need to be considered, including the possibility of abandoning current practice and developing new methods.

The present study is not without limitations. The generalizability of the PPVs and NPVs is restricted to the NACC sample, a convenience research sample. However, this concern is attenuated, given that many clinical trials and diagnostic studies recruit from NACC sites. The cross-sectional data may have precluded robust diagnostic status within the NACC sample. Longitudinal studies are needed to clarify the limitations of the MMSE and LM across diagnostic severity groups, including their sensitivity to change and the role of practice effects, particularly in the context of recent work from the NACC dataset that shows most pronounced practice effects over time (relative to other measures from the UDS) on tasks of semantic and episodic memory [42].

Conclusions

The use of MMSE and LM-II scores to determine eligibility for AD clinical trials and diagnostic studies may lead to inappropriate inclusion or exclusion of subjects. Such biases in sample selection could translate to misleading results from trials testing the efficacy of new AD treatments, or diagnostic studies examining methods and biomarkers for the detection of AD.

Abbreviations

AD:: Alzheimer’s disease
ADC:: Alzheimer’s Disease Center
ADNI:: Alzheimer’s Disease Neuroimaging Initiative
AUC:: area under the receiver operating characteristic curve
CDR:: Clinical Dementia Rating
LM:: Logical Memory
MCI:: mild cognitive impairment
MMSE:: Mini Mental State Examination
MoCA:: Montreal Cognitive Assessment
NACC:: National Alzheimer’s Coordinating Center
and NPV:: negative predictive value
PPV:: positive predictive value
ROC:: receiver operating characteristic
SD:: standard deviation
UDS:: Uniform Data Set
WMS:: Wechsler Memory Scale

References

Palmqvist S, Zetterberg H, Mattsson N, Johansson P, Minthon L, Blennow K, et al. Detailed comparison of amyloid PET and CSF biomarkers for identifying early Alzheimer disease. Neurology. 2015;85:1240–9.
Article PubMed Central CAS PubMed Google Scholar
Khan TK, Alkon DL. Alzheimer’s disease cerebrospinal fluid and neuroimaging biomarkers: diagnostic accuracy and relationship to drug efficacy. J Alzheimers Dis. 2015;46:817–36.
Article CAS PubMed Google Scholar
Lautner R, Plamqvist S, Mattsson N, Andreasson U, Wallin A, Pålsson E, et al. Apolipoprotein E genotype and the diagnostic accuracy of cerebrospinal fluid biomarkers for Alzheimer disease. JAMA Psychiatry. 2014;71:1183–91.
Article PubMed Google Scholar
Folstein MF, Folstein SE, McHugh PR. “Mini-mental state”: a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975;12:189–98.
Article CAS PubMed Google Scholar
Wechsler D. WMS-R: Wechsler Memory Scale–Revised. San Antonio, TX: Psychological Corporation; 1987.
Google Scholar
Franco-Marina F, García-González JJ, Wagner-Echeagaray F, Gallo J, Ugalde O, Sánchez-García S, et al. The Mini-Mental State Examination revisited: ceiling and floor effects after score adjustment for educational level in an aging Mexican population. Int Psychogeriatr. 2010;22:72–81.
Article PubMed Google Scholar
Galasko D, Abramson I, Corey-Bloom J, Thal JL. Repeated exposure to the Mini-Mental State Examination and the Information-Memory-Concentration Test results in a practice effect in Alzheimer’s disease. Neurology. 1993;43:1559–63.
Article CAS PubMed Google Scholar
Spencer RJ, Wendell CR, Giggey PP, Katzel LI, Lefkowitz DM, Siegel EL, et al. Psychometric limitations of the Mini-Mental State Examination among nondemented older adults: an evaluation of neurocognitive and magnetic resonance imaging correlates. Exp Aging Res. 2013;39:382–97.
Article PubMed Google Scholar
Lonie JA, Tierney KM, Ebmeier KP. Screening for mild cognitive impairment: a systematic review. Int J Geriatr Psychiatry. 2009;24:902–15.
Article PubMed Google Scholar
Arevalo‐Rodriguez I, Smailagic N, Roqué i Figuls M, Ciapponi A, Sanchez‐Perez E, Giannakou A, et al. Mini‐Mental State Examination (MMSE) for the detection of Alzheimer’s disease and other dementias in people with mild cognitive impairment (MCI). Cochrane Database Syst Rev. 2015;3:CD010783.
PubMed Google Scholar
Mitchell AJ. A meta-analysis of the accuracy of the Mini-Mental State Examination in the detection of dementia and mild cognitive impairment. J Psychiatr Res. 2009;43:411–31.
Article PubMed Google Scholar
Perneczky R, Wagenpfeil S, Komossa K, Grimmer T, Diehl J, Kurz A. Mapping scores onto stages: Mini-Mental State Examination and Clinical Dementia Rating. Am J Geriatr Psychiatry. 2006;14:139–44.
Article PubMed Google Scholar
Schnabel R. Overcoming the challenge of re-assessing logical memory. Clin Neuropsychol. 2012;26:102–15.
Article PubMed Google Scholar
Weintraub S, Salmon D, Mercaldo N, Ferris S, Graff-Radford NR, Chui H, et al. The Alzheimer’s Disease Centers’ Uniform Data Set (UDS): the neuropsychological test battery. Alzheimer Dis Assoc Disord. 2009;23:91–101.
Article PubMed Central PubMed Google Scholar
Derby CA, Burns LC, Wang C, Katz MJ, Zimmerman ME, L’Italien G, et al. Screening for predementia AD time-dependent operating characteristics of episodic memory tests. Neurology. 2013;80:1307–14.
Article PubMed Central PubMed Google Scholar
Johnson DK, Storandt M, Balota DA. Discourse analysis of logical memory recall in normal aging and in dementia of the Alzheimer type. Neuropsychology. 2003;17:82–92.
Article PubMed Google Scholar
Tremont D, Halpert S, Javorsky DJ, Stern RA. Differential impact of executive dysfunction on verbal list learning and story recall. Clin Neuropsychol. 2000;14:295–302.
Article CAS PubMed Google Scholar
Doody RS, Thomas RG, Farlow M, Iwatsubo T, Vellas B, Joffe S, et al. Phase 3 trials of solanezumab for mild-to-moderate Alzheimer’s disease. N Engl J Med. 2014;370:311–21.
Article CAS PubMed Google Scholar
Salloway S, Sperling R, Fox NC, Blennow K, Klunk W, Raskind M, et al. Two phase 3 trials of bapineuzumab in mild-to-moderate Alzheimer’s disease. N Engl J Med. 2014;370:322–33.
Article PubMed Central CAS PubMed Google Scholar
Beekly DL, Ramos EM, van Belle G, Deitrich W, Clark AD, Jacka ME, et al. The National Alzheimer’s Coordinating Center (NACC) database: an Alzheimer disease database. Alzheimer Dis Assoc Disord. 2004;18:270–7.
PubMed Google Scholar
Beekly DL, Ramos EM, Lee WW, Deitrich WD, Jacka ME, Wu J, et al. The National Alzheimer’s Coordinating Center (NACC) database: the Uniform Data Set. Alzheimer Dis Assoc Disord. 2007;21:249–58.
Article PubMed Google Scholar
Morris JC, Weintraub S, Chui HC, Cummings J, DeCarli C, Ferris S, et al. The Uniform Data Set (UDS): clinical and cognitive variables and descriptive data from Alzheimer Disease Centers. Alzheimer Dis Assoc Disord. 2006;13:210–6.
Article Google Scholar
McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan E. Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDS Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology. 1984;34:939–44.
Article CAS PubMed Google Scholar
Winblad B, Palmer K, Kivipelto M, Jelic V, Fratiglioni L, Wahlund LO, et al. Mild cognitive impairment–beyond controversies, towards a consensus: report of the International Working Group on Mild Cognitive Impairment. J Intern Med. 2004;256:240–6.
Article CAS PubMed Google Scholar
Hughes CP, Berg L, Danziger WL, Coben LA, Martin RL. A new clinical scale for the staging of dementia. Br J Psychiatry. 1982;140:566–72.
Article CAS PubMed Google Scholar
Morris JC. Clinical dementia rating: a reliable and valid diagnostic and staging measure for dementia of the Alzheimer type. Int Psychogeriatr. 1997;9:173–6.
Article PubMed Google Scholar
Morris JC. The Clinical Dementia Rating: current version and scoring rules. Neurology. 1993;43:2412–4.
Article CAS PubMed Google Scholar
Nyunt MS, Chong MS, Lim WS, Lee TS, Yap P, Ng TP. Reliability and validity of the Clinical Dementia Rating for community-living elderly subjects without an informant. Dement Geriatr Cogn Dis Extra. 2013;3:407–16.
Article PubMed Central PubMed Google Scholar
Fan J, Upadhye S, Worster A. Understanding receiver operating characteristic (ROC) curves. CJEM. 2006;8:19–20.
PubMed Google Scholar
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74.
Article CAS PubMed Google Scholar
Spering CC, Hobson V, Lucas JA, Menon CV, Hall JR, O’Bryant SE. Diagnostic accuracy of the MMSE in detecting probable and possible Alzheimer’s disease in ethnically diverse highly educated individuals: an analysis of the NACC database. J Gerontol A Biol Sci Med Sci. 2012;67:890–6.
Article PubMed Central PubMed Google Scholar
Tombaugh TN, McIntyre NJ. The Mini‐Mental State Examination: a comprehensive review. J Am Geriatr Soc. 1992;40:922–35.
Article CAS PubMed Google Scholar
Abikoff H, Alvir J, Hong G, Sukoff R, Orazio J, Solomon S, et al. Logical Memory subtest of the Wechsler Memory Scale: age and education norms and alternate-form reliability of two scoring systems. J Clin Exp Neuropsychol. 1987;9:435–48.
Article CAS PubMed Google Scholar
Goldberg TE, Harvey PD, Wesnes KA, Snyder PJ, Schneider LS. Practice effects due to serial cognitive assessment: implications for preclinical Alzheimer’s disease randomized controlled trials. Alzheimers Dement (Amst). 2015;1:103–11.
Google Scholar
Perneczky R. The appropriateness of short cognitive tests for the identification of mild cognitive impairment and mild dementia [in German]. Aktuelle Neurol. 2003;30:114–7.
Article Google Scholar
National Institute for Clinical Excellence (NICE). Guidance on the use of donepezil, rivastigmine and galantamine for the treatment of Alzheimer’s disease [Technology Appraisal Guidance]. London: NICE; 2001.
Google Scholar
National Collaborating Centre for Mental Health. Dementia: the NICE-SCIE guideline on supporting people with dementia and their carers in health and social care. National Clinical Practice Guideline Number 42. London: British Psychological Society and Royal College of Psychiatrists; 2007. http://www.scie.org.uk/publications/misc/dementia/dementia-fullguideline.pdf?res=true. Accessed 23 January 2016.
Callahan BL, Ramirez J, Berezuk C, Duchesne S, Black SE, Alzheimer’s Disease Neuroimaging Initiative. Predicting Alzheimer’s disease development: a comparison of cognitive criteria and associated neuroimaging biomarkers. Alzheimers Res Ther. 2015;7:68.
Article PubMed Central PubMed Google Scholar
Jak AJ, Bondi MW, Delano-Wood L, Wierenga C, Corey-Bloom J, Salmon DP, et al. Quantification of five neuropsychological approaches to defining mild cognitive impairment. Am J Geriatr Psychiatry. 2009;17:368–75.
Article PubMed Central PubMed Google Scholar
Hsu JL, Fan YC, Huan YL, Wang J, Chen WH, Chiu HC, et al. Improved predictive ability of the Montreal Cognitive Assessment for diagnosing dementia in a community-based study. Alzheimers Res Ther. 2015;7:69.
Article PubMed Central PubMed Google Scholar
Trezepacz PT, Hochstetler H, Wang S, Walker B, Saykin; Alzheimer’s Disease Neuroimaging Initiative. Relationship between the Montreal Cognitive Assessment and Mini-Mental State Examination for assessment of mild cognitive impairment in older adults. BMC Geriatr. 2015;15:107.
Gavett BE, Ashendorf L, Gurnani AS. Reliable change on neuropsychological tests in the Uniform Data Set. J Int Neuropsychol Soc. 2015;21:558–67.
Article PubMed Google Scholar

Download references

Acknowledgments

The NACC database is funded by the National Institute on Aging (NIA) under NIH grant U01 AG016976. NACC data are contributed by the NIA-funded ADCs under grants P30 AG019610 (Eric Reiman, MD, principal investigator), P30 AG013846 (Neil Kowall, MD, principal investigator), P50 AG008702 (Scott Small, MD, principal investigator), P50 AG025688 (Allan Levey, MD, PhD, principal investigator), P30 AG010133 (Andrew Saykin, PsyD, principal investigator), P50 AG005146 (Marilyn Albert, PhD, principal investigator), P50 AG005134 (Bradley Hyman, MD, PhD, principal investigator), P50 AG016574 (Ronald Petersen, MD, PhD, principal investigator), P50 AG005138 (Mary Sano, PhD, principal investigator), P30 AG008051 (Steven Ferris, PhD, principal investigator), P30 AG013854 (M. Marsel Mesulam, MD, principal investigator), P30 AG008017 (Jeffrey Kaye, MD, principal investigator), P30 AG010161 (David Bennett, MD, principal investigator), P30 AG010129 (Charles DeCarli, MD, principal investigator), P50 AG016573 (Frank LaFerla, PhD, principal investigator), P50 AG016570 (David Teplow, PhD, principal investigator), P50 AG005131 (Douglas Galasko, MD, principal investigator), P50 AG023501 (Bruce Miller, MD, principal investigator), P30 AG035982 (Russell Swerdlow, MD, principal investigator), P30 AG028383 (Linda Van Eldik, PhD, principal investigator), P30 AG010124 (John Trojanowski, MD, PhD, principal investigator), P50 AG005133 (Oscar Lopez, MD, principal investigator), P50 AG005142 (Helena Chui, MD, principal investigator), P30 AG012300 (Roger Rosenberg, MD, principal investigator), P50 AG005136 (Thomas Montine, MD, PhD, principal investigator), P50 AG033514 (Sanjay Asthana, MD, FRCP, principal investigator), and P50 AG005681 (John Morris, MD, principal investigator). MLA is supported by NIH postdoctoral fellowship grant T32-AG06697.

Role of the funding source

This study was funded by NIA grants P30 AG013846 and U01 AG016976. The funding sources provided data and salary support for some of the authors. However, these funding sources did not play any role in the study design, analysis and interpretation of data, the writing of the report, or the decision to submit the manuscript for publication.

Author information

Authors and Affiliations

Boston University Alzheimer’s Disease and CTE Center, 72 East Concord Street, Suite B7800, Boston, MA, 02118, USA
Kimberly R. Chapman, Hanaan Bing-Canar, Michael L. Alosco, Eric G. Steinberg, Brett Martin, Christine Chaisson, Neil Kowall, Yorghos Tripodis & Robert A. Stern
Department of Neurology, Boston University School of Medicine, Boston, MA, 02118, USA
Michael L. Alosco, Neil Kowall & Robert A. Stern
Data Coordinating Center, Boston University School of Public Health, Boston, MA, 02118, USA
Brett Martin & Christine Chaisson
Department of Biostatistics, Boston University School of Public Health, Boston, MA, 02118, USA
Christine Chaisson & Yorghos Tripodis
Department of Pathology, Boston University School of Medicine, Boston, MA, 02118, USA
Neil Kowall
Neurology Service, VA Boston Healthcare System, Boston, MA, 02130, USA
Neil Kowall
Department of Neurosurgery, Boston University School of Medicine, Boston, MA, 02118, USA
Robert A. Stern
Department of Anatomy & Neurobiology, Boston University School of Medicine, Boston, MA, 02118, USA
Robert A. Stern

Authors

Kimberly R. Chapman
View author publications
You can also search for this author in PubMed Google Scholar
Hanaan Bing-Canar
View author publications
You can also search for this author in PubMed Google Scholar
Michael L. Alosco
View author publications
You can also search for this author in PubMed Google Scholar
Eric G. Steinberg
View author publications
You can also search for this author in PubMed Google Scholar
Brett Martin
View author publications
You can also search for this author in PubMed Google Scholar
Christine Chaisson
View author publications
You can also search for this author in PubMed Google Scholar
Neil Kowall
View author publications
You can also search for this author in PubMed Google Scholar
Yorghos Tripodis
View author publications
You can also search for this author in PubMed Google Scholar
Robert A. Stern
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robert A. Stern.

Additional information

Competing interests

RAS has received research funding from the National Football League (NFL), the NFL Players Association, and Avid Radiopharmaceuticals, Inc. (Philadelphia, PA, USA). RAS is a member of the Mackey-White Committee of the NFL Players Association. RAS is a paid consultant to Athena Diagnostics/Quest Laboratories (Marlborough, MA, USA), Amarantus BioScience Holdings, Inc. (San Francisco, CA, USA), and Avanir Pharmaceuticals, Inc. (Aliso Viejo, CA, USA). RAS receives royalties for published neuropsychological tests from Psychological Assessment Resources, Inc. (Lutz, FL, USA), as well as compensation from expert legal opinion. All of the other authors declare that they have no competing interests.

Authors’ contributions

KRC participated in study conception and design, assisted with acquiring the data, and participated in analysis and interpretation of data as well as drafting and revision of the manuscript. HBC participated in study conception and design and drafting and revision of the manuscript. MLA assisted with acquiring the data, participated in study conception and design, helped with conducting the statistical analyses, and participated in interpretation of data and drafting and revision of the manuscript. EGS participated in analysis and interpretation of data and drafting and revision of the manuscript. BM assisted with acquiring data and participated in analysis and interpretation of data and drafting and revision of the manuscript. CC participated in analysis and interpretation of data and drafting and revision of the manuscript. NK participated in analysis and interpretation of data and drafting and revision of the manuscript. YT participated in study conception and design, performed the statistical analyses, and participated in data interpretation and drafting and revision of the manuscript. RAS participated in study conception and design, analysis and interpretation of data, and drafting and revision of the manuscript. All authors read and approved the final manuscript and agree to be accountable for all aspects of the work.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Chapman, K.R., Bing-Canar, H., Alosco, M.L. et al. Mini Mental State Examination and Logical Memory scores for entry into Alzheimer’s disease trials. Alz Res Therapy 8, 9 (2016). https://doi.org/10.1186/s13195-016-0176-z

Download citation

Received: 21 December 2015
Accepted: 13 January 2016
Published: 22 February 2016
DOI: https://doi.org/10.1186/s13195-016-0176-z

Mini Mental State Examination and Logical Memory scores for entry into Alzheimer’s disease trials

Abstract

Background

Methods

Results

Conclusions

Similar content being viewed by others

Is comprehensiveness critical? Comparing short and long format cognitive assessments in preclinical Alzheimer disease

A Hierarchical Bayesian Latent Class Model for the Diagnostic Performance of Mini-Mental State Examination and Montreal Cognitive Assessment in Screening Mild Cognitive Impairment Due to Alzheimer’s Disease

Bayesian estimation for the accuracy of three neuropsychological tests in detecting Alzheimer's disease and mild cognitive impairment: a retrospective analysis of the ADNI database

Background

Methods

Search criteria

Subjects

Diagnostic categories

Measures

Mini Mental State Examination

Logical Memory test

Clinical Dementia Rating

Statistical analyses

Results

Prevalence of MMSE and LM as inclusion criteria in AD clinical trials and diagnostic studies

AUC, PPV, and NPV for MMSE cutoff scores

AUC

PPV and NPV: normal cognition versus MCI

PPV and NPV: normal cognition versus AD dementia

PPV and NPV: MCI versus AD dementia

MMSE and CDR score agreement

AUC, PPV, and NPV for LM-II cutoff scores

AUC

PPV and NPV: normal cognition versus MCI

PPV and NPV: normal cognition versus AD dementia

PPV and NPV: MCI versus AD Dementia

Discussion

Conclusions

Abbreviations

References

Acknowledgments

Role of the funding source

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation