Background

Dementia can be caused by different pathological processes that are often difficult to distinguish clinically, particularly in the early stages of the condition. Alzheimer’s disease (AD) is the most frequent, followed by vascular dementia (VaD), mixed pathology and dementia with Lewy Bodies [1]. Identification of causes of dementia soon after symptom onset is critical, because appropriate treatment of some causes of dementia can slow or halt its progression or enable symptomatic treatment where appropriate [2]. Acetylcholinesterase inhibitors may improve symptoms of Alzheimer’s disease, while dementia with a vascular pathology can be treated by addressing the vascular risk factors e.g. prescribing low dose aspirin or similar medication [3]. Failure to distinguish VaD from AD may lead to inappropriate treatment.

Autopsy is the reference standard for differential diagnosis of dementia in a research context. In clinical practice, and in research that does not follow patients until death, diagnostic criteria consisting of a combination of patient medical history, cognitive function assessment and imaging findings are often used. These include the National Institute of Neurological Disorders and Stroke and Association Internationale pour la Recherche et l’Ensignement en Neurosciences (NINDS-AIREN) criteria for VaD [4] and the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association (NINCDS-ADRDA) criteria for AD [5]. Neuroimaging is increasingly regarded as an essential part of the diagnostic work-up of a patient with dementia. Magnetic Resonance Imaging (MRI) has been advocated as the preferred imaging method in clinical guidelines [6], despite being more costly and (in some health systems) less readily available than computed tomography (CT).

Previously, neuroimaging was used to exclude abnormalities such as normal pressure hydrocephalus, tumours and subdural hematoma [7], but it is increasingly used to identify features consistent with the pathology of dementia subtypes such as cerebrovascular changes. The accuracy of MRI and CT, and whether MRI is superior to CT, in detecting a vascular component to dementia in clinical cohorts of patients with VaD, combined AD and VaD (“mixed dementia”), and AD remain unclear. We conducted a systematic review and meta-analysis to investigate this question.

Methods

We produced a protocol for the review (available from the authors on request) detailing the proposed review methods.

Literature search

We searched MEDLINE, EMBASE, BIOSIS, Science Citation Index, ZETOC, NTIS, Dissertation Abstracts, and the GrayLit networkfrom database inception to February 2011 for published and unpublished studies. We combined terms for each imaging test (Magnetic Resonance Imaging” OR “mri” OR “Computed Tomography” OR “ct scan$”) with terms for the target conditions (“Alzheimer Disease” OR “Vascular Dementia” OR “multi-infarct dementia”). We did not use a methodological search filter to identify diagnostic accuracy studies, because such filters may result in omission of relevant studies [8, 9]. No language restrictions were applied.

Study selection

Studies that assessed the accuracy of MRI and/or CT (index tests) for the detection of cerebrovascular changes in patients with VaD, AD or mixed dementia (target conditions) against an appropriate reference standard were eligible for inclusion. Eligible reference standards for VaD and AD included: autopsy; NINCDS-ADRDA [5] for AD; NINDS-AIREN [4] for VaD; Diagnostic and Statistical Manual of Mental Disorders (DSM) DSM-III [10], DSM-III R [11], DSM-IV [12]; State of California AD Diagnostic and Treatment Centre Criteria (ADDTC) [13]; and ICD-10 [14]. Any reported reference standard for mixed dementia was eligible. All subtypes of VaD (e.g. multi-infarct dementia; subcortical vascular ischemic dementia; and Binswanger’s dementia) were included in the VaD group. Studies had to report 2x2 performance data for one or more of the following cerebrovascular imaging findings: general infarcts; lacunar infarcts; non-lacunar infarcts; white matter hyperintensities (WMH); periventricular hyperintensities (PVH); basal ganglia hyperintensities (BGH); or a ‘global assessment’ finding, such as the presence of two or more findings. Two reviewers independently screened titles and abstracts. Full papers were assessed by one reviewer and checked by another; disagreements were resolved through consensus or referral to the review team.

Data extraction and quality assessment

Data extraction and quality assessment were completed by one reviewer and checked by a second; disagreements were resolved through discussion or referral to a third reviewer. We extracted data on: inclusion/exclusion criteria, included patients, CT and MRI technical and operator details, reference standard, imaging finding, definition of a positive imaging finding, numbers of patients in each patient group (VaD, mixed dementia,AD or other diagnosis), and number of patients with positive imaging findings in each group. The patient groups were dichotomised as VaD or mixed dementia compared to AD or other diagnoses. This allowed construction of 2x2 tables of test performance, separately for each imaging finding assessed. Study quality was assessed using the Cochrane Collaboration’s adaption of the QUADAS tool [15].

Statistical analyses

We calculated sensitivity, specificity, and the diagnostic odds ratio (DOR) of MRI and CT for the detection of VaD or Mixed dementia, for each 2x2 table. We plotted estimates of sensitivity and specificity from individual studies in summary receiver operating characteristic (SROC) space, separately for each imaging finding. We conducted separate analyses for studies that did and did not use autopsy as the reference standard. Summary sensitivity and specificity were estimated using the bivariate/HSROC meta-analysis models when sufficiently many studies (usually at least four) reported on the same imaging finding [16]. If too few studies were available to permit use of these models (for example, because the estimation procedure did not converge), univariate random-effects meta-analysis was carried out. We investigated the utility of different MRI and CT imaging findings to rule in or rule out a diagnosis of VaD or mixed dementia, by deriving positive and negative likelihood ratios from summary estimates of sensitivity and specificity. We used standard random-effects meta-analysis [17] to estimate summary DORs for each imaging finding, separately for MRI and CT, and then used meta-regression to calculate ratios of DORs (RDORs) comparing MRI with CT. We also estimated RDORs comparing MRI and CT in studies that reported direct comparisons of the two techniques. Estimates of the between-study variance τ2 were used to quantify heterogeneity. There were insufficient included studies to allow assessment of reporting bias.

We assessed the impact of patient spectrum (QUADAS item 1) and incorporation bias (QUADAS item 6) on diagnostic accuracy using meta-regression to calculate ratios of RDORs comparing the DOR in studies that were rated “no” or “unclear” with those rated “yes” on these QUADAS items, separately for MRI and CT. In these analyses, we selected one set of 2x2 data from each study on the basis of the following hierarchy: (1) global assessment, (2) white matter hyperintensities, (3) lacunar infarcts, (4) periventricular hyperintensities, (5) any other imaging finding. All analyses were done using Stata™ version 11, using the metan metandi and metareg commands [1820].

Results

The searches identified 19,669 titles and abstracts; 38 studies (4377 patients, range 23 to 683) were included in the review (Figure 1). Twenty-six studies (37 sets of 2x2 data) assessed CT, 16 (33 sets of 2x2 data) assessed MRI; 4 evaluated both CT and MRI and thus provided direct comparisons between the two techniques. Twenty studies were prospective cohorts, 6 were retrospective cohorts and 12 were case–control studies Table 1. Publication dates ranged from 1986 to 2010.

Figure 1
figure 1

Flowchart of systematic review process.

Table 1 Number of studies assessing each imaging method, according to study design and reference standard

Seven studies used autopsy as reference standard; all others used clinical criteria with or without imaging findings. VAD was confirmed by NINDS-AIREN (13 studies), DSM-III or DSM-III-R (16), and ICD10 (1). Reference standards used to define AD were NINCDS-ADRA (24 studies), DSM-III or DSM-III-R (6) and ICD10 (1). Six studies included mixed dementia patients, 2 used DSM-III-R, 2 used ADDTC, 1 ICD10, 1 Hachinski Ischemic Score and 1 history and examination as reference standard. Mean age, where reported, ranged from 66 years to 85 years and was generally higher in autopsy than non-autopsy studies. Individual study demographics and results are shown in the Additional file 1 .

The main limitations of the included studies were the potential for biased selection of patients and incorporation bias. Most studies (61%) did not enrol an appropriate patient spectrum, defined as patients with suspected dementia in whom the diagnosis had not been confirmed. There was a risk of incorporation bias in 23 (61%) of the non-autopsy studies, because the reference standard included the imaging findings. Other QUADAS items were classified as adequate or unclear in the majority of studies (Figure 2).

Figure 2
figure 2

Proportions of studies rated as ’yes”, “no” or “unclear” for each QUADAS item.

Overall findings

There was substantial variation in estimates of accuracy reported in individual studies (Additional file 1). Figure 3 shows individual study estimates plotted in SROC space, separately for each imaging finding with different symbols according to imaging method (MRI or CT) and reference standard (autopsy or non-autopsy). These figures suggest that autopsy studies produced more of the outlying studies than the non-autopsy studies although there was no clear association with either sensitivity or specificity. Data from both direct and indirect comparisons suggested that MRI was more specific than CT with variable effects on sensitivity. The most specific imaging finding on both MRI and CT was general infarcts, but sensitivity was very heterogeneous for this finding. Non-lacunar infarcts also showed reasonable specificity, with heterogenous sensitivity. None of the findings had consistently high sensitivity. The most sensitive imaging finding appeared to be basal ganglia hyperintensities, but specificity was more variable and this finding was only assessed in five studies. White matter hyperintensities was the most commonly assessed finding, but results were heterogeneous.

Figure 3
figure 3

Summary ROC plots showing imaging findings from individual studies, according to imaging method and reference standard. Solid lines join MRI and CT results from the same study (direct comparisons).

Autopsy studies

Six autopsy studies assessed CT and one assessed MRI (Table 2). White matter hyperintensities was the only imaging finding assessed on both MRI and CT; none of the studies reported a direct comparison between the two techniques. Based on three studies assessing white matter hyperintensities, the RDOR comparing CT (n = 2) with MRI (n = 1) was 0.28 (95% CI 0 to 55849), p = 0.42.

Table 2 Summary estimates of diagnostic accuracy from autopsy studies, according to imaging finding and method

Non-autopsy studies

We compared the DORs in studies that incorporated imaging findings in the reference standard with those that did not, and in studies that enrolled a selected sample of patients with those did not (Table 3). Each study contributed one set of 2 × 2 data to these analyses, based on the hierarchy described earlier. There was weak evidence that the accuracy of CT was overestimated in studies in which incorporation bias was present (RDOR 3.97, 95% CI 0.68 to 23.2) (p = 0.12). There was little evidence for an association between incorporation bias and MRI (p = 0.88), or between biased selection of patients and CT (p = 0.95) or MRI (p = 0.21). Based on these findings, we did not exclude studies with limitations in patient selection or at risk of incorporation bias from subsequent analyses.

Table 3 Comparisons of diagnostic odds ratios according to presence or absence of incorporation and selection bias, for each imaging method

Table 4 shows summary estimates of sensitivity, specificity and positive and negative likelihood ratios for each imaging method and finding. Neither the individual imaging findings, nor the global assessment criteria, were found to have consistently high sensitivity. The most widely reported imaging finding was white matter hyperintensities. For CT (11 studies) summary sensitivity and specificity were 71% (95% CI 53% to 85%) and 55% (95% CI 44% to 66%). Corresponding figures for MRI (6 studies) were 95% (95% CI 87% to 98%) and 26% (95% CI 12% to 50%). This finding therefore had limited utility in ruling in or ruling out a diagnosis of VaD or mixed dementia. General infarcts was the most specific imaging finding on both MRI (96% (95% CI 94% to 97%)) and CT (96% (95% CI 93% to 98%)) with little heterogeneity (tau2 = 0). Corresponding positive likelihood ratios were also relatively high (LR + 13.08 (95% CI 7.64 to 22.4) for MRI and 12.22 (95% CI 5.59 to 26.7) for CT). However, sensitivity was low and showed substantial heterogeneity for both MRI (53% (95% CI 36% to 70%); tau2 = 0.21) and CT (52% (95% CI 22% to 80%), tau2 = 1.66).

Table 4 Summary estimates of diagnostic accuracy from non-autopsy studies, according to imaging finding and method

MRI was found to have greater accuracy than CT for six of the seven imaging findings assessed (Table 5) with RDORs ranging from 1.78 (95% CI 0.11, 28.2) for periventricular hyperintensities to 2.68 (95% CI 0.33 to 22.0) for lacunar infarcts. However, confidence intervals were wide and evidence for an association was weak (p-values ranged from 0.15 to 0.64). The four studies that reported direct comparisons of MRI and CT supported the results from the indirect comparisons. However, RDORs were smaller for most imaging findings (range 1.12 to 1.86) with the exception of one study of global assessment (RDOR 14.81, 95% CI 1.73 to 127.14) and one of periventricular hyperintensities (RDOR 5.08, 95% CI 0.46 to 55.70).

Table 5 Summary estimates of diagnostic odds ratios from non-autopsy studies, according to imaging finding and method, and comparison of the diagnostic accuracy of the two methods

Discussion

In this systematic review, we searched nearly 20,000 titles and abstracts in order to identify 38 studies that investigated the diagnostic accuracy of MRI or CT for detecting a vascular component to dementia. Only four of these studies assessed both imaging methods. Included studies were generally small and many were at high risk of bias due to the potential for biased selection of patients and possibility that test results were incorporated into the reference standard. However there was little evidence that these sources of bias impacted on estimates of accuracy. Only seven studies used autopsy as the reference standard, and their results were heterogeneous. Among the 31 studies that used a non-autopsy reference standard, no individual imaging finding was assessed in a majority of studies, and results were heterogeneous. White matter hyperintensities were the most frequently assessed imaging finding, but based on summary estimates of sensitivity and specificity this finding had limited utility for ruling in or ruling out a diagnosis of VaD or mixed dementia. The presence of general infarcts showed the greatest potential for ruling in a diagnosis of VaD or mixed dementia, but none of the findings appeared sufficiently sensitive to rule out a diagnosis of AD. Comparative analyses suggested that MRI may be more accurate than CT for distinguishing vascular or mixed dementia from Alzheimer’s disease and other conditions, but confidence intervals on estimated ratios of diagnostic odds ratios were wide.

We performed a comprehensive search, without language restrictions, to identify both published and unpublished literature: thus it is unlikely that relevant studies have been missed. We employed systematic review methods to minimise bias and errors during study selection, data extraction and quality assessment and used the most rigorous methods of meta-analysis for diagnostic accuracy data. We made both direct and indirect comparisons of the accuracy of CT and MRI, but were limited by the substantial between-study heterogeneity and small number of studies that directly compared the imaging methods. We assessed study quality using accepted criteria for diagnostic accuracy studies and investigated the effects of potential sources of bias in the analysis. Most of the included studies did not enrol an appropriate patient spectrum, which we defined as patients with symptoms of dementia in whom the diagnosis had not been confirmed. In practice, MRI and CT will have most clinical value if used at a relatively early stage in the diagnostic work-up of patients with symptoms of dementia, in order to help reach a definitive diagnosis and begin appropriate treatment early in the course of disease. Studies that did not assess MRI and/or CT in this context may produce less applicable, or biased, estimates of diagnostic accuracy: for example if they used a case–control design where cases already have a confirmed diagnosis of dementia subtype, or if they were conducted in patients with a longer duration of illness.

We stratified the analysis based on whether studies used an autopsy or non-autopsy reference standard. Because only a small number of autopsy studies were available, the impact of the type of reference standard on estimated diagnostic accuracy could not be evaluated with precision: no more than three autopsy studies assessed any individual imaging finding and most findings were assessed in only one or two studies. Although we considered autopsy to be the least biased reference standard there is a potential risk of disease progression bias as there will be a time lapse between the imaging and the autopsy examination. This means that some patients may not have had VaD orAD when they were assessed by MRI/CT but have developed one of these conditions before they died. This has the potential to impact on estimates of sensitivity and specificity, depending on whether the original reference standard is more likely to wrongly classify patients as VaD or AD. There was a risk of bias due to incorporation of test results in the reference standard, in many studies that used a non-autopsy reference standard. There was a suggestion that incorporation bias resulted in greater diagnostic accuracy for studies of CT but this was not found for studies of MRI. We would expect incorporation bias to increase agreement between the index test and reference standard leading to inflated estimates of sensitivity and specificity [21].

In the United States, the use of either CT or MRI as part of the diagnostic work-up of a dementia patient is recommended [22]. The UK National Institute for Health and Clinical Excellence (NICE) guidelines on dementia diagnosis state that structural imaging should be used in the assessment of suspected dementia to exclude other cerebral pathologies and to help establish the subtype diagnosis [6]. MRI is referred to as the preferred method to detect subcortical vascular changes, although it is acknowledged that CT could also be used. A 1988 narrative review by Joyce and Lishman [23], which discussed 9 studies, concluded that neither CT nor MRI are reliable in the differential diagnosis of AD and VaD.

Both CT and MRI technology have developed considerably in the time since the majority of the included studies were conducted. For example, helical CT with multiplanar reconstruction is now routinely used and has higher image resolution than the CT scans evaluated in the included studies. Modern CT may be considered to be preferential to MRI because it is quicker and much cheaper to buy and run, it is more comfortable for the patient and there are fewer contraindications to its use. It can be reconstructed in the coronal plane for direct visual assessment of hippocampal volume. These factors should be weighed against increased exposure to ionising radiation exposure with CT. In the future, fluorodeoxyglucose (FDG) - positron emission tomography (PET) may be useful in predicting decline in normal subjects and individuals with mild cognitive impairment [24]. Abeta-PET appears most useful in distinguishing AD from other dementias, although it has recently been suggested that a combination of Abeta- and FDG-PET may be more accurate. However, neither of these techniques is widely available in many hospital settings [25].

New diagnostic accuracy studies are needed to compare the utility of the latest generation of MRI and CT techniques in detecting a vascular component to dementia. The design of studies should aim to avoid the weaknesses of the studies located for this review. They should assess both MRI and CT in the same group of patients with symptoms of early dementia. Study size should be large enough to allow precise estimates of relative diagnostic accuracy. The reference standard should consist of accepted diagnostic criteria, without incorporating imaging findings, ideally supplemented by autopsy confirmation. Global assessment criteria for MRI and CT, based on the most useful individual imaging findings that are indicative of a vascular component to dementias, should be established, and their diagnostic accuracy quantified.

Conclusions

This comprehensive, systematic literature review has shown that, despite its longstanding and widespread use, there is no strong evidence to suggest that MRI is more accurate than CT in identifying cerebrovascular changes in autopsy-confirmed and clinical cohorts of VaD, AD, and ‘mixed dementia’. There is a need for new, large, high quality studies comparing state of the art CT with MRI in patients with symptoms of early dementia.

Funding

The review was funded by the United Kingdom Medical Research Council (Grant Code G0801405). GW was partly funded by the NIHR Biomedical Research Centre Programme, Oxford.