Background

Biliary obstruction may be due to a variety of causes including choledocholithiasis, tumours, and trauma, including injury after gall bladder surgery, with choledocolithiasis being the most common cause. The prevalence of gallstones in England and Wales was 182 per 10,000 person years at risk. The incidence rate was 8 per 10,000 person years at risk for 1991–1992 [1]. Patients with suspected biliary obstruction present with abnormal liver function and symptoms such as jaundice, pale-coloured stools, dark urine, itching, abdominal pain in the upper right quadrant, fever, nausea and vomiting. Endoscopic ultrasonography (EUS) is the first-line imaging investigation in patients with jaundice or right upper-quadrant pain [2]. Although EUS is non-invasive, quick and inexpensive it is very operator and patient dependent.

Endoscopic retrograde cholangiopancreatography (ERCP) is currently the 'gold standard' for the diagnosis of biliary obstruction. It is one of several invasive direct cholangiography techniques. However, it is an imperfect diagnostic tool and other procedures may be more appropriate gold standards for diagnosis in the future [3]. Magnetic resonance cholangiopancreatography (MRCP) is an alternative to diagnostic ERCP for imaging the biliary tree and investigating biliary obstruction. MRCP was developed in 1991 and techniques are continuing to improve. A major feature of MRCP is that it is not a therapeutic procedure, while in contrast ERCP is used for both diagnosis and treatment. MRCP also does not have the small but definite morbidity and mortality associated with ERCP. The use of MRCP in diagnosing biliary obstruction may avoid the use of unnecessary invasive procedures such as ERCP.

Indications for the use of MRCP include: unsuccessful or contraindicated ERCP; patient preference for non-invasive imaging; patients considered to be at low risk of having pancreatic or biliary disease; patients where the need for therapeutic ERCP is considered unlikely; and those with a suspected neoplastic cause for pancreatic or biliary obstruction [4]. No patient preparation is required for MRCP and sedation is not usually required. MRCP is particularly useful where ERCP is difficult, hazardous or impossible. It is also an important option for patients with failed ERCPs. ERCP and MRCP have different contraindications allowing them to be used as complementary techniques.

In order to determine the sensitivity and specificity of MRCP compared to ERCP, a systematic review was undertaken to identify all relevant studies comparing the two techniques using clearly defined inclusion and exclusion criteria. This paper therefore compares the findings of MRCP with diagnostic ERCP for the investigation of biliary obstruction, using accuracy statistics. We also report study quality, population characteristics and suspected conditions. The paper summarises the key clinical points reported in a recent Health Technology Assessment Monograph [5]. Since the most common cause of biliary obstruction is choledocholithiasis, we have concentrated mainly on the diagnosis of this condition.

Methods

We searched 13 electronic databases Medline, Embase and the Cochrane Controlled Trials Register from inception to January 2003. Reference lists of relevant articles were hand searched and various health services research-related resources were consulted via the Internet. Search terms included population search terms such as biliary, biliary tract, bile, gallbladder, choledocholithiasis and were combined with intervention terms such as magnetic resonance imaging, MRI and non-invasive diagnostic imaging. The search strategy is described in detail elsewhere [5]. No language or study/publication-type restrictions were applied to the searches. Inclusion criteria were adult patients with suspected biliary obstruction or dilatation, as defined by the individual studies, having MRCP and ERCP for diagnostic purposes. Outcome measures included sensitivity, specificity and likelihood ratios in different patient groups, acceptability to patients and adverse effects. The ERCP test results were assumed to be a true 'gold standard' diagnosis, although even the results of this test may be subject to error and thus not represent the patient's true condition. Only English language papers were selected. Studies involving pancreatic ductal system abnormalities were excluded, as were those not including a comparison of MRCP with diagnostic ERCP. Other exclusion criteria were: papers published before 1995; comparison of MRCP with failed or unsuccessful ERCP; studies where MRCP results informed decision to proceed to ERCP; and retrospective study design. Excluded studies were documented together with reasons for exclusion [5]. Data was extracted by one researcher using a standardised data extraction form and checked by another. Full details of the review process are described elsewhere [5].

The studies were assessed using quality criteria for diagnostic or screening tests [6]. These criteria include 14 components of study quality such as: appropriate spectrum of patients, selection criteria, independent assessment of test results, verification bias (whether all patients had both tests), reporting of uninterpretable results and withdrawals among others.

Standard tests for heterogeneity were conducted [7]. Point estimates and 95% confidence intervals for summary statistics (sensitivity, specificity, likelihood ratio) were calculated for each study and presented graphically with forest plots. In meta-analyses of diagnostic results one must also consider variation introduced by changes in the diagnostic threshold; studies may use different thresholds to define positive and negative test results. In the event of heterogeneity and/or variation in the diagnostic threshold, studies were summarised graphically with a Summary Receiver Operating Characteristic curve (SROC) curve and via the Littenberg and Moses (L-M) method for estimation of a best fitting SROC curve [8]. The L-M method consists of linear regression of the log diagnostic odds ratio, D, against the log of the measure of diagnostic threshold S, to produce estimates of the parameters a and b from the regression equation, D = a + bS. D and S are calculated from the true positive and false positive rates. We computed the sensitivity and specificity values from the SROC at the mean, minimum and maximum values of the S parameter, to indicate a central value and associated band with which the data were compatible.

Results

Out of a total of 1437 potentially relevant studies, 28 studies were identified that directly compared MRCP with diagnostic ERCP [936]. An additional study was identified that covered patient satisfaction [37]. Study selection is outlined in Figure 1. The quality of studies was variable. Results are shown in Table 1.

Figure 1
figure 1

Summary of study selection.

Table 1 QUADAS quality assessment checklist applied to MRCP studies

In only one study did all selected patients have both MRCP and diagnostic ERCP [31], indicating potential verification bias. Thirteen studies [1012, 14, 15, 17, 18, 25, 2830, 34, 36] reported adequate blinding and only six [10, 18, 22, 27, 29, 34] reported information on agreement of MRCP results for more than one investigator. Nine studies [10, 12, 17, 20, 22, 25, 28, 29, 32] gave no information on other diagnostic tests and most studies did not adequately report inclusion and exclusion criteria.

Seven studies [9, 17, 21, 22, 25, 26, 34], reported results comparing MRCP to final diagnosis (including ERCP and other test results) but only four of these reported data comparing MRCP with final diagnosis and ERCP with final diagnosis. The remaining 21 reported results comparing MRCP with diagnostic ERCP. Three [10, 17, 19] of the 28 studies did not provide enough information to calculate sensitivity, specificity and likelihood ratios. Accuracy was assessed separately for each condition (choledocolithiasis, malignancy, dilatation, obstruction and stricture). The results of the remaining 25 studies are shown in Table 2. Table 3 describes study characteristics. Studies comparing MRCP and ERCP with final diagnosis are shown in Table 4.

Table 2 The 25 included studies
Table 3 Study characteristics
Table 4 Studies comparing MRCP and ERCP with final diagnosis

Assessment of effectiveness by condition: choledocolithiasis

One of the most common causes of biliary obstruction is choledocolithiasis: indeed 18 out of the 28 studies (64%) were for this condition. For this reason we concentrate mainly on the analysis of these studies. Of the 18 studies reporting results for choledocolithiasis, 15 [11, 1315, 18, 20, 21, 26, 27, 2931, 33, 35, 36] reported adequate data for analysis. Figure 2 shows a scatterplot of sensitivity vs. specificity for the 15 studies reporting choledocholithiasis.

Figure 2
figure 2

Scatterplot of sensitivity vs. 1-specificity of MRCP test for diagnosing choledocholithiasis with Littenberg-Moses Summary ROC curve and actual data (n = 15 studies).

Two of these studies stand out as having sensitivities somewhat lower than the other 13 studies [11, 36]. Sensitivities and specificities along with 95% CI for these estimates for the 15 choledocholithiasis studies are presented in Figures 3 and 4.

Figure 3
figure 3

Forest plot of estimated sensitivities of MRCP test for diagnosing choledocholithiasis (n = 15 studies).

The sensitivity for the 15 studies of choledocholithiasis ranged from 0.50 to 1.00 while specificity ranged from 0.83 to 1.00. The positive likelihood ratio ranged from 5.44–47.72 and the negative likelihood ratio for the 15 studies ranged from 0.00–0.51. All of the confidence intervals overlap, although the confidence intervals for some studies are wide. Again two studies [11, 36] have point estimates in Figure 2 that are clearly different from the other 13 studies, suggesting that theses two studies are outliers. There is also some evidence of statistically significant heterogeneity between studies, (Figure 3 and 4), which suggested that the computation of a SROC curve was the most appropriate way to pool the results of studies.

Figure 4
figure 4

Forest plot of estimated specificities of MRCP test for diagnosing choledocholithiasis (n = 15 studies).

Summary ROC curves for diagnosis of choledocholithiasis

Figure 2 also shows the parameter estimates and the results of the Littenberg-Moses method for the estimation of a summary best fitting ROC curve. This curve shows the relationship between sensitivities and specificities across the 15 studies. The non-significant result for the S coefficient estimate of 0.057 [CI: -0.25 to 0.42; p = 0.506], suggests that there is no reliable statistical evidence that the diagnostic odds ratio changes with threshold.

There is no unique joint summary estimate of sensitivity and specificity suitable for use in clinical practice from this plot. Table 5 shows values of sensitivity and specificity off the fitted ROC curve to demonstrate the range of values that the data are compatible with.

Table 5 Estimates of diagnostic odds ratio (DOR), sensitivity and specificity from the regression of D = a + bS for Littenberg-Moses Summary ROC curve for MRCP test diagnosis of choledocholithiasis (n = 15 studies) for a range of values of diagnostic threshold (S)

Assessment of effectiveness by other conditions: malignancy, dilatation, obstruction and stricture

For malignancy (three studies [9, 18, 22]), sensitivity ranged from 0.81 to 0.94 and specificity from 0.92 to 1.00. Positive likelihood ratios ranged from 10.12 to 43 and negative likelihood ratios ranged from 0.15 to 0.21. Although, from the results presented in Table 2 it is apparent that the results for malignancy are much less reliable than those for the other conditions presented. The sensitivity for dilatation (five studies [11, 12, 20, 25, 28]) ranged from 0.87 to 1.00 and the specificity from 0.91 to 1.00. For obstruction (three studies [11, 18, 27]), sensitivity ranged from 0.91 to 1.00 and specificity from 0.91 to 1.00. Sensitivity for stricture (two studies [24, 33]) was 1.00 and specificity ranged from 0.98 to 0.99.

Adverse events and satisfaction with procedures

None of the 28 studies reported any adverse events associated with MRCP. Six studies [9, 11, 13, 27, 31, 34] reported adverse effects associated with ERCP, including pancreatitis, bleeding and pain. Two [10, 15] reported that no adverse events had occurred and 20 gave no information at all regarding adverse events. Claustrophobia associated with MRCP was reported in ten studies [9, 11, 16, 18, 19, 24, 26, 27, 30, 33].

The separate study [37] dealing with patient satisfaction found that most patients preferred MRCP, although there were still some patients who preferred ERCP. Almost half of the patients in this small study complained of claustrophobia associated with MRCP, although very few (5.9%) refused MRCP for this reason.

Discussion

This systematic review shows that there is evidence that MRCP stands up well to comparisons with diagnostic ERCP, for the diagnosis of many biliary abnormalities. From the small number of comparative studies with final diagnosis, it appears that ERCP is an adequate reference standard for choledocholithiasis with sensitivities and specificities above 89%, however the results for malignancy were much less reliable. The limited evidence on patient satisfaction shows that patients prefer MRCP to diagnostic ERCP. The results of our review are similar to those found by Romagnuolo et al [38] who in their meta-analysis showed high levels of sensitivity and specificity for demonstrating the level and presence of biliary obstruction.

The main advantage of MRCP is that diagnostic ERCP may be associated with significant morbidity and mortality [39]. Reported complication rates of diagnostic ERCP are 5–6% and mortality figures range from 0.01% (36) to 0.89% [40]. Therapeutic ERCP has a complication rate of 4–10% [41]. Diagnostic ERCP has the potential to allow a therapeutic procedure to be performed immediately, but its indiscriminate use will result in an increasing proportion of patients in whom such intervention is found to be unnecessary. If preliminary tests, such as EUS or computed tomography, clearly indicate the need for therapeutic ERCP, then the use of diagnostic MRCP is probably unwarranted. Those patients with a high probability of choledocholethiasis on the basis of EUS investigations usually proceed directly to ERCP. These issues are elaborated in Bravo et al [42].

There were no reported adverse events associated with MRCP, other than claustrophobia. However, in certain circumstances MRCP cannot be performed due to contraindications to Magnetic Resonance Imaging, e.g. in patients with cardiac pacemakers or cochlear implants. Severe claustrophobia may make patients intolerant of the procedure.

Limitations of this study

Overall the quality of the studies was variable. In only one study did all selected patients have both MRCP and diagnostic ERCP. The reasons why all patients in the other studies did not receive both investigations were not clear. In 21 of the studies, the stated comparison was with ERCP, while in the other seven studies, comparison was with final diagnosis; making comparisons between all studies difficult.

We can consider three ways of categorising a patient: their true condition, the diagnosis and the test results [43]. We have calculated the sensitivity and specificity of MRCP in relationship to diagnosis by ERCP, but we do not necessarily know that the diagnosis is always correct. ERCP is not a perfect gold standard, so differences in diagnosis between MRCP and ERCP may not be due to MRCP giving an incorrect result, but rather to ERCP giving an incorrect result. So we have evaluated MRCP's ability to predict the diagnosis of choledocholithiasis rather than the patient's true disease status. So any errors in the ERCP reference test may lead to either underestimates or overestimates of MRCP's accuracy.

There are several problems associated with using summary ROC curves. For example, although the production of a summary ROC curve does allow the computation of a summary estimate of diagnostic performance, the results cannot be directly applied to clinical practice.

Our results indicate that MRCP is accurate for diagnosis of biliary abnormalities compared to diagnostic ERCP, within the limits of the available data. Good quality studies, particularly randomised controlled trials (we found no comparative clinical trials of the two techniques in our review), are needed comparing MRCP with diagnostic ERCP to final diagnosis, stating inclusion/exclusion criteria and relevant patient characteristics. These studies need to include the full range of target conditions, in particular the differentiation of benign and malignant strictures and the impact on management and outcome. Studies are also needed comparing MRCP with final diagnosis where ERCP is unsuitable or impossible. More research is also needed in the area of patient satisfaction and ways to reduce problems with claustrophobia.

Conclusion

MRCP is a comparable diagnostic investigation in comparison to ERCP for diagnosing biliary abnormalities. Results were particularly favourable for choledocholethiasis and less so for malignancy. Limited information on patient satisfaction found that patients prefer MRCP to ERCP. The use of MRCP in suitable patients reduces the need for diagnostic ERCP which is associated with significant morbidity and mortality.