Introduction

Magnetic resonance imaging (MRI) has been shown to be highly sensitive for the detection and characterization of abdominal lesions including those in the liver, pancreas, and kidney. Therefore, MRI often has direct implications for patient management (surgical vs. nonsurgical treatment) [1]. MRI is increasingly used in clinical routine to differentiate between malignant and nonmalignant renal tumors and is considered a complimentary modality to computed tomography (CT) and ultrasonography (US). The use of contrast agents in renal MRI offers additional advantages for the diagnostic assessment of renal diseases and is well established. The use of gadopentetate dimeglumine (0.5 M Gd-DTPA, Magnevist®, Bayer Schering Pharma) in combination with fat-saturation (FS) techniques has been shown to be superior to unenhanced MR imaging for detection and characterization of renal lesions [2]. Furthermore, 0.5 M gadopentate dimeglumine may also provide the detection of functional kidney derangement [3, 4] and allow for selective direct imaging of the entire urinary tract [5].

Various 0.5 M nonspecific extracellular gadolinium chelatesFootnote 1 are commercially available and have been investigated for contrast-enhanced (CE)-MRI of different body areas. It is widely assumed that these contrast agents have a comparable efficacy, although only limited results from direct comparative studies are available. These 0.5 M gadolinium chelates belong to the first generation of nonspecific extracellular Gd-chelates. However, a second generation of commercially available MR contrast agents, e.g., 0.5 M Gd-chelates with weak protein-binding properties (Gd-BOPTA, MultiHance®, Bracco) or the highly concentrated 1.0 M macrocyclic Gd-chelate (gadobutrol, Gadovist®, Bayer Schering Pharma), are increasingly used in clinical routine following approval.

Due to its more compact bolus profile, 1.0 M gadobutrol is well suited for dynamic imaging, such as first-pass MR angiography and perfusion imaging [614]. It is also well suited for standard MRI techniques (e.g., imaging of focal disease in different body regions) with comparable efficacy to other 0.5 M extracellular Gd-chelates, as the concentration of the contrast agent has no impact on the extracellular distribution, including the distribution in lesions. Gadobutrol is therefore expected to provide diagnostic information to classify benign and malignant lesions comparable to 0.5 M Gd-chelates.

Since 2006 gadolinium-based contrast agents have increasingly been suspected to be one potential trigger for a rare but serious condition, called nephrogenic systemic fibrosis (NSF) [15, 16]. NSF has only been seen in patients with severe renal impairment, most of whom underwent dialysis, and in isolated cases in patients with acute renal failure [17].

The purpose of this phase III clinical study was to investigate the diagnostic efficacy of gadobutrol-enhanced MRI for the classification of known or suspected focal renal lesions as compared with 0.5 M gadopentate dimeglumine. To test for noninferiority of 1.0 M gadobutrol-enhanced MRI in comparison with 0.5 M gadopentate dimeglumine-enhanced MRI the diagnostic accuracy rates for both contrast agents were compared with a predefined standard of truth (SOT).

Material and methods

Study design and population

In order to meet the criteria of evidence-based medicine the clinical trial was performed as a multicenter, randomized, single-blind, interindividually controlled parallel group study in a routine patient population. Patients with known or suspected renal lesions who had undergone or were planned for CT evaluation within 1 month before or after the MRI examination were eligible for the study. Thus each patient underwent both computed tomography (CT) and contrast-enhanced MRI of the kidney. Regarding the contrast-enhanced MRI, patients were randomized to receive either gadobutrol or gadopentate dimeglumine (1:1 randomization). A written informed consent was obtained from each patient before they entered the study, and the institutional review boards of all involved centers approved the study.

Exclusion criteria were: age under 18 years, general contraindications to MRI, clinical instability, surgery or intervention within 4 weeks or scheduled for the 28-h safety follow-up period, acute renal failure, pregnancy, and lactation. Patients who had received any investigational drug or systemic kidney tumor therapy within the previous 14 days or within 15 half-lives of the drug and patients who had received any contrast medium within 24 h before or who were scheduled to do so within 28 h of study injection were also excluded. Additionally, patients with renal lesions that could be sufficiently diagnosed by diagnostic procedures other than CT or MRI (e.g., cysts) were excluded.

Contrast agents

1.0 M gadobutrol (Gd-BT-DO3A, Gadovist®, Bayer Schering Pharma AG, Berlin, Germany) is a gadolinium-based hydrophilic, neutral (nonionic) macrocyclic contrast agent. The steady-state volume of distribution indicates a predominantly extracellular distribution. The terminal half-life is approximately 1.5 h in healthy subjects [18]. Gadobutrol has a high complex stability owing to the kinetic stability characteristic for macrocyclic agents. The kinetic dissociation half-life is T 1/2 = 24 h (pH = 1), which can be extrapolated to a T 1/2 of over 1,000 years at pH = 7.4 [19]. The T1-relaxivity (r1) is 5.2 ± 0.3 l mmol−1 s−1, the T2-relaxivity (r2) is 6.1 ± 0.3 l mmol−1 s−1 (in plasma, at 1.5 T and 37°C) [20]. Gadobutrol has shown an excellent renal tolerance in patients suffering from renal impairment or those with end-stage renal failure under hemodialysis treatment [2123]. As of December 2007 there are no reports on NSF in association with the administration of gadobutrol in the peer-reviewed literature.

0.5 M gadopentetate dimeglumine (Gd-DTPA, Magnevist®, 0.5 mol Gd l−1, Bayer Schering Pharma AG, Berlin, Germany) as a standard extracellular paramagnetic MR contrast agent served as a reference agent with a proven excellent efficacy and safety profile [24]. The T1- and T2-relaxivities in plasma (at 1.5 T and 37°C) are somewhat lower compared with gadobutrol (r1 = 4.1 ± 0.2 l mmol−1 s−1, r2 = 4.6 ± 0.8 l mmol−1 s−1) [20]. As of December 2007 there are reports on NSF in association with the administration of gadopentate dimeglumine in the peer-reviewed literature [25, 26]. Since June 2007 gadopentate dimeglumine is contraindicated for the use in patients suffering from severe renal impairment (GFR < 30 ml min−1) in Europe (i.e., after this study was conducted).

MRI sequences

MRI examinations comprising precontrast, dynamic contrast-enhanced, and delayed contrast-enhanced sequences were obtained at a field strength of 1.5 T on MR systems from different manufacturers.

The precontrast sequence (transversal orientation) included a T2-TSE- and a T1-SE- or T1-GE-weighted sequence without FS covering both kidneys (slice thickness 5–8 mm, matrix ≥256 × 192, field of view (FOV) ≤400 mm). In addition, a T1 SE or GE sequence with fat suppression was performed with sequence parameters that were comparable except for the matrix (≥256 × 128).

In both groups, contrast agents were injected as an intravenous bolus at a dose of 0.1 mmol Gd kg−1 bw with an MR-compatible power injector. To allow for comparable gadolinium delivery rates the injection speed was adapted to the concentration of the injected contrast agent, e.g., 1.0 ml s−1 for the 1.0 M gadobutrol and 2.0 ml s−1 for the 0.5 M gadopentate dimeglumine. A dynamic contrast-enhanced T1-weighted sequence covering both kidneys (coronal orientation) was acquired before and 20–30, 50–60, and 160–180 s following the injection. GE or multiplanar turbo-GE sequences with or without fat suppression or alternatively 3D GRE sequences could be applied during a breathhold period (slice thickness ≤5 mm without gap, matrix ≥256 × 160, FOV ≤400 mm; in the case of 3D sequences with a reconstruction interval of 5 mm).

Delayed contrast-enhanced T1-weighted sequences (transversal acquisition, with and without FS) were obtained 4–5 min following the injection.

Standard of truth procedure

A CT examination to establish the correct diagnosis of the renal lesions was required for all patients; this was regarded as the standard of truth (SOT). Both nonenhanced and contrast-enhanced CT images (CE-CT) (preferably spiral CT) of both kidneys had to be performed within 4 weeks before or following the MRI procedure. A bolus injection of ≥80 ml nonionic contrast agent was required, preferably injected by a power injector. CE-CT had to be obtained during a time frame of 1–5 min postinjection with a slice thickness of up to 8 mm (preferably ≤5 mm). Where CT was indeterminate regarding the classification of lesions, additional tests or surgery were requested to establish diagnosis. If surgery (e.g., nephrectomy, partial/heminephrectomy, enucleation) or surgical exploration was performed and respective histopathological results were available, these results replaced the CT diagnosis as SOT for the respective lesions.

Evaluation

MR images

MR image datasets were evaluated in a blinded reading by three independent radiologists with expertise in abdominal MRI as an off-site, central evaluation. Neither the radiologists nor their institutions were involved in the on-site part of the study. The blinded reading was divided into three separate sessions: (1) precontrast (T1 and T2) MRI, (2) contrast-enhanced MRI, (3) combined precontrast (T1 and T2) and contrast-enhanced MRI. In order to reduce bias, readers were given no information concerning patient population, indication/clinical question, center-related information, or details of the study protocol. An interval of at least 3 weeks between each of the three reading sessions was fixed to avoid recall bias. The readers had to detect and classify the renal lesions as either malignant, benign, not assessable, or no lesion for MRI.

CT images (SOT)

Blinded reading of the CT studies was performed as an off-site, central evaluation by one independent certified radiologist with expertise in abdominal CT imaging. The radiologist and his respective institution were not involved in the clinical part of the study. The reader was provided with clinical information, e.g., age and sex, reason for referral, genitourinary medical and surgical history, but was blinded to center-related information, details of the study protocol, and the clinical imaging results. All identified lesions had to be classified as malignant or benign in order to be included for any SOT-based analysis.

Lesion tracking

The blinded reading of the MRI and CT examinations was complemented with a lesion-tracking session by an independent radiologist experienced in abdominal imaging, using only the renal maps. The objective of the lesion tracking was to guarantee an unambiguous assignment of the lesions to the respective SOT diagnosis. The blinded reader had to compare the renal maps derived from the SOT procedures with those from the blinded off-site and the clinical on-site readings of the MR images. This procedure was performed to ensure that each lesion was assigned correctly on the renal maps from both SOT and MR images.

Statistical analysis

Primary efficacy variable was the noninferiority of 1.0 M gadobutrol to 0.5 M gadopentate dimeglumine in terms of accuracy, using a 95% confidence interval approach on a per-lesion basis for the per protocol set (PPS), i.e., for all patients who concluded the study without major protocol deviations. Diagnostic accuracy is an overall measure of diagnostic efficacy, which determines the probability that a test result reflects the true disease state of a patient. The lesion classification resulting from study MRI was compared with that of the SOT in order to determine the classification outcome (TP, FN, FP, TN). Lesions not classified by SOT were excluded from these classifications and from all other evaluations. The accuracy rate for lesion classification of the MRI was defined as the number of lesions with TP or TN ratings divided by the total number of SOT lesions [27].

Descriptive statistics (N, mean, standard deviation, median, minimum, and maximum) were performed for quantitative variables and frequency counts, and percentages by category were given for qualitative variables. Two-sided 95% confidence interval (CIs) were prepared where appropriate.

Noninferiority was statistically evaluated by testing at a significance level α = 0.025 the null hypothesis (H0: accuracygadobutrol − accuracygadopentate ≤−Δ) against the alternative hypothesis (H1: accuracygadobutrol  − accuracygadopentate >−Δ), where Δ describes the acceptable noninferiority threshold. The result for an ‘average reader’ was calculated from the results of the three blinded readers for each group. A difference Δ of <10% for the readers and the average reader was prospectively defined to characterize statistical significance of noninferiority. In addition, increases in diagnostic accuracy from precontrast MRI to the combined precontrast and postcontrast MRI were determined. For this evaluation, a difference Δ of <4% was prospectively defined as the threshold for significance of noninferiority for readers and the average reader.

Furthermore, sensitivityFootnote 2 and specificity,Footnote 3 as well as increases of sensitivity and specificity from precontrast MRI to the combined precontrast and postcontrast MRI, were determined. No threshold for statistical significance was prospectively defined for the sensitivity and specificity datasets.

Descriptive statistics were performed for all safety variables, including AE data and vital signs.

Results

A total of 471 patients were enrolled into the study; 466 received a study drug and were thus included into the safety analysis (FAS) (n = 233 1.0 M gadobutrol group, n = 233 0.5 M gadopentate group). Major protocol deviations were recorded for 60 patients. The most common major deviations were related to MRI or CT examinations, e.g., MR images were recorded without mandatory precontrast sequences. Therefore, 406 patients (n = 200 1.0 M gadobutrol; n = 206 0.5 M gadopentate; demography and baseline characteristics in Table 1) concluded the study without major protocol deviations and were included into the efficacy evaluation (PPS). There are no relevant differences between both groups regarding the listed characteristics. Table 2 shows the distribution of SOT procedures in both groups, whereas Table 3 shows the distribution of diseases. On the patient level the disease distribution (benign/malignant) is fairly balanced between the treatment groups; a slight imbalance can be seen in the disease distribution on the lesion level. In about 75% of the benign cases in both treatment groups, the lesions consisted of single cysts.

Table 1 Demography and baseline characteristics (PPS)
Table 2 Number of patients by SOT procedures (PPS)
Table 3 Distribution of disease and lesions (PPS)

Accuracy

For the average reader, the ‘mean’ accuracy rates for the combined assessment of precontrast and postcontrast images are 83.7% for 1.0 M gadobutrol and 87.3% for 0.5 M gadopentate dimeglumine. The differences between the gadobutrol and the gadopentate group ranged between less than 2% and 8% for the three readers. Statistical significance of noninferiority according to the predefined difference of <10% was achieved for the average reader and reader 1. The full data set is displayed in Fig. 1.

Fig. 1
figure 1

Accuracy results for combined precontrast and postcontrast assessment as well as increase of accuracy from precontrast to postcontrast assessment

Almost identical increases in diagnostic accuracy from precontrast to combined assessment of precontrast and contrast-enhanced MRI sequences in comparison to the SOT were obtained for 1.0 M gadobutrol and 0.5 M gadopentate with differences between the two contrast agents of maximum 3% (in favor of gadobutrol in case of reader 1) (Fig. 1). Statistical significance according to the predefined difference of <4% was achieved for the average reader (Fig. 1).

Sensitivity

Diagnostic sensitivity rates for lesion classification were comparable between 1.0 M gadobutrol and 0.5 M gadopentate (85.2% for 1.0 M gadobutrol and 88.7% for 0.5 M gadopentate, Fig. 2). The difference in diagnostic sensitivity between 1.0 M gadobutrol and 0.5 M gadopentate ranged from 1 to 7% (Fig. 2). Both groups showed a considerable and comparable increase of diagnostic sensitivity from precontrast to combined precontrast and postcontrast findings (Fig. 2).

Fig. 2
figure 2

Sensitivity results for combined precontrast and postcontrast assessment as well as increase of sensitivity from precontrast to postcontrast assessment

Specificity

Diagnostic specificity rates for lesion classification were comparable between 1.0 M gadobutrol and 0.5 M gadopentate (82.1% vs. 86.1% for the average reader, Fig. 3). Also the differences in diagnostic specificity from precontrast to the combined assessment of precontrast and contrast-enhanced MRI findings in comparison with the standard of truth were comparable, although no relevant increases from the precontrast to the combined precontrast and postcontrast images were observed in either group (Fig. 3).

Fig. 3
figure 3

Specificity results for combined precontrast and postcontrast assessment as well as increase of specificity from precontrast to postcontrast assessment

Discussion

The purpose of this multicenter, randomized, interindividually controlled, single-blind study was to demonstrate the noninferiority of 1.0 M gadobutrol in comparison with 0.5 M gadopentate in the diagnostic assessment of renal lesions. Therefore, the accuracy of the combined precontrast and postcontrast images was chosen as the primary end point, representing the relevant diagnostic information for the physician. However, it has to be considered that this is not the end point exclusively representing the contribution of a contrast agent.

To exclude bias in the evaluation of diagnostic efficacy in the two treatment groups, the evaluation of the MRI images was carried out by three experienced blinded readers not otherwise involved in the study. To assess the overall significance of the results across the readers, different approaches are possible. One approach is the ‘majority read’, i.e., in the case of three readers, two of the three need to reach significance in order for the overall result to be classified as significant. However, the primary objective of imaging studies is to evaluate the diagnostic efficacy of a specific imaging modality and/or to compare the efficacy of two modalities. The concept of the ‘average reader’ was prospectively chosen in the study protocol to represent the overall results and their significance.

The predefined noninferiority significance level of less than 10% lower limit of CI was achieved for the average reader in the PPS, confirming the noninferiority of 1.0 M gadobutrol compared with 0.5 M gadopentate. In addition, statistical proof of noninferiority on the basis of the increase in diagnostic accuracy from precontrast to combined precontrast and contrast-enhanced MRI with a predefined noninferiority margin of 0.04 (4%) was obtained. Conclusion of proof of noninferiority clearly depends on the choice of the maximum acceptable difference between the two agents. The chosen equivalence limit in this study was based on the response rate of the study drug. The diagnostic efficacy rate of 0.5 M gadopentate was set to 85–90%, based on the available literature [28, 29]. Overall, acceptable accuracy rates of about 85% were obtained in both treatment groups in the presented study.

Hugh and Dubey proposed equivalence limits for binary data in dependence on the response of the study drug [30]. Based on previously published literature data the upper limit of the efficacy range for the reference drug (i.e., in our case the accuracy of gadopentate dimeglumine MRI in the investigated indication) was set to the demanding target of 10% for the noninferiority margin assuming a reference accuracy of 90–95%. Gadobutrol met the predefined end point, but it should be noted that the less demanding 15% difference (based on a reference accuracy of 80–90%) would have been met by all readers and the clinical investigators as well as by the average reader. Concerning the predefined equivalence limit for the statistical proof of the noninferiority regarding the increase in diagnostic accuracy, a note of caution is necessary, as the predefined limit of 4% is obviously very strict, but not as well referenced as the limits of the overall rate. Further statistical research is probably necessary to arrive at reliable guiding limits for this type of question.

However, an interesting methodological aspect surfaced in this study. Even though the noninferiority of gadobutrol compared with gadopentate dimeglumine was demonstrated based on the results of the blinded reader evaluation, slightly higher values for diagnostic efficacy (accuracy, sensitivity) for combined precontrast and postcontrast images were obtained in the gadopentate group in comparison with the gadobutrol group. This effect was not seen for the increase of accuracy or sensitivity from precontrast to combined precontrast and postcontrast between both groups. On the contrary, in the case of the sensitivity data the gadobutrol group actually performed consistently better, an effect which can also be seen in the accuracy data, although to a lesser degree. This increase, however, is the direct measure of the actual contribution of the contrast agent. In fact, the imbalance between the overall accuracy in both groups using the predefined end point (combination of precontrast and postcontrast images) can already be seen in the precontrast evaluation alone. It persists throughout the assessment of contrast-enhanced MRI alone, as well as in combined assessment of precontrast and contrast-enhanced MRI. Therefore, a contrast-agent-related factor is considered highly unlikely to be the cause of the observed imbalance. Potential explanations rather include differences in the characteristics of the two treatment groups.

Regarding sensitivity, considerable improvements (>15% for the average reader) from precontrast to postcontrast images were shown for both agents. The low variability of these results clearly shows the add-on from the contrast agents. Slight differences in sensitivity seen between the two treatment groups are attributed to the imbalance in lesion distribution between them. In the case of specificity, on the other hand, there was actually no added benefit of contrast enhancement in the case of both agents. This result is not surprising, as the additional information provided by any contrast agent in comparison with unenhanced MRI alone is very limited in benign lesions [31]. In more than 75% of the cases, benign lesions consisted of single unambiguous cysts, which can be easily diagnosed in the precontrast image. The slight difference in specificity seen between the two agents is again attributed to the imbalance in lesion distribution between them (more benign lesions in the 0.5 M gadopentate group).

This study was conducted before the first report on a potential association of gadolinium-based contrast agents and NSF in April 2006. It has to be recognized that the patient population investigated in this study is distinctly different from the patient population for which NSF has been reported (patients with severe renal impairment, most of them on dialysis). The presence of focal kidney lesions in the investigated study population does not directly relate to the level of kidney function. However, exclusion of severe renal impairment (as opposed to acute renal failure) was not a criterion for the investigated study population. The lack of follow-up of the patients regarding possible delayed reactions can therefore be seen as a limitation of this study.

In conclusion, this study documents evidence for the noninferiority of a single i.v. bolus injection of 1.0 M gadobutrol (0.1 mmol kg−1 bw) compared with 0.5 M gadopentate dimeglumine (0.1 mmol kg−1 bw) in the diagnostic assessment of renal lesions with CE-MRI.