Introduction

Kidney transplantation is an effective surgical procedure for managing patients with end-stage kidney diseases, which has been shown to confer superior quality of life and clinically relevant outcomes compared to dialysis [1]. Proper postoperative management and accurate prognostication have emerged as essential elements for maintaining the longevity of kidney allografts. From this perspective, biomarkers that could inform risk of kidney function decline would facilitate clinical judgment and ascertain precise prognostication. Currently, serum creatinine and proteinuria are the main clinical measures used to assess renal function.

Diffusion-weighted imaging (DWI) is a robust imaging technique probing the displacement of water molecules that are reflective of underlying microstructural changes. Accumulating evidence suggests that DWI of the native and transplanted kidneys is technically feasible and reproducible [2]. Recent advances in intravoxel incoherent motion DWI (IVIM-DWI) have enabled the separation of pseudo diffusion from true diffusion by using the biexponential model for signal analysis. Prior cross-sectional studies have consistently shown that DWI parameters may assist with characterizing pathologic changes, especially interstitial fibrosis, in both the native and the allograft kidneys [3, 4].

To the best of our knowledge, the majority of available DWI studies on kidney allografts have been cross-sectional in design, and it remains unknown whether DWI parameters are predictive of kidney allograft survival. If so, what is the additive prognostic value of the DWI parameters, in addition to clinical biomarkers? To bridge this gap, this study aimed to evaluate the predictive value of DWI parameters for kidney allograft function decline and to clarify the additive long-term prognostic value of these DWI parameters.

Materials and methods

Study population

This single-center retrospective study was performed after obtaining approval from the local ethics committee. Informed consents from patients were waived due to the retrospective and non-interventional nature of this analysis. Data from a total of 115 adult patients evaluated for clinically driven indications, including post-transplant rising serum creatinine, proteinuria, or the appearance of donor-specific antibodies suspicious for allograft rejection, were retrieved and scrutinized. These patients underwent posttransplant multi-b DWI from March 2014 to October 2015 as part of a comprehensive evaluation that also comprised laboratory measurements. A total of 18 cases were excluded for suboptimal imaging quality (n = 8), incomplete clinical or laboratory data (n = 7), and no follow-up (n = 3), leaving a final analysis of available data from 97 patients.

DWI acquisition and image analysis

MRI was performed with a 3.0 Tesla clinical imager (General Electric, Milwaukee, WI, MR750, USA) equipped with a 32-channel body coil after the patient had been fasting for at least 4–6 h. Coronal T1-weighted and axial T2-weighted images were routinely acquired for anatomic depiction. Axial DWI images were acquired with a single-shot echo-planar imaging sequence using the respiration-triggered technique. A total of 10 nonzero b-values (10, 30, 50, 70, 100, 150, 200, 400, 800, and 1000 s/mm2) were applied with the following parameters: repetition time 2857 msec, echo time 87.2 msec, field of view 38 × 30.4 cm, matrix 256 × 128, slice thickness 6 mm, number of slices 15, and number of excitations 2. Diffusion-sensitizing gradients were applied along 3 orthogonal directions to minimize the effects of diffusion anisotropy. The total acquisition time for DWI was approximately 4 to 6 minutes, depending on patient’s breathing frequency.

Image analysis

The MADC program within the vendor-supplied FuncTool software was employed for the analysis of DWI images transferred to the GE Healthcare Advantage Workstation 4.6. The DWI images were analyzed by two of the authors without knowing the clinical or laboratory information by manually drawing regions of interest (ROI).

As previously reported [5], a large ROI covering the entire allograft cortex was delineated on the six central slices close to the hilum in b = 0 images, resulting in six cortical ROIs for each allograft. A typical example of ROI delineation was presented in Fig. 1. A qualified radiologist (Y.M.Y., with 5 years of experience in abdominal MRI) manually drew the ROI, which was then confirmed by an experienced radiologist (L.J.Z., with over 20 years of experience in abdominal MRI). The cortical ROI readings were then averaged to obtain the corresponding DWI parameters for the allograft cortex.

Fig. 1
figure 1

A typical example of region-of-interest delineation at a slice near the hilum in a kidney transplant on b = 0 image

The imaging signals were analyzed using both the monoexponential and biexponential models. Specifically, both b = 0 and the 10 nonzero b-values were fitted into the monoexponential model to obtain the total apparent diffusion coefficient (ADCT): Sb/S0 = exp (-b ADCT), where Sb represents the signal intensity at a given b-value and S0 signifies the signal intensity at b = 0 s/mm2. The IVIM-derived parameters, including true diffusion (D), pseudo-diffusion (D*), and perfusion fraction (fp), were calculated using the following biexponential model: Sb/S0 = (1-fp × exp (-b × D) + fp × exp (-b × [D + D*])). A segmented fit algorithm with constraints was applied, estimating the initial D value solely from b-values > 200 s/mm2. Subsequently, the resulting D was maintained as a fixed parameter to fit the missing values of D* and fp [6].

Biochemical measurements, follow-up, and outcomes

Patient demographics and clinical and laboratory information at the time of DWI were collected from the electronic health records. The following data were recorded: age, sex, causes for end-stage kidney disease, serum creatinine, 24-h proteinuria, immunosuppression regimen, hemoglobin, albumin, and hematocrit. Allograft function was assessed with the estimated glomerular filtration rate (eGFR) using the creatinine-based Chronic Kidney Disease Epidemiology Collaboration equation [7]. Patients were regularly followed up every 3 to 6 months at the outpatient clinic post-discharge. The primary outcome was composite events including eGFR decline > 30% or the initiation of renal replacement therapy or re-transplantation, as previously reported [8]. For those who died, the last available eGFR was used to assess the primary outcome.

Statistical analysis

Continuous variables were presented as mean ± standard deviation or median with interquartile range, as appropriate. Categorical variables were expressed as numbers and percentages. The optimal cut-off points for eGFR, proteinuria, and DWI parameters were determined based on the “maximally selected rank statistics” using the package prodlim for R (http://www.r-project.org/) as proposed previously [9]. This technique allows the distinction of a low- and high-risk group of patients by offering a cut-off point of the predictor while avoiding multiple testing in the meantime. Cumulative kidney allograft survival rate was estimated using the Kaplan-Meier method and compared using the log-rank test. Variables with a p-value < 0.05 in the univariate analysis were adopted into the multivariable Cox proportional-hazards regression model. To evaluate whether the model’s risk prediction could be improved by incorporating DWI parameters, we compared the area under the receiver-operating characteristic curve (AUROC) of the clinical model, the DWI model, and the composite model using DeLong’s test. Statistical significance was indicated by a two-sided p-value < 0.05.

Results

Characteristics of the study population

Patient demographics, clinical parameters, and laboratory findings are summarized in Table 1. A total of 97 patients were finally included, including 69 males and 28 females with a mean age of 38 years. The baseline median serum creatinine, eGFR, and proteinuria were 1.53 mg/dL, 56 mL/min/1.73 m2, and 0.39 g/24 h, respectively. The great majority (80.41%) of patients were on a triple immunosuppressive regimen consisting of prednisone, tacrolimus, and mycophenolic acid. Causes for end-stage kidney diseases were unknown in 82.47% patients. During a median follow-up time of 98 months (interquartile range, 44–103 months), a total of 45 patients achieved the primary outcome, including eGFR decline > 30% in 9 patients, return to dialysis in 34 patients, and re-transplantation in 2 patients.

Table 1 Demographics, clinical parameters, and laboratory findings of the cohort

Determination of optimal cut-off points for binary classifiers

As shown in Fig. 2, the optimal cut-off points to assess the primary outcome as determined by the “maximally selected rank statistics” for ADCT, D, D*, and fp in the cortex were 1.94 × 10−3 mm2/s, 1.47 × 10−3 mm2/s, 4.65 × 10−3 mm2/s, and 0.281, respectively. Similarly, the optimal cut-off points for patient age, baseline eGFR, and proteinuria were 47 years, 37 mL/min/1.73 m2, and 0.75 g/24 h, respectively.

Fig. 2
figure 2

Determination of the optimal cut-off points for age (a), proteinuria (b), eGFR (c), cortical ADCT (d), cortical D (e), cortical D* (f), and cortical fp (g) based on the “maximally selected rank statistics”

Kaplan-Meier survival curves

Patients were dichotomized into two groups based upon the cut-off points determined above. Kaplan-Meier curves (Fig. 3) showed that higher age (hazard ratio [HR] = 2.48, p = 0.009), lower baseline eGFR (HR = 4.94, p < 0.001), higher proteinuria (HR = 4.25, p < 0.001), lower cortical ADCT (HR = 8.30, p < 0.001), lower cortical D (HR = 6.17, p < 0.001), lower cortical D* (HR = 4.29, p < 0.001), and lower cortical fp (HR = 5.24, p < 0.001) were all associated with the primary outcome. Nonetheless, kidney allograft survival for patients with different sex was similar (HR = 1.32, p = 0.43).

Fig. 3
figure 3

Kaplan-Meier curves of kidney allograft function decline stratified by patient age (a, HR = 2.48, p = 0.009), sex (b, HR = 1.32, p = 0.43), baseline estimated glomerular filtration rate (eGFR, c, HR = 4.94, p < 0.001), proteinuria (d, HR = 4.25, p < 0.001), cortical ADCT (e, HR = 8.30, p < 0.001 ), cortical D (f, HR = 6.17, p < 0.001), cortical D* (g, HR = 4.29, p < 0.001), and cortical perfusion fraction (h, HR = 5.24, p < 0.001)

Construction and comparisons of predictive models

We constructed a total of three predictive models using multivariable analysis. Specifically, the Model 1 (clinical model) was comprised of patient age, baseline eGFR, and proteinuria; the Model 2 (DWI model) included cortical ADCT, cortical D, cortical D*, and cortical fp; and the Model 3 (composite model) encompassed all the parameters in the Model 1 and Model 2. The results of each model for predicting the primary outcome are shown in Table 2. Multivariable analysis showed that the Model 3 included cortical D (HR = 3.93, p = 0.001) and cortical fp (HR = 2.85, p = 0.006), in addition to baseline eGFR (HR = 3.52, p = 0.002) and proteinuria (HR = 2.94, p = 0.003).

Table 2 Different models for predicting kidney allograft function decline by multivariable Cox regression analysis

The changes of AUROC with different follow-up time for each model are presented in Fig. 4. It can be observed that the AUROC for Model 1 gradually decreased with the follow-up time > 40 months, whereas the Model 2 and Model 3 maintained relatively stable AUROC. We then set the follow-up time to 12 months, 36 months, 60 months, and 84 months to compare the predictive ability of each model by calculating the AUROC. As presented in Fig. 5, the AUROCs of Model 1 and Model 2 were not statistically significant at 12-month (0.91 vs 0.87, p = 0.83), 36-month (0.95 vs 0.88, p = 0.53), 60-month (0.86 vs 0.88, p = 0.83), and 84-month follow-up (0.83 vs 0.86, p = 0.34). In comparison, the AUROCs of Model 3 were significantly higher than those of Model 1 at the follow-up time of 60 months (0.91 vs 0.86, p = 0.02) and 84 months (0.90 vs 0.83, p = 0.007). The AUROCs of Model 3 were comparable with those of Model 1 at 12-month (0.92 vs 0.91, p = 0.07) and 36-month (0.95 vs 0.95, p = 0.13) follow-up.

Fig. 4
figure 4

Time-dependent area under the receiver-operating characteristic curve (AUROC) for the clinical model (Model 1, a), DWI model (Model 2, b), and composite model (Model 3, c) during the follow-up time. The solid line represents the AUROC value, and the two dotted lines denote the corresponding 95% confidence intervals

Fig. 5
figure 5

Receiver-operating characteristic curves for Model 1, Model 2, and Model 3 at 12-month (a), 36-month (b), 60-month (c), and 84-month (d) follow-up time points

Discussion

We evaluated the added value of cortical DWI parameters for predicting allograft function decline in a cohort of 97 patients with a median follow-up of 98 months. The results showed that cortical D and fp were predictors of allograft function decline independent of baseline eGFR and proteinuria. Furthermore, the addition of these two IVIM-DWI parameters to a clinical model consisting of baseline eGFR and proteinuria may provide incremental prognostic value for allograft function decline with long-term (≥ 60 months) follow-up.

The finding that baseline eGFR and proteinuria are predictors of allograft function decline is not surprising, given that abundant prior investigations have consistently demonstrated that higher proteinuria and declining kidney function as indicated by lower eGFR are strong predictors of unfavorable outcome for both native kidneys and kidney transplants [10, 11]. Therefore, measuring serum creatinine and proteinuria is a cost-effective and ubiquitous clinical approach that assist in the clinical management and prognostication for patients with kidney transplants.

A novel finding of the present study is that cortical D and fp, both of which are parameters obtained through IVIM analysis of the DWI signals, are independent risk factors associated with the decline of allograft function. We are aware that several earlier studies have explored the prognostic significance of ADCT for kidney outcome with conflicting results. For instance, both the study by Sugiyama et al. [12] in a cohort of 91 patients with chronic kidney disease and the study by Berchtold et al. [13] in a mixed cohort of 197 patients with both chronic kidney disease and kidney allograft dysfunction suggested that cortical ADCT did not predict the decline of kidney function. Nonetheless, Srivastava’s group [14] demonstrated that baseline cortical ADCT was associated with change in eGFR over time. Interestingly, the associations between ADCT and allograft function decline identified in Kaplan-Meier curves disappeared in multivariable analysis in the present study. We attributed this to the notion that ADCT encompasses both D and D* caused by microcapillary perfusion. The work [15] of Cheng et al. uncovered that D outperformed ADCT in assessing underlying kidney pathologic changes, suggesting ADCT might be less sensitive than D for evaluating kidney microstructural alterations.

To the best of our knowledge, the significance of IVIM-DWI parameters for predicting kidney outcome has not been previously explored. An increasing number of studies have suggested that both the D and fp were correlates of kidney pathologic changes, particularly kidney interstitial fibrosis [16]. The cortical D measures water molecule movement that is predominantly reflective of the extent of interstitial fibrosis. The fp has been shown to be a robust and reproducible measure for assessing tissue perfusion status [17]. Earlier studies have indicated a significant correlation between cortical fp and allograft perfusion, as well as serving as a reflection of kidney microvessel density [18]. In congruence with our findings, prior investigations demonstrated that cortical D and fp were intimately correlated with kidney interstitial fibrosis [19], which is considered a reliable histologic indicator for kidney function deterioration. The D* has been noted to suffer from low reproducibility as indicated by suboptimal inter-reader agreement, so some studies simply did not report D* results [20].

We observed that DWI parameters had comparable prognostic capability to model 1, and they did not appear to add prognostic significance in those with short-term follow-up. The usefulness of DWI parameters for predicting kidney allograft function deterioration is mostly confined to those with long-term follow-up. We interpret this finding as suggesting that the clinical utility of multi-b DWI for short-term prognostication is limited. However, it does offer additional benefits for managing patients with kidney transplants in the long term. In line with this, a previous study [21] reported that DWI can help reduce unnecessary allograft biopsies by improving the detection rate of allografts with underlying pathologic alterations. Thus, multi-b DWI could potentially be incorporated into the armamentarium for transplant surgeons and radiologists caring for patients with a kidney transplant with long-term follow-up.

Although we observed a relatively large cohort for an extended follow-up time, the limitations of this study must be acknowledged. First, the retrospective nature of this study may introduce potential selection biases that may confound the study results. Consequently, extrapolation of the study results to the kidney transplant population with different clinical characteristics from our cohort may be difficult. For example, the causes of end-stage kidney disease in the majority of patients in this cohort were unknown, and it remains unclear whether this would affect the DWI’s prognostic value. In addition, the added value of DWI for kidney transplants with long-term follow-up should be further investigated in the future through cost-effectiveness and comparative analysis with ultrasound, which is currently the most widely used imaging modality for the evaluation of renal allografts.

Conclusion

In conclusion, we have demonstrated that both cortical D and cortical fp were predictive of allograft outcome, independent of baseline eGFR and proteinuria. Additionally, incorporating cortical D and fp into a clinical model with baseline eGFR and proteinuria may add prognostic value for long-term allograft function decline. The cost-effectiveness of integrating multi-b DWI to predict kidney function decline should be validated in the future prior to its clinical application.