Background

The efficacy of immunosuppressive therapy in kidney transplantation has steadily increased over the last decade. As a consequence, the incidence of acute rejection (AR) episodes has decreased and short-term graft survival rates have improved [1, 2]. However, long-term transplant outcomes are still poor and episodes of AR are known to significantly exacerbate long-term outcomes [2, 3]. AR is associated with long-term complications, such as graft dysfunction and reduced graft survival and AR prevention continues to be a main focus in the design of new therapeutic strategies for renal transplantation [4,5,6,7]. The most common form of AR is acute cellular rejection (ACR) [8]. ACR is a T cell cytotoxic immune response against the graft, leading to inflammatory cell infiltration with tubulitis and, eventually, damage of the donor tissue [9, 10]. The positive outcome of ACR if treated early, as well as its potentially irreversible damage, render it particularly relevant for prevention research [10, 11]. Regarding non-invasive diagnostics, a number of studies have obtained good results using tissue, blood or urine markers [11,12,13,14,15,16,17,18]. For early risk assessment, the large majority of models are donor-dependent, as they either employ measurements from the early post-transplantation period or utilize donor-derived data (e.g. from crossmatch tests) [19,20,21,22,23,24,25,26,27,28,29,30]. The most common approach for pre-transplant risk assessment relies on the characterization of HLA antibodies in recipient serum samples by solid phase single HLA antigen bead (SAB) assay [24,25,26,27,28,29, 31]. The assay facilitates detection and identification of anti-HLA antibody specificities and provides a method for monitoring the development of donor-specific antibodies (DSA). The detection of DSA through SAB assays is a well-established method for antibody-mediated rejection (ABMR) pre-transplantation risk assessment, but not for ACR [24,25,26,27,28,29,30, 32].

Approaches for risk assessment of ACR do not employ DSA for the prediction – as both patients with or without DSA experience episodes of ACR – but other risk markers, such as soluble CD30 levels or panel of reactive T cells [23, 33,34,35]. However, the inspection of SAB serum antibody reactivity profiles (irrespective of DSA status) may provide a means to an ACR risk assessment tool for two reasons: (1) serum antibody binding profiles against antigen/protein libraries are generally powerful in discriminating between different health or disease conditions [36,37,38,39], and (2) antibody-mediated mechanisms have been shown to be involved in the T cell-mediated initiation, perpetuation, and progression of graft injury [40, 41].

In this work, as part of an exploratory study, we present a classifier achieving high-accuracy pre-transplant risk assessment of ACR. Remarkably, this classifier is based on continuous non-thresholded HLA 1 SAB data and does not rely on donor-specific HLA typing.

Results

Characteristics of the graft recipients included in the study

Pre-transplant HLA assay data were retrospectively analyzed as part of a systems medicine approach towards early risk assessment of ACR [42, 43]. The investigated study group comprised all kidney transplant recipients enrolled in the Harmony trial (N = 615) who experienced at least one ACR or borderline ACR event in the first year (N = 77) and all transplant recipients who experienced no serious adverse events (N = 80). Median time to the first ACR event was 20.5 days (range = 4–373 days) (Additional file 1: Figure S1). Demographics and clinical characteristics of the study groups are summarized in Additional file 4: Table S1.

Pre-transplantation HLA-1 and HLA-2 MAB data were available for N = 63 recipients of the ACR group and N = 54 recipients of the control group (for demographic and clinical characteristics, see Additional file 5: Table S2). Additionally, HLA-1 SAB data was measured for all those patients who tested positive for HLA-1 MAB screening (21 ACR + 13 control) and a random subset of patients who tested negative (13 ACR + 5 control). In total, pre-transplantation HLA-1 SAB data were available for N = 34 recipients of the ACR group and N = 18 of the control group. Due to the higher sensitivity of SAB assay compared to MAB, the former assay was considered a better candidate for ACR risk assessment.

Demographic and clinical characteristics of the SAB ACR (N = 34) and the SAB control group (N = 18) were compared and are summarized in Table 1. The majority of patients was male, received their first kidney transplant and had a deceased donor. There were no significant differences between the study groups for the above mentioned characteristics as well as immunosuppressive therapy. However, the mean age of patients in the ACR group (54.9 ± 11.0) was significantly higher than for the control group (51.6 ± 11.6; p = 0.04; Mann-Whitney U test) as was the body mass index (27.7 ± 5.4 vs. 24.2 ± 4.2; p = 0.01; Mann-Whitney U test). With respect to HLA mismatches, a significant difference was found for HLA-DR (p = 0.03; Pearson’s chi-squared test), with an elevated frequency of patients with two mismatches in the ACR group (32.4% vs. 11.1%). No significant differences were found for HLA-A or HLA-B. There were no significant differences regarding PRA between the groups. There was a near-significant difference in cold ischemia time with longer times being observed for the ACR group (739 ± 295 vs. 637 ± 302; P = 0.06; Mann-Whitney U test).

Table 1 Characteristics and medication details for the subset of patients included in the HLA class 1 SAB data seta

Conventional HLA SAB data analysis does not permit pre-transplant risk assessment of ACR

To assess whether HLA-1 SAB reactivity profiles provide a means for ACR risk assessment, we initially applied the conventional data analysis approach used in HLA-diagnostics to our data set. Central to this approach is the conversion of the quantitative SAB assay read-out data into qualitative binary data (1 = presence of antibody-antigen reactivity, 0 = absence of reactivity) based on a mean fluorescence intensity (MFI) threshold. We performed all analyses for a fixed threshold of 1000 MFI and an individually adjusted threshold in the range 253–1068 MFI (Additional file 2: Figure S2). In both cases, there were no statistically significant differences between the SAB ACR and the SAB control group in any of the individual reactivities (Additional file 6: Table S3 and Additional file 7: Table S4). That is, there are no individual HLA-1 antibody reactivities that allow for risk assessment of ACR.

To assess whether there is a combination of reactivities that allows for risk assessment of ACR, we extended the conventional approach by applying a support vector machine-based multiparameter classification method to the binarized data (for details, see Material and Methods). The resulting multiparameter classifiers did not achieve significant classification performance (p > 0.1, Table 2). Taken together, our results indicate that the conventional HLA SAB data analysis approach does not permit pre-transplant risk assessment of ACR.

Table 2 Multiparameter pre-transplant prediction of ACR

A novel approach built on multiparameter classification and quantitative data input allows for high accuracy pre-transplant prediction of ACR

In spite of the widespread use of HLA SAB assays, the interpretation of results obtained following the conventional data analysis approach remains controversial [44]. A strict MFI threshold consistently identifying clinically relevant antibody-antigen reactivities is challenging to define [45]. Since it is likely that the choice of MFI threshold compromises classification efforts, we applied a novel approach to the HLA-1 SAB data set that does not rely on MFI thresholding. Key to this novel approach is the rank-normalization of the continuous SAB assay read-out data. Remarkably, a support-vector machine-based multiparameter classificator built on these data achieved highly significant prediction performance with a balanced accuracy of 82.7% (Sens. = 76.5%, Spec. = 88.9%, p = 0.002, Fig. 1a and Table 2). Receiver operating characteristic (ROC) analysis further emphasizes that the prediction performance was better than a random guess (area under the curve [AUC] = 0.86) and illustrates the trade-off between the probability of correctly predicting ACR (true positive rate, sensitivity) and the probability of incorrectly predicting ACR (false positive rate, specificity) (Fig. 1b). Importantly, we found that prediction performance was independent of a patient’s MAB screening test result, as patients who tested positive or negative for HLA-1 antibodies are predicted equally well (Fig. 1a). Moreover, the performance of the continuous data classifier was not due to age, BMI or HLA-DR mismatch frequency as confounding factors; significant classification was not achieved when HLA class 1 SAB continuous data were grouped according to either of those factors (≤50 y vs. > 50 y, ≤25 BMI vs. > 25 BMI, or no-mismatch vs. 1–2 mismatches). In addition, median-centered bead MFIs did not show any association with age or BMI (mean Spearman correlation coefficient r = 0.019 ± 0.129 and r = 0.009 ± 0.133, respectively). Taken together, our results show that continuous, rank-normalized HLA-1 SAB reactivity profiles provide a means of high-accuracy risk assessment of pre-transplant ACR.

Fig. 1
figure 1

Predictive performance of the multiparameter ACR risk assessment classifier based on rank-normalized continuous pre-transplant HLA-1 antibody reactivity profiles. a Output of the classifiers decision function for each patient. The decision threshold is indicated by a dashed horizontal line. Patients with a decision value > 0 are classified as ACR, patients with a decision value < 0 are classified as control. Colors indicate whether patients tested positive (black) or negative (grey) for the presence of serum HLA-1 antibodies during MAB screening. b ROC curve of the multiparameter classifier

Diagnostics based on HLA antibody detection assays may generally benefit from the novel approach

The fact that continuous HLA-1 SAB reactivity data outperformed MFI-thresholded binary data in terms of pre-transplant prediction of ACR (Table 2) led us to the conjecture that the conventional approach entails a loss of information that may compromise HLA-diagnostics classification efforts in general. To substantiate this claim, we performed additional analyses on the MAB screening data (63 ACR + 54 controls). Conventional MFI-threshold based data analysis revealed no statistical differences between the two study group as to the prevalence of HLA class 1 and/or HLA class 2 antibodies (Fig. 2). A multiparameter classifier based on the continuous rank-normalized data, however, achieved statistically significant prediction of the patients who experience ACR (p = 0.04, Table 2 and Additional file 3: Figure S3). Even though the accuracy of the classifier was low and not sufficient for routine risk assessment (BACC = 63.9%, Sens. = 55.6%, Spec. = 72.2%), the fact that it was significant emphasizes that the use of continuous non-thresholded antigen bead assay data favorably affects classification performance.

Fig. 2
figure 2

Conventional MFI-thresholded binary MAB screening data do not allow for pre-transplant risk assessment of ACR. Illustrated are the results of the MAB screening data of the cohort (117 graft recipients, 63 ACR + 54 controls; for demographics and clinical characteristics, see Additional file 5: Table S2).). Analyses were carried out on MFI-thresholded binary HLA MAB screening data (conventional approach); according to Fisher’s exact test, differences with respect to the prevalence of HLA antibodies are not significant (p > 0.05)

Discussion

The current study shows that pre-transplant HLA class 1 SAB signatures predict the risk of acute cellular rejection (ACR) with high accuracy. Importantly, it demonstrates that HLA antibody signatures contain information on cell-mediated events to come.

In contrast to the vast majority of existing pre-transplant risk assessment models [24,25,26,27,28,29, 31], our model does not rely on DSA reactivity data. The key advantages of this approach are that i) it facilitates risk assessment for non-sensitized patients lacking DSA and ii) it can be carried out independently of donor assignment.

HLA SAB data usually feed into prediction models in the form of binary data derived from MFI thresholding – the focus lying on strong binding interactions. Strikingly, our study emphasizes that such an approach entails a loss of information and ultimately results in loss of or suboptimal prediction performance. The fact that continuous HLA reactivity data outperform thresholded binary data (Table 2) indicates that weak binding interactions hold high-value information for risk assessment of ACR. This is further emphasized by our finding that our risk assessment tool performs equally well for patients who tested positive or negative for the presence of HLA-1 antibodies during MAB screening. There is indeed sufficient evidence in the literature to show that weak binding events are of great importance to biological systems, e.g. protein-peptide interactions [46], virus-cell interactions [47], cell adhesion, and cell-cell interactions [48,49,50,51]. Our data suggest that HLA SAB based diagnostics will profit from inclusion of weak interactions by feeding prediction models with non-thresholded continuous data. A further advantage of prediction models based on non-thresholded MFI data is that they are not affected by the prevailing uncertainties regarding the right choice of the threshold MFI level and by the yet missing internationally agreed standards [44, 52].

But why is the pre-transplant signature of serum antibodies against HLA-1 SAB predictive for the risk of T cell mediated rejection? There is evidence in the literature for an association between anti-HLA serum antibodies and ACR. Crosslinking of HLA-1 antigens expressed on the surface of donor cells by HLA class 1 antibodies has been shown to trigger the classical complement pathway through binding of C1q. The subsequent release of the complement peptides C3a and C5a then leads to enhanced allo-T cell responses and leucocyte recruitment [53]. That is, pre-transplant HLA class 1 antibodies may be involved in the initiation and perpetuation of ACR by boosting adaptive T cell activities after graft transfer. HLA class 1 antibodies have also been shown to be directly involved in mechanisms that cause severe graft injury such as endothelial cell activation or NK cell related FcγR-dependent processes [53]. However, these mechanisms usually result in histological manifestations strongly associated with ABMR [53]. Since our study cohort did not show such manifestations, these processes are unlikely to be relevant to the predictive value of our pre-transplant HLA 1 antibody signatures.

The HLA MAB and SAB data used in this study are part of a large multi-parameter database set up by the e:KID consortium that seeks to establish a systems medicine approach to personalized immunosuppressive treatment at an early stage after kidney transplantation (http://www.sys-med.de/en/consortia/ekid/). e:KID recorded a total of 478 parameters including, among others, gene expression, cytokine profile, epigenetics, metabolomics and viral load data as well as common clinical variables such as renal function or acute phase proteins. Evaluation of clinical parameters failed to identify any markers or combinations thereof which are predictive of ACR [4]. Additionally, no other single parameter or multiparameter set, other than HLA class 1 SAB signatures, achieved high accuracy pre-transplant prediction performance. This emphasizes the vast potential of serum antibodies in diagnostics in general, and, in particular, for diseases where the antigens are unknown.

The comparison of predictive performances between our classifier and classifiers in the literature underlines its relevance to pre-kidney transplant risk assessment (Additional file 8: Table S5). Its accuracy of 82.7% is i) one of the highest among all donor-independent risk assessment models [19, 22, 27, 34, 54], ii) comparable to any AR models [19,20,21,22,23,24, 26,27,28,29,30, 34, 35, 54, 55] and iii) comparable to any SAB data based models for ABMR [24, 26,27,28,29,30]. Furthermore, our classifier is based on SAB, an established diagnostics laboratory tool, thereby facilitating its further use for ACR risk assessment.

Conclusions

Our study establishes a novel tool for pre-transplant risk assessment of acute cellular rejection. Once externally validated, patients classified as high risk by our model will benefit from its implementation through modified immunosuppression as well as closer monitoring leading to earlier detection of rejection onset and initiation of treatment. Consequently, the prognosis and survival rate of the graft will improve.

Methods

Study aim

The aim of this study is to determine whether SAB serum antibody reactivity profiles of renal transplantation recipients can be used for the prediction of ACR during the first post-transplantation year. For this goal, pre-transplant HLA assay data were retrospectively analyzed as part of a systems medicine approach.

Patient population and monitoring

Six hundred fifteen adult kidney transplant recipients were enrolled in the randomized, multicenter diagnostic trial Harmony (EudraCT-Nr. 2007–006516-31) [4]. Patients were treated with a quadruple (arm A) or triple (arms B and C) immunosuppressive therapy as described before [4]. The immunosuppressive therapy included induction with either monoclonal IL-2R antibody basiliximab (arms A and B) (Simulect®, Novartis) or rabbit ATG (arm C) (Thymoglobulin®, Sanofi). Maintenance immunosuppression consisted of tacrolimus (Advagraf®, Astellas) and mycophenolate mofetil (MMF) with (arm A) or without steroids (arms B and C) [4]. All transplantations were of low immunological risk, with recipient PRA scores ≤30% and no detectable DSA prior to transplantation (complement-dependent cytotoxicity crossmatch) [4]. Further inclusion and exclusion criteria can be found in Thomusch et al. [4]. Suspected episodes of acute rejection were confirmed through biopsy according to the Banff criteria of 2005 [56]. For the e:KID project, which aims at early risk assessment of ACR by following a systems medicine approach [42, 43], 157 recipients were retrospectively monitored for the presence of HLA antibodies in blood serum on day 0 (pre-transplantation). All patients who experienced ACR (borderline or Banff class 1 or higher) in the first year were assigned to the ACR group (N = 77). The control group included all patients who neither experienced a rejection episode nor other serious adverse events (N = 80).

HLA antibody detection by Luminex multiplex bead assay

Screening for HLA class 1 and class 2 antibodies was performed using a MAB assay (LABScreen® Mixed Kit, One Lambda, Canoga Park, CA, USA). All sera that tested positive and a random subset of negative sera were subject to SAB assays to identify antibody specificities (LABScreen Single Antigen HLA Class I kit and/or LABScreen Single Antigen HLA Class II kit, One Lambda). Both MAB and SAB were performed according to the manufacturer’s instructions. Briefly, following heat-inactivation at 56 °C for 30 min and clearance from debris (0.22 μm filter), 20 μl of undiluted serum was added to 3 μl of the LABScreen bead mix and incubated for 30 min in the dark at room temperature. After a washing step in 1x LABScreen wash buffer, the bead mix was incubated with 100 μl of a 1:100 dilution of the PE-conjugated goat anti-human IgG detection antibody for 30 min in the dark at room temperature. After a final washing step in 1x LABScreen wash buffer, data acquisition was performed using a FLEXMAP3D Analyser in combination with xPONENT software version 4.1 (Luminex Corporation, Texas, USA).

Conventional HLA data processing and analysis

Key to the conventional method for HLA data processing is the binarization of the continuous xPONENT median fluorescence intensity (MFI) raw data by means of a MFI threshold (1 = presence of reactivity, 0 = absence of reactivity). For generation of binary data, raw MFI data were normalized to an in-house negative control serum (MAB) or the One Lambda negative serum OLI.NS (SAB). In case of MAB data, a bead was considered positive if its normalized background ratio exceeded 3. For binarization of SAB data, we used both a fixed threshold of 1000 MFI and an individually adjusted MFI threshold. In case of the latter, a bead was considered positive when its baseline normalized MFI exceeded 30% of the MFI of the bead showing the highest strength in reaction. Single parameter prediction performance of binarized HLA class I SAB data was assessed using Fisher’s exact test. In the case of binarized HLA class reactivity screening data (MAB), study groups were compared using Fisher’s exact test.

Experimental design and statistical rationale: novel strategy for HLA data processing and analysis

In this work we applied a novel approach to the HLA data sets that does not rely on MFI- thresholding. Key to it is the use of non-thresholded unprocessed continuous data, that is unprocessed raw MFI data given out by the FLEXMAP3D Analyser in combination with the xPONENT software. In contrast to the conventional approach, reactivity may take any value and is not limited to a binary set of values (reactivity/no reactivity). For downstream analyses these raw data are rank-transformed; they are not processed in any other way or form. To assess the predictive potential of the rank-normalized data, we performed multiparameter classification using an non-public R implementation of the Potential Support Vector Machine (P-SVM) algorithm [57]. An equivalent public release that can be run via command line or MatLab interface is provided under: https://ml.jku.at/software/psvm/. To assess the predictive performance of the classifier, we followed a leave-one-out cross-validation approach. The latter provides a well-established solution to assess a classifier’s predictive performance for high-dimensional, low sample size data sets as ours [58]. To rate predictive performance, we used the statistical measures sensitivity (Sens.), specificity (Spec.) and balanced accuracy (BACC). To assess the statistical significance of the predictive performance, we used random class label-permutation testing. The significance level was set at p < 0.05. Training of the P-SVM classifier was performed using the function psvm(), providing the matrix of training data, the class labels of the training data, the cost parameter C, the regularization parameter ε and the parameter epsitol as arguments. In case of classification of SAB data, we performed grid search for the hyperparameter space ε = {0.25, 0.5, 0.75, 1} and C = {1, 6}; the hyperparameter space for MAB data-based classification was ε = {8, 9, 10, 11} and C = {1, 6}. The parameter epsitol was set to 0.05 for all analyses. To predict the class labels of test data, we called the function predict() with the arguments object and x set to the trained classifier and the test data that is to be predicted, respectively. True classification and p-value estimation were always carried out for the same grid of hyperparameters. To further specify the performance of classificators, receiver operating characteristic (ROC) curve analysis was performed. The numerical scores (decision values) that form the basis of P-SVM class identity label assignment were extracted by setting the argument decision.values of the predict() function to TRUE. After sorting the decision values in increasing order they were used as decision boundaries. For each boundary, both sensitivity and specificity were estimated. AUC was calculated based on Mann-Whitney U statistics [59].

Statistical analyses

To assess whether the two study groups (control/ACR) differed in any of the baseline population characteristics, Mann-Whitney U test (metric variables), Pearson’s chi-squared or Fisher’s exact test (categorical variables) were applied. A p-value < 0.05 was considered statistically significant. Variables are described with mean ± standard deviation or median [interquartile range (IQR)].