Introduction

In this article, I have focused on treating and analyzing human leukocyte antigen (HLA) data collected using Transplant Registry Unified Management Program (TRUMP) and explained several factors regarding the statistical analyses specific for hematopoietic stem cell transplantation.

There are numerous important issues related to the analysis of TRUMP data. HLA information for each locus was collected through free description by physicians or data managers, resulting in errors in HLA matching counts, which needs correction. Mismatch direction should be considered when biological significance of HLA mismatch is analyzed. The counting method and impact of HLA mismatches differ according to the stem cell source. These differences should be considered in the analysis of HLA data obtained using TRUMP.

HLA information in TRUMP data

Information for the HLA locus of patients and donors was collected by free description using the pre-TRUMP and TRUMP version 1 questionnaire form. Automatic calculation of the number of HLA matching is based on whether the digits of HLA locus for the patients are the same as that in the donors. Therefore, the data is considered a mismatch in the absence of proper input of HLA information for either a patient or a donor. For example, donor “2402” and recipient “A2402” at the HLA-A locus are considered a mismatch. Further, the Japanese font of “2402” is considered different from the English font of “2402”. To correct these minor but significant errors and provide accurate HLA matching data, we developed an HLA script for the analysis of TRUMP dataset on the webpage of the Japan Society for Hematopoietic Cell Transplantation (JSHCT), which allows limited access to only JSHCT members (http://www.jshct.com/memdir/download/wg.shtml). Further, the HLA script replaces the original with the retyped HLA data for the HLA-A, -B, -C, and -DRB1 loci in unrelated bone marrow transplantation provided by the research group of Dr. Morishima [1]. HLA information provided from JCBBN is also considered in the HLA script. This makes the HLA data more accurate, particularly for the HLA data collected previously. If HLA 2-digit data (i.e., antigen data) for a locus are both blank but 4-digit information (i.e., allele data) are available, 2-digit data are replaced with 4-digit information. When one of the two 2-digit or 4-digit data at a locus are missing, there are two possibilities: one indicates a homologous locus and the other indicates missing data. In the HLA script, we consider these as missing data and excluded them from the analysis because we are unable to determine their status. By using the HLA script, accuracy of the number of HLA mismatches will be substantially improved. For example, the number of HLA-A 2-digit mismatches in the GVH direction before and after the use of HLA script is 0 in 38,203 and 39,660 patients, 1 in 5516 and 4919 patients, and 2 in 491 and 71 patients, respectively. It should be noted that there are several cases showing a contradiction between the 2-digit and 4-digit data. These should be managed in each study if necessary. In TRUMP version 2, which was started in 2015, HLA information must be selected from the pull-down menu, creating a risk of an improper input minimum.

HLA counting method

The number of HLA mismatches between patients and donors is typically counted as a total without considering the HLA mismatch direction. However, the effect of the immune reaction caused by HLA mismatch differs according to whether the mismatches are in the GVH or HVG direction. A mismatched locus in the GVH direction may be a major target for donor T cells and can cause GVHD, whereas a mismatched locus in the HVG direction may be a major target for the remaining recipient T cells and can lead to graft rejection. Therefore, from a biological perspective, the impact of HLA mismatch should be discussed separately according to the mismatch direction. The risk of GVHD should be evaluated with HLA mismatches in the GVH direction, whereas the risk of engraftment should be evaluated with those in the HVG direction. The risk of overall mortality may be evaluated with those in GVH and/or HVG direction, depending on the study objective.

Examples of HLA 2-digit (antigen) or 4-digit (allele) mismatches at the HLA-A locus are shown in Table 1. An HLA mismatch is considered in the GVH direction when the recipient’s antigens or alleles (Case 2 & 4, A*24:20) are not shared with the donor (Case 2, A*24:02; Case 4, A*24:02, A*24:07). An HLA mismatch is considered in the HVG direction when the donor’s antigens or alleles (Case 3, A*24:20; Case 4, A*24:07) are not shared with the recipient (Case 3, A*24:02; Case 4, A*24:02, A*24:20). The total number of HLA mismatches can be counted in two ways. If we focus on the number of mismatches in the GVH or HVG direction, the total number of mismatches in the GVH direction or the HVG direction should be counted and the larger number of mismatches for either the GVH or HVG direction should be selected (Table 2). However, if we focus on the number of mismatched loci, the number of mismatched loci should be counted regardless of the mismatch direction. In most cases, there is no discrepancy between these 2 counting methods (cases 9 and 10 in Table 2). However, as shown in case 11 in Table 2, a locus is mismatched only in the GVH direction and another locus is mismatched only in the HVG direction, and thus these numbers differ. The method should be chosen according to the study objective and design. In the HLA script, we offer variables for each method such as, .HLA.Geno.mis8/genomis8abcdr versus .HLA.Geno.mis8.2/genomis8abcdr2 (Table 3). In case 11, the value of .HLA.Geno.mis8/genomis8abcdr is 1 and that of.HLA.Geno.mis8.2/genomis8abcdr2 is 2.

Table 1 Examples of the number of mismatches in the GVH and HVG directions at the HLA-A locus
Table 2 Examples of the number of mismatches at the HLA-A, -B, -C, and -DRB1 loci
Table 3 Variables related to HLA data

Impact of HLA mismatch according to stem cell sources

In related transplantation, the presence of an HLA antigen mismatch in the GVH direction was associated with a higher incidence of GVHD compared to an HLA mismatch in the HVG direction [2, 3]. In contrast, the presence of an HLA antigen mismatch in the HVG direction was associated with a higher incidence of graft failure than the HLA match [3, 4]. In a recent analyses of unrelated transplantation, one-allele mismatch (HLA-A, -B, -C, -DRB1) only in the GVH direction, but not one-allele mismatch only in the HVG direction, was associated with a higher incidence of grades III–IV acute GVHD compared with the HLA match. In contrast, allele mismatch in either the GVH or HVG direction was not associated with neutrophil engraftment [5, 6]. This difference between related and unrelated transplantation may be partly explained by more frequent 2-digit (antigen) mismatches in related transplantation and by the improvement in the conditioning regimen and GVHD prophylaxis. Recent studies revealed that a high titer of donor-specific HLA antibody was associated with graft failure, suggesting that multiple 2-digit mismatches in HLA mismatched related transplantation may increase the risk of graft failure, unless donor-specific HLA antibodies are examined [710]. Avoidance of a donor to whom the patients have a donor-specific HLA antibody would improve the rate of engraftment. These findings in both related and unrelated transplantation indicate the importance of HLA mismatch direction for interpreting clinical outcomes.

Difference in HLA matching between Western countries and Japan

In Japan, HLA matching is counted as 2-digit level in unrelated cord blood transplantation (UCBT), and up to 2 mismatches in this counting method are allowed for UCB unit selection [11]. In Europe and the U.S., HLA matching is generally counted as 2-digit level for HLA-A and HLA-B loci and as 4-digit level for the HLA-DRB1 locus. However, there is no robust evidence to support counting of the HLA-DRB1 locus on the allele level. We previously analyzed the difference between the impacts of 2-digit or 4-digit level mismatches in the HLA-DRB1 locus [11, 12]. However, we found no significant difference in impact between these mismatches. More importantly, the impact of HLA mismatch was very small or negligible in adult patients who received UCBT in Japan [12]. Although there was no significant difference between the impacts of HLA-DRB1 antigen and allele mismatch, it is important to determine which HLA matching methods researchers will use before they begin to analyze transplant outcomes in UCBT. To directly compare outcomes of UCBT between studies conducted in Europe, the U.S., and Japan, researchers may follow the counting method employed in Europe and the U.S. However, for clinical practice in Japan, the results are easily interpreted if Japanese counting method is used.

In most CIBMTR studies analyzing unrelated bone marrow or peripheral blood stem cell transplantation, the impact of HLA mismatches at the HLA-A, -B, -C, and -DRB1 loci was evaluated regardless of mismatch levels, including the 2-digit or 4-digit levels [13, 14]. However, the impact of HLA allele mismatches was evaluated among 2-digit level matched pairs for the HLA-A, -B, and -DR loci in Japan, following the standard donor selection process of the Japan Marrow Donor Program, as such a donor can be found for more than 90 % of the patients in Japan [11, 15]. Therefore, for the HLA-C mismatch, 80–90 % of HLA-C allele mismatches were at the antigen level in the study. To directly compare the impact of HLA allele mismatch between Japanese studies, it may be better to include only 2-digit level matched pairs for the HLA-A, -B, and -DR loci.

Statistical analyses specific for hematopoietic stem cell transplantation

Survival analysis is the most-frequently used analysis method in the field of hematopoietic stem cell transplantation as well as in other hematological and solid organ malignancies. However, since the incidence of transplant-related mortality is not negligible, specific consideration is needed to calculate the cumulative incidence of post-transplant events, such as a relapse incidence. Further, analysis of the effect of post-transplant events, such as GVHD, on subsequent transplant outcomes requires specialized statistical techniques and consideration. These statistical analyses have been also reviewed in other articles [1618].

Time-to-event analysis

Time-to-event analysis or survival analysis treats the time from a certain time point until a target event is analyzed. In the time-to-event analysis in hematopoietic stem cell transplantation, the Kaplan–Meier method is used to estimate overall survival, disease-free survival, or progression-free survival rates. In this analysis, an event is defined as death for overall survival, death or relapse for disease-free survival, and death, relapse, or progression for progression-free survival (Tables 4, 5). Since the follow-up time for patients without an event is variable, these patients are treated as censored at the last follow-up. The log-rank test is used to evaluate the overall differences among different groups, and the Cox proportional hazards model is used for univariate and multivariate analyses.

Table 4 Event and competing event used in the analysis of hematopoietic stem cell transplantation
Table 5 Variables frequently used in the analysis of hematopoietic stem cell transplantation

Competing event

A competing event is defined as an event that does not concurrently occur with a target event (i.e., a mutually exclusive event). If relapse is defined as an event, death without relapse is defined as a competing event. In this situation, there are three possible conditions for a patient, including relapse, death without relapse, and alive without relapse. The sum of the incidence of relapse and non-relapse mortality (death without relapse) and probability of disease-free survival (alive without relapse) should be 100 %. If the Kaplan–Meier method is used to calculate the incidence of relapse, that is, a 1-Kaplan–Meier probability, the incidence of relapse is overestimated, as patients who die early after transplantation before relapse are censored and excluded from the patients at risk. This means that the sum of the incidence of relapse and non-relapse mortality and probability of disease-free survival will be greater than 100 %, which is incorrect. Therefore, cumulative incidences should be calculated using the cumulative incidence function to account for competing risks [19]. As shown in Tables 4 and 5, we defined variables for an event and a corresponding competing risk event for neutrophil and platelet engraftment, acute GVHD, chronic GVHD, relapse, and non-relapse mortality. The definition of competing risk and eligibility criteria for the analysis should be determined according to the study design. Gray’s test is used to evaluate overall differences among cumulative incidence functions [20]. The Fine and Gray proportional hazards model is used for univariate and multivariate analyses [21]. Log-rank test and Cox proportional hazards model is also acceptable [22].

Landmark analysis and time-dependent covariate

When we analyze the effect of post-transplant events, such as GVHD, on transplant outcomes, we cannot predict whether GVHD will occur at the start of observation, such as at the time of transplantation. If the occurrence of GVHD is treated as a time-fixed variable, patients with GVHD are supposed to live at least for the day of GVHD occurrence, which is biased towards showing a survival advantage for patients with GVHD. In this situation, the starting time of observation should be changed to a specific post-transplant period (i.e., landmark analysis) or the occurrence of GVHD should be treated as a time-dependent covariate. In landmark analysis of acute GVHD, the occurrence of acute GVHD at a specific post-transplant point is regarded as a time-fixed variable. Acute GVHD occurs until day 60 after transplantation for 90 % of patients. If landmark day is set at day 60 after transplantation, patients who have or have not experienced acute GVHD at day 60 are categorized into the GVHD group or no GVHD group, respectively. Even if patients have acute GVHD more than 60 days after transplantation, they are considered under the no GVHD group and are not included in the GVHD group. Patients who have had a target event by day 60 should be excluded from analysis. The results may change according to the landmark day. If the landmark day is set to an earlier day, the number of patients with GVHD will decrease, while if it is set to a later day, the number of total patients analyzed will decrease. Landmark analysis may be performed by setting the landmark day to various days in order to test the robustness of the results.

In regression analysis, the variable that changes over time can be incorporated in the model treating this variable as a time-dependent covariate. In the case of GVHD, the variable is 0 from the time of transplantation until GVHD occurs, and becomes 1 after GVHD occurrence. An example of Stata script analyzing a time-dependent covariate is shown in Fig. 1. How to analyze a time-dependent covariate using EZR/R is shown in another paper [23].

Fig. 1
figure 1

Example of Stata script analyzing an impact of grades 2–4 acute GVHD on overall survival

Conclusions

Since the processes used to collect transplant information differ according to time and are complicated, the background of the process should be clearly understood when these data are used. Particularly, use of a script offered by JSHCT is strongly recommended for analyzing HLA data in the JSHCT dataset. Researchers should also understand the statistical analyses specific for hematopoietic stem cell transplantation to correctly analyze transplant data. Researchers can contact the data center of the JSHCT if statistical help is required.