1 Introduction/Background

Not only the underlying mechanisms driving a long-term cure but also life-threatening side effects after hematopoietic cell transplantation (HCT) are primarily mediated by reconstitution of the immune repertoire. The composition and dynamics of reconstitution are influenced by the conditioning regimen, cell dose, graft composition, and age and type of immune suppression. However, our understanding of these mechanisms is limited due to many variations in clinical programs, including the specific type of transplantation procedure, and the absence of standardized immune monitoring after HCT. While the process of donor selection has seen significant advancements based on new biological insights, little attention has been given to optimizing cell product design in terms of numbers and composition to minimize inter-patient variability. In addition, the high inter-patient disparities in the clearance of agents used during the conditioning are rarely investigated. The lack of prospective clinical studies addressing these concepts, coupled with limited pharmaceutical company interest, fosters a consensus discussion. Our goal is to harmonize HCT interventions by exploring how individual patient differences and overall transplantation strategies impact the final effector mechanisms of HCT, specifically aiming for timely and well-balanced immune reconstitution.

1.1 Impact of Conditioning Regimens on Immune Reconstitution and Outcomes: Pharmacokinetics–Pharmacodynamics (PK–PD) and Individualized Dosing

Over the last decade, it has become evident that various agents, such as busulfan, fludarabine, anti-thymocyte globulin (ATG), and anti-T-lymphocyte globulin (ATLG), administered as part of the conditioning regimen and post-HCT, have a substantial impact on both relapse and non-relapse mortality due to graft-versus-host disease (GvHD) and viral reactivation. Consequently, these agents significantly influence survival chances (Soiffer et al. 2017; Lakkaraja et al. 2022; van Roessel et al. 2020; Admiraal et al. 2017). Comprehensive pharmacokinetic (PK) and pharmacodynamic (PD) modeling has provided evidence that exposure to most of these agents can affect both short- and long-term immune reconstitution.

An important example is the development and validation of a population pharmacokinetics model for ATG (Thymoglobuline). It was found that clearance of ATG mainly depends on weight (when patients weigh <40 kg) and the receptor load (represented by absolute lymphocyte count; ALC) before the first dosing (Haanen et al. 2020). Using population PK modeling, a new dosing nomogram was developed, which has been recently validated in a prospective trial (Admiraal et al. 2022). Patients who received individualized dosing were more likely to attain CD4+ immune reconstitution, defined as CD4+ >50/μl at two consecutive time points before day 100. Importantly, it was confirmed that this definition of CD4+ immune reconstitution is a reliable predictor of outcomes in multiple transplantation settings (adults, pediatrics, T-replete, T-deplete, cord blood (CB), bone marrow (BM), and peripheral blood stem cells (PBSCs)) (Admiraal et al. 2022) and is easy to use at all transplant centers. In Table 10.1, optimal exposures of ATG after transplant associated with optimal outcomes are presented (Soiffer et al. 2017; Lakkaraja et al. 2022; Admiraal et al. 2017; Haanen et al. 2020; Admiraal et al. 2022) (Table 10.1). Although no validated population PK model for ATLG has yet been published, data from a post hoc analysis of a randomized controlled trial allowing three different types of regimens showed that ATLG had opposite effects on the outcome parameters of chronic GvHD and leukemia-free survival, resulting in overlapping curves for these primary end points (Soiffer et al. 2017). This study showed that agents used for conditioning had a significant impact on the ALC prior to dosing of ATLG and thus influence immune reconstitution and clinical outcomes, i.e., a similar impact as shown for ATG.

Table 10.1 Suggested novel ATG (Thymoglobuline) dosing nomograms based on PK–PD modeling for (non-)myeloablative settings in pediatrics and adults

More recently, when using body surface area (BSA)-based dosing, it has been found that fludarabine exposure is highly variable (range 10–66 mg*h/L, median exposure 26 mg*h/L) (Langenhorst et al. 2019). Immune reconstitution was found delayed in patients with an exposure >25 mg*h/L, which was associated with more viral reactivations and higher probability of non relapse mortality (NRM). Using a validated population PK model, both glomerular filtration rate (GFR) and weight were identified as predictors of clearance of fludarabine. An association between fludarabine exposure and outcomes was also shown in CD19 chimeric antigen receptor T-cell (CAR T) recipients (Fabrizio et al. 2022; Dekker et al. 2022), suggesting that individualized fludarabine dosing to improve outcomes is a viable option beyond the HCT setting. Prospective validation trials in bone marrow transplantation and in immune effector cell transplant strategies (e.g., CAR T) are underway.

Posttransplant cyclophosphamide (PT-Cy) has emerged as an elegant and effective pharmacological strategy to overcome human leukocyte antigen (HLA) barriers in the setting of allogeneic HCT from haploidentical donors and more recently in matched donor transplants (Battipaglia et al. 2021). Several biological mechanisms are responsible for PT-Cy effectiveness in terms of GvHD reduction (Radojcic and Luznik 2019), and new understandings are currently emerging (i.e., reduction in the proliferation of alloreactive CD4+ effector T cells and the preferential recovery of CD4+ regulatory T cells (Tregs); functional impairment of surviving alloreactive CD4+ and CD8+ effector T cells) (Nunes and Kanakry 2019). Moreover, PT-Cy has an indirect effect on Tregs (Fletcher et al. 2023) due to the expansion of functional myeloid-derived suppressor cells. A retrospective study has recently compared immune reconstitution across ATG and PT-Cy strategies (Massoud et al. 2022). ATG resulted in faster reconstitution of CD8+ T, natural killer (NK), natural killer T (NKT), and γδT cells, whereas CD4+ T cells and B cells reconstituted faster after PT-Cy. Similar reconstitution was observed for Tregs and B cells. Even though differences in immune reconstitution (IR) were associated with a decreased incidence of infections and moderate/severe chronic GvHD in the ATG group, they had no impact on any of the other long-term outcomes.

Collectively, these studies present compelling evidence that achieving “predictable” immune reconstitution is paramount when investigating the efficacy of maintenance therapies involving novel drugs, donor lymphocyte infusions, and advanced cell therapy interventions. Such predictability serves as a standardized predictor, enabling meaningful comparisons across studies and accounting for the numerous variables inherent to the HCT setting.

2 Graft Composition as an Additional Predictor of Immune Reconstitution and Clinical Outcomes

Although transplant physicians carefully monitor the levels of many drugs, such as cyclosporine or antibiotics, an additional opportunity to further harmonize the transplantation procedure arises from the surprising clinical observation that substantial cell dose variations are currently accepted across patients. The hesitation to monitor cell numbers in the graft or after HCT, and to act on them, is, of course, partially driven by the confusing magnitude of immunological subsets, the narrow nature of many immunological programs with a lack of consensus on immune monitoring, and also the rather limited immunological education across the majority of transplant physicians. However, currently available retrospective and prospective studies can provide guidance. A retrospective EBMT study indicated that graft T-cell numbers in matched unrelated donors frequently vary between 50 and 885 x 10e6/kg and that the highest quartile in CD34+ cells as well as T cells associate with an inferior clinical outcome (Czerw et al. 2016). As we cannot expect randomized trials to address in the future the impact of different graft compositions in T cell-replete transplantations on clinical outcomes, avoiding higher numbers of CD34 and T cells within the highest quartile might be reasonable, as high T-cell numbers have been associated with the risk of developing chronic GvHD (Czerw et al. 2016). For haploidentical donors, even lower T-cell numbers might be advised (Mussetti et al. 2018), as, in this context also, higher numbers of T cells are associated with increased incidences of chronic GvHD. However, different cohort analyses are desirable to confirm these intriguing studies. Higher numbers of NKT cells (Malard et al. 2016) and γδT cells (Perko et al. 2015) in the graft have been reported to associate with favorable immune reconstitution, and a positive clinical outcome, most likely due to their impact in controlling GvHD (Du et al. 2017) and acting on cytomegalovirus (CMV) as well as on leukemia (Scheper et al. 2013; de Witte et al. 2018). However, these variables are more difficult to control in daily clinical practice. Direct ex vivo graft engineering provides an elegant solution to further control immune subsets in the graft and the consecutive immune reconstitution. It also allows for the standardization of cell numbers, as well as subsets per patient, e.g., selecting CD34-positive cells alone has been reported to associate with less chronic GvHD, whereas the graft-versus-leukemia effect is maintained (Pasquini et al. 2012). Increased activity of the next generation of graft engineering through depletion of αβT cells has been reported over the last decade (de Witte et al. 2023), emphasizing the better awareness of an opportunity to define graft compositions more precisely before transplantation. Depletion of αβT cells is associated with not only lower frequencies of infection and extremely low GvHD rates but also a different immune repertoire (de Witte et al. 2021a) and with a good efficacy/safety profile used during the pandemic (Nijssen et al. 2023). Thus, each transplantation platform needs to be carefully evaluated for immune reconstitution as it might substantially differ and, consequently, differently impact later interventions (de Witte et al. 2021b; Schmid et al. 2021).

2.1 Monitoring: Immune Cell Phenotyping

Variables that may impact immune reconstitution are (A) the immune status before the immune intervention, (B) the immune composition of the graft, (C) the dynamics of the reconstituting immune subsets and their function, and (D) the exposure to drugs administered in the conditioning regimen prior to intervention (as discussed above; Table 10.1). The most important questions that arise when monitoring the immune cells after transplant using clinical flow cytometry are what markers should be followed and how to use these markers in a meaningful way? These questions are particularly important in an era when post-HCT pharmaceutical maintenance interventions and donor lymphocyte infusion (DLI) or the administration of other Advanced therapy medicinal products (ATMPs) have become more common over the last decade (Soiffer and Chen 2017).

Flow cytometry is broadly available to monitor immune cell reconstitution in accredited laboratories within transplant centers. Markers identifying the most common leukocyte subsets are broadly used and can therefore be considered as a “standard” panel: CD45 (lymphocytes), CD3 (T cells), CD19 (B cells), αβ T-cell receptor (TCR), γδTCR, and CD16/CD56 (NK) cells. For γδT cells, it is important to note that δ2-positive and δ2-negative γδT cells always need to be distinguished as they are biologically two completely different populations (Sebestyen et al. 2020). In some laboratories, this panel is extended to identify the differentiation and activation state of subsets of T (T-helper, cytotoxic, regulatory T cells, naive, effector/memory or recent thymic emigrants), B (switched and non-switched) and NK(T) cells, and cells from the myeloid lineage (monocytes, dendritic cell subsets). This knowledge is important because the success of cell-based immunotherapies, as well as agents modulating the immune system after transplantation, will significantly depend on the presence or absence of different immune subsets. As described above, from all markers, CD4+ T cells >50/uL (at two consecutive time points < 100 days) have shown to be the best early immune cell markers to predict outcomes in many different transplant settings. More recently, in a large (>500 pt) pediatric and young adult cohort with B cells >25 cells/uL <100 days, the combination of CD4+ T cells (>50 cells/uL) and B cells (>25 cells/uL) in particular has been found to be a predictor of outcomes (e.g., NRM, GvHD, and Overall Survival (OS)) (van Roessel et al. 2020). This new combination of B and CD4+ T cells as potential biomarkers of outcomes needs confirmation in separate cohorts. Interestingly, the relationship between CD4+ T-cell immune reconstitution and exposure to ATG and fludarabine was not found between the conditioning drugs and B-cell immune reconstitution. In the near future, mastering the diversity might allow for the definition of patient subpopulations who would benefit from certain adjuvant therapies as maintenance after HCT (e.g., checkpoint inhibitor treatment and tyrosine kinase inhibitors (TKIs)) (Davids et al. 2016; Mathew et al. 2018). Moreover, certain myeloid subsets are suggested to have an impact on outcomes (Mussetti et al. 2018), but more studies are needed to confirm this. Therefore, on top of clinical flow panels, discovery panels (in a research setting) can potentially provide more insight into what the optimal immune milieu is for disease and toxicity control. To be able to compare results from different trials and individual centers, it is important to develop standardized operational protocols for sample handling and staining protocols for both fresh and biobanked samples.

2.2 Immune Monitoring: Secretome Analyses

Measuring the production of cytokines, chemokines, and growth factors in the serum or plasma represents an integral part of immunomonitoring during immunotherapeutic treatments. Proteomic biomarkers may distinguish diverse diseases/response patterns, identify surrogate markers of efficacy, and provide additional insights into the therapeutic mode of action. Over the last decade, advances in highly multiplexed technologies have allowed for the discovery and validation of several blood biomarkers of acute and chronic GvHD and graft-versus-tumor reactivity.

As examples, proteins, such as interleukin (IL)-6, granulocyte-macrophage colony-stimulating factor (GM-CSF), hepatocyte growth factor (HGF), ST2 (suppressor of tumorgenicity 2), and soluble IL-2a, have shown to be biomarkers of GvHD, whereas increased levels of tumor necrosis factor-alpha (TNF-α) and IL-6 are associated with robust immune responses to viral reactivations (de Koning et al. 2016). It is noteworthy that these biomarkers show diagnostic and prognostic potential (Milosevic et al. 2022), can be informative in predicting more severe GvHD and NRM (McDonald et al. 2015; Srinagesh et al. 2019), and may be informative to categorize patients based on their likelihood to respond to therapy (Hess et al. 2021). The main challenge, however, remains to identify predictors very early after or even before cell infusion.

Peripheral blood is often the only source for protein analysis, which may lack the sensitivity to reflect local responses in affected tissues. The most common methods to identify these markers include antibody-based enzyme-linked immunosorbent assays (ELISAs) or multiplex platforms, such as protein microarrays, liquid chromatography–mass spectrometry (LC-MS), electrochemiluminescence, and bead-based or proximity extension multiplex immunoassays (MIAs). Again, different technologies and reagents (e.g., antibodies and recombinants for standard curves) may lead to different concentrations, and dramatic variability in results, depending on how the pre-analytical samples are handled (e.g., differences in processing and storage, including duration of storage). Cytokine levels may differ considerably between serum and plasma samples obtained from the same donor, due to release of platelet-associated molecules into the serum. Moreover, the type of anticoagulant used in plasma isolation and time- and/or temperature-sensitive changes need to be considered (Keustermans et al. 2013). These phenomena underscore the need for extensive documentation with respect to all biomarker analyses before any conclusions can be made when comparing patient cohorts treated at multiple sites.

While the detection of specific cell subsets and proteins offers valuable insights, functional assays can provide additional information to enhance our understanding of the biological mechanisms and assess the effectiveness of a patient’s immune system. For example, the functionality of natural killer (NK) cells and regulatory T cells (Tregs) can be evaluated through assays that measure their ability to induce target cell killing by degranulation as well as by their capacity for proliferation and suppression, respectively. Moreover, recent advancements in single-cell proteomic technologies have enabled the combination of both approaches, wherein the analysis of secreted proteins at the single-cell level generates an immune fitness score. This score has demonstrated its potential in predicting the responsiveness to checkpoint inhibitor therapy (Haanen et al. 2020), but its value in assessing immune fitness post-HCT has still to be assessed.

2.3 Immune Monitoring of Virus-Specific T-Cell Responses

Virus-specific immune responses are mainly assessed for cytomegalovirus (CMV) (Tassi et al. 2023; Krawczyk et al. 2018; Wagner-Drouet et al. 2021), human herpesvirus 6 (HHV6) (Noviello et al. 2023), adenovirus (AdV) (Cesaro and Porta 2022), Epstein–Barr virus (EBV), BK virus (Annaloro et al. 2020), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Anon 2022). Different assays have been adopted to assess virus-specific T cells (Table 10.2): flow cytometry-based tests (e.g., intracellular cytokines, MHC multimer binding), interferon-γ (IFN-γ) enzyme-linked immunospot (ELISpot), QuantiFERON-CMV, other home-made tests (e.g., proliferation assays or different CMV-specific T-cell subsets). In a recent EBMT survey, only 13.8% centers have reported to perform at least one type of virus-specific immune monitoring, whereas 31% additional centers are planning to start to do so in the future (Greco et al. 2023; Cordonnier et al. 2021). The quantitative and functional assessment of virus-specific T-cell responses may be more relevant to patient’s risk stratification and clinical decision-making, thereby encouraging immune monitoring of patients. While still experimental and often limited to research studies, adoptive immunotherapy with virus-specific lymphocytes could benefit from more data on virus-specific IR to extend its applicability on a broader scale.

Table 10.2 Example of platforms for virus-specific immunological monitoring (i.e., CMV)

3 From Transplantation to Immune Monitoring of CAR T Cells, Harmonization Is Needed

Adoptive transfer of CAR T cells has revolutionized the treatment of several hematological malignancies by overcoming chemotherapy refractory and/or relapsed disease. CAR T therapy shares many similarities with hematopoietic cell transplantation. First detailed immunological analyses of long-term responders become available (Cappell and Kochenderfer 2023; Melenhorst et al. 2022). In addition to clinical trial data, a large body of real-world evidence (RWE) has been compiled in different registries, with the EBMT CAR T registry being the largest European registry to be successfully used for post-authorization safety studies (PASSs) for most approved CAR T products (McGrath et al. 2020). Notably, only a minimal core set of accepted clinical end points are identical across trials and registries, leaving important additional clinical parameters not comparable between trials and RWE. The ongoing GoCART Coalition initiative (https://thegocartcoalition.com) aims to harmonize not only clinical data collection as needed for PASSs but also exploratory clinical data for earlier clinical trials and biomarker analyses. Optimal time points and, e.g., flow cytometry panels for associated exploratory biomarker programs are not harmonized across centers, trials, or products. This lack of harmonization in clinical and biomarker programs hampers scientific advances, quality control efforts, and/or benchmarking and urgently calls for a coordinated effort to harmonize parameter sets, data structures, and time points for the assessment of clinical and biomarker data enabling health-care professionals, health-care providers and payers, and, of course, patients, to optimize their decision-making. Therefore, under the umbrella of the GoCART Coalition, EBMT, European Hematology Association (EHA), and T2Evolve started a new initiative in 2023 (CART-CD) to generate a harmonized European parameter set via the Delphi process (Webbe et al. 2023), a structured process, to involve the broader stakeholder community. This will allow to harmonize, over the next years, the data structure with common time points for clinical end points and a set of biological parameters. This harmonization, including harmonization of collecting samples for immune monitoring, will improve and facilitate cross-study comparability and generate real-world data for CAR T cell therapies and beyond.

4 In Summary

The failure or success of HCT is significantly impacted by the patient’s immune status. However, only a minority of HCT programs systematically consider individualized drug monitoring during conditioning, graft design, and immune monitoring as key for patient surveillance, in order to maximally control and capture essential details of the intervention HCT. Therefore, guidelines are needed to further harmonize the HCT procedure and standardized immune monitoring to allow for distillation of the key features for success and failure. First, careful recommendations for individualized drug dosing and graft compositions can be made based on available data sets. However, within the new cellular therapy registry of EBMT, it will be key to register additional details of drug dosages, graft compositions, and immune reconstitution, to capture clinical variations in programs, as well as defined immune reconstitutions. This will enable a retrospective increase in insight into daily clinical practice, and its impact on immune reconstitution, as well as clinical outcome. Moreover, clinical trials should adopt such consensus measurements. Nevertheless, the markers and phenotypes studied in one setting may not be considered relevant in another, supporting the definition of a set of general recommended protocols and a set of add-on trial-specific parameters (Table 10.3). A new survey is currently being prepared by the Cellular Therapy and Immunobiology Working Party (CTIWP) of EBMT and GoCART Coalition for both hematopoietic cell transplantation and CAR T therapies. A harmonization procedure to achieve a more balanced immune reconstitution might have a more profound impact on patient survival than any other novel maintenance therapy (Admiraal et al. 2017; Boelens et al. 2018) and allow for a better success rate for novel drugs tested as maintenance therapy.

Table 10.3 Panels under consideration in the panel discussion of the CTIWP (Greco et al. 2018). General parameters that could be included in harmonized immune monitoring protocols across most studies/centers and advanced parameters that may be of great value in specific studies and that can only be performed in specialized immunology laboratories or analyzed in a central laboratory

Key Points

  • The failure or success of HCT is significantly impacted by the patient’s immune status.

  • Harmonizing individualized drug monitoring during conditioning, graft design, and immune monitoring is key for patient surveillance.

  • A harmonization procedure to achieve a more balanced immune reconstitution might have a more profound impact on patient survival (and quality of life) than any other novel maintenance therapy and allow for a better success rate for novel drugs tested as maintenance therapy.