FormalPara Key Points

Chimeric antigen receptor (CAR) T-cell therapies have been approved for the treatment of adult patients with relapsed/refractory large B-cell lymphoma (LBCL), and additional CAR T-cell therapies are currently in development. The rates and severity of two important toxicities, cytokine release syndrome (CRS) and neurological events (NEs), vary across these treatments.

Cost-utility analyses (CUA) can be used to examine the value of CAR T-cell therapies and inform decision making on healthcare resource allocation. CUAs require health state utilities to calculate quality-adjusted life-years. The differences in toxicity rates among CAR T-cell therapies could have an impact on patients’ quality of life and the results of a CUA. However, no published utilities are available that represent CAR T-cell toxicities in economic modeling.

This study was conducted to estimate health state utilities associated with various severity levels of CRS and NEs resulting from CAR T-cell treatment of LBCL. The resulting utilities may be useful in models examining and comparing the cost-effectiveness of CAR T-cell therapies for LBCL. By incorporating disutility of these AEs, CUAs can more accurately and comprehensively model the differences among available treatments for LBCL.

1 Introduction

Chimeric antigen receptor (CAR) T-cell immunotherapy is a tumor treatment approach in which a patient’s immune cells are genetically modified ex vivo so that the T cells can target and eliminate tumor cells upon infusion into the patient’s body [1,2,3]. For lymphomas and leukemias arising from B lymphocytes, CD19 has emerged as an effective target for the genetically modified T cells. This protein exists on the surface of both normal and malignant B cells, but not on other human cells. The CD19-targeted CAR T-cell therapies, axicabtagene ciloleucel, tisagenlecleucel, and lisocabtagene maraleucel (liso-cel), have been found to produce high levels of durable response in clinical trials of patients with relapsed/refractory (R/R) large B-cell lymphoma (LBCL) [4,5,6,7,8,9].

CAR T-cell therapy is associated with two important toxicities: cytokine release syndrome (CRS) and neurological events (NEs) [10,11,12]. CRS occurs when excessive cytokines are produced/released, and can range from mild, requiring only over-the-counter analgesics, to life threatening, requiring intensive treatments, such as blood pressure and ventilation support and/or intensive care unit support. The most severe symptoms of CRS are fever, hypoxia, hypotension, and organ failure, while other symptoms include rash, nausea/vomiting, and rigors [10, 12, 13]. NEs can occur with or without CRS, and symptoms can include confusion, aphasia, headaches, seizures, tremor, and hallucinations [10, 12, 14]. Like CRS, NEs can become severe and require intensive treatments and/or intensive care unit support. In patients with LBCL, rates of severe CRS and NEs reported in clinical trials with axicabtagene ciloleucel, tisagenlecleucel, and liso-cel were 13% and 28%, 22% and 12%, and 2% and 10%, respectively [7,8,9].

As more CAR T-cell treatments are introduced, cost-utility analyses (CUAs) can be used to examine their value and inform decision making on healthcare resource allocation [15]. To calculate quality-adjusted life-years (QALYs), CUAs require health state utilities, which are values anchored to 0 (dead) and 1 (full health) that represent strength of preference for health states [16]. Because the CAR T-cell therapies differ in the rates of CRS and NEs, a CUA comparing these treatments should incorporate utility differences associated with these adverse events (AEs). The differences among CAR T-cell therapies in the rates of these toxicities could have an impact on patient quality of life and the results of a CUA. However, there are no published utilities available to represent CAR T-cell toxicities.

The purpose of this study was to estimate health state utilities associated with toxicities of CAR T-cell treatments. The study was designed to provide disutilities (i.e., decreases in utility) associated with various severity levels of CRS and NEs resulting from CAR T-cell treatment of LBCL. Some health technology assessment guidelines state a preference for utility values derived from generic preference-based measures completed by patients [17]. However, when generic instruments such as the EQ-5D are not appropriate, alternative utility assessment methods may be used. Generic instruments are unlikely to be feasible for assessing disutility of CAR T-cell therapy toxicity because they assess health status at a specific point in time, requiring multiple administrations at various time points to quantify the utility impact of toxicities across the entire course of CAR T-cell treatment and recovery. These toxicities are unpredictable and difficult to identify, and patients experiencing severe toxicities may be unable to answer a questionnaire when hospitalized in these acute states. Furthermore, generic instruments such as the EQ-5D may not be appropriate for assessing the disutility of these toxicities because these instruments do not directly assess the relevant symptoms and impact on quality of life (e.g., fatigue, difficulty breathing, intubation, cognitive impairment, and extended hospitalization). Therefore, utilities were estimated using a vignette-based approach, which is well suited for isolating the impact of specific temporary acute events on utility, without placing added burden on patients.

2 Methods

2.1 Overview of Study Design

Health state vignettes were developed to represent patients’ typical experience of AEs associated with CAR T-cell treatment (Online Resource; Appendix A). The health states were valued in a time trade-off (TTO) utility elicitation study with a sample of general population participants in two locations in the UK (Edinburgh and London). The difference in utility between health state A without an AE and each of the other health states with an AE represents the disutility of each AE. Participants provided written informed consent before completing study procedures, and all procedures and materials were approved by an independent institutional review board (Ethical and Independent Review Services, Study 19018).

2.2 Health State Development

Health state vignettes representing successful CAR T-cell treatment without AEs and with CRS and NEs were drafted based on the following three sources of information: published literature; AE reports from TRANSCEND NHL 001 (NCT02631044), a clinical trial of liso-cel in LBCL [9]; and interviews with clinicians involved in clinical trials of CAR T-cell therapy for LBCL. The targeted literature search was performed to support the health state content and help develop questions for the clinician interviews. This literature search focused on CAR T-cell therapies [1, 12, 18], LBCL [19,20,21,22], and CRS and NEs [13, 14, 23,24,25]. Health states were developed to represent multiple grades of these AEs so the resulting utilities could represent AEs of varying severity levels in economic modeling.

Health state content was also derived from MedWatch AE reports for 13 patients who experienced CRS or an NE after CAR T-cell therapy during a clinical trial [9]. The reports included detailed information on the timing of symptom emergence relative to the initial date of CAR T-cell infusion, description of the AE, treatment required for each AE, and duration of the AE and subsequent treatment. The reports referenced three patients with grade 1 events, six with grade 2 events, two with grade 3 events, and two with grade 4 events. Reports included detailed descriptions of AEs attributed to CAR T-cell therapy, including fatigue, fever, malaise, hypotension, altered mental state, confusion, respiratory failure, renal failure, and aphasia.

Multiple rounds of telephone interviews were conducted with four clinicians (two oncologists and two hematology nurses with experience administering CAR T-cell treatments for LBCL). The clinicians averaged over 20 years of experience each in hematology/oncology at a variety of centers, including academic medical centers, cancer hospitals, and a major cancer research center. Clinicians also reported an average of over 4 years of experience treating patients with CAR T-cell therapy. All clinicians were located in the US (California, Washington, and Nebraska). Health states were developed through an iterative process with the clinicians. The initial interviews with each clinician included open-ended questions focused on describing LBCL and CAR T-cell treatment, including the relevant AEs. The experience of any disease, treatment, or AE varies widely across patients, and vignettes cannot describe this full range. Therefore, clinicians were asked to focus on describing the typical patient experience of the relevant treatment and AEs, rather than unusual patient profiles that they may have observed. Follow-up discussions focused on reviewing and editing health state drafts. Interviews continued until all clinicians agreed that the health states were clear and accurate descriptions of the disease, treatment, and AEs.

Six health states were developed to describe 1 year in the life of a patient with LBCL. The first health state (health state A) described a year of a patient’s life, beginning with CAR T-cell therapy followed by recovery. Health state A did not include any AEs. Five additional health states (B through F) began with the same description of LBCL and CAR T-cell therapy but added descriptions of various AEs prior to recovery. Health states B, C, and D described patients who experienced grade 1, 2, or 3/4 CRS based on the Lee et al. criteria for grading [13]. Health states E and F described patients who experienced a grade 1/2 or 3/4 NE. The four clinician advisors agreed that the patients’ experience of grade 3 and 4 events was similar for both CRS and NEs, and therefore the two levels could be combined into a single health state representing both severity levels for each type of AE. For grade 1 and 2 events, clinician advisors agreed that the patients’ experience of grade 1 and 2 NEs was similar enough to be combined into a single health state but recommended that grade 1 and 2 CRS should be represented in separate health states.

The health states were presented to respondents on individual cards, each with a series of bullet point descriptions. The bullet points were divided into sections with titles intended to help the respondents understand the health state content. Section titles for health state A were Disease, Impact, Treatment, and Follow-up. Health states B–F had an additional section called Adverse Event, which described the AE experience. A timeline of the first 4 months of the year was included with each health state to illustrate the duration of AEs.

The health states developed for this study can be considered ‘path states’, which describe a sequence of health-related events [26,27,28]. The path state approach is useful for estimating utilities associated with temporary events that change over time. Respondents are asked to value the entire path, rather than each individual part of the path. Therefore, the resulting utilities represent the full year described in the health state, including treatment, AEs, and recovery.

Health states differed only in their descriptions of AEs (including type of AE, severity of AE, hospitalization, and recovery time), while other parts of the path were identical. Therefore, any difference in preference and utility between the health states can be attributed entirely to the AEs.

2.3 Participants

Participants were required to be at least 18 years of age, residents of the UK, and able to give informed consent and complete protocol requirements. Inclusion criteria did not specify clinical characteristics because this study was designed to estimate utilities for use in CUAs that may be submitted to health technology assessment agencies, which often prefer utilities derived from general population samples [17, 29, 30]. Participants were recruited via newspaper and online advertisements. Following telephone screening, participants who were eligible, interested, and available were scheduled for an in-person interview. Recruitment targets were set to ensure that the sample was similar to the UK general population with regard to key demographic variables (age, sex, racial/ethnic background, employment status).

2.4 Pilot Study

A pilot study (N = 22; 50% female; mean [range] age of 44 [18–74] years) was conducted to test the TTO methods and ensure that the health state vignettes were clear and comprehensible. Participants completed the interview procedures, including health state ranking and TTO valuation, and then provided feedback. Most participants reported no difficulty understanding the health states or the TTO task and some provided suggestions for minor changes. In response to these comments, edits were made to the health states, including revisions to formatting and wording to improve clarity. The revised health states were used in the subsequent utility elicitation study.

2.5 Utility Interview Procedures and Scoring

The TTO valuation study was completed via face-to-face in-person interviews in November 2019. When presenting health state cards to the participants, the interviewer reviewed each section of the health states and then gave the participants time to read the cards independently. Health state A (without an AE) was always presented first. To facilitate comprehension, the other health states were presented in AE groups. Half the sample was presented with the three CRS health states first, while the other half viewed the two NE health states first. After reviewing all health states and confirming that they understood the content and differences between the health states, participants ranked health states from most to least preferable.

Utility values were then elicited in a TTO task with a 1-year time horizon, following commonly used methodology [16]. For each health state, participants were presented with a series of choices between spending a 1-year period in the health state being valued (followed by death) versus spending varying amounts of time in full health (followed by death). The amount of time in full health varied in 1-month increments, presented in an order that alternated between longer and shorter periods of time (i.e., 1 year, 0 months [dead], 11 months, 1 month, 10 months, 2 months, 9 months, 3 months). Utility values were assigned at the point where the participant was indifferent between the two choices (utility = x/y, where x is the number of months in full health and y is the number of months in the health state being valued [i.e., 12 months]).

The time horizon of TTO valuations (i.e., the specified duration of time spent in each health state) varies across studies [31, 32]. While longer time horizons (e.g., 10 years) are usually used when valuing health states representing chronic medical conditions, the shorter 1-year time horizon tends to be more effective for assessing the impact of a relatively brief event [33]. When attempting to differentiate among brief events, a longer time horizon can suffer from a ceiling effect where all health states receive similarly high utility scores because a brief event is unlikely to have an effect on preference for a 10-year time period. Thus, TTOs with longer time horizons are not sensitive to short-term but potentially important differences in health, such as the serious AEs examined in the current study. In contrast, the shorter 1-year time horizon is useful for capturing differences in preference among brief AEs. Therefore, the 1-year time horizon was used in the current study. This approach also has the advantage of simplifying the interpretation and use of the results. Because the time horizon was exactly 1 year, the utility decrease associated with each health state represents the QALY decrement associated with the AE described in that health state.

If a respondent perceived a health state to be worse than death, the task and scoring procedures were adjusted as described in previous literature [34]. Participants were asked to state their preference between death (choice 1) and a 1-year life span (choice 2), beginning with time in the health state being rated, followed by full health for the remainder of the year. The amount of time in the health state being rated was varied in 1-month increments. Negative utility scores were calculated at the point of indifference between the two choices using the following formula: utility = − x/t, where x is the amount of time in full health and t is the total life span of choice 2.

2.6 EQ-5D-5L

The EQ-5D-5L was administered to characterize the overall health status of the sample. This self-administered, generic, preference-weighted measure includes two sections [35,36,37]. The first section consists of five dimensions assessing mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Participants report their functioning in each dimension by selecting one of the following five response options: 1 (no problem), 2 (slight problems), 3 (moderate problems), 4 (severe problems), or 5 (unable). The second section is a 20-cm vertical visual analog scale (VAS), with anchors of 0 (‘worst imaginable health state’) and 100 (‘best imaginable health state’). Respondents rate their own health state today on the VAS.

2.7 Statistical Analysis Procedures

Statistical analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC, USA). Continuous variables were summarized as means and standard deviations (SD), and categorical variables were summarized with frequencies and percentages. Utilities of subgroups (e.g., male vs. female, England vs. Scotland, etc.) were compared using independent t-tests, and utilities for pairs of health states were compared using paired t-tests.

3 Results

3.1 Sample Characteristics

A total of 366 potential participants were reached for screening, interviews were scheduled for 252 eligible participants, and 227 attended their interview. Nine participants were unable to provide valid utility data (four had difficulty understanding the TTO task, four had difficulty comprehending health state content, and one repeatedly changed their mind and could not provide consistent results). Therefore, data from 218 participants were included in the analysis (n = 113 London; n = 105 Edinburgh) (Table 1). There were no significant differences between the London and Edinburgh subgroups in age, sex, marital status, or education level. Compared with the London subgroup, the Edinburgh subgroup had a higher rate of White participants and a lower rate of participants who were employed full-time.

Table 1 Sample characteristics

The most commonly reported medical and mental health conditions were depression (22.0%), anxiety (20.2%), diabetes (13.8%), arthritis (13.3%), and hypertension (8.3%), while 43.6% of the sample reported having no health conditions. No participant reported receiving a diagnosis of LBCL, but nine (4.1%) reported knowing somebody diagnosed with LBCL. Five participants (2.3%) reported having been diagnosed with another type of lymphoma, and 57 (26.1%) reported knowing somebody who had been diagnosed with another type of lymphoma. Six participants (2.8%) reported having had chemotherapy, and 157 (72.0%) reported knowing somebody who had experienced chemotherapy.

EQ-5D-5L results suggest that this sample had relatively few problems with mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. In each of these five dimensions, a majority reported ‘no problems’ (ranging from 57.8% reporting no problems with pain/discomfort, to 93.1% reporting no problems with self-care). The mean (SD) index score and VAS score were 0.86 (0.17) and 82.15 (13.54), respectively. Index scores ranged from 0.06 to 1.00, and VAS scores ranged from 35 to 100. These EQ-5D-5L scores indicate relatively good overall health, with substantial variation.

3.2 Health State Preferences and Utilities

Participants ranked all health states in order of preference, ranging from 1 (most preferable) to 6 (least preferable). The base LBCL health state with no AEs (health state A) was ranked as most preferable by all participants. All participants ranked either health state D (grade 3/4 CRS; 88.5%) or F (grade 3/4 NE; 11.5%) as least preferable. Mean rankings in order of average preference were 1.0 (health state A), 2.15 (B), 3.33 (E), 3.57 (C), 5.09 (F), and 5.87 (D).

Mean utility scores followed the same pattern as the health state rankings (Fig. 1). Health state A had the highest mean utility (0.73), followed by B (0.71), E (0.69), C (0.68), F (0.55), and D (0.50). Disutilities of each AE were calculated as the difference between the utility of the health state without an AE (A) and the utility of each health state with an AE (B–F) (Fig. 2). Grade 3/4 AEs (described in health states D and F) had substantially larger disutilities than the less severe AEs (described in health states B, C, and E). t-Tests revealed that the comparisons between the utility of health state A and the utility of each other health state were statistically significant (p < 0.0001). In addition, all three pairwise comparisons among the utilities of CRS health states (B, C, D) and the comparison between utilities of the two NE health states (E, F) were statistically significant (p < 0.0001).

Fig. 1
figure 1

Health state utilities (N = 218). Utilities were calculated for each health state. Health state A represented LBCL with CAR T-cell therapy, without an AE. Health states B–F were identical to health state A, except for the addition of an AE (i.e., either CRS or NEs). AE adverse event, CAR chimeric antigen receptor, CI confidence interval, CRS cytokine release syndrome, LBCL large B-cell lymphoma, NEs neurological events, SD standard deviation

Fig. 2
figure 2

Disutilities (i.e., decreases in utility) associated with CRS and NEs (N = 218). Disutilities of each AE were calculated by subtracting the utility of health state A (representing LBCL with CAR T-cell therapy, without an AE) from the utility of every other health state (each representing LBCL with CAR T-cell therapy, with one AE). Health states B–F were identical to health state A, except for the addition of an AE (i.e., either CRS or NEs). Therefore, any difference in utility between health state A and the other health states can be attributed to the addition of the AE. AE adverse event, CAR chimeric antigen receptor, CI confidence interval, CRS cytokine release syndrome, LBCL large B-cell lymphoma, NEs neurological events, SD standard deviation

Participants willing to trade time in perfect health to avoid living in any of the health states were as follows: 157 (72.0%) for A, 164 (75.2%) for B, 172 (78.9%) for both C and E, 189 (86.7%) for F, and 194 (89.0%) for D.

Most participants perceived all health states as better than dead (i.e., utility score >0), resulting in few negative utility scores. Health states A, B, C, and E were rated as worse than death by only 5 (2.3%) participants. Health states D and F received negative utility scores from 11 (5.0%) participants.

3.3 Subgroup Comparisons

There were no significant differences in utility or disutility scores by sex or age (median split). When comparing subgroups by geographic location (i.e., London vs. Edinburgh), there were significant differences in utility for all health states except health state A. Utility scores in Edinburgh were consistently lower than scores in London by a difference of 0.07–0.11 for health states B through F (p < 0.05). However, there were no significant differences in any disutilities of AEs (i.e., the comparison between health state A and the other health states).

4 Discussion

Results followed expected patterns, with the extent of disutility corresponding to AE severity. Grade 1 and 2 CRS and NEs had minimal impact on preference and utility, which was expected since these events tend to resolve relatively quickly without intensive treatment and only marginally increased time spent in the hospital. Grade 3 and 4 CRS and NEs were associated with greater decreases in utility. These events are life-threatening, require longer hospitalization, and may require intensive treatments, including mechanical ventilation. Although there are no previously published disutilities of these AEs that can be used for comparison, the base LBCL utility value estimated in this study (0.73) is similar to LBCL utilities from other studies [15, 38].

Disutility values in Fig. 2 may be useful in CUAs of CAR T-cell treatments. Because the disutilities derived in this study represent the impact of a temporary event (the AE) on a specific time period (1 year), the 1-year time horizon of the health states and TTO task must be considered when using the values in a model. The utility decrease associated with each AE (i.e., the disutilities of health states B through F in Fig. 2) represents the impact of adding the AE to a 1-year period. Therefore, these disutilities can be applied in CUAs as a one-time QALY decrement representing the utility decrease associated with each AE across a year in which CAR T-cell therapy occurred. For example, if modeling a CAR T-cell therapy with a 5% rate of grade 3 CRS, a QALY decrement of −0.23 (i.e., the difference between utilities of health states A and D) would be applied to 5% of hypothetical patients.

When interpreting and using the results of this study in a model, researchers should consider strengths and limitations of vignette-based methods. A limitation is that utility scores represent general population preferences for the health state descriptions rather than the experience of an actual patient sample. To maximize representativeness of the vignettes, the descriptions of AEs were based on clinicians’ reports of the typical patient experience of each AE, along with published literature and AE reports from a clinical trial. Still, the extent to which the resulting utilities might differ from values derived from patients is unknown. In addition, the vignettes did not account for the potential impact of treatment efficacy on disutility (i.e., impact of cure vs. disease progression on AE impact).

Despite these limitations, the vignette-based approach was useful for estimating AEs associated with CAR T-cell therapy. Generic preference-based measures, such as the EQ-5D, were not designed to capture the specific impact of these AEs, which may include neurological symptoms, extended hospitalization, and physical debilitation. Most importantly, it would not be feasible to derive these utilities from patients because of the challenges accessing patients during intensive care unit hospitalization, ethical considerations of administering questionnaires to hospitalized patients with severe symptoms, and the impossibility of assessing sedated patients. Furthermore, the vignette ‘path state’ approach used in the study allows for estimation of a single disutility value representing a temporary health-related event that changes over time, such as CRS and NEs. For these reasons, vignette methodology was determined to be the most feasible approach for estimating disutilities of these AEs.

Concerns about using vignette-based utilities along with utilities from generic preference-based measures in the same model could be addressed by running a base-case model without the vignette utilities, allowing cost-effectiveness of CAR T-cell therapies to be considered irrespective of AE rates, followed by a sensitivity analysis with the vignette utilities to show how cost effectiveness might be affected when considering the impact of AEs. Reviewers could interpret the findings of the sensitivity analysis with appropriate caution.

The 1-year time horizon used in this study also has strengths and limitations. Longer time horizons, such as 10 or 20 years, are not effective for differentiating among relatively brief temporary events, such as CRS and NEs, which can last less than 1 month and would be unlikely to affect the valuation of a 10-year time period. Therefore, a TTO elicitation task with a 10-year time horizon would not be sensitive to the disutility of these AEs, although they clearly have an impact on preference and quality of life. In contrast, the TTO task with the 1-year time horizon can detect a utility impact of each temporary AE. Another advantage of the 1-year time horizon is that the results are easy to interpret and use in a CUA because they can be applied as QALY decrements. The limitation of shorter time horizons is that comparability to utilities derived from tasks with longer time horizons is unclear. Previous research has shown that the TTO time horizon can impact results, including the magnitude of utility and differentiation between health states [32]. Researchers should consider this limitation when using these utilities in a model along with utilities derived from TTO elicitations with longer time horizons.

The complexity of the vignettes in the current study may have presented challenges for some participants. In a study such as this, it is essential that the vignettes are clear and comprehensible to the respondents. The current vignettes were longer and more complex than those used in many other utility valuations. Efforts were made to ensure participants understood these health states (e.g., pilot study assessing comprehension; face-to-face interviews in which respondents’ questions could be answered and seemingly illogical responses could be queried; inclusion of timelines at the bottom of each health state to illustrate the sequence of events), and the strategies appeared to be effective because the utilities followed logical patterns. Still, it is possible that some participants may not have fully comprehended every detail of the health states.

Another limitation may be generalizability of the data. Efforts were made to ensure that no demographic group was overrepresented relative to the UK general population with regard to age, ethnic/racial background, sex, and employment. However, the sample was centered in London and Edinburgh and generalizability to the broader UK population is uncertain.

5 Conclusion

In summary, the health state utilities estimated in this study may be useful in models examining and comparing the cost-effectiveness of CAR T-cell therapies for LBCL. The rates and severity of the AEs represented in the health states can differ across the CAR T-cell treatments. By incorporating disutility of these AEs, CUAs can more accurately and comprehensively model the differences among available treatments for LBCL.