FormalPara Key Summary Points

Why carry out this study?

Tafasitamab (TAFA) in combination with lenalidomide (LEN) is approved for treatment of adults with relapsed or refractory diffuse large B-cell lymphoma (R/R DLBCL).

Given the lack of head-to-head studies comparing TAFA + LEN against comparator treatments, we conducted matching-adjusted indirect comparison analyses to generate relative efficacy estimates using data from the phase 2 L-MIND study and comparator studies, including those that assessed polatuzumab vedotin + bendamustine + rituximab (POLA + BR) or BR.

What was learned from the study?

TAFA + LEN was associated with significantly longer duration of response (p = 0.045) and long-term improved overall survival after 4 months of follow-up (p = 0.026) when compared to POLA + BR, and with significantly improved overall survival (p = 0.014), progression-free survival (p < 0.001), duration of response (p < 0.001), and complete response rate (p = 0.004) when compared with BR.

Treatment with TAFA + LEN was associated with improved clinical outcomes compared with standard rituximab-based treatments in R/R DLBCL.

The findings should inform discussions of therapeutic strategies for patients with R/R DLBCL, in which alternative treatment options are currently lacking.

Introduction

Diffuse large B-cell lymphoma (DLBCL) is the most common subtype of non-Hodgkin lymphoma (NHL), accounting for approximately 31% of NHL cases [1, 2]. The standard of care first-line treatment for DLBCL is the anti-CD20 monoclonal antibody rituximab combined with cyclophosphamide, doxorubicin, vincristine, and prednisone [3,4,5]. Second-line therapies for relapsed or refractory (R/R) DLBCL include platinum-based regimens, such as dexamethasone, cisplatin, and cytarabine ± rituximab, and ifosfamide, carboplatin, etoposide ± rituximab for patients intending to proceed to stem cell transplantation (SCT) [4, 5]. For patients not proceeding to SCT, second- and third-line treatment options include polatuzumab vedotin ± bendamustine ± rituximab (POLA + BR), bendamustine ± rituximab (BR), gemcitabine + oxaliplatin ± rituximab (R-GEMOX), rituximab ± lenalidomide, pixantrone monotherapy, and anti-CD19 CAR T-cell therapies, such as axicabtagene ciloleucel (Axi-cel) and loncastuximab tesirine [4, 5].

Survival rates for patients with DLBCL, although relatively low, have improved in recent years. The 5-year relative survival rate from primary diagnosis was 55.4% during the period 2006–2008, according to the population-based European Cancer Registry (EUROCARE-5), whereas the United States Surveillance, Epidemiology and End Results population-based study reported a 5-year survival rate for DLBCL of 63.9% from 2011 to 2017 [6, 7]. Among patients receiving first-line treatment for DLBCL, 30–40% either relapse or are unable to achieve remission, outcomes that are associated with poor prognosis [8]. Median survival for patients experiencing primary or secondary refractory DLBCL ranges from 5 to 7 months [8, 9]. It should also be noted that, among patients receiving second- or third-line treatment for DLBCL, a significant proportion are ineligible for SCT due to their advanced age or being unfit for the procedure [10]. Taken together, these data point to a significant unmet need for safe and effective treatment options for SCT-ineligible R/R DLBCL [10].

Tafasitamab is an Fc-enhanced, humanized, anti-CD19 monoclonal antibody [11]. The CD19 molecule, expressed on B-lymphocytes and follicular dendritic cells, is present on tumor cells from most patients with NHL [12]. Tafasitamab was engineered to increase engagement with the Fcγ receptor on immune effector cells, and enhance antibody-dependent cellular cytotoxicity and phagocytosis [12]. The phase 2 L-MIND study (NCT02399085) assessed the efficacy and safety of tafasitamab plus lenalidomide (TAFA + LEN) in 80 adult patients with R/R DLBCL ineligible for SCT who had received ≥ 1, but no more than 3, prior lines of therapy, including ≥ 1 anti-CD20 therapy (e.g., rituximab). Objective response rate (ORR) was 57.5% with a duration of response (DOR) of 3.7 years. Median progression-free survival (PFS) was 11.6 months [95% confidence interval (CI) 6.3, 45.7] after a median follow-up of 33.9 months, whereas median overall survival (OS) was 33.5 months (95% CI 18.3, not reached) after a median follow-up of 42.7 months [13]. Based on the results of the L-MIND study, TAFA + LEN received accelerated approval in 2020 by the United States Food and Drug Administration (FDA) for the treatment of adults with R/R DLBCL not otherwise specified (including DLBCL arising from low-grade lymphoma) and ineligible for autologous SCT (ASCT) [11]. TAFA + LEN also received conditional approval from the European Medicines Agency for treatment of patients with R/R DLBCL who are ineligible for ASCT, and has received “orphan” designation in the European Union [14, 15].

Currently (December 2021), there are no head-to-head studies comparing TAFA + LEN against comparator treatments. Therefore, we conducted an indirect treatment comparison to generate relative efficacy estimates. Given the single-arm nature of L-MIND, network meta-analysis methodology was not feasible, so a matching-adjusted indirect comparison (MAIC) was conducted. MAIC is a methodology that balances study populations by matching the distributions of the baseline characteristics of a study for which patient-level data are available with the reported baseline characteristics of a comparator study by using statistical weights on the individual patient data (IPD) from the study of interest.

The present study applies MAIC methodology to compare TAFA + LEN therapy to 3 rituximab-based regimens: POLA + BR, BR, and R-GEMOX, with efficacy outcomes—OS, PFS, DOR, ORR, and complete response rate (CRR)—assessed in the treatment of transplant-ineligible R/R DLBCL.

Methods

L-MIND Study Population

Anonymized IPD for patients with R/R DLBCL treated with TAFA + LEN were available from the L-MIND trial from an updated analysis after a median follow-up of ≥ 35 months (data cut-off October 2020) [13]. L-MIND was a multicenter, open-label, single-arm, phase 2 trial that evaluated the safety and efficacy of TAFA + LEN followed by TAFA monotherapy in adults with histologically confirmed DLBCL ineligible for SCT who had received 1–3 prior lines of therapy, including an anti-CD20 agent. Patients were co-administered intravenous TAFA (12 mg/kg) and oral LEN (25 mg/day) for up to 12 cycles (28 days each); patients with complete or partial response (CR or PR) or stable disease continued to receive TAFA monotherapy until disease progression [16]. The efficacy analysis set from the L-MIND study was used for comparison and included 80 patients, of whom 50% received TAFA + LEN as second-line therapy, while 50% received the combination as third- (n = 34), fourth- (n = 5), or fifth-line (n = 1) therapy. Of these 80 patients, 19% (n = 15) were primary refractory (i.e., no response to, or progression during or within 6 months of, frontline therapy) and 44% (n = 35) were refractory to their last prior therapy line.

Selection of Comparator Treatments for MAIC

The comparators for TAFA + LEN in transplant-ineligible patients with R/R DLBCL were selected based on treatments described in the National Comprehensive Cancer Network and European Society for Medical Oncology guidelines as regimens administered in routine clinical care [5, 17]. Regimens identified as potential comparators for TAFA + LEN were POLA + BR, BR, R-GEMOX, Axi-cel, tisagenlecleucel, rituximab + lenalidomide (R + LEN), LEN monotherapy, and pixantrone monotherapy.

A systematic literature review (SLR) was conducted using broad criteria to identify clinical trials that evaluated the efficacy and safety of interventions for patients with R/R DLBCL. The SLR included studies published from 2011 and abstracts published from 2016, up to the date of the search (February 3–7, 2021). Data sources used in the SLR included PubMed, EMBASE, the Cochrane Library, Health Technology Assessment websites, and conference proceedings. The SLR was conducted and reported in accordance with the requirements of the National Institute for Health and Care Excellence (NICE), the Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen, and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines [18,19,20]. Further details on the SLR search strategy are provided in the Supplementary Methods and Table S1.

Each of the studies identified by the SLR was assessed based on prespecified criteria for inclusion in the MAIC. First, studies were reviewed to assess whether the eligibility criteria were comparable with those of the L-MIND study. Opinion from two clinical experts was used to guide this process; of note, retrospective studies were not considered as suitable candidates for a comparison against TAFA + LEN as investigated in the L-MIND study. This initial step also assessed whether outcomes reported in the L-MIND and comparator studies were similarly defined. For studies with multiple populations, this review step also allowed the most appropriate comparator population to be determined for comparison against the L-MIND study. Finally, it also identified whether any filtering of the L-MIND population would be required to improve the overlap of the L-MIND study population and the comparator study. Although eight regimens had been initially identified as relevant, this manuscript focuses only on rituximab-containing comparators for which ≥ 1 study eligible for inclusion in the MAIC was identified: POLA + BR, BR, and R-GEMOX.

MAIC Methodology

In order to minimize the risk of bias due to differences in baseline demographic and disease characteristics between the populations enrolled in L-MIND and comparator studies, a MAIC was conducted according to the methods described by Signorovitch et al. [21] and following the guidelines of NICE [22].

Approach to Matching Trial Populations

IPD data from the L-MIND study were adjusted to match the average baseline characteristics for the relevant treatment arm of the comparator trials or studies. Individual patients in L-MIND were assigned weights, which were estimated through propensity-score-like regressions using selected covariates and used to calculate the final effective sample size of the matched index population. Successful population adjustment with an appropriate set of covariates ensures that the patient populations across studies are comparable for the set of factors included in the adjustment. Anchored comparisons were not feasible because the L-MIND is a single-arm study; therefore, per NICE guidelines for unanchored MAICs, all known and available prognostic factors and effect modifiers should be included in the population adjustment. However, matching on a large list of factors can result in a low effective sample size, making comparisons technically not feasible or inferences no longer reliable. Therefore, the matching strategy aimed to preserve the effective sample size to at least 20% (i.e., ≥ 16 patients) of the original L-MIND treated population size (N = 80), while adjusting for as many prognostic factors and effect modifiers as possible.

A list of all possible prognostic factors and effect modifiers was generated, based on findings reported in published studies and on clinical expert opinion. Three scenarios were considered when performing matching: Scenario 1: adjusting for all mutually available and similarly defined baseline characteristics among the L-MIND and comparator target studies; Scenario 2: adjusting for all clinically relevant mutually available prognostic factors and effect modifiers, as identified from the SLR and by clinical experts (Table 1); and Scenario 3: adjusting for the following clinical expert-selected list of prognostic factors and effect modifiers, prioritized as being most relevant for DLBCL: age, Eastern Cooperative Oncology Group performance status score, International Prognostic Index score, refractoriness of patients (primary refractoriness or refractoriness to prior lines of therapy), number of prior treatment lines, prior SCT, and cell type of origin of the disease. Several matching models were undertaken for each MAIC for each of the above scenarios, and a base case model was selected based on model convergence, and effective sample size (≥ 16) retained for the comparison; ≥ 1 matching models were also selected for sensitivity analyses.

Table 1 List of prognostic factors and effect modifiers in DLBCL identified by clinical experts (Scenario 2)

Measuring Relative Treatment Effects

For time-to-event outcomes (OS, PFS, DOR), the relative efficacy estimates were quantified as a hazard ratio (HR) with a 95% CI. HRs were obtained using a Cox regression analysis fitted on the weighted L-MIND data and the reconstructed IPD of the comparator study used in the matching. Reconstructed IPD were generated from the Kaplan–Meier plots in the comparator publications using the algorithm published by Guyot et al. [23]. The assumption of proportional hazards (PH) was tested using visual assessment tools, including log-cumulative hazard plots and Schoenfeld residuals plots, and using the Grambsch and Therneau test employing a p value threshold of 0.05. When visual assessment or analytic tests (see Supplementary Materials) provided evidence of a deviation from the PH assumption, time-dependent hazard ratios were calculated. Due to sample size limitations, whenever possible, time-dependency was implemented by splitting the follow-up time into two intervals and calculating a single constant HR within each interval. For binary outcomes (ORR, CRR), the relative efficacy estimates were quantified as an odds ratio (OR) with a 95% CI. The OR was obtained using logistic regression analysis. MAIC analyses results were deemed statistically significant at a threshold of p < 0.05.

The robust sandwich estimator was used for the calculation of the standard errors. The regression models were fitted using the weighted L-MIND population and the unweighted L-MIND data versus comparator data to estimate the reduction in the bias induced by the population adjustment. Data analyses were conducted using the R 3.6.1 packages survival (v.3.2–3) [24, 25], metafor (v.2.4–0) [26] and sandwich (v.2.5–1) [27, 28] and R code provided by the guidelines of NICE [22].

Pooled Analyses

Where multiple clinical trials were identified for one comparator, pooled estimates of relative efficacy (HR and OR) were obtained using frequentist meta-analyses. These direct meta-analyses were conducted by pooling the results from multiple MAICs (i.e., estimates of mean treatment effects on the ln scale between TAFA + LEN and BR on PFS, ORR, DOR, and CRR) to obtain the estimate for the comparison and its standard error. Random-effects [29] and fixed-effects models were implemented, and best fitting models were used in the primary analyses.

Compliance with Ethics Guidelines

This is a post hoc analysis and modeling of data already collected (L-MIND; NCT02399085) and/or data reported in peer-reviewed publications. L-MIND study was approved by the institutional review boards at each study site and was carried out in accordance with the International Conference on Harmonisation good clinical practice guidelines and the Declaration of Helsinki. All patients provided written informed consent.

Results

After removal of duplicates, the SLR identified 7616 titles and abstracts that evaluated pharmacologic interventions for R/R DLBCL. After screening and assessment for eligibility, there were 76 citations covering 35 unique studies reporting data on the comparators of interest (Fig. S1; Table S2). Of the 35 studies for comparators identified during the SLR, a total of 4 comparator studies for 3 rituximab-based regimens (POLA + BR, BR, and R-GEMOX) met the criteria for inclusion. An overview of the study design, patient population and study endpoints from L-MIND and comparator studies is shown in Table 2.

Table 2 Characteristics of included studies

MAIC of TAFA + LEN Versus POLA + BR

The comparator in this MAIC was the POLA + BR arm of the phase 2 GO29365 trial. Patients in GO29365 received study treatments (POLA + BR or BR) for up to 6 × 21-day cycles and the primary endpoint was CRR, whereas patients in L-MIND received up to 12 × 28-day cycles and the study had a primary endpoint of ORR [16, 30].

PFS outcomes [per independent review committee (IRC)] for the GO29365 trial were obtained from the FDA submission dossier for POLA [31], rather than the primary publication [30]. The FDA source was used as it censored PFS records of patients who received a subsequent anticancer treatment without a recorded progression event at the time of the last progression assessment; this censoring rule was similar to that used in the L-MIND study. For the base case analysis, baseline characteristics before and after population adjustment compared with the POLA + BR arm of the GO29365 trial are summarized in Table 3. Two alternative MAIC models were implemented by changing the list of factors included in the population adjustment and used to assess the sensitivity of results to the matching model. The baseline characteristics in the sensitivity analysis were similar to the base case (Table S3).

Table 3 Baseline characteristics of the L-MIND study, the weighted L-MIND population, and the GO29365 trial for POLA + BRa

Comparative Efficacy Analysis for TAFA + LEN Versus POLA + BR

After adjustment, TAFA + LEN was associated with a significantly longer DOR compared with POLA + BR (HR 0.34, 95% CI 0.12, 0.98; p = 0.045) (Table 4; Fig. 1). The assumption of proportional hazards in the analyses of OS and PFS was not satisfied (Fig. S2), and a piecewise constant analysis of the HRs was therefore performed. A 4-month split was chosen based on visual assessment of the cumulative hazard plot. Statistically significant differences were observed in population characteristics between L-MIND patients dying within 4 months and patients still alive after > 4 months of follow-up (Table S4).

Table 4 Relative efficacy estimates for observed and weighted TAFA + LEN versus POLA + BR
Fig. 1
figure 1

Observed and weighted efficacy results for TAFA + LEN versus POLA + BR. a KM estimates of OS, b KM estimates of PFS by IRC, c KM estimates of DOR by IRC, and d depth of responses by IRC of patients enrolled in the L-MIND study before and after the population matching and reported for the POLA + BR. BR bendamustine + rituximab, DOR duration of response, IRC independent review committee, KM Kaplan–Meier, LEN lenalidomide, MAIC matching-adjusted indirect comparison, OS overall survival, PFS progression-free survival, POLA polatuzumab vedotin, TAFA tafasitamab

A significant difference in OS favoring TAFA + LEN was observed after 4 months of follow-up (HR 0.41, 95% CI 0.19, 0.90; p = 0.026), while the OS HR from start of therapy to month 4 was 1.82 (95% CI 0.58, 5.65; p = 0.302). For PFS, there was no significant difference after 4 months of follow-up (HR 0.39, 95% CI 0.14, 1.06; p = 0.065) or from start of therapy to month 4 (HR 1.42, 95% CI 0.65, 3.09; p = 0.376), although there was a numerical advantage favoring TAFA + LEN over POLA + BR after 4 months. Analyses on the OS and PFS HR split point versus POLA + BR were also performed at 3, 9 and 11 months (Table S5) to explore sensitivity to the choice of the splitting point and were aligned with the results obtained using the 4 month-split point.

A nonsignificant numerical advantage favored POLA + BR over TAFA + LEN for ORR (OR 0.68, 95% CI 0.25, 1.86; p = 0.450) and CRR (OR 0.74, 95% CI 0.27, 2.07; p = 0.571).

In the sensitivity analyses based on two alternative matching models, all outcome differences were qualitatively similar to the base case but were not statistically significant (Table S6). For PFS, a sensitivity analysis based on the PFS data reported by Sehn et al. was performed and findings were consistent with the base case (Table S7).

MAIC of TAFA + LEN Versus BR

Three comparator studies were included to inform the MAIC analysis of TAFA + LEN versus BR: (1) the BR arm of the phase 2 GO29365 trial (with PFS data from the FDA submission dossier for POLA [31] rather than the primary publication [30], as described above); (2) a phase 2 study by Vacirca et al. [32]; and (3) a phase 2 study by Ohmachi et al. [33]. In Vacirca et al., OS was not reached, whereas Ohmachi et al. assessed neither OS nor DOR. Thus, these studies were included only for the outcomes they reported.

As noted above, the GO29365 trial differed from L-MIND regarding the duration of treatment and primary endpoint. Like L-MIND, the Vacirca and Ohmachi studies were both single-armed and reported ORR as the primary endpoint, but whereas L-MIND employed a maximum of 12 × 28-day treatment cycles, the Vacirca study had a maximum of 6 × 28-day cycles, and the Ohmachi study had a maximum of 6 × 21-day cycles.

Demographic and clinical factors included in the MAIC analyses of TAFA + LEN versus BR in the GO29365 trial, and the Vacirca and Ohmachi studies, along with changes before and after the population-adjustment are presented in Table 5.

Table 5 Baseline characteristics of the L-MIND Study, the weighted L-MIND Population, and the GO29365 trial, Vacirca et al. 2014a, and Ohmachi et al. 2013 for BRb

Sensitivity to alternative model choices were not performed on the comparisons using the GO29365 trial as a source of evidence due to the small effective sample size achieved, as well as concerns raised when assessing alternative distributions of the MAIC weights in the L-MIND population, with a few patients given extreme weights. Sensitivity models were applied to the Vacirca and Ohmachi studies by changing the list of factors included in the population-adjustment; baseline characteristics were similar to the base case (Table S8).

Comparative Efficacy Analysis for TAFA + LEN Versus BR

Compared with BR from the GO29365 trial, TAFA + LEN was associated with significantly improved OS (HR 0.39, 95% CI 0.18, 0.82; p = 0.014), PFS (HR 0.35, 95% CI 0.18, 0.71; p = 0.003), DOR (HR 0.15, 95% CI 0.05, 0.51, p = 0.002), and ORR (OR 3.40, 95% CI 1.05, 11.02; p = 0.041). A numerical advantage favoring TAFA + LEN was observed for CRR (OR 2.36, 95% CI 0.68, 8.21; p = 0.177) (Table S9; Fig. 2). For PFS, a sensitivity analysis based on the PFS data reported by Sehn et al. was performed and produced results in line with the base case (Table S7).

Fig. 2
figure 2

Observed and weighted efficacy results for TAFA + LEN versus BR from the GO29365 trial. a KM estimates of OS, b KM estimates of PFS by IRC, c KM estimates of DOR by IRC, and d depth of responses by IRC of patients enrolled in the L-MIND study before and after the population matching and reported for BR. BR bendamustine + rituximab, CR complete response, DOR duration of response, IRC independent review committee, KM Kaplan–Meier, LEN lenalidomide, MAIC matching-adjusted indirect comparison, ORR overall response rate, OS overall survival, PFS progression-free survival, PR partial response, TAFA tafasitamab

In the comparison to BR using data from the Vacirca study, TAFA + LEN was associated with significantly improved PFS (HR 0.35, 95% CI 0.24, 0.52; p < 0.001) and CRR (OR 3.36, 95% CI 1.40, 8.07; p = 0.007), as well as numerically improved ORR (OR 1.48, 95% CI 0.72, 3.03; p = 0.281) (Table S9; Fig. 3). When compared with BR using the Ohmachi study, TAFA + LEN was associated with numerically improved PFS (HR 0.59, 95% CI 0.31, 1.15; p = 0.122) and CRR (OR 1.51, 95% CI 0.51, 4.46; p = 0.459), while ORR outcomes were similar to those with BR (OR 1.00, 95% CI 0.35, 2.85; p = 0.995) (Table S9; Fig. 4).

Fig. 3
figure 3

Observed and weighted efficacy results for TAFA + LEN versus BR from the Vacirca et al. (2014) study. a KM estimates of PFS by IRC, b KM estimates of DOR by IRC, and c depth of responses by IRC of patients enrolled in the L-MIND study before and after the population matching and reported for BR. BR bendamustine + rituximab, CR complete response, DOR duration of response, IRC independent review committee, KM Kaplan–Meier, LEN lenalidomide, MAIC matching-adjusted indirect comparison, ORR overall response rate, PFS progression-free survival, PR partial response, TAFA tafasitamab

Fig. 4
figure 4

Observed and weighted efficacy results for TAFA + LEN versus BR from the Ohmachi et al. (2013) study. a KM estimates of PFS by IRC, and b depth of responses by IRC of patients enrolled in the L-MIND study before and after the population matching and reported for BR. BR bendamustine + rituximab, CR complete response, DOR duration of response, IRC independent review committee, KM Kaplan–Meier, LEN lenalidomide, ORR overall response rate, PFS progression-free survival, PR partial response, TAFA tafasitamab

The findings of the sensitivity analysis for the Vacirca study were overall similar to the base case model; in the sensitivity analyses for the Ohmachi study PFS and CRR were numerically superior for TAFA + LEN but not statistically significant, whereas ORR was numerically superior for BR patients but was not statistically significant (Table S10).

Comparative Efficacy Analysis for TAFA + LEN Versus BR: Pooled Efficacy Data

In the pooled analysis with BR, combining GO29365, Vacirca et al. (2014) and Ohmachi et al. (2013) studies, TAFA + LEN was associated with significantly improved PFS (HR 0.39, 95% CI 0.29, 0.53; p < 0.001), DOR (HR 0.35, 95% CI 0.25, 0.50; p < 0.001) and CRR (OR 2.43, 95% CI 1.33, 4.41; p = 0.004) (Table 6). A numerical advantage in favor of TAFA + LEN was noted with ORR (OR 1.59, 95% CI 0.94, 2.69; p = 0.086).

Table 6 Pooled relative efficacy estimates for observed and weighted TAFA + LEN versus BR

MAIC of TAFA + LEN Versus R-GEMOX

The comparator study for R-GEMOX (Mounier et al.) included rituximab-naïve patients, which severely limited its comparability with the TAFA + LEN study population in L-MIND that only included patients previously exposed to an anti-CD20 agent. However, a MAIC versus R-GEMOX was attempted, and the full results are reported in the supplementary material (Supplementary Results; Figs. S3 and S4; Tables S11 and S12).

Comparative Efficacy Analysis for TAFA + LEN Versus R-GEMOX

Briefly, following adjustment, all outcomes showed a numerical advantage in favor of TAFA + LEN, although none reached statistical significance: OS, HR 0.55, 95% CI 0.28, 1.06; p = 0.073; PFS, HR 0.59, 95% CI 0.30, 1.17; p = 0.133; ORR, OR 1.42, 95% CI 0.46, 4.38; p = 0.543; CRR, OR 1.09, 95% CI 0.34, 3.54; p = 0.882.

Discussion

The present study reports analyses of the comparative effectiveness of TAFA + LEN versus current rituximab-based treatments for patients with R/R DLBCL using indirect comparisons based on MAIC methodology. MAICs are particularly useful in the context of single-arm trials, such as the L-MIND study used in this analysis, where it is not possible to “anchor” the comparison with a common comparator arm and, therefore, adjustments based on prognostic and effect-modifying variables are required to increase comparability with published studies of alternative therapies [34].

The findings of this analysis indicate that TAFA + LEN is likely to offer clinically meaningful improvements in the evaluated treatment outcomes compared with current rituximab-based treatments for R/R DLBCL. The added benefit of TAFA + LEN estimated by the MAIC is overall consistent with, although slightly lower than, the added benefit estimated by recent retrospective cohort studies [35, 36].

This MAIC study followed as closely as possible the guidelines of the NICE Decision Support Unit, which advises matching using all known and available prognostic factors and effect modifiers when unanchored MAICs are performed [22]. However, because of the differences observed between the L-MIND and comparator studies, matching models could not adjust for all prognostics factors and effect modifiers. Consequently, there is a potential for bias because of unobserved confounders among the variables that were not reported in comparator studies, as they may have affected the relative efficacy estimates. Whenever possible, sensitivity analyses were employed to assess the effect of using alternative selections of prognostic factors and effect modifiers and test the robustness of base case results.

This study is subject to limitations which should be considered when interpreting the results. A MAIC can only adjust for observed differences in baseline characteristics between the included populations. Hence, bias in the analyses results may be introduced by differences in the design of the trials (e.g., use of co-therapies, patient monitoring, and assessment schedule).

In addition, some differences were observed in the measurement of outcomes and characteristics between the L-MIND and comparator studies. Different versions of the International Working Group (IWG) response criteria were employed by investigators to assess disease response in some of the studies. Therefore, it is possible that the reported surrogate outcomes may not be entirely comparable in the analyses of L-MIND (IWG 2007) versus the GO29365 trial (Modified Lugano 2014) and the Mounier et al. (IWG 1999) study. Moreover, definitions of PFS, and of the censoring rules used in the analyses of PFS, were not explicitly stated in most comparator publications, and, as a result, the comparison of PFS may be subject to limitations.

Furthermore, since DOR was calculated after balancing the baseline characteristics of the efficacy populations of the L-MIND study and of the comparator publications (GO29365; Vacirca et al.), the interpretation of DOR should consider that the baseline characteristics of the responder populations may differ between the L-MIND and comparator studies, as they could indeed differ between the two arms of a randomized controlled trial. In fact, since response depends on treatment efficacy, and the selection of patients who achieve a response is a function of the treatment’s mechanism of action and efficacy, a population adjustment made by balancing the characteristics of the responder populations would not be appropriate, as it would disregard the fact that, given a similar population at baseline, two treatments with different mechanisms of action will likely lead to a responder population with different characteristics.

In the absence of detailed definitions for certain key baseline characteristics, it was assumed that the definitions used in the L-MIND and comparator studies were similar. In the case of the comparison with the Vacirca study (i.e., TAFA + LEN vs. BR), patients with non-DLBCL NHL from the L-MIND study were included in the analysis, as the Vacirca study did not specifically exclude these patients. In the Ohmachi and Mounier studies, the definition of patients’ refractoriness was not available, and it was assumed that these were comparable with the one used in the L-MIND study.

In the comparisons of TAFA + LEN versus BR using the GO29365 trial and the Ohmachi study, and in the comparison versus R-GEMOX, the effective sample size achieved after the population adjustment was substantially smaller (reduced by 74–75%) compared with the L-MIND original sample size. Such low effective sample size numbers can reduce the power with which inference can be made. Moreover, in certain comparisons, a few patients were assigned large weights to balance the population, thus increasing the sensitivity of the results to the records of a few patients.

The results of MAIC comparisons between TAFA + LEN and BR showed that added benefit was smaller when L-MIND was compared to Ohmachi et al. than with the GO29365 trial or the Vacirca study. Such heterogeneity in the estimates of relative effectiveness could have been caused by potential differences in treatment-free interval of patients at baseline, but these differences could not be adjusted for in the population adjustment. Alternatively, the exclusion of patients with less than 3 months’ life expectancy in the Ohmachi study, in contrast to L-MIND and the other BR studies, might have played a role in the observed differences, i.e., frailer patients were not candidates for inclusion in the Ohmachi study, and this may explain the superior outcomes observed here.

Conclusions

The MAICs performed in the present study showed that treatment with TAFA + LEN in patients with R/R DLBCL was associated with statistically significant and clinically meaningful improvements in survival outcomes compared to treatment with common alternatives, such as BR or POLA + BR. The improved clinical benefit of TAFA + LEN over existing rituximab-based therapies contribute to the creation of the body of evidence needed to inform discussions with regulatory and national health technology assessment authorities, and contribute to identifying the most appropriate place of TAFA + LEN within therapeutic strategies for patients with R/R DLBCL, in which alternative treatment options are currently limited. It should also be noted that the results of MAICs must be interpreted with caution, owing to known methodological limitations of unanchored comparisons, and should be confirmed by large-sample randomised controlled trials.