Introduction

Rituximab in combination with fludarabine and cyclophosphamide (RFC) is currently the standard of care for medically fit patients with previously untreated chronic lymphocytic leukemia (CLL) [13]. However, many patients with CLL are in their 70’s and beyond before they need to start treatment, and are likely to have a greater co-morbidity burden [4]. For this often medically unfit population, RFC is unsuitable, with data from several clinical studies suggesting that the regimen is associated with excessive toxicity (e.g., cytopenias and increased infection rates) relative to the remission rates achieved [5, 6]. Other therapeutic options are available for unfit patients with CLL and include chlorambucil (Clb) in combination with an anti-CD20 antibody such as rituximab (R-Clb), obinutuzumab (G-Clb) or ofatumumab (O-Clb), rituximab in combination with bendamustine (R-Benda), dose-reduced fludarabine with cyclophosphamide (FC), and a dose-modified RFC regimen (RFC-Lite) [24, 710]. Available data from randomized controlled trials (RCTs) suggest an improvement in efficacy with certain regimens in this setting. For example, significant improvements in progression-free survival (PFS) have been reported in unfit patients with CLL with R-Benda compared with R-Clb in the MaBLe study (median PFS 39.6 versus 29.9 months; hazard ratio [HR] 0.523; 95% confidence interval [CI] 0.339–0.806; P = 0.003) [10], and with G-Clb compared with the equivalent rituximab regimen in the CLL11 study (29.2 versus 15.4 months; HR 0.40; 95% CI 0.33–0.50; P < 0.001) [11]; however, with a limited number of head-to-head treatment comparisons available, the optimal treatment for unfit patients with CLL remains unclear.

Network meta-analysis (NMA) allows information from direct head-to-head studies to be combined with information from indirect treatment comparisons to enable estimation of the comparative efficacy of therapies and build a hierarchy of available treatments [12, 13]. Furthermore, the outputs of NMA-based comparative effectiveness research can be used in a full economic appraisal of competing interventions to assess the cost-effectiveness [14]. The usefulness of NMA has been demonstrated across a range of therapeutic interventions and disease areas including CLL [1520]. Naïve comparison of drug treatments based on data from different studies carries with it a risk of making incorrect conclusions; by assuming the constancy of relative treatment effects (odds ratios or HRs) to link studies, NMA minimizes this risk. We, therefore, conducted a systematic review and Bayesian NMA of data from all RCTs comparing at least two interventional treatments in patients presenting with ‘first-line’ CLL and/or who were not eligible to receive full-dose fludarabine (F) to determine the relative effects of treatments on PFS and overall survival (OS).

Methods

Data Source and Searches

We conducted an initial literature search plus two updates of five databases (Embase®, Medline®, Medline® In-Process, the Cochrane Database of Systematic Reviews [CDSR], and the Database of Abstracts of Reviews of Effects [DARE]) to identify RCTs investigating first-line treatment in CLL published between January 1992 and August 2015. Search terms included the names of drugs used as first-line treatment for CLL combined with the Medical Subject Heading (MeSH) term ‘Leukemia, lymphocytic, chronic, B-cell’ (see supplementary material, section A for search strategies used). In addition to the above searches, abstracts from the proceedings of selected conferences (American Society of Clinical Oncology, American Society of Hematology, European Hematology Association) held between March 2010 and August 2015 were hand-searched. ClinicalTrials.gov (US National Institute of Health), the World Health Organization’s (WHO) meta-registry, the International Clinical Trials Registry Platform Search Portal, and the Cochrane Central Register of Controlled Trials were searched for trials in progress. Bibliographic searching of included trials and systematic reviews was also performed. Language of publication was restricted to English.

Study Selection

The systematic literature search and screening process for trials are described in section B of the supplementary material. Two independent reviewers screened the titles and abstracts of all identified studies to produce a list of potentially relevant studies. They then performed a detailed screening of the full-text versions of these studies to identify the final list of studies for consideration in the analysis. When reviewing the full text, the objective was to identify whether the study inclusion criteria allowed for the enrollment of patients with co-morbidities, such as renal impairment. In addition, during the same full-text review, the baseline characteristics of patients included in the studies were reviewed to identify (if any) the level of co-morbidities. Any discrepancies in the decisions of the two reviewers were resolved by a third independent reviewer.

Studies selected for inclusion in the NMA were RCTs comparing first-line treatment in patients with CLL. Also, included studies had to be linked via common treatment comparators, to report PFS and/or OS data, and were required to meet at least one of five ‘co-morbidity’ inclusion criteria in descending sequence priority: a median cumulative illness rating score (CIRS) of >6, median creatinine clearance ≤70 mL/min, existing co-morbidities (particularly relating to renal impairment), median age ≥70 years, and/or no full-dose F in the comparator arm. These five criteria served as the basis for the identification of publications in the literature that may have evaluated first-line treatments for co-morbid/unfit patients with CLL. A final manual review of all the studies that met at least one of these five criteria was performed and then validated by external experts in the field of CLL, to confirm whether the identified studies did, in fact, evaluate the first-line therapeutic options for co-morbid/unfit and/or full-dose F-ineligible patients with CLL.

Data Extraction and Statistical Analysis

For each selected trial, unadjusted HRs for PFS and OS were extracted into a pre-defined extraction grid to ensure that the data were extracted uniformly and were comparable across studies. Data were independently extracted by two analysts, with their results checked and reconciled by a third-party independent analyst. If HR data were not reported, HRs were estimated using a method that was appropriate for the available published PFS/OS statistic. For example, if the median PFS/OS was reported, an exponential distribution was assumed and the HR was estimated as the ratio of the median PFS/OS time for the two treatment arms [21]. In papers where landmark PFS/OS data were reported (e.g., 3-year PFS), proportionality of hazard functions was assumed and the HR was calculated as HR = lnS 1(t)/lnS 0(t), where S 0(t) and S 1(t) denote survival estimates at time t for the control and treatment arm, respectively. In only two of the trials, HRs were not reported and were estimated.

NMA as presented in this manuscript was based on the natural logarithms of the HRs (lnHR) and standard deviations (SDs). Published CIs or log-rank P values (in two cases where HRs values were not reported) were used to estimate SDs for lnHRs [22].

Network Meta-Analysis

The NMA was conducted using a hierarchical, contrast-based model where lnHR i of trial i follows a normal distribution centered at the (unknown) treatment effect with an SD equal to SD i . To deal with three-arm trials, we followed the approach described by Dias et al. [23]. Fixed effects (FE) and random effects (RE) models were explored. The latter accounted for between-study variation using a heterogeneity parameter τ.

The NMA was performed for PFS and OS on two different evidence networks: the main analysis was based on a network of five studies for PFS and four studies for OS, which were selected according to the five pre-defined criteria and expert opinion (base-case analysis); an additional analysis was also conducted based on a secondary evidence network that included an additional three studies (i.e., a total of eight studies for PFS and seven studies for OS that met at least one of the five pre-defined criteria [no expert involvement]).

Model inference was conducted within a Bayesian framework (see supplementary material, section C for the full BUGS code) [24]. The posterior distributions of the model parameters were obtained using Markov Chain Monte Carlo (MCMC) methods implemented in the software JAGS [25]. All analyses were performed using the computing environment R. The R package “R2jags” was used for MCMC simulations. Using three chains, the first 10,000 simulations, with a thinning rate of 500, were discarded as burn-in. Parameters were then monitored for a further 1 million simulations, with the same value of thinning, resulting in a total of 2000 MCMC samples per chain. Convergence of the chains was confirmed using trace plots, density plots of treatment effects, and the Gelman and Rubin’s diagnostic statistic [24].

For the FE model, the heterogeneity parameter τ was assumed to be zero, and all other relevant parameters were equipped with flat priors. For the RE model, a half-normal prior of the form τ ~ HN[ u/1.96)2] was considered [26]. This prior has its mode at 0 and is steadily declining in τ, with an upper 95% point at τ u. We set τ u at 0.25, which yields an informative prior (a heterogeneity parameter of τ u = 0.25 translates to HRs ranging from 0.61 to 1.63 [26]). We note that uninformative priors on τ are not appropriate in this setting as most comparisons are informed by a single study (see “Results”).

All results were reported as median posterior HRs with corresponding 95% credible intervals (CrIs). The treatments were ranked in each MCMC simulation, and medians and 95% CrIs of the posterior ranks were reported. Further, posterior probabilities of being the best treatment were obtained as the proportion of simulations in which each treatment had the smallest HR.

Compliance with Ethics Guidelines

This article is based on previously conducted studies, and does not involve any new studies of human or animal subjects performed by any of the authors.

Results

Systematic Review and Included Studies

The initial literature search and two updates yielded 244 citations published between January 1992 and August 2015 (supplementary material, section B). Following screening and examination of the papers, we selected a total of eight RCTs that met at least one of the five pre-defined criteria: CLL11, CLL5, COMPLEMENT 1, Nikitin, MaBLe, Knauf, CAM307, and CALGB9011 [810, 2731]. According to expert feedback, three of the studies, Knauf, CAM307, and CALGB9011, did not match the typical unfit patient scenario and were considered to have included patients who were more fit compared with the other RCTs.

Table 1 summarizes the main characteristics of the included studies. The treatments evaluated in the eight studies included four single agents: F (in two treatment arms), Clb (six treatment arms), alemtuzumab (Alm; one treatment arm) and bendamustine (Benda; one treatment arm), and five combination regimens: G-Clb (one treatment arm), R-Clb (three treatment arms), R-Benda (one treatment arm), RFC-Lite (one treatment arm) and O-Clb (one treatment arm). Eight RCTs reported PFS, and six RCTs reported OS (Cam307 and Nikitin did not report OS). Treatment effects in terms of PFS and OS for the eight studies are summarized in Table 2.

Table 1 Summary of the eight randomized controlled studies evaluating first-line therapy in chronic lymphocytic leukemia selected for inclusion in the network meta-analysis (main and additional analysis)
Table 2 Summary of lnHRs and SDs for PFS and OS derived from the eight randomized controlled trials evaluating first-line therapy in chronic lymphocytic leukemia included in the network meta-analysis

Network Meta-Analysis

Figure 1 summarizes the network geometries for PFS and OS, showing the included studies and direct treatment comparisons. The studies excluded in accordance with expert opinion are highlighted in red. We consider a full network of eight RCTs (eight for PFS, seven for OS) and a reduced network of five RCTs (trials excluded in accordance with expert opinion; five for PFS, four for OS). The analysis performed on the reduced network represents our main analysis; the full network was used for completeness in an additional analysis.

Fig. 1
figure 1

Network of trials and treatments selected using the five ‘fludarabine-ineligibility’ criteria a PFS and b OS. The main analysis excluded the three studies highlighted in red (expert recommendation). The additional analysis is based on the whole network. Alm alemtuzumab, Benda bendamustine, Clb chlorambucil, F fludarabine, G-Clb obinutuzumab + chlorambucil, O-Clb ofatumumab + chlorambucil, OS overall survival, PFS progression-free survival, R-Benda rituximab + bendamustine, R-Clb rituximab + chlorambucil, RFC rituximab + fludarabine + cyclophosphamide

Main Analysis

Forest plots displaying median HRs and CrIs for PFS and OS for the different treatments compared with G-Clb are shown in Fig. 2 for the main analysis, FE model, of five RCTs. For the clinical endpoint of PFS, the FE NMA model showed a preference for G-Clb compared with R-Clb, O-Clb, F, and Clb; the CrIs for these data were relatively narrow and all median HR values were substantially <1 (0.43, 0.33, 0.20, and 0.19, respectively). Also, there was a trend for better efficacy for G-Clb over R-Benda and RFC-Lite in terms of PFS, with HRs of 0.81 and 0.88, respectively. Comparable results were also observed when between-study heterogeneity was accounted for using an RE model with an informative prior (uninformative priors are not appropriate here as all comparisons are informed by a single study). Using this approach, median HR values for PFS were within the same range for G-Clb compared with R-Clb, O-Clb, F, and Clb (0.43, 0.33, 0.20, and 0.19, respectively) and 0.81 and 0.88, respectively, for G-Clb versus R-Benda and RFC-Lite. As expected, the RE model led to wider CrIs. Data for the main analysis using the RE model are summarized in the supplementary material, sections D and E.

Fig. 2
figure 2

Main analysis (Knauf, Cam307, and Calgb_9011 excluded): effect of interventional treatments on a PFS and b OS using a FE model. Forest plots show relative effect of each treatment on PFS and OS as compared with the reference combination treatment G-Clb. Median HRs and CrIs are shown. Clb chlorambucil, CrI credible interval, F fludarabine, FE fixed effects, G-Clb obinutuzumab + chlorambucil, HR hazard ratio, O-Clb ofatumumab + chlorambucil, OS overall survival, PFS progression-free survival, R-Benda rituximab + bendamustine, R-Clb rituximab + chlorambucil, RFC rituximab + fludarabine + cyclophosphamide

OS results were generally consistent with the PFS results, suggesting greater efficacy for G-Clb over other treatments. Using the FE model, there was a preference for G-Clb compared with Clb, O-Clb, and R-Clb (median HR 0.48, 0.53, and 0.81, respectively, and relatively narrow CrIs) and a trend for better efficacy for G-Clb versus F and R-Benda (median HR 0.35 and 0.81, respectively; Fig. 2). These findings also held true for the RE model (supplementary material, sections D and E). However, due to fewer OS events, the CrIs for these data, using both the FE and RE models, were considerably wider than for the PFS data, suggesting that there was greater uncertainty associated with the OS data. For example, the large CrI for G-Clb versus F (FE model: HR 0.35, CrI 0.07, 1.86; RE model: HR 0.34, CrI 0.06, 1.93) was driven by an SD of 0.812 from the CLL5 trial (Table 2).

Treatment ranking reflected the improvement in PFS and OS associated with G-Clb over other treatment strategies (median rank 1 for both), and also indicated that G-Clb had a higher probability of being the favored therapy using both the FE model (PFS 56% [median rank 1, CrI 1, 3]; OS 57% [median rank 1, CrI 1, 4]; Table 3) and the RE model (PFS 53% [median rank 1, CrI 1, 3]; OS 55% [median rank 1, CrI 1, 4]; supplementary material, sections D and E).

Table 3 Main analysis: treatment ranking for PFS and OS

Additional Analysis

The analysis was repeated on all eight trials identified using the five F-ineligibility criteria (no expert assessment). Inclusion of an additional three studies made minimal difference to the estimated treatment effects and the rank ordering of treatments (Fig. 3; Table 4). These findings for the additional analysis were consistent for both the FE and RE models. Data for the additional analysis using the RE model are summarized in the supplementary material, sections D and E.

Fig. 3
figure 3

Additional analysis (Knauf, Cam307, and Calgb_9011 included): effect of interventional treatments on a PFS and b OS using a FE model. Forest plots show relative effect of each treatment on PFS and OS as compared with the reference combination treatment G-Clb. Median HRs and CrIs are shown. Alm alemtuzumab, Benda bendamustine, Clb chlorambucil, CrI credible interval, F fludarabine, FE fixed effects, G-Clb obinutuzumab + chlorambucil, HR hazard ratio, O-Clb ofatumumab + chlorambucil, OS overall survival, PFS progression-free survival, R-Benda rituximab + bendamustine, R-Clb rituximab + chlorambucil, RFC rituximab + fludarabine + cyclophosphamide

Table 4 Additional analysis: treatment ranking for PFS and OS

For OS, of note was the narrower CrI for the comparison of G-Clb versus F (FE model: HR 0.57, CrI 0.34, 0.95) compared with the wide CrI presented earlier in the main analysis. The higher precision in the additional analysis is a consequence of the combined evidence from the CLL5 and CALGB9011 trials.

Differences in terms of PFS efficacy were observed among chemotherapies, with Benda showing greater efficacy compared with F, Clb, and Alm; HRs were 0.35 (Benda versus Clb), and 0.61 (Benda versus Alm) using both the FE and RE models, and were 0.46 and 0.45 for Benda versus F using the FE and RE models, respectively (supplementary material, sections D and E).

Discussion

This NMA evaluating the relative efficacy of therapeutic interventions for the first-line treatment of unfit patients with CLL produced a number of key findings. Consistently, for both efficacy endpoints (PFS and OS) and all scenarios, our results suggest that therapy comprising a combination of the anti-CD20 monoclonal antibody obinutuzumab and chlorambucil (G-Clb) is likely to be superior to many other treatment options including Benda, R-Clb, O-Clb, Alm, F, and Clb in unfit patients with CLL. In addition, G-Clb showed a trend for greater efficacy over the regimens RFC-Lite and R-Benda in this setting. Our results for OS were generally consistent with the data on PFS, suggesting beneficial outcomes with G-Clb over other regimens, and were driven by the consistent trend in OS favoring G-Clb in the CLL-11 trial [11, 27, 32]. However, the OS data in our study were associated with greater uncertainty, and this was not unexpected given that the follow-up times for OS were relatively short. We also note that to the best of our knowledge, PFS has not yet been validated as a surrogate endpoint for OS in the first-line treatment of unfit patients with CLL within a meta-analytic framework [33].

Our NMA of PFS (additional analysis) also supported the findings of the Knauf study suggesting that Benda is a more potent chemotherapy, leading to ‘deeper’ remission than the traditional agent Clb in unfit patients [29]. Given this finding, it was of interest to note a trend favoring the combination of obinutuzumab with Clb over rituximab combined with Benda, as shown by the HR estimate for PFS (0.81 CrI 0.49, 1.33).

Through the use of pre-defined ‘co-morbidity’ inclusion criteria combined with expert review, we were able to restrict our analysis to unfit patients with CLL who were likely to meet the unfit definition as described in the CLL11 trial [27]. However, information on CLL patient ‘fitness’/CIRS is not always reported in the literature, making it difficult to evaluate a study population’s level of fitness/co-morbidity; moreover, there is no well-defined surrogate marker for fitness status/co-morbidity. To address this, we used five pre-defined co-morbidity criteria (median CIRS >6, median CrCL ≤70 mL/min, existing co-morbidities, median age ≥70 years, and no full-dose F in the comparator arm) to approximate the level of patient fitness/co-morbidity indirectly in each study. Because of the potential heterogeneity of the identified papers, a manual review of the full text of the identified papers was conducted, and the final selection of papers was approved by experts. This final selection led to the exclusion of the following studies from the main analysis: Cam307 (study inclusion criteria included adequate renal and liver function and median age was 59–60 years) [30], Calgb_9011 (F dose appeared to be the full dose, 20 or 25 mg/m2 intravenously days 1–5 every 28 days) [31], and Knauf (a high percentage of patients in both the Benda [70%] and Clb [65%] arms had a WHO performance score of 0 and a median age of approximately 65 years, therefore, it was unlikely that the patients included were not eligible for full-dose F) [29]. It is also important to note that the three excluded studies were conducted at a time when Clb (rather than RFC) was the standard of care even in fit patients, since at that time, no treatment had shown an OS benefit over Clb. Also, these studies were not designed to explicitly enroll unfit patients, their median age was substantially lower than the remaining five studies, and chemotherapy alone is not currently considered a relevant treatment option (even in unfit CLL).

An important advantage of NMA over naïve inter-trial comparisons is that the calculations are based on relative treatment effects (in terms of HRs) rather than absolute effects. Thereby, NMA circumvents the potential incomparability of two studies due to differences in the distributions of measured and/or unmeasured prognostic factors. For example, a naïve comparison of the R-Benda arm in MaBLe (median PFS 39.6 months) [10] with the G-Clb arm in CLL11 (median PFS 29.2 months) [11] would have led to the opposite conclusion of better performance of R-Benda versus G-Clb. This naïve comparison compares efficacy on an ‘absolute’ scale and ignores the fact that the common comparator R-Clb in these two studies showed substantial discrepancy in terms of median PFS (29.9 months in MaBLe and 15.4 months in the CLL11 study, respectively). The prognostic differences that lead to this study bias between MaBLe and CLL11 may be manifold with differences in methodology of data generation and data read-out as the main drivers. We note that our NMA does not account for potential effect modifiers, which are defined as patient or study characteristics that influence the two treatment arms to a different extent and, therefore, alter the relative treatment effect. A speculative effect modifier could be the difference in the cumulative dose of Clb in MaBLe and CLL11. However, we do not believe that this solely explains the discrepancies seen in median PFS between the two R-Clb arms from MaBLe and CLL11, since there is evidence which points against ‘Clb dosing’ as an effect modifier of PFS [27, 34, 35]. Admittedly, there is still some debate around the role of Clb dosing on PFS.

The results of our NMA should ideally be validated in independent studies, and our findings should be interpreted taking into account a number of limitations. First, a limited number of studies were eligible for inclusion in the analysis. Many of the comparisons were informed by a single trial (e.g., CLL11 was the only trial to compare G-Clb and R-Clb). This limited the statistical assessment of heterogeneity and inconsistency and also precluded the performance of more sophisticated analyses, for example, meta-regression to adjust for potential effect modifiers. Our selection strategy (five criteria and expert opinion) also aimed at increasing the homogeneity of the trials in the network. Nonetheless, the levels of unfitness still varied among the selected trials, which could ultimately have influenced our NMA. Second, the follow-up times for OS (and to a lesser extent PFS) were relatively short; longer follow-up would likely have impacted on the effect size, especially the CrIs. Third, in two trials, assumptions were required to calculate lnHR and SD, and these estimates may have differed from the estimate of HR using whole Kaplan–Meier curves.

Our results are comparable with the findings reported by Ladyzynski et al. [16] in their recent NMA of first-line treatment for CLL. Using a Bayesian NMA, they compared the effectiveness of available therapies in terms of PFS and OS in patients with treatment-naïve symptomatic CLL and in subgroups of younger/fit and older/unfit patients (median age ≥69 years). Of the five treatments evaluated in older/unfit patients (Clb, F, O-Clb, G-Clb, and R-Clb), G-Clb was associated with longer projected mean PFS versus all other comparators (60 months versus 16–30 months) and had the highest potential of increasing OS (90 months versus 44–59 months). Also, corresponding median hazard rates for G-Clb were the lowest of all the analyzed treatment options (PFS <0.05 over 120 months; OS <0.01 during the initial 65 months of survival). Of note, this analysis differed from our own NMA in a number of respects: it stipulated stricter study eligibility criteria (all included RCTs were required to provide a survival curve); an overall analysis (no restriction on fitness); and separate analyses for fit and unfit were performed, whereas our focus was on the unfit setting only; older/unfit was defined as median age ≥69 years or median CIRS ≥8, whereas we used five criteria; and the MaBLe study was not included.

The results of a second NMA evaluating US Food and Drug Administration/European Medicines Agency-approved treatment options for previously untreated patients with CLL ranked RFC highest in terms of efficacy, while all single agents (with the exception of Alm) occupied the worst ranks [19]. Unlike our NMA, the authors imposed no restrictions in terms of physical fitness or age of the study participants, and full-dose F studies were also included. Furthermore, the analysis did not include important recent trials, for example, CLL11 (G-Clb), MaBLe (R-Benda), and Complement 1 (O-Clb). In an earlier study, Cheng et al. [36] also used similar NMA methodology to analyze treatments that had not been directly compared in terms of PFS in previously untreated patients with CLL. The findings suggested that RFC achieved relatively longer PFS compared with FC, F, Alm, and Clb (76 months versus 23–60 months); however, the data were limited to younger patients (59–65 years) with good performance status and early-stage disease [36]. In a multiple-treatment meta-analysis using direct and indirect data based on all available head-to-head RCTs, Terasawa et al. [37] concluded that there was insufficient evidence on OS to recommend a specific first-line treatment for CLL and that any observed PFS differences may have been attributable to the relatively young uncomplicated patient populations.

Our NMA suggests that G-Clb is an effective treatment for unfit patients with CLL, with suggestion of superiority over R-Benda and RFC-Lite, accepting the limitations discussed elsewhere in this manuscript. Previously, the MaBLe study showed that using Benda instead of Clb as the backbone of immunochemotherapy for unfit patients with CLL improves efficacy [10]. In addition, the ongoing GREEN study (NCT01905943) has reported preliminary safety and efficacy data for obinutuzumab in combination with bendamustine (G-Benda), also in the patient subset of unfit CLL [38]. A potential future therapeutic intervention for unfit patients with CLL that was not considered in our NMA is ibrutinib, a first-in-class oral inhibitor of Bruton’s tyrosine kinase which is a key mediator of B-cell signaling [39, 40]. Ibrutinib was recently licensed for the treatment of adult patients with previously untreated CLL, including those aged ≥65 years. In a final analysis of the phase III RESONATE-2 trial, ibrutinib was reported to significantly improve PFS and OS compared with Clb in previously untreated patients with CLL (n = 269; PFS median not reached versus 18.9 months, HR 0.16, 95% CI 0.09–0.28, P < 0.0001; OS median not reached for either arm, HR 0.16, 95% CI 0.05–0.56, P = 0.0010). Patients were aged ≥65 years, approximately 30% had a CIRS score >6, and the median European Cooperative Oncology Group status was 0, thus the majority of patients were closer to the populations in the CLL8 and CLL10 trials than to the population in CLL11 [40]. In addition, Ibrutinib is also under evaluation in the unfit setting in combination with G-Benda (CLL2-BIG trial, Clinical Trials.gov identifier, NCT02345863) [41]. Pyruvate dehydrogenase kinase (PDK) inhibitors [42] and other inhibitors of the B-cell receptor (BCR) pathway [43] are also in development for the treatment of CLL. However, PDK inhibitor development is still at a very early stage, and following initial promising efficacy results for idelalisib, studies of first-line use have been halted due to severe immune-mediated toxicities, and further safety data are now required [44]. We are not aware of any information regarding the availability and affordability of these compounds in the patient population considered for this NMA.

The goal of the current study was to inform on the efficacy of commercially available drugs for the treatment of unfit patients with CLL. Although economic factors play a key role in shaping healthcare decision-making, the consideration of drug costs and treatment outcomes was beyond the scope of the current study. For this reason, information on the cost-effectiveness of the different treatment options for unfit patients with CLL requires careful evaluation in subsequent publications and/or health technology assessment submissions. Furthermore, while considerations of effectiveness may be applicable across different healthcare systems, considerations of costs and values are more likely to be healthcare system-specific. Therefore, a cost-effectiveness guideline may be less transferable across countries than one based on clinical effectiveness [45].

Conclusions

Results from our NMA demonstrate a clear preference in terms of PFS for G-Clb versus R-Clb, O-Clb, fludarabine and chlorambucil, and a trend for better efficacy versus R-Benda and RFC-Lite. A higher degree of uncertainty was associated with the OS results, but the findings were generally consistent with PFS data. Together, these data support the conclusion that G-Clb is an effective first-line treatment for unfit patients with CLL and is likely to show superior efficacy compared with other treatment options.