Evaluating the clinical effectiveness and safety of various HER2-targeted regimens after prior taxane/trastuzumab in patients with previously treated, unresectable, or metastatic HER2-positive breast cancer: a systematic review and network meta-analysis

Purpose In the absence of head-to-head trial data, network meta-analysis (NMA) was used to compare trastuzumab emtansine (T-DM1) with other approved treatments for previously treated patients with unresectable or metastatic HER2-positive breast cancer (BC). Methods Systematic reviews were conducted of published controlled trials of treatments for unresectable or metastatic HER2-positive BC with early relapse (≤ 6 months) following adjuvant therapy or progression after trastuzumab (Tras) + taxane published from January 1998 to January 2018. Random-effects NMA was conducted for overall survival (OS), progression-free survival (PFS), overall response rate (ORR), and safety endpoints. Results The NMA included regimens from seven randomized controlled trials: T-DM1 and combinations of Tras, capecitabine (Cap), lapatinib (Lap), neratinib, or pertuzumab (Per; unapproved). OS results favored T-DM1 over approved comparators: hazard ratio (HR) (95% credible interval [95% CrI]) vs Cap 0.68 (0.39, 1.10), LapCap 0.76 (0.51, 1.07), TrasCap 0.78 (0.44, 1.19). PFS trends favored T-DM1 over all other treatments: HR (95% CrI) vs Cap 0.38 (0.19, 0.74), LapCap 0.65 (0.40, 1.10), TrasCap 0.62 (0.34, 1.18); ORR with T-DM1 was more favorable than with all approved treatments. In surface under cumulative ranking curve (SUCRA) analysis T-DM1 ranked highest for all efficacy outcomes. Discontinuation due to adverse events was less likely with T-DM1 than with all comparators except neratinib. In general, gastrointestinal side effects were less likely and elevated liver transaminases and thrombocytopenia more likely with T-DM1 than with comparators. Conclusions The efficacy and tolerability profiles of T-DM1 are generally favorable compared with other treatments for unresectable or metastatic HER2-positive BC. Electronic supplementary material The online version of this article (10.1007/s10549-020-05577-7) contains supplementary material, which is available to authorized users.


Introduction
Human epidermal growth factor receptor 2 (HER2) is a receptor tyrosine-protein kinase, which is widely expressed and can promote tumorigenesis when expression is increased [1]; approximately 15% of breast cancer (BC) cases are HER2-positive, classified by HER2 protein overexpression or HER2 gene amplification [2]. HER2-positive BC has an aggressive clinical phenotype with, historically, a poor prognosis [3]. However, since its approval in 1998, the HER2-targeted humanized monoclonal antibody trastuzumab (Herceptin ® ; Roche) [4,5] has been associated with significant and clinically relevant improvements in diseasefree and overall survival (OS) [6].
Trastuzumab emtansine (T-DM1; Kadcyla ® ; Roche) is a first-in-class antibody-drug conjugate approved for the treatment of HER2-positive unresectable locally advanced BC (LABC) or metastatic BC (mBC) in patients previously treated with trastuzumab and a taxane, separately or in combination [7]. T-DM1 is a conjugate of DM1, a cytotoxic derivative of maytansine, and has the HER2-targeting and cytotoxicity-mediating properties of trastuzumab [8]. Conjugated DM1 is released to exert cytotoxic effects when the antibody-drug conjugate is internalized by the cell to which it binds [9].
Regulatory approval of T-DM1 for use in the mBC setting was based on improvements in progression-free survival (PFS) and in OS in the phase 3 EMILIA study; EMILIA was a multicenter, open-label randomized controlled trial (RCT) that evaluated the efficacy and safety of T-DM1 compared with capecitabine plus lapatinib in patients with HER2-positive LABC or mBC, who were previously treated with trastuzumab and a taxane [10]. T-DM1 received US Food and Drug Administration (FDA) approval in February 2013 [11] and European Medicines Agency (EMA) marketing authorization in November 2013 [12]. A recent descriptive analysis of EMILIA, which followed up patients who crossed over from the control group to T-DM1, corroborated the original findings of improved OS with T-DM1 relative to control [13]. Additionally, based on recent data from the KATHER-INE trial [14], in May 2019, T-DM1 was approved by the FDA for use in the early BC setting for patients with residual invasive disease after neoadjuvant taxane and trastuzumabbased treatment [15].
Several other therapies are available, or have been studied in patients with previously treated, unresectable, HER2positive LABC or mBC [16], including trastuzumab in combination with capecitabine [17] or vinorelbine [18], and the tyrosine kinase inhibitors lapatinib (Tyverb ® ; Novartis-in combination with capecitabine; LapCap) [19,20] and neratinib (Nerlynx ® ; Puma Biotechnology, Inc.) [21,22]. In the absence of direct head-to-head evidence from clinical trials, the relative efficacy of treatments can be assessed using network meta-analysis (NMA) to combine direct and indirect evidence from multiple independent trials [23]. Therefore, we conducted a systematic review (SR) and NMA to evaluate the clinical effectiveness and safety of T-DM1 versus other treatments for HER2-positive mBC. The monoclonal antibody pertuzumab (Perjeta ® ; Roche-treatment in combination with trastuzumab and chemotherapy) [24,25] was included in the NMA for completeness despite not being approved in this patient group. The primary endpoint analysis of the failed PHEREXA trial showed pertuzumab to increase PFS, but was not statistically significant, when added to trastuzumab plus capecitabine [26].

Methods
All SR and NMA methodology and reporting complied with Preferred Reporting Items for Systematic Reviews and Metaanalyses (PRISMA) guidelines.

Systematic review
The SR included published data between 1 January 1998 and 3 January 2018 based on searches of MEDLINE ® , EMBASE™, MEDLINE ® In-Process, the Cochrane Central Register of Controlled Trials (CENTRAL), Cochrane Methods studies, the Cochrane Database of Systematic Reviews (CDSR), and the Database of Abstracts of Reviews of Effects (DARE). Database searches were complemented by manual searches of conference abstracts from the American Society of Clinical Oncology (ASCO), the European Society for Medical Oncology (ESMO), the San Antonio Breast Cancer Symposium (SABCS), and the International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Studies in the SR were identified based on search strings provided in Online Resource 1. The SR searched for studies from three separate periods (1 January 1998-2 July 2013, 1 October 2012-30 June 2016, and 1 January 2016-3 January 2018), with sufficient overlap in time between the search periods to allow for indexing delays of published studies. Citations were reviewed by two independent reviewers, and discrepancies adjudicated by a third independent reviewer according to eligibility criteria presented in Online Resource 2. Eligible studies were controlled trials of pharmacological treatments for HER2-positive LABC with early relapse (within 6 months) following adjuvant therapy, or for HER2positive unresectable mBC in which patients had progressed after treatment with trastuzumab plus taxane. No pre-specified interventions or comparators were targeted, interventions were not required to be approved for use in this indication, and trials were included independent of randomization, phase, or blinding status. Critical appraisal of included trials was based on recommendations from the UK National Institute for Health and Care Excellence (NICE), Germany's Institute for Quality and Efficiency in Health Care (IQWIG), the Canadian Agency for Drugs and Technologies in Health (CADTH), the French National Authority for Health (HAS; randomized trials), and the Downs and Black checklist (nonrandomized trials) (Online Resource 3).

Treatment networks
Not all studies reported all selected endpoints. Separate network plots were developed for OS, PFS, overall response 1 3 rate (ORR), and the following safety-related endpoints (all based on treatment-related adverse events [AEs] Common Terminology Criteria for Adverse Events [CTCAE] grade 3 [severe] and above [grade 4, life-threatening or disabling; grade 5, death-related]): number of patients with AEs; serious AEs; treatment discontinuation due to AEs; and selected individual AEs, namely anemia, diarrhea, fatigue, increased alanine aminotransferase (ALT), increased aspartate aminotransferase (AST), mucosal inflammation, nausea, neutropenia, palmar-plantar erythrodysesthesia (PPE), thrombocytopenia, and vomiting.

Statistical methodology
A Bayesian NMA of PFS, OS, ORR, and safety endpoints was conducted on a log-hazard (PFS and OS) or log odds scale (ORR, safety endpoints). A Bayesian inferential framework was used, because it captures and propagates uncertainty while allowing external information to be included [27]. The NMA was conducted using the GeMTC R package in the Roche Biometrics Experimental Environment (BEE; R version 3.4.4) [28,29]; both fixed-effects and randomeffects approaches were used [23]. Markov Chain Monte Carlo (MCMC) convergence was assessed by inspecting trace plots and Brooks-Gelman-Rubin statistics [30]. The detailed methodology is included in Online Resource 4 [31]. Data inputs used to calculate hazard ratios (HRs) for PFS and OS, odds ratios (ORs) for ORR and AEs, and their respective 95% credible intervals (CrIs) are included in Online Resource 5. For all endpoints, the primary analyses were based on data as reported in the relevant RCTs; for Roche-sponsored trials (EMILIA [10], GBG 26 [32], and PHEREXA [26]), unpublished data were included where relevant.
A sensitivity analysis used the treatment crossoveradjusted HR for OS, calculated using either the rank-preserving structural-failure time model (RPSFTM) [33][34][35] or a Cox regression model with treatment crossover as a time-dependent covariate [36]. Although there is no one ideal method for adjusting for crossover [35], RPSFTM is one of the preferred methods that preserves randomization and is least prone to selection bias [33][34][35]; whereas use of a Cox regression model with treatment crossover as a timedependent covariate was found to be the most robust among several approaches reported in an analysis of a phase 3 trial of LapCap versus capecitabine, and was considered to provide a more balanced result [36]. Overall rankings for each treatment were determined by estimation of Surface Under the Cumulative Ranking Curve (SUCRA; range 0-100%) [37].

Study selection and heterogeneity assessment
The results of the three rounds of SR are summarized in Fig. 1. The initial SR covered searches from 1 January 1998 to 2 July 2013. After deduplication, 3822 records were identified for screening; from which five RCTs were identified for analysis: EMILIA [10]; GBG 26 [32]; EGF100151 [38]; Martin et al. [39]; and CEREBEL [17]. The first update of the SR encompassed searches from 1 October 2012 to 30 June 2016. After deduplication, 3401 records were identified for screening, from which updates to the original five RCTs and one new RCT were identified: PHEREXA [26]. The second update to the SR covered searches from 1 January 2016 to 3 January 2018. After deduplication, 2923 records were identified for screening, from which updates and one further RCT, ELTOP [40], were identified.
Overall, seven RCTs met the criteria for inclusion in the NMA: EMILIA [10]; GBG 26 [32]; EGF100151 [38]; a phase 2 trial of neratinib versus lapatinib plus capecitabine (Martin et al. 2013) [39]; PHEREXA [26]; ELTOP [40]; and a prior trastuzumab treatment subgroup in CEREBEL [17]although CEREBEL did not meet the inclusion criteria, it included a subgroup defined as "patients who received prior trastuzumab either in adjuvant or metastatic setting" and was thus included in the NMA. The phase 3 TH3RESA study [41] that evaluated T-DM1 was excluded from the NMA because of limitations that its inclusion would impose i.e., only patients on certain treatment regimens in the comparator arm "Physician's choice" were relevant to the NMA, and disaggregating data on particular treatments from this arm would have broken randomization. Thus, selecting a particular treatment from "Physician's choice" would introduce bias and break the fundamental principle of NMA which relies on randomized evidence.
Eligibility criteria for inclusion in the NMA are summarized in Online Resource 2, and trial information and patient characteristics at baseline in the trials, which were included in the analysis, are summarized in Table 1. Although there were differences in population size and treatment line among studies, heterogeneity assessment indicated that all trials were comparable in terms of randomization, allocation concealment, demographic and baseline characteristics, outcome selection and reporting, patient withdrawal from the studies, and statistical analyses undertaken (Online Resource 3). In total, there were five phase 3 studies and two phase 2 studies. All were open-label, but EMILIA and EGF100151 used independent review committees to assess outcomes and, therefore, the outcome assessors were blinded to study treatment. Patients from all studies had been treated previously with trastuzumab; however, only results from a "prior 1 3 trastuzumab treatment" subgroup were included from the CEREBEL study. Results of critical appraisal of trials are presented in Online Resource 3.

Treatment networks
Seven studies reported data for OS, for OS adjusted for treatment crossover, and for PFS (Fig. 2a); six studies for ORR as data were not reported in CEREBEL (Fig. 2b). Various treatment network plots were generated for the safety endpoints (Online Resource 6). Six studies were linked to network plots of treatment discontinuation (due to AEs), diarrhea, neutropenia, and ALT. Five studies were linked to plots for fatigue, nausea, and vomiting, four studies were included for AEs (grade 3 and above) and AST, and two studies were linked for serious AEs (Online Resource 6).

Model selection
The Bayesian random-effects model was the base-case analysis, and was preferred over the fixed-effects model for all endpoints to account for heterogeneity among the included studies. Between-study variance cannot be estimated owing to the small number of available studies and assuming homogeneity was considered to be implausible. Hence, informative priors based on the best empirical evidence were used instead [42]. Convergence statistics for the random-effects model are shown in Online Resource 8, and convergence plots are presented in Online Resource 10. For completeness, results obtained with the fixed-effects model are shown in Online Resource 9. 1L first line, 1L-R first-line relapse, 2L second line, 3L third line, C comparator, Cap capecitabine, CNS central nervous system, ECOG Eastern Cooperative Oncology Group, ER estrogen receptor, GBG German Breast Group, I intervention, Lap lapatinib, NR not reported, PR progesterone receptor, T-DM1 trastuzumab emtansine a The information shown is for the total intent-to-treat population for CEREBEL; however, analysis of progression-free and overall survival is based on only the subgroup of patients who received prior trastuzumab in the adjuvant or metastatic setting (N = 167 for cap + lap and N = 159 for cap + trastuzumab) b Patients with two or three or more prior anti-cancer regimens c 100% of patients received trastuzumab in the 1L metastatic breast cancer setting; information on prior trastuzumab setting missing in three patients d ECOG performance status not available for eight and two patients in cap + lap and T-DM1 groups, respectively e ECOG performance status of 0 or 1 f Hormone receptor status unknown for nine and 11 patients in cap + lap and T-DM1 groups, respectively g Hormone receptor status was not reported for five and seven patients in the cap + trastuzumab and cap groups, respectively h Value based on N = 207; 198 patients were randomized to cap + lap and nine additional patients were assigned the treatment later on during the trial i Percentage based on a total N of 223 and 226 in the cap + trastuzumab and pertuzumab + trastuzumab + cap groups, respectively j Percentage based on a total N of 201 and a total of 36 patients that crossed over to the combination arm

Overall survival
The HR data for cross-comparison of treatments are summarized for OS (the primary analysis) and for OS adjusted for treatment crossover (sensitivity analysis) in Table 2 and Fig. 3.

Sensitivity analysis
In the sensitivity analysis, adjusted estimates of OS were available for both EMILIA and EGF100151 (Online Resource 5) [10,36]. For EMILIA, the treatment crossoveradjusted HR for OS was 0.69 (95% confidence interval [CI] 0.59, 0.82) using RPSFTM. In EGF100151, the adjusted HR for OS in which treatment crossover was used as a time-dependent covariate was 0.80 (95% CI 0.64, 0.99). Intention-to-treat estimates of OS were used for the other five studies, as in the primary analysis [17,32,[38][39][40]. Sensitivity analysis results were generally similar to the basecase analysis, with a numerically greater OS benefit for T-DM1 than for the other treatments (Table 2 and Fig. 3).

Progression-free survival
Cross-comparison, between-treatment HRs for PFS are also summarized in Table 2 and Fig. 3. The analysis indicated that the likelihood of PFS benefit was greater with T-DM1 than with any of the other comparator treatments. The SUCRA ranking was also greater for T-DM1 (first) than for the other approved treatments: (1) T-DM1, (2) pertuzumab plus trastuzumab plus capecitabine, (3) lapatinib plus capecitabine, (4) trastuzumab plus capecitabine, (5) neratinib, and (6) capecitabine.

Overall response rates
Comparisons of ORR with T-DM1 and with other treatments showed that T-DM1 was associated with a more favorable ORR than all comparator treatments, and was more efficacious than capecitabine, lapatinib plus capecitabine, and neratinib (Fig. 4). Consistent with this finding, the SUCRA ranking was greatest for T-DM1 compared with the other

Adverse events (grade 3 and above)
The ORs for the likelihood of various AEs occurring with T-DM1 compared with the different comparator treatments are summarized in Fig. 5. Treatment discontinuation due to an AE of grade 3 and above was less likely with T-DM1 than with other treatments that could be compared (there was no link between neratinib and T-DM1 in the network, and these therapies could not be compared), and discontinuation due to any AE was less likely with T-DM1 than with all other treatments except for neratinib. The SUCRA rankings for discontinuation due to an AE of grade 3 and above for approved treatments were: (1) T-DM1, (2) pertuzumab plus trastuzumab plus capecitabine, (3) trastuzumab plus capecitabine, (4) lapatinib plus capecitabine, and (5) capecitabine. The likelihood of serious AEs was lower with T-DM1 than with neratinib, or lapatinib plus capecitabine; no comparison with other treatments was possible (Fig. 5a).
ORs indicated that increased AST was more likely with T-DM1 than with other treatments, and was least likely with capecitabine. The SUCRA rankings were: (1) capecitabine, (2) lapatinib plus capecitabine, (3) neratinib, (4) trastuzumab plus capecitabine, and (5) T-DM1. The risk of increased ALT with T-DM1 was higher than with trastuzumab plus capecitabine, lapatinib plus capecitabine, or capecitabine.  Effect size could not be quantified for mucosal inflammation. T-DM1 was consistently better than the comparators, but no mucosal inflammation events were reported in one of the cohorts in study EGF100151; removal of this study disconnected the network, preventing further analysis. Similarly, the absence of events in one arm of study GBG 26 prevented further analysis of thrombocytopenia; T-DM1 was consistently worse than the comparators but this could not be quantified, even when GBG 26 was removed from the network. T-DM1 was also consistently worse than comparators in terms of anemia events. The absence of such AEs in one arm of GBG 26 afforded the possibility to re-analyze with that study excluded; OR estimates for T-DM1 are shown in Online Resource 7. Finally, no effect size could be estimated for PPE, owing to PPE being a rare event in EMILIA, thereby preventing estimation of an OR.

Discussion
This SR and NMA of RCTs undertaken in previously treated patients with unresectable, HER2-positive LABC or mBC represents the first synthesis of clinical effectiveness and safety of studies in mBC. The results demonstrated that safety and efficacy outcomes with T-DM1 were, in general, more favorable than those with other treatments that are currently available/approved in this indication (pertuzumab plus trastuzumab plus capecitabine is not approved). This is particularly encouraging because T-DM1 is the first-in-class antibody-drug conjugate therapy approved in this indication.
In terms of efficacy, T-DM1 was associated with greater OS benefit than all approved comparators, both based on data at the end of the randomized phase of the EMILIA trial and after adjustment for pre-specified treatment crossover. T-DM1 was also associated with better PFS and ORR than other treatments.
The random-effects model results had wide CrIs, reflecting uncertainty around the comparisons. Fixed-effects results (Online Resource 9) which assume no between-study heterogeneity had similar hazard ratios but narrower CrIs.
In terms of safety outcomes, there is high uncertainty due to the small number of studies included, and the fact that not all endpoints were available for all comparator treatments. Discontinuation due to an AE or to an AE of at least grade 3 severity was less likely with T-DM1 than with other treatments, with the exception of neratinib, and T-DM1 generally had a more tolerable safety profile in terms of gastroenterological side effects than the other treatments, being substantially less likely to be associated with diarrhea. Similar point estimates for nausea and vomiting were seen with T-DM1 and trastuzumab plus capecitabine. Fatigue was generally less likely with T-DM1 than with comparators. Hematological effects were more difficult to quantify in this analysis. The likelihood of neutropenia with T-DM1 was slightly higher than with neratinib, and lower than with other approved treatments. When comparisons could be made, T-DM1 was more likely to be associated with anemia and thrombocytopenia than were other treatments. Finally, T-DM1 was among the treatments most likely to be associated with hepatic side effects.
Cardiac safety is an important consideration for patients treated with HER2 inhibitors, including T-DM1, who are at increased risk of developing left ventricular dysfunction. A decrease of left ventricular ejection fraction has been observed in patients treated with T-DM1. For example, in EMILIA, left ventricular dysfunction occurred in 1.8% of patients in the T-DM1-treated group and 3.3% of patients in the lapitinib plus capecitabine group.
Generally, the risk of bias in different aspects of the RCTs included in the NMA was low. Response to treatment in GBG 26 (trastuzumab plus capecitabine versus capecitabine) [32] was at risk of being assessed by unblinded investigators due to a lack of central assessment; and CEREBEL (lapitinib plus capecitabine versus trastuzumab plus capecitabine) [17] was underpowered because it stopped before recruitment was complete, following interim analysis and a recommendation from its data monitoring committee. In several studies, risk of bias was unclear due to a lack of published information. This was typically related to details of randomization in open-label studies and to the methods of allocation concealment used, but little information about patients' baseline similarity was available in one study [39]. Differences in the follow-up times of the studies included in the NMA did not affect PFS and OS, which were analyzed using HRs that inherently assume proportional hazards. However, results for ORR and safety endpoints may have been affected. EMILIA was the largest study and had the longest follow-up, a fact that was reflected in the increased numbers of events observed in this study. Nonetheless, the efficacy and safety of T-DM1 were generally more favorable compared with the other treatments analyzed. It was not possible to analyze the rates of safety events to account for differences in the follow-up times, because not all studies reported the appropriate data to conduct such an analysis.
A key benefit of the current analysis is its reproducibility, which enables updates to be performed when additional evidence becomes available. A significant limitation of the NMA presented here is that a formal analysis of the likelihood of thrombocytopenia with T-DM1 versus other treatments could not be performed (i.e., effect size was not quantifiable due to thrombocytopenia being a rare event in most comparators). Thrombocytopenia was the most frequently reported grade 3 or above AE in patients treated with T-DM1 in the EMILIA trial, affecting 14% of patients, compared with < 1% of patients in the lapatinib plus capecitabine group [13].

Conclusion
T-DM1 was associated with a greater OS, PFS, and ORR benefit than lapatinib plus capecitabine, trastuzumab plus capecitabine, capecitabine monotherapy, and neratinib monotherapy in patients with previously treated HER2-positive LABC or mBC. The improvements in OS with T-DM1 were seen in the analyses of both ITT (unadjusted) and after adjustment for treatment crossover. In the safety analyses, T-DM1 was associated with a greater benefit than the other treatments for the majority of endpoints with evaluable results with the exception of thrombocytopenia and hepatic AEs.

Informed consent Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.