FormalPara Key Points for Decision Makers
There is heterogeneity across Health Technology Assessment (HTA) agencies in terms of their assessment of the value of new treatments for non-small cell lung cancer.
This heterogeneity, alongside the disease-specific factors that also influence decision outcomes, can present a challenge regarding the prioritisation and generation of the evidence needed to optimally support the HTA decision-making process within and across countries.
The most influential factors included in the multivariate model were drug therapeutic group, HTA agency, marketing authorisation year and treatment line.
Analysis was limited by missing data; greater transparency would aid understanding of factors, beyond price, that influence HTA decision making.

Introduction

Lung cancer is the most common cancer worldwide, contributing to 1.37 million deaths globally and 353,000 deaths in Europe alone each year [1,2,3,4]. Non-small cell lung cancer (NSCLC) is the most common form of lung cancer, constituting approximately 85% of all diagnosed cases. Management of NSCLC has historically been based on chemotherapy regimens, which are associated with high toxicity and marginal increases in overall survival (OS) [5, 6]. Treatment of NSCLC has gradually evolved to become focused on treating histology-specific subtypes (squamous cell vs. non-squamous cell) and therapies targeted to driver mutations (e.g. epidermal growth factor receptor [EGFR] mutations). More recently, innovative immuno-oncology (I-O) therapies have also been developed. The benefit of I-O therapies in patients with high expression of programmed death-ligand 1 (PD-L1) is well established, and recent data also demonstrate their efficacy, when used alongside standard chemotherapy, for treating all patients regardless of PD-L1 expression [7]. Both targeted and I-O therapies have the potential to substantially improve NSCLC survival and provide a more tolerable alternative to conventional chemotherapy regimens [8,9,10,11].

In many countries, access and funding for novel NSCLC treatments will be impacted by Health Technology Assessment (HTA) decisions [12]. HTA processes, submission requirements and timelines can vary from country to country [12]. Across most HTA bodies, there are two distinct phases: an evidence assessment (usually conducted by an independent body) and an evidence appraisal (conducted by an HTA internal committee). The evidence appraisal is where value judgements in the decision-making process are most likely. The same drug may be assessed by multiple HTA agencies, but differences in assessment frameworks, scope, timing and affordability may result in different HTA decisions and, consequently, variability in patient access to medicines. Furthermore, some HTA agencies may consider factors in addition to clinical and economic metrics, such as innovation, unmet need or societal benefit, and give different weightings to these, driven by economic, cultural and societal values [13].

There is a paucity of empirical evidence to understand how different HTA agencies have judged the relative importance, magnitude and direction of these factors as determinants of value when making their decisions. Too restrictive a view may misrepresent the true value to patients and society. Insight into inter-agency differences in value frameworks may also contribute to the discussion on pan-regional evaluations, such as those currently under consideration in Europe [14]. A number of societies have developed alternative value assessment mechanisms to evaluate new oncology therapies, including the American Society of Clinical Oncology conceptual framework [15] and European Society for Medical Oncology magnitude of clinical benefit scale [16]. In addition, an ISPOR Special Task Force has provided recommendations on the novel approaches to value assessment [17]. These emerging value frameworks vary in target audience, methodology and concept of value.

The primary aim of this study was to conduct a review of international HTA evaluations of NSCLC therapies to identify factors associated with the increased probability of achieving a positive HTA outcome. The study also sought to gain insights into whether differences exist in the determinants of value for new NSCLC treatments between countries that employ similar HTA methodologies, namely where cost-effectiveness analysis is incorporated in the reimbursement process.

Methods

Review of HTAs

An empirical review of published HTAs for NSCLC was undertaken across six HTA agencies: the Canadian Agency for Drugs and Technologies in Health (CADTH), the Haute Autorité de Santé (HAS) in France, the National Centre for Pharmacoeconomics (NCPE) in Ireland, the National Institute for Health and Care Excellence (NICE) in England and Wales, the Pharmaceutical Benefits Advisory Committee (PBAC) in Australia, and the Scottish Medicines Consortium (SMC). The agencies were chosen to achieve a balance between covering a wide geography and including agencies that incorporate cost-effectiveness analysis in their decision-making process, whilst taking a pragmatic approach to the availability of published documentation and the manageability of extracting data from a number of HTA agencies. Relevant appraisal documentation was identified using focused search terms, and was further assessed against specific Population-Intervention-Comparators-Outcomes-Study (PICOS) eligibility criteria (Table S1 in the Electronic Supplementary Material). HTAs were included if there was detailed information publicly available for that assessment, regardless of whether the HTA outcome remained current. The cut-off date for inclusion in the review was 14 October 2019, with HTAs eligible from agency inception (earliest therapy licensed 2003). There were no restrictions based on disease stage, treatment history or therapy mechanism of action (MOA) (i.e. HTAs for chemotherapy, targeted therapies [e.g. EGFR inhibitors] or immunotherapies [e.g. programmed cell death protein 1 {PD-1}/PD-L1 inhibitors] were all included). Therapies were categorised according to the therapeutic group of the drug manufactured by the sponsoring company, meaning that interventions comprising two drugs from different therapeutic groups were categorised once.

A comparison with standard willingness-to-pay thresholds was undertaken in the original currency. For comparison between countries, costs were converted to euros based on the conversion rate [18] for the cost year described in the documentation; if this was unavailable, costs were converted assuming the submission year for the cost year. Costs were not inflated.

Several agencies appraised the same drug for different indications in separate appraisals; each separate HTA was included. Resubmissions were also included, firstly, to avoid skewing the data towards positive recommendations, which would be the result of excluding initial submissions, and secondly, in order to provide evidence regarding the nature of the data that allowed a change from ‘not recommended’ to ‘recommended’ status. HTAs were excluded from the review if they considered more than one intervention at the same time, such as in the NICE Multiple Technology Appraisal process (e.g. TA374 [19]), or if the therapy did not have market authorisation by 31 January 2020. Data including intervention drug, HTA characteristics, scope of the HTA, clinical outcomes and cost-effectiveness parameters were extracted as detailed in Appendix 1 in the Electronic Supplementary Material.

Data Analysis

Statistical analyses were carried out in R version 3.6.1 using the glm package.

Descriptive Analysis

Each HTA agency included in the analysis has a variety of decision options available. In the descriptive analysis, HTA outcomes were categorised as ‘fully recommended’ (for those drugs recommended without any restrictions or conditions), ‘recommended with conditions’ (including any restrictions, optimised access for subgroups of patients who will benefit most, or managed access, for example, entry into the NICE Cancer Drugs Fund [CDF] for further data collection to resolve significant clinical uncertainty), and ‘not recommended’ (which includes deferred decisions in those agencies where deferral is a decision option). These three categories were used in an initial descriptive summary of the data to reflect the decision options available to the different HTA agencies whilst retaining comparability. The outcome of HTA decisions were set as the dependent variable in bivariate and multivariable analyses. These variables were categorised as binary outcomes (recommended/not recommended), as initial investigations indicated that sample sizes were too small to support multinomial logistic regression. Full recommendations and recommendations with conditions were combined to form the ‘recommended’ category and were considered to be positive outcomes in the analyses.

Bivariate Analysis

Details of the variables included in the analysis are given in Table S2 in the Electronic Supplementary Material, outlining the coding used, the variable description, justification for its use and the degree of missingness in the dataset. Variables included in the bivariate analysis were initially identified according to low levels of missingness. Given the limited sample size per agency (Table 1), it was considered important to retain as many observations as possible in the primary analysis, so any variable with greater than 10% missingness for at least one agency was initially excluded in order to maximise the sample size and support comparability across HTA agencies. Variables with greater than 10% missingness, but which were identified in the literature as potentially important were also included in the bivariate analysis, with ‘missing’ as a category. This approach was chosen to investigate the relationship of missingness to outcome since imputation methods were not considered appropriate.

Table 1. Descriptive analysis of NSCLC HTA outcomes by selected variables

Multivariable Analysis

Multivariable analysis was undertaken using binary logit regression. Dummy variables for year of HTA decision were included to account for variation in decision-making priorities over time. Missing values were largely considered to be ‘missing not at random’ due to reporting practices at the different HTA agencies and were dealt with by removing those observations. A reduced dataset was created to inform the multivariable analysis. Variables identified as potentially influential (defined as having a p value equal to or below 0.25 in a non-missing category) in the bivariate analysis were chosen for the initial multivariable model. This approach has been used previously in similar studies [20]. Observations with missing data for those variables were then excluded from the dataset (n = 24). A logit model was then fitted to the dataset without missing values, and variable inclusion was refined by backwards selection using a 5% significance level to produce the final model.

Results

Summary of HTAs and Outcomes

The analysis included 163 HTAs that assessed pharmaceutical products for the treatment of NSCLC, of which the majority of outcomes (67.5%) were positive. Full recommendations were the most common outcome (59.1% of positive decisions, 39.9% of total decisions) (Table 1). The most assessed drugs were targeted therapies (57.1%), followed by I-O therapies (30.1%) and chemotherapy (12.9%). Positive HTA decisions (full recommendations and recommendations with conditions combined) were more likely in assessments considering I-O therapies than targeted or chemotherapies. Over 46% of HTAs came from two agencies: PBAC (25.8%) and SMC (20.3%). The smallest contribution of HTAs was from NCPE (7.4%). The agency with the greatest proportion of positive HTA outcomes was NICE (86.2%), followed by HAS (84.0%). PBAC was the only agency whose decisions were predominantly negative (67.7%); this may be partially explained by PBAC’s practice of deferring decisions when it requires further evidence, which were counted as ‘not recommended’ outcomes in this study. The frequency of HTAs for NSCLC increased substantially over time; two-thirds (66.9%) of the HTA decisions included in the study were made between 2015 and 2019. The probability of receiving a positive decision was higher in 2015–2019 than in any other period covered in the analysis.

All HTAs included in the review considered drugs aimed at patients with locally advanced, advanced or metastatic NSCLC (stage III or stage IV). The majority of HTAs specified the line of therapy, with 48.5% of assessments considering second- or later-line treatments and 34.4% considering first-line treatments. The proportion of positive outcomes was highest for HTAs assessing first-line therapies (73.2%) and lowest for HTAs that did not specify a line of therapy (60.7%). HTA agencies reported unmet need in the relevant population in 39.9% of HTAs, of which 83.1% (n = 54) resulted in a positive decision outcome.

Reference to innovation or the novelty of the method of action of the intervention drug was not documented in the majority (78.5%) of HTAs. From the 32 HTAs (19.6%) where it was documented that the drug under assessment was considered innovative, 87.5% were fully recommended or recommended with conditions. All drugs not considered to be innovative received either a full recommendation or recommendation with conditions; however, the number of cases was limited in this category (n = 3).

At least one manufacturer incremental cost-effectiveness ratio (ICER) (the largest of the ICERs reported for a given HTA in the initial manufacturer submission dossier) was reported in 110 (67.1%) of the HTAs in the study, and at least one final ICER (the ICER reflecting net price or the ICER reflecting the committee’s preferred assumptions) was reported in 64 (39%) of HTAs. The number of HTAs that reported both manufacturer and final ICERs was more limited, with only 55 (34%) of HTAs reporting both manufacturer and final ICERs for the same comparator. Consequently, there was limited opportunity to investigate the evolution of the ICER from the initial ICER on submission to the final ICER used in decision making.

For HTAs with a ‘not recommended’ outcome, all documented final ICERs exceeded the willingness-to-pay threshold (Table 2); however, a proportion of HTAs reporting final ICERs above the willingness-to-pay threshold had a positive outcome.

Table 2. Median final ICER by HTA decision stratified by HTA agency, decision year and therapeutic group

Reported ICERs increased over the period of the study. Median manufacturer ICERs increased from €35,694 (mean €35,694) for HTAs published in the period 2003–2006, when only one HTA reported a manufacturer ICER, to €67,835 (mean €100,233) in the period 2015–2019. Overall, median final ICERs in HTAs increased from €45,937 (mean €45,937) to €68,243 (mean €94,414). Median final ICERs in those HTAs with a positive decision outcome also increased over time, from €37,952 (mean €38,443) to €67,250 (mean €89,897) (Table 2; Fig. 1).

Fig. 1
figure 1

Median final ICER and interquartile range as a factor of HTA outcome. HTA health technology assessment, ICER incremental cost-effectiveness ratio, QALY quality-adjusted life year

Factors Influencing HTA Outcomes

Twenty-one variables were identified for the bivariate analysis once missingness criteria were applied (Table 3). The results of the bivariate analysis suggested that seven of these variables may be important factors in determining HTA decisions across the six agencies when applying a 25% significance level: therapeutic group, HTA agency, market authorisation year, treatment line, biomarker specified, OS median reached and existence of design flaws in the clinical evidence.

Table 3. Bivariate factor analysis of HTA decision

The effect of the HTA agency responsible for the decision was found to be significant, with HTAs assessed by PBAC substantially less likely to result in a positive outcome than those assessed by other agencies (p value versus CADTH = 0.002). The year of market authorisation for the intervention was shown to be associated with the probability of a positive HTA outcome, with drugs whose market authorisation was granted in 2011 or later more likely to be recommended than drugs with an earlier authorisation. Assessments for therapies designated for first-line treatment were almost twice as likely to receive positive recommendations than treatments eligible for use at any point in the treatment pathway (odds ratio [OR] = 0.57, 95% CI 0.22–1.49, p = 0.25). Therapies for which biomarkers were specified were associated with a greater probability for a positive recommendation than those without a specific biomarker (p = 0.19). In those HTAs where median OS was reported, more mature clinical evidence for the intervention drug was found to be associated with a negative HTA decision. HTAs including evidence from pivotal trials that had reached median OS for the intervention arm were less likely to achieve a positive HTA outcome than those trials where it was reported that median OS had not been reached for the intervention (OR = 0.43, 95% CI 0.15–1.07, p = 0.09).

Multivariable Analysis

Seven variables were included in the full model for the multivariable analysis based on reaching a 25% significance level in the bivariate analysis: therapeutic group, HTA agency, market authorisation year, treatment line, specified biomarker, median OS reached and clinical design flaws. Interaction terms including HTA agency were also included in the initial model. The final, reduced model—reached by backwards selection from the full model using a 5% significance level—included four variables and showed significant associations between HTA outcome and HTA agency, and treatment line (Table 4). Multicollinearity was tested using variance inflation factors (VIFs) as an indicative measure, since all independent variables tested were categorical. Each variable in both the full and final model returned a VIF below 2, indicating little collinearity. However, many of the marginal effects in the final model showed a different direction of association than in the bivariate analysis. This implies a level of confounding potentially associated with factors not included in the study, leading to different conclusions depending on whether the factors are considered in isolation or as a whole.

Table 4. Logistic regression analysis

With CADTH as the reference agency, the likelihood of a positive HTA decision was greater for HTAs assessed by HAS, NICE and SMC, but lower for NCPE and PBAC. Therapies that received market authorisation in later periods were associated with increased odds of positive HTA decision; likelihood of recommendation was most strongly associated with the period 2011–2014. The odds of a positive outcome were increased for targeted therapies versus I-O therapies, although this factor does not take into account the comparison under assessment within the HTAs, which may impact the probability of one type of therapy receiving a positive outcome versus another type of therapy when all other factors are held equal. HTAs considering therapies for first-line treatment were more likely to result in a decision to recommend than those HTAs considering therapies for a previously treated population or which did not specify treatment line.

Following regression of the reduced set of 139 HTA outcomes over the four variables in the final model, the resulting pseudo-R2 was 0.25. This suggests that around 25% of variability in the data was explained by the final model and that there exist some unknown factors that are influential on HTA outcomes. These unknowns may be related to data availability.

Discussion

This study reviewed HTA evaluations of NSCLC therapies across six HTA jurisdictions, including Australia, Canada, England and Wales, France, Ireland and Scotland, to identify which factors were observably influential on the HTA decision and how these factors varied across agencies. The most influential factors identified in the bivariate analysis were those related to HTA framework (HTA agency), timing of HTA (market authorisation year), therapeutic group, uncertainty of clinical evidence (median OS not reached and existence of design flaws in the clinical evidence) and treatment pathway (treatment line and specification of biomarkers). In multivariate analysis this reduced to four factors: HTA agency, market authorisation year, therapeutic group and treatment line. Although non-significant in bivariate analysis, these factors combined to account for approximately 25% of the variability in the data.

Previous studies have used a variety of methodological frameworks to examine variation in HTA processes and HTA decisions. These frameworks have included product case studies [13], HTA expert interviews, quantitative analysis or a mixture of these approaches [2, 21,22,23,24]. These studies have also focused on either single-country agencies or multiple countries of differing combinations. Previous cross-country assessments of HTA outcomes have found significant inter-country variability, with poor to moderate agreement between agencies [1,2,3, 25,26,27]. Reasons for cross-country differences included heterogeneity in the evidence appraised, in the interpretation of the same evidence, and in the different ways of dealing with the same uncertainty [2]. For example, NICE and PBAC accepted an indirect comparison, while this was not accepted in the Canadian Common Drug Review process [27]. In line with the current study, previous studies identified significant associations across agencies for cost-effective therapies [25, 28,29,30,31] (i.e. ICERs below the willingness-to-pay threshold [32]), year of decision [25, 31] and products meeting end-of-life criteria [25]. An available NICE recommendation was also found to be highly influential [28], although this was not assessed in the current analysis. Within agencies, previous studies have demonstrated that cost-effectiveness was highly influential on outcomes [20,21,22, 29,30,31]; however, other studies have suggested that this threshold may be above the stated level [23].

Within the current bivariate analysis, a robust evidence base was identified as an influential factor for HTA decision making across HTA agencies, with significant factors including lack of design flaws in the clinical evidence and mature trial data. However, new innovative therapies for NSCLC face specific challenges in the available clinical evidence base. In particular, there are challenges related to interpretation and modelling of survival data. Immature survival evidence (assessed as median OS not reached) was associated with improved likelihood of positive outcomes. Inconsistent reporting of clinical trial data restricted the exploration of this effect, as HTAs that did not report the availability of median OS were also less likely to achieve a positive outcome than in those HTAs reporting that median OS had not been reached for the intervention. However, this finding is in line with two previous studies. One study concluded that OS was not associated with HTA outcome for cancer drugs assessed by NICE or CADTH [32], while the other found that, for drugs with ICERs considered borderline or not cost-effective, submissions with mature data were more likely to be not recommended, while submissions with immature data were likely to be recommended through the CDF [33]. Although seemingly counterintuitive, this association with immature survival evidence may reflect risk-averseness in decision making, whereby delaying decisions until more evidence is available, while providing interim access through the CDF, may allow patients to benefit in cases where there is a high degree of uncertainty around long-term efficacy.

A separate analysis showed that demonstration of statistical superiority of the primary endpoint was associated with outcome [20]. In line with this evidence, the current analysis found that increased follow-up was also correlated with positive decisions, although this was not statistically significant. Similarly, therapies meeting end-of-life criteria where applicable were more likely to achieve a positive outcome, but this was also not statistically significant.

The multivariate regression analysis presented within this analysis resulted in a final model with a pseudo-R2 of 0.25, suggesting around 25% of variability in the data was explained by the final model, indicating several unknowns. However, this may be related to missing data caused by the broad spectrum extraction grid required to capture factors affecting six HTA bodies. A previous analysis of NICE decision making resulted in a pseudo-R2 of 0.26, which is broadly in line with the variability captured within the current analysis [20].

Both manufacturers and HTA agencies must balance the challenges of data maturity and uncertainty in evidence against the needs of patients in terms of timely market access and disease burden. However, in trying to achieve this balance, HTA agencies have placed different emphases on the factors identified in our analyses. Unsurprisingly, economic evidence, specifically the incremental costs and quality-adjusted life years, were strongly influential for the HTA decision across many of the HTA agencies. The approach to the classification of innovation across HTA agencies also varied. The definition of innovation and novelty was not clear even in those HTAs where the agency explicitly considered these factors, which may explain some of the variability between agency decisions. Research is underway on how best to define and reward medical innovation [34, 35], which may clarify and qualify the value of innovation in future reimbursement decisions.

The study has some limitations. A lack of harmonisation for the classification of HTA decisions between agencies has necessitated that for this study a comparable categorisation of HTA decisions was required. For example, in England and Wales, a NICE technology appraisal committee can make five HTA recommendations, including the following: recommended, optimised, recommended for use in the CDF (for indications in cancer only), only in research, and not recommended. By contrast, in Australia, PBAC can make one of three recommendations, including the following: recommended, defer pending specific additional information, or not recommended. Additionally, the coverage and level of detail for the publicly available information to facilitate comparisons between agencies varied. There are significant limitations in terms of what HTA agencies choose to report and whether those agencies report all factors that impact decision making, either implicitly or explicitly. Further, there may be variation in local treatment practices or health care costings that may be driving differences in cost-effectiveness assessment that would not be reported and hence captured within this analysis. It should also be noted that this analysis used the approach of assuming the largest base-case analysis ICER represented the greatest impact on decision making, when this may have had less relevance due to subgroup analyses or comparator discussions. Further, there is significant potential that resubmissions and price negotiations may have impacted the results of this analysis, as was seen in a previous analysis of conditionally approved drugs in the EU [13]. Finally, sample size limited the extent to which initial findings could be examined in further depth and may affect the robustness of the conclusions.

The summary ICERs described in this study may not be representative of the actual ICERs used in decision making. They were calculated using values published in HTA documentation and may not include any confidential price discounts subsequently negotiated by the manufacturer; thus, the average ICERs used in decision making for those HTAs that reported ICERs were likely lower than those calculated in this study. Also, as ICERs can only be summarised according to those that have been reported, there may be biases in the summaries that are linked to the values of the ICER in an unknown way; for instance, higher ICERs may be more likely to have been redacted. Alternatively, since some agencies are more likely to publish ICERs than others, should those agencies also be more likely to recommend a therapy, this would show up in the summarised ICERs and may result in counterintuitive conclusions purely due to the effect of reporting practices.

In light of these limitations, the analytical methods and conclusions are subject to several caveats; however, alternative approaches do not address these limitations and would not provide additional insight. The regression analysis approach used here is consistent with similar published analyses, providing comparable outcomes [20,21,22, 29,30,31]. Alternative approaches are also common, including case studies [1, 27, 33] and use of descriptive statistics [2, 3, 28, 32, 33], but these methodologies would not address the objectives of this study.

Conclusions

Differences in assessment frameworks, scope and timing across HTA agencies may lead to differences in patient access to new treatments for NSCLC. This study identified a degree of heterogeneity across HTA agencies in terms of their decisions and the factors informing them. Despite this heterogeneity, several common non-price-related factors associated with decision making were identified. The most influential factors included in the multivariate model were drug therapeutic group, HTA agency, marketing authorisation year and treatment line; although non-significant during bivariate analysis, these factors combined to inform 25% of variability in the data. Robust evidence was lacking to describe the influence that factors such as unmet need, innovation and data maturity have on HTA recommendations for new NSCLC treatments. Additionally, analysis was hampered by missing data; greater transparency of reporting would aid understanding of factors, beyond price, that influence HTA decision making.