FormalPara Key Points for Decision Makers

The most frequently reported issues found in the pan-Canadian Oncology Drug Review (pCODR) reports were related to costing, time horizon, and model structure.

While not statistically significant, there appears to be a trend between time horizon and model structure and funding recommendations.

The results of this research suggest areas in which to focus efforts to improve economic submissions.

1 Introduction

In Canada, it is estimated that 6.7% of healthcare budgets are spent on cancer care [1]. Expenditures on oncology drugs account for a large proportion of healthcare spending, and this trend is expected to increase, due in part to the rapid introduction of costly new treatments and increase in cancer incidence as the population ages [2]. Approximately 149 distinct cancer drugs are expected to come to market in the next 5 years [3]. As healthcare budgets are limited, public drug plans are faced with increasingly difficult choices around which drugs to fund and under what circumstances to fund them. An economic evaluation that compares both the costs and effects of a new drug compared with the current standard of care can be one tool to inform decision makers when making difficult funding decisions.

There are major complexities in assessing cancer drugs. The clinical evidence is often based on surrogate outcomes. Drugs are commonly life-extending rather than curative, and are usually very costly [4]. These factors often result in oncology drugs exceeding common thresholds for cost effectiveness [2, 4]. Canada has a funding review process specifically for oncology drugs [5,6,7]. The intent of a formal funding review process is to apply high quality and consistent evaluation methods to generate recommendations [8].

The pan-Canadian Oncology Drug Review (pCODR) was established in 2011 to assess new oncology drugs and/or new clinical indications and make evidence-based recommendations to the Canadian provincial and territorial drug plans (excluding Quebec) [5]. Pharmaceutical manufacturers (i.e., submitters) provide submissions to pCODR on the clinical evidence, cost effectiveness, and budget impact of the new drug. Submitters are to follow the Canadian Agency for Drugs and Technologies in Health guidelines for the economic evaluation of health technologies that are specific to oncology products [9]. The pCODR recommendations are based on a deliberative framework that considers clinical benefit, economic evaluation, adoption feasibility, and patient values [5]. The pCODR Expert Review Committee (pERC) is a multidisciplinary group that considers all the evidence and makes a funding recommendation. Economic reviewers are engaged in the review process to provide an assessment of a submitted economic evaluation model and detail any modifications and critiques of the model in an economic guidance report, a summary of which is made publicly available on the agency’s website (http://www.cadth.ca/pcodr). Economic reviewers are selected by pCODR and assigned submissions. Economic reviewers engage with the submitters at the midpoint of a review to clarify any details and request additional analysis, as needed, as well as receive the submitter’s feedback on the initial recommendation before the final guidance and recommendation are posted [10]. Any drug recommended by pCODR is then considered for reimbursement and pricing negotiation by the individual provinces and territories, or through the pan-Canadian Pharmaceutical Alliance, which combines the purchasing power of provincial and territorial drug plans.

Few studies have been published identifying issues in economic evaluations submitted to health technology assessment (HTA) bodies worldwide [11,12,13,14,15,16,17]. One study conducted in Canada prior to the establishment of pCODR found that the most common limitations of submitted economic models related to the interpretation of clinical and quality of life benefits and the lack of sensitivity analyses to show the impact of model assumptions [14]. With the advent of pCODR, which has embraced transparent public reporting, other researchers in Canada have begun studying pCODR reviews and have identified common issues with the time horizon chosen and post-progression survival estimates in models submitted by manufacturers [11, 16, 17].

An Australian study [12] examined the common issues identified in the submitted economic evidence for the Australian Pharmaceutical Benefits Scheme. The commonly reported issues were related to the estimates of comparative clinical efficacy, model structure, appropriateness of the chosen comparator, and calculation errors. A Dutch study reviewed 21 pharmacoeconomic evaluations and found that the most common problems were related to alignment of the study population with the registered drug indication, type of economic analysis, and chosen time horizon [13]. A study from France evaluated how uncertainty was accounted for in cost-effectiveness analyses submitted by manufacturers to the French National Authority for Health and found that there was frequently a lack of justification for plausible ranges in the sensitivity analysis. As well, there were frequent omissions in reasons for extrapolation of effects of the health technology beyond the time horizon [18]. Finally, a study from the UK reported on interviews with the National Institute for Health and Care Excellence (NICE) Appraisal Committee members who commonly expressed concerns with the economic model structure and quality of clinical data [15]. Thus, while there have been some efforts to understand the methodological issues faced by HTA agencies over the last several years, the findings have typically been general and based on a small number of reviews.

Some studies have made recommendations on improving and standardizing models in oncology [19,20,21]. One study conducted a critical review of economic evaluations pertaining to aromatase inhibitors in breast cancer and found issues pertaining to time horizon and indirect comparisons [19]. The authors suggest in this area of research a lifetime horizon should be taken with a sensitivity analysis on this variable as well as distinguishing indirect comparisons where actual data exists against a common comparator and those that require additional modeling. Another study examining published economic models for adjuvant endocrine breast cancer treatments found that there was variation in model structure and parameterization and recommended improved guidance on handling structural uncertainty [20].

To date, it also does not appear that potential associations between commonly cited issues and funding recommendations have been explored in Canada or in other jurisdictions worldwide. A few studies have examined the relationship between economic evidence generally and recommendations but have not examined specific methodological issues [22, 23]. Knowing the types of issues that economic reviewers encounter when evaluating economic models can help improve models and avoid commonly cited challenges, as well as inform priority areas for future research to advance the field. As HTA continues to play an essential role in decisions about drug funding, and health system budgets face growing challenges in adding new technologies, it is increasingly important to engage in continuous learning and enhancement of the methodological rigor of the analyses informing the health technology review.

To provide further insight and specific guidance to those who generate and use health economic evidence, the objectives of this research were to (1) identify and examine the main methodological issues reported by economic reviewers, and (2) explore associations between reported methodological issues and pCODR funding recommendations.

2 Methods

Publicly available economic guidance report summaries published on pCODR’s website between July 2011 (inception) and June 2014 with a final funding recommendation were independently examined by two study authors (LM and JB). Drugs that were reviewed for multiple indications at one time were included as distinct reviews in this analysis. Both study authors abstracted issues raised within the economic guidance reports and compared them to reach consensus that all were captured. A list of categories for the main issues that emerged were developed using common definitions for each category. Additionally, both authors independently categorized each issue based on their impressions of the economic reviewer’s necessary actions to rectifying the issue or the issue’s implications on the model results. The three approaches chosen were (1) ‘Addressed’ to improve the estimations (partially or completely), (2) ‘Explored’ to understand uncertainty, or (3) left ‘Unresolved’. An Addressed issue was one in which the reviewers made a model modification and revisions were made to create what they felt was a better estimate of the incremental cost-effectiveness ratio (ICER). An Explored issue was one in which reviewers conducted a sensitivity analysis around an estimate because they were unsure of the best estimate. An Unresolved issue was one in which reviewers could or did not address the issue. Two authors (LM and JB) compared each drug’s categorizations for any disagreements which were resolved through discussion.

In addition to identifying the main issues, the final funding recommendation for each indication was collected. A review could have a negative, positive, or conditional recommendation. A conditional recommendation often meant that the drug was recommended for funding based on clinical benefit, so long as the price of the drug could be improved through procurement negotiations to achieve cost effectiveness. To assess relationships between each main issue categories and types of funding recommendation, Fisher’s exact tests were conducted, with the existence of each issue (yes or no) coded as a binary variable for each drug/indication. Statistical analyses were conducted in STATA 13 [24].

3 Results

A total of 34 economic guidance report summaries were examined corresponding to 39 indications. The reviews spanned a variety of disease sites and indications/settings. The most common disease site was breast (n = 7) followed by lung (n = 6). Other disease sites for which drugs were being reviewed were prostate, leukemia, melanoma, gastrointestinal, lymphoma, myelofibrosis, myeloma, pancreatic, renal, and soft tissue sarcoma.

Among the issues identified in the included reviews, nine categories of recurring problems were identified (Fig. 1). The categories consisted of time horizon, model structure, extrapolation, duration of benefit of treatment, quality of clinical data, uncertainty with indirect comparison, analytic errors, issues with utility estimates, and costing assumptions.

Fig. 1
figure 1

Issues frequently described in the pan-Canadian Oncology Drug Review (pCODR) economic guidance reports

Online resource 1 (see electronic supplementary material) provides an overview of the pattern of issues found and the economic reviewers’ actions for each individual drug/indication reviewed. The economic reviewers most frequently reported problems related to drug wastage and other costing (59%) and time horizon (56%) issues, the latter category indicative of concerns with overestimated survival. The costing category included issues such as consideration of drug wastage, pricing compared with existing therapies (e.g., where the comparator drug price is confidential), healthcare resource use assumptions, and impact of dose adjustments. Dose adjustments could have large cost implications when tablets or capsules were priced equally per unit regardless of strength; adjustments requiring multiple tablets to make up the new dose could double the cost per dose, which if not considered would lead to underestimation of the expected incremental cost of the new drug. The economic reviewer addressed the issues pertaining to drug wastage and other costing in 70% (16/23) of the instances, and the model permitted exploration of the uncertainty for the remainder.

Time horizon, or overestimated survival, was identified as a problem if the length of time chosen by the manufacturer was not deemed to be a realistic estimate of maximum survival duration for the condition or supported by the clinical data. For example, one report had a large proportion of life expectancy gain in the extrapolated portion of the model, and the economic reviewer suggested a shorter time horizon, based on clinical input suggesting that the majority of patients who initiated treatment would likely die before the time assumed by the manufacturer. In most instances, the reviewer shortened the time horizon to align with more clinically plausible estimates of the maximum expected survival for the patient population under study. However, in other instances, the time horizon was shortened specifically to address model limitations. In these instances, the survival and the incremental benefit predicted by the model as a result of extrapolation were considered implausible, and in the absence of more direct ways of addressing the cause of the bias, the time horizon was shortened to mitigate accrual of long-term survival gains that were unsubstantiated. The reviewer modified the time horizon, addressing the issue in some way 95% (21/22) of the time the issue was raised.

Issues with the submitted utility values were also identified (38%). Economic reviewers often identified alternative sources, values or assumptions for utility estimates. In two instances, a reviewer expressed that utility estimates that were more representative of the study population would have been preferable. Where justification for differences between treatment groups were weak, the reviewer may have considered assuming equal utilities among each treatment group, an approach that could still be considered conservative without evidence about any disutility patients may experience due to treatment. There were also concerns with the methods of obtaining utility values and the possible introduction of bias. Concerns about utility values were addressed in 53% (8/15) of the instances in which they were raised, and 27% (4/15) of the time they were explored in a sensitivity analysis.

Concerns about duration of benefit, raised in about one out of three reviews (33%), were also addressed in some way the majority of the time (79%) (11/14). Duration of benefit refers to the length of time in which treatment effects are applied to the risks or hazard of events (i.e., in reducing risks of progression or death). Problems with duration of benefit were raised when the length of time the drug was assumed to provide benefit was deemed implausible, likely overestimating benefit. In addition to explicit statements referring to duration of benefit of the drug, this category also included references to post-progression survival benefits that were not clinically supported. For several submissions, reviewers described the submitted model as assuming no distinction between a patient’s risk of dying before tumor progression and a patient’s risk of dying after tumor progression, implying that patients continued to benefit from the drug even after tumor progression occurred and the drug had been stopped (i.e., a beneficial carry-over effect). In one instance, the reviewer adjusted the model so that the drug did not have a beneficial carry-over effect. In another submission, the incremental survival benefits accrued post-progression with the drug were excluded completely to explore the impact of the implausible assumption.

Model structure, quality of clinical data, and statistical problems with extrapolation were also identified as problems within the reports. Model structure issues were identified if the economic reviewer indicated the structure was inadequate for the purpose of the review. As a result of concerns with the limitations of the model structure in one review, the economic reviewer refrained from providing an upper estimate of the ICER or sensitivity analyses. Model structure issues were frequently raised as a result of use of partitioned survival models, particularly as a result of issues with extrapolating overall survival data. This often occurred alongside concerns related to duration of benefit (post-progression survival benefit) and time horizon. Fifty percent (7/14) of the time, when a model structure issue was raised, the reviewer could not resolve or explore the implications of the issue.

Quality of clinical data issues, which were observed in 26% of the reviews, consisted of concerns around estimates or assumptions that were not based on substantive evidence, where studies had limited sample size, or were non-comparative. Issues in this category could not be resolved half of the time. It also encompassed issues related to interpretation or application of data. For example, in one review, the survival analysis excluded patients who experienced toxicity in the clinical study, which posed a risk of bias in favor of the drug being reviewed. Similarly, statistical problems with the extrapolation method could not be resolved 50% of the time it was raised. Issues were categorized as extrapolation problems if the technical aspects of survival curve fitting were not dealt with in the model or dealt with incorrectly; for example, if the distribution used did not appear to fit the clinical data sufficiently or where pharmaceutical manufacturers did not conduct or describe appropriate statistical tests for curve fitting or lacked patient-level data. Concerns related to the plausibility of extrapolation outcomes were encompassed by the time horizon (overestimated survival) category, as well as potentially through duration of benefit and model structure categories.

Issues that were raised infrequently were analytic error (10%) and uncertainty in indirect comparisons (21%). An analytic error referred to an issue with the technical function of the model, or a major calculation or logic error that called into question the face validity of the model. In one submission, the progression-free survival function was not estimated properly and led to illogical results (e.g., a survival of 103%). The economic reviewer was able to address the analytic error in one out of the four submissions where it was raised. Reviewers highlighted flawed indirect comparisons or uncertainty around the assumptions in the conduct of an indirect comparison, which could only be addressed in four out of the eight submissions where it was raised. In one submission, the trials included in the indirect comparison did not fulfill the assumptions about homogeneity, similarity, and consistency. However, based on clinical input, the economic reviewer assumed equal efficacy between drugs to counter this issue.

3.1 Issues and Funding Recommendations

Among the 39 indications, 54% had a conditional funding recommendation, 26% had a positive recommendation, and 20% had a negative recommendation. A Fisher’s exact test did not show a statistically significant association between any main issue and the type of funding recommendation. For time horizon and model structure, a trend was visually observed (Figs. 2, 3), which suggests that there may be an association between these variables and a funding recommendation. For the reviews with a negative recommendation, there was an issue with time horizon 87% of the time, 57% of the time for reviews with a conditional recommendation, and only 30% of the time for positive recommendations. A similar trend can be seen for model structure. The rest of the issues did not demonstrate this trend.

Fig. 2
figure 2

Time horizon (overestimated survival) and funding recommendation. Reviews with a negative recommendation had a time horizon issue 87% of the time and those with a conditional recommendation had a time horizon issue 57% of the time. The vertical bars on each point estimate indicate the confidence interval

Fig. 3
figure 3

Model structure and funding recommendation. Reviews with a negative recommendation had a model structure issue 62% of the time and those with a conditional recommendation had a model structure issue 38% of the time. The vertical bars on each point estimate indicate the confidence interval

4 Discussion

While problems in economic submissions have been explored previously [11,12,13,14,15,16,17], this is one of the first publications to look at the common issues identified by pCODR’s economic reviewers.

Submitters are advised to consult the Canadian Agency for Drugs and Technologies in Health guidelines for economic evaluation. These guidelines provide an overview of what should be considered in an economic evaluation. However, there are limits to the amount of detail that can be provided to address some of the methodological issues that may be encountered. Importantly, there can be challenges in interpretation of the appropriateness of assumptions made when populating a model. Some of the issues represent scenario analyses where the reviewers feel that a particular scenario should be revised or explored to be in line with current guidelines and provide a more appropriate estimate. Other issues pertain to disagreements about what the most appropriate estimate may be to meet the guidelines (e.g., time horizon). As well, some of the issues that have arisen have not yet been fully addressed in the guidelines, such as partitioned survival models or appropriate methods for extrapolation.

We found that the most commonly identified problems involved main model inputs: the time horizon length, costing, and utility estimates. These issues, however, could be managed by the economic reviewers the majority of the time, by making modifications to the model to address alternative assumptions. Submitters should ensure these parameters align with the clinical evidence, are clinically plausible, and avoid introducing inappropriate bias towards a particular treatment group. Validation of assumptions by clinical experts may assist with this issue. Issues with costing have been reported previously [12, 14], with researchers stating that manufacturers often made costing assumptions that favored the manufacturer’s products or were very optimistic. We found the same costing problems; pharmaceutical manufacturers often did not consider all healthcare resource use or include the impact of drug wastage or dose adjustments that might increase costs. Though such issues are more easily handled through modification of parameter values in models, the frequency of occurrence likely impacts perceptions of objectivity [25].

Problems around utility estimates have also been previously identified as common [12, 14]. In one study [14], the reviewers disagreed with the way the quality-of-life benefits were incorporated in the model as several of the submissions assumed the adverse events arising from the drug were lower than what the clinical evidence indicated. In another study [12], it was expressed that utility estimates were obtained from statistically non-significant or uncertain clinical data. In our study, we found concerns with elicitation methods as well as face validity when compared with existing literature. Pharmaceutical manufacturers may benefit from justifying their assumptions in light of existing clinical evidence and also providing extensive sensitivity analyses around utility estimates, including scenarios that make alternate assumptions (e.g., where differences in utility by treatment are assumed, also consider a scenario assuming equal utilities for each treatment).

Previous research has called for the inclusion of sufficient sensitivity analysis to show the impact of model assumptions, particularly for costing and utility assumptions [14]. Although these issues still exist to some extent, as mentioned above, economic reviewers could more easily address these issues by conducting their own sensitivity analyses. Previous calls to ensure access to fully transparent and executable models [14] have been met by pCODR through their submission guidelines, which has facilitated the review process by providing reviewers with the ability to more rigorously interrogate a model. As a result, the focus appears to have shifted towards more substantive methodological issues such as concerns with model structure or extrapolation.

Issues that were reported less frequently but posed more substantive challenges for reviewers involved model structure, extrapolation, and quality of clinical and comparative data informing the analysis. A model should be validated for both internal and face validity before submission. This is in line with current economic guidelines [9, 26, 27], but is particularly relevant for partitioned survival models, an increasingly common model structure being applied to cancer care interventions, as this model structure can easily produce biased estimates where extrapolation is required to populate a substantial portion of the model. While further research and additional methodological guidance is required to inform the best practice around use of partitioned survival models with extrapolation, reasonable approaches could include substantive discussion surrounding the plausibility of long-term outcomes arising from a partitioned survival model and submitting alternative model structures using a Markovian approach. Issues with overestimated clinical benefits (time horizon), extrapolation, and quality of clinical data echoed some of the findings from a Canadian study that found there was a lack of validation of the clinical evidence [14].

Our study found that economic reviewers reported problems with extrapolation methods that often could not be resolved, yielding overestimated survival. With oncology drugs, it is common practice to adopt early into clinical practice based on interim data [28], and in fact, there is tremendous pressure from patients, patient groups, and society to adopt new cancer therapies even earlier in their life cycle. In potentially becoming sympathetic to unmet clinical needs, regulatory data requirements have become less demanding than funding review requirements and as a result substantial extrapolation beyond trial data is often necessary to form estimates of lifetime survival [28]. Researchers examining methods for extrapolation have found that when extrapolating treatment benefits early in the life cycle of the drug, the results may be inaccurate depending on the assumptions used [29]. Some recommendations have been to assess the sensitivity of the cost-effectiveness analysis to different parametric forms of the survival model as well as to take conservative approaches to extrapolation [29, 30]. One study demonstrated the importance of assessing the suitability of standard parametric models and suggested that if the standard distributions are not appropriate to represent the hazards, flexible parametric survival functions should be used [31]. Another study examining HTAs undertaken for NICE reported that submitters often do not assess the appropriateness of the extrapolated portion of the survival curve [32]. In light of these findings, at minimum, thorough statistical evaluation should be undertaken for model fitting, and the internal and external validity should be explicitly assessed. An algorithm has been proposed to guide analysts in selecting the appropriate model [32]. While there may not be strict formal methods to assess external validity, careful consideration should be given to the implications of different distributions on the assumed direction of the hazards experienced by the cohort, and alternatives should be explored when internally valid statistical approaches suggest distributions that produce very long residual survival without external clinical justification.

We found no statistically significant associations between each main issue and funding recommendations, though we observed patterns with time horizon and model structure. The lack of statistical significance for these two issues could be the result of a small sample size (n = 39 indications). As well, there are likely other factors even within the economic domain that may play larger roles in the formation of the final funding recommendation, including clinical outcomes, uncertainty, and drug price [33]. While some studies have examined the predictors of type of funding recommendation, which included both economic and clinical variables [22, 33,34,35], to the best of our knowledge, no other studies have attempted to examine the association between the issues the economic reviewers have identified and the final funding recommendation. This is an area for further research, including exploring causal models to take the results beyond associations.

There are limitations to this study. First, the largest limitation of a text-based analysis is that findings are limited to what was specifically mentioned in the reports. Importantly, we were unable to assess whether the absence of a description of an issue in the reports meant the absence of an issue. This may have been rectified by consulting the more detailed, unpublished technical reports or discussing the results with the authors of the reports. However, we only examined publicly available documents and did not obtain additional information from the reviewers, the committee, or the manufacturers to understand the importance of each issue or determine whether the issues reported in the summary represented the full scope of issues involved in the review. This limitation was evident in the absence of descriptions related to early treatment switching in clinical trials, which has been identified as an ongoing challenge for oncology [36], but concerns related to the implications of the methods used to adjust for cross-over were not raised in any of the summary economic guidance reports included in this study. It is also relevant that the partitioned survival model structure may have been commonly used but only raised as an issue in some reviews and not in others, depending on the perceived appropriateness of the approach by the reviewer to address the specific circumstances of the review. Reviewers bring their own philosophical, methodological background and expertise. We are unable to address what beliefs reviewers bring to a review and cannot easily address or account for their preferences or predispositions on how the review is conducted or perceived. It is important to consider our findings in light of this limitation. This factor may be part of the review process in a way that impacts both the perception of the submitted model and the revised estimates. Second, this study relied on a small sample size of 34 economic guidance reports corresponding to 39 indications. However, we believe that much insight has been gained into the common issues that economic reviewers encounter when reviewing oncology drug submissions. Lastly, other major factors considered in forming recommendations, such as clinical evidence, alignment with patient values, and adoption feasibility play important roles in the funding recommendation. These factors were not accounted for when conducting assessments of relationships between economic issues and funding recommendations, but are important considerations that would be relevant to future research in this area.

5 Conclusion

The types of issues that economic reviewers identify when reviewing submissions are important for quality improvement. In order to increase the quality of the submissions and reviews, submitters, reviewers, and reimbursement review agencies in Canada and elsewhere can benefit from a current inventory of common issues from existing reviews to inform and enhance guidance for conduct, reporting, and submission of economic evidence, interpretations of such guidance, and review practices. We hope that the findings from this study will inform improved economic submissions, support consistency in economic reviews, and lead to advances in methodological research, and that together these will subsequently lead to better decision making.