This study adopted a mixed methods research design based on an existing methodological framework to investigate HTA decision processes for ten drug and indication pairs across four countries, and showed important variations and contradictory trends across countries. Differences at each stage of the HTA process were identified, partly explaining the reasons for differing HTA recommendations across countries, while illustrating the complexity of these processes. First, heterogeneity was seen in the evidence accounted for, in the interpretation of the same evidence, and in the different ways of dealing with the same uncertainty (Table 2). These were influenced by the evidentiary, risk and value preferences identified across the ten study drugs. The differences in interpreting the same evidence were partially explained by varying levels and types of stakeholder input, the consideration (or not) of the drug’s orphan status or investigational nature, the consideration of additional qualitative criteria (e.g. innovation, unmet need), the presence of another study, or as part of the decision-maker’s judgment during deliberation. There were also a number of decision modulators that contributed to a greater acceptance of uncertainty or higher and uncertain ICERs. These included agency-specific modulators, pertaining to agency-specific elicited or non-elicited societal preferences, such as the SMC modifiers, NICE’s end-of-life supplementary advice and disease severity for TLV. There were also process-specific modulators, which included the ability to implement patient access schemes or lower discount rates, or to impose restrictions or future re-assessments. There were also consequences from the HTA approach used (clinical or cost-effectiveness) on the final decision.
Results from this in-depth analysis of ten orphan drugs suggest that HTA is not a simple analysis of clinical and/or cost-effectiveness, but remains a flexible process subject to the decision maker’s interpretation about uncertainty and social values as part of the deliberative process of HTA. This study contributes to shedding light on some of the factors being accounted for, which may not necessarily be explicitly defined as part of the decision process. Policymakers should be aware of the more comprehensive set of factors accounted for in these decisions, and the different ways of applying HTA, including how countries dealt with the issues specific to—but not limited to—orphan drugs. The implications of these findings are discussed here, together with the study limitations.
Contrasting applications of HTA
A first contrast was seen between the HTA recommendations driven by cost-effectiveness and those by clinical benefit. Some drugs with a recognised positive clinical benefit in France were rejected in some, but not all, of the other countries partly due to their high ICER (e.g. everolimus, eltrombopag). This finding is in line with one study that compared NICE coverage and HAS ASMR decisions for a sample of anticancer drugs, showing a significant association between the QALY gain and ASMR ratings, but none when accounting for costs (ICER) [8]. This also has implications on price, which is driven by the ASMR assessment. Economic evaluation has recently been implemented by HAS to support price negotiations for those drugs with an ASMR I-III rating (significant to major improvement in clinical benefit). In such cases, the economic evaluation acts as an additional criterion to be accounted for by the French Economic Committee for Healthcare Products (CEPS) when negotiating prices, giving more weight to the concept of value and value for money. This two-step approach may, however, have negative implications on the price of those orphan drugs considered to have a minor or no improvement in clinical benefit (ASMR IV-V). As illustrated in the case studies analysed, those drugs with very uncertain evidence (due to the lack of comparative data) received low ASMR ratings, where their price will be set lower than comparator prices. In the other study countries, their assessments based on economic evaluation approaches allow for various techniques to deal with uncertainty (e.g. sensitivity analysis), which subsequently may also influence the ICER estimate and drug pricing.
Further contrasts were also seen within those countries assessing cost-effectiveness. The acceptability of the ICER, based on similar economic models and comparators, differed due to the agency-specific or process-specific modulators identified: (a) disease severity for TLV, (b) SMC modifiers, (c) patient access schemes, (d) NICE end-of-life criteria, (e) imposing restrictions, or (f) continuous data generation and future re-assessment. The first four reflect adjusted willingness-to-pay thresholds and special considerations for orphan drugs, while the latter two cases relate to the ability to modulate the ICER by identifying circumstances or subgroups for which the treatment is cost-effective, or accepting greater uncertainty for a limited period of time until more evidence is generated. Findings for Sweden are in line with a recent study that demonstrated the positive impact of disease severity on reimbursement decisions, despite severity not being explicitly defined [44]. The ability to implement patient access schemes is another way of improving the cost-effectiveness and/or uncertainty [45], and providing earlier access to these treatments [46]. Their effects on innovation and expected returns are still unclear [47], and a number of issues around their implementation have been already noted [48]. Additionally, in those countries that have the ability to implement process-specific modulators (e.g. patient access schemes), this study showed that their application was not the same nor consistent across countries or drugs.
Dealing with rare conditions
Results illustrate the type of issues encountered when dealing with orphan drugs in terms of the nature of the evidence presented (e.g. sample size, phase II primary trials, subgroup data, surrogate endpoints, lack of comparative data) and the types of issues highlighted by the HTA bodies (e.g. small sample size, insufficient statistical power, surrogate endpoints, subgroup data, etc.), corresponding to what characterises orphan drugs [49, 50]. Different ways in dealing with this imperfect evidence were seen. In some cases, these issues relating, but not specific to orphan drugs were considered acceptable through various means as highlighted in this study. This included the specific consideration of the condition’s rarity or the recognised difficulties in recruiting sufficient patient numbers in trials, as highlighted by TLV for eltrombopag or NICE for mifamurtide and romiplostim. In other circumstances (e.g. dealing with subgroup populations), some issues remained inconclusive for all because of their lack of statistical power or retrospective nature (e.g. azacitidine or mannitol). When comparing the prevalence rates used by SMC in their budget impact analysis and the HTA recommendations issued, two observations arise. The three drugs treating less than 20 patients per year (ofatumumab, mifamurtide, trabectedin) had generally poorer outcomes: they all received the poorest ASMR (V) rating, and were more likely to be rejected by the other agencies (ofatumumab by all, trabectedin by SMC). This was a consequence of the lower quality of the evidence from small sample sizes or the lack of comparative data. In the “more prevalent” rare conditions analysed (between 200 and 300 patients per year in Scotland), similar issues were encountered but to a lesser extent were these linked to the small sample size (eltrombopag, mannitol dry). These experiences could be a good starting point for generating the circumstances under which small sample sizes or other issues specific to rare diseases may be acceptable due to the rarity of the condition, also ensuring these are accounted for consistently across cases.
Results also suggest possible misalignments between the incentives implemented for marketing authorisation and their effect at HTA level. For three drugs, the evidence presented was very uncertain due to its low quality and lack of comparative data (e.g. mannitol dry, ofatumumab, trabectedin). This was a consequence of the early marketing authorisation granted or early scientific advice received, which negatively influenced the HTA decisions made: low ASMR ratings (V) in France and rejected in the other countries. Two exceptions, however, were identified (NICE’s recommendations for mannitol dry and trabectedin), where uncertainty was deemed acceptable thanks to the different mechanisms modulating the ICER or to the consideration of other forms of evidence (e.g. historical controls, other considerations). These examples may constitute ways forward in dealing with such scenarios in the future. Additionally in France, all study drugs were made available as part of their temporary authorisation scheme (ATU), with the exception of mannitol dry and mifamurtide. The former received an ASMR V rating and the latter was rejected, which occurs very rarely in France. This may imply that continuous data collection is an additional factor that contributes to accepting greater uncertainty in France.
HTA methodological challenges
RCT weaknesses are well known and include limitations around safety and generalizability to heterogeneous populations or clinical practice, as well as the cost to conduct them [14]. Similar issues were identified in this study (e.g. generalizability to local population, non-inclusion of certain patient subgroups or subgroup heterogeneity, trial population non-representative of the indication under review, or imbalances in the characteristics or responses across the different subgroups). Given the preference for RCTs observed and the inclusion of these trial results as main parameters of interest in the economic models, the above concerns identified and the diverging ways in dealing with these emphasise the need to recognise complementary forms of robust and valid evidence [14]. Apart from a few cases (e.g. expert opinion to confirm generalizability), this was not seen in practice given the limited role of non-phase III evidence in the assessment of clinical benefit and cost-effectiveness observed in this study. The uptake of such forms of evidence is still modest and likely due to the lack of expertise around dealing with a variety of types of observational evidence including those based on real world data such as electronic patient records, [51] or patient-reported outcomes [52]. Their role, however, is to be stressed given their potential use for policy making in, for example, the value-based system or process for highly specialised medicines at NICE, the patient and clinician engagement (PACE) programme at SMC, the use of managed entry agreements [47] and, more recently, the introduction of a pilot study on adaptive licensing at the EMA [53, 54]. With these new developments, the environment is increasingly relying on expert opinion, observational studies and real world data [55], which could provide insights about treatment effectiveness, the burden of illness, the nature of a condition, or the indirect health care costs and benefits from taking the treatment and feeding it into a more adaptive model of HTA [56]. This is already in place in some countries such as Sweden or France (under the ATU scheme), which has contributed to dealing with uncertainty in some of the cases evaluated without imposing additional conditions or restrictions.
This study identified differences across countries in the type of evidence that is considered appropriate and in interpreting the same evidence, contributing to explaining different HTA recommendations. A more formalised and consistent recognition of the acceptability criteria for evidence and uncertainty is needed, which could be achieved by generating criteria based on past decisions such as the specific circumstances (e.g. early marketing authorisation) or quality standards (e.g. reliability, validity) required. The agency-specific risk and value preferences identified in this study could also be a good starting point for shedding light on the more common circumstances already arising in the different countries.
Practical implications
This research is in line with the recognised need to better understand pricing and reimbursement systems through cross-country learning and sharing of experiences [57]. It may be useful for European-level initiatives, such as the pilot for a common European HTA (EUnetHTA), as it sheds light on the different applications of HTA and the reasons for differences in the HTA recommendations made, which can feed into discussions when seeking greater consensus across Member States. It may also feed into the new programmes that have since been implemented for orphan drugs (PACE programme at SMC), and for ultra-orphan drugs (NICE’s Highly Specialised Technology (HST) programme, SMC’s ultra-orphan drug decision framework), as well as HAS’s recent requirement for an economic evaluation. These recent developments all have in common (with the exception of the HST programme) that they are add-ons to conventional programmes. Therefore, better understanding of how value is being assessed within these conventional programmes and the reasons for cross-country differences is relevant to identifying issues and potential ways forward for their continuous improvement, while acting as a reference when evaluating these new programmes. This is all the more significant given their recentness, where little is known about their impact.
Results and the systematic approach used may also feed into other forms of research around priority setting. The retrospective identification of the criteria driving previous decisions, applied in this study, is also recognised as one approach to criteria elicitation for multiple criteria decision analysis (MCDA) when used for priority setting [58]. When comparing the criteria identified in this study to those elicited by the EVIDEM project for the purpose of MCDA, similarities were seen. For example, unmet need was categorised as unmet need in efficacy, in safety, in patient-reported outcomes and patient demand [59]. This study identified the different expressions of unmet need, such as: the importance of new treatment options, the lack of (satisfactory) treatment alternatives, alternatives not routinely available, the need to improve therapeutic management, and so forth. Identifying the different expressions of such criteria in practice may feed into defining their attribute levels during the criteria elicitation processes (e.g. MCDA, discreet choice experiments).
A more recent study developed a value proposition based on 19 social value arguments about orphan drug reimbursement decisions, summarised into four value-bearing factors (e.g. disease-related, treatment-related, population-related and socio-economic factors) [60]. Most of these factors were identified in this study (Fig. 2), with the exception of the identifiability of treatment beneficiaries, the impact on the distribution of health, or any of the socio-economic factors. These corroborate the finding’s content validity, and showcase the ability to identify how these factors are expressed in practice. Another example is the second component, “decision-making process”, of the evidence-informed framework developed by Dr Stafinski and colleagues, comprising a list of 7 questions important for resource allocation decisions, and which corresponds to the decision-making processes analysed in this study [61]. This research and the approach used allows one to identify how some of the key questions are expressed in practice during these decision processes, namely those about “information inputs” and “information sources”, “social value judgements” and “deliberations”, which correspond to the “evidence” and “interpretation of the evidence” components, respectively, from the methodological framework applied in this study [4].
Limitations and need for further research
This research is not without its limitations. First, the data was mainly collected from secondary sources. It would have been preferable to have full information about the submissions (e.g. manufacturer submission), but this was not possible in the current scheme. The information obtained by applying the methodological framework was unavoidably limited by the level of detail provided in the HTA reports and whether the framework captures all aspects of the decision-making process [4]. The information published was assumed to be transparent and reflect the main determinants driving the decisions (transparency directive). The analysis of these published documents was considered to provide sufficient detail and explain how decisions were reached. Additionally, triangulation with other data sources ensured that sufficient detail was captured for each case study [e.g. HTA reports, additional material, and input from HTA experts (Advance-HTA consortium, conferences)]. Results were also presented to and discussed with the HTA bodies, ensuring that the interpretation of the decisions made by the research was accurate. Second, there were sampling issues arising from differences among the four agencies in the way they select topics for their assessments. Despite these differences, a suitable sample was identified. Third, this research focused specifically on orphan drugs, which undergo the same HTA process as drugs for more common conditions. Some of the findings may also be applicable to these more common conditions. One component of the analysis did focus on identifying those challenges that are specific, but not necessarily always unique to, dealing with these rarer conditions, and draw key lessons from these. A final limitation is the relatively small sample size, which does not allow for multivariate regression analysis. However, this research resulted in meaningful outputs derived from a more in-depth and qualitative component showing that differences across countries do matter. A more structured understanding of the possible explanations for differences were derived from the findings, allowing for subsequent more quantitative analyses to focus on certain aspects of the decision-making process across a greater sample. Further research could look at the drivers of these differences across a larger sample of drugs and therapy areas using multivariate regression analysis for a greater generalisation of the results, by extending it to other types of drugs to assess how different agencies assess different drug and disease characteristics. In order to maintain the depth and breadth of the analysis building on the methodological framework used in this study, it is highly recommended to begin by prioritising the qualitative strand to ensure that the depth of the processes are captured and comparable across settings.