FormalPara Key Points for Decision Makers

Of the identified health technology assessment (HTA) evaluation reports, 22% presented an indirect treatment comparison (ITC); the acceptance rate of ITC methods by HTA agencies in oncology was generally low (30%).

The results of this analysis suggest that, whilst in the absence of a direct comparison ITCs may provide relevant evidence, the attitudes of HTA agencies towards their acceptance of ITCs in HTA evaluations is varied.

There is a need for further clarity on the properties of ITC techniques and the assessment of their results.

1 Introduction

Randomized controlled trials (RCTs) are the gold standard for comparing clinical safety and efficacy between two or more therapies for a particular population, allowing direct head-to-head comparisons to be made. To inform decision making for drug reimbursement, Health Technology Assessment (HTA) agencies state a clear preference for RCT comparisons [1, 2].

However, head-to-head RCTs may not always have been conducted between a new intervention and the relevant comparator(s) at the time of HTA evaluation. In some therapeutic areas, RCTs may assess competing treatments against placebo and, as such, direct comparisons versus relevant comparator therapies are not available. New comparators may also emerge rapidly, particularly through parallel drug development, making comparison against the most relevant comparator at the time of evaluation difficult. Moreover, comparators developed by separate companies pose challenges with regard to conducting RCTs for competing treatments. In other cases, ethical considerations may prevent direct comparisons from being conducted, particularly in life-threatening disorders or those causing disability. Moreover, a direct comparison may not always be feasible; for example, in rare diseases where patient numbers are often extremely low [3,4,5]. Finally, in cases where multiple comparators are of interest, RCTs directly comparing multiple interventions simultaneously are often unavailable, and alternative methods for analysis are necessary.

In these situations, an indirect treatment comparison (ITC) may aid decision making for HTA agencies and healthcare budget holders. However, the degree to which HTA agencies accept the use of ITC methods varies across different countries. In England, guidance from the National Institute for Health and Care Excellence (NICE) indicates that where a direct comparison is not possible, ITC methods may be utilized to compare the clinical benefit and cost-effectiveness of different therapeutic interventions [6]. In Germany, the use of ITC methods within the framework of benefit assessments is not advised by the Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen (IQWIG) and the Gemeinsamer Bundesausschuss (G-BA). However, ITC methods may be considered in certain situations, such as for the evaluation of drugs with novel active ingredients [7], and for health economic evaluations [1]. In France, the Doctrine of the Transparency Committee (TC) of the Haute Autorité de Santé (HAS) states that in the absence of a direct comparison, an ITC carried out in accordance with defined and validated methodologies may be taken into account [8]. Similarly, the standard operating procedure of the Red de Evaluación de Medicamentos del Sistema Nacional de Salud (REvalMed–SNS) and the Agencia Española de Medicamentos y Productos Sanitarios (AEMPS) in Spain mention that an ITC may be considered in cases where a direct comparison is not appropriate [9]. Finally, clear guidance from the Agencia Italiana del Farmaco (AIFA) in Italy is not currently available with regards to the use of ITCs [10].

In this paper, “ITC” is used to describe any type of technique involving the comparison of treatments, which do not involve a direct head-to-head RCT. In particular, it covers both indirect comparisons using a connected network, such as a network meta-analysis (NMA), and disconnected indirect comparisons such as unanchored matching adjusted indirect comparisons (MAICs) and simulated treatment comparisons (STCs).

To assess the relative efficacy of a new intervention versus one or more comparators using an ITC, selection of the most appropriate ITC technique is crucial. However, the choice of ITC technique can vary substantially. Methods of adjusted indirect comparison are based on a connected network to estimate relative treatment effects via a shared comparator. Alternatively, adjusted indirect comparison can make use of population-adjusted methods, ensuring comparability between populations through adjustment based on treatment effect measures (TEMs). Unadjusted or naïve ITCs negate the randomized nature of each individual RCT by comparing absolute outcomes. As such, adjusted ITCs should always be prioritized [11, 12].

Within adjusted indirect comparisons, the standard ITC techniques such as the Bucher method and NMA assume that relative treatment effects are constant across included trials, thereby potentially biasing the resulting estimates in cases where differences exist [13, 14]. To provide a robust relative treatment effect estimate, studies included in the ITC should be sufficiently similar in terms of study design, patient characteristics, treatments, and outcomes measured [15]. Where differences in effect modifiers exist between trials, population-adjusted indirect comparison methods [MAIC, STC, network meta-regression (NMR), and propensity score-based techniques], can be utilized to estimate relative treatment effects [16,17,18,19,20]. Relative treatment effects depend on the specific distribution of these effect modifiers, and therefore it is necessary to project treatment effects onto the target population. Achieving this is limited with MAIC and STC as these methods are only able to estimate treatment effects in the aggregated trial population, which may differ too extensively from the target population. Alternatively, assumptions made by NMR and multilevel NMR (ML-NMR) can be utilized to obtain reliable treatment effect estimates in a target population.

MAIC and STC require individual patient data (IPD) from at least one trial to match the IPD to the aggregated data (AgD) of the comparator arm from another trial where IPD is unavailable [11]. NMR and multilevel NMR (ML-NMR) are regression-adjustment methods that incorporate trial-level covariates to accommodate variability between studies and adjust for heterogeneity between trials [11]. Propensity score (PS) matching and PS weighting require access to IPD for each arm of the analysis [21]. Compared with ad hoc randomization in RCTs, PS is a post hoc randomization technique to mimic what happens in RCTs by balancing covariates at the “randomization” point. PS weighting approaches include inverse-probability treatment weighting (IPTW) and standardized mortality/morbidity ratio weighting (SMRW) [22].

To date, formal methodological guidance is available on the appropriate use of ITCs for HTA evaluations at both a national and a European level. National guidance includes a series of NICE Technical Support Documents (TSD) produced by the NICE Decision Support Unit (DSU) [23], as well as published guidance on ITC methods from HAS [12] and IQWIG [1]. Similar guidance has also been produced by the European Network for Health Technology Assessment (EUnetHTA) at a European level [11]. Furthermore, an International Society for Pharmacoeconomics and Outcomes Research (ISPOR) task force has produced reports on good practice for conducting and interpreting ITCs, as well as a questionnaire to assess ITCs for decision making for policy makers and other healthcare practitioners [24,25,26]. These guidelines focus on standard techniques for ITCs to support HTA agencies in their decision making processes. However, due to the emergence of new and increasingly complex ITC techniques in recent years, selecting an appropriate methodological approach can be difficult, and requires careful consideration.

This study aimed to analyze ITC methods presented in HTA evaluation reports between 2018 and 2021, and comprehensively evaluate the acceptance of these methods by national HTA agencies in England, France, Germany, Italy, and Spain. The analysis focused on oncology treatments for solid tumors, given the high number of interventional trials and the potential limitations associated with conducting direct comparisons in this therapeutic area [27]. The focus on oncology allows for a comparison across a homogeneous sample of HTA evaluations, and the standardized endpoints used in oncology clinical trials also facilitate comparative assessments [27].

2 Methods

2.1 Search Strategy

Electronic searches were conducted on the 17 May 2021, via the PrismAccess database, a private comprehensive platform that comprises HTA evaluation reports from national HTA agencies across the world [28]. The eligibility criteria of the study specified HTA evaluation reports for oncology treatments in solid tumors, assessed by national HTA agencies in England, France, Germany, Italy, and Spain. These countries were selected as the principal pharmaceutical markets in Europe and because of the availability of HTA evaluation reports published by their HTA agencies. The study was chosen to be run for a three year period from April 2018 to April 2021 to have sufficient records for a robust analysis, while reducing the time period as much as possible for a true current appreciation of the acceptance of ITCs.

The resulting hits were documented in a Microsoft Excel file, and electronic versions of the available HTA evaluation reports were downloaded. The reports were sourced from the websites of NICE for England, the TC of HAS for France, IQWIG and G-BA for Germany, AIFA for Italy, and AEMPS for Spain.

To complement information available from the published HTA evaluation reports, a pragmatic search of additional materials in the public domain was conducted to identify any key missing information. Additional information was found in transcripts from the TC in France and in NICE Committee papers in England.

2.2 Review and Selection Process

From the identified HTA evaluation reports, those that presented ITCs were selected for analysis using preferred search terms related to ITCs in the local language (Supplementary Data 1 in Online Resource 1) and screened by two independent reviewers. Only these HTA evaluation reports that presented an ITC were utilized in the main analysis.

2.3 Data Extraction

Information on the ITC methods presented in each HTA evaluation report were extracted into a prespecified data extraction table in Microsoft Excel. Extractions were performed by a first reviewer and all information extracted was validated by a second reviewer. For each ITC, the following information was extracted: the ITC technique and how it was applied, the sources of clinical evidence and type of data used, and the acceptance and criticisms of the ITC methods by the HTA agency.

The acceptance of the ITC methods was reported as “Yes” when it was clearly stated in the report that the ITC technique used was appropriate, “No” when it was clearly stated in the report that the ITC technique used was unsuitable, or “Unclear” where no clear conclusion was reported.

Before formally undertaking the study, a targeted exploratory review of HTA reports and the criticisms reported of the ITC methods was undertaken. This allowed the criticisms of the ITC methods to be classified into three main predefined categories: (1) limitations related to data, (2) limitations related to methodology, and (3) limitations related to uncertainty. These categories were further divided into 11 subcategories of specific criticisms to capture greater granularity (Table 1). Additional details of the criticism subcategories are presented in Supplementary Data 2 in Online Resource 1.

Table 1 Categories of criticisms

2.4 Analyses

The following analyses were performed for each country included in the analysis: the proportion of HTA evaluation reports that presented at least one ITC, the overall acceptance rate of the ITC methods, the ITC technique(s) utilized, the acceptance rate of the ITC methods per technique, the proportion of each criticism (sub)category per ITC technique, and the source of clinical evidence used for each ITC presented.

3 Results

The electronic searches on PrismAccess yielded a total of 654 hits for HTA evaluation reports for oncology treatments in solid tumors published between April 2018 and April 2021 in the five countries of interest. A total of 111 HTA evaluation reports were removed before screening, including 106 duplicates and five evaluations where no publicly available reports were identified. The large number of duplicates identified was primarily due to the PrismAccess database containing reports that were repeated across IQWIG and G-BA publications.

The initial screening excluded 363 HTA evaluation reports where no search terms relating to ITCs were mentioned in the report. The secondary screening excluded a further 60 reports due to the absence of ITC methods being specifically reported, despite a search term being mentioned. The remaining 120 HTA evaluation reports were distributed across the five HTA countries as follows: NICE (N = 38), IQWIG and/or G-BA (N = 21), AIFA (N = 29), AEMPS (N = 21), and HAS TC (N = 11) (Fig. 1).

Fig. 1
figure 1

PRISMA flow diagram for identification of HTA evaluation reports presenting at least one ITC across England, France, Germany, Italy, and Spain

3.1 Number of HTA Evaluation Reports that Included an ITC

Once duplicates were removed, 543 HTA evaluation reports for oncology treatments in solid tumors published between April 2018 and April 2021 in the five countries of interest were retrieved. Overall, 22% of these reports (120/543) presented at least one ITC and were therefore included in the analysis.

England had the highest proportion of HTA evaluation reports that included an ITC (51%), followed by Spain (44%), Italy (24%), Germany (18%), and the lowest being France (6%) (Table 2).

Table 2 Number of HTA evaluation reports including an ITC by country

3.2 Techniques Used for ITC

Several ITC techniques were identified in the HTA evaluation reports and were categorized as follows: NMA, NMR, Bucher ITC, MAIC, STC, IPTW, adjusted ITC (method not recorded), unadjusted/naïve ITC (method not recorded), and not determined. The adjusted and unadjusted/naïve categories were utilized for cases where the HTA evaluation report described the ITC using these terms and no specific technique was explicitly stated. ITC techniques were classified as “not determined” where the HTA evaluation report did not provide information on the ITC technique used. ITC techniques reported as unadjusted/naïve were combined under a single “unadjusted” category.

The analysis of ITC techniques showed that among all HTA evaluation reports presenting an ITC (N = 120), the most commonly used techniques were “not determined,” representing 30% of all identified HTA evaluation reports, followed by NMA (23%), Bucher ITC (19%), unadjusted ITC (14%), and MAIC (12.5%). Since the terms “unanchored” and “anchored” were not explicitly stated for the MAICs presented in most of the HTA evaluation reports, it was not possible to make this distinction, except when efficacy was based on single arm studies; these were defined as “unanchored” comparisons. This was mainly driven by the HTA evaluation reports from Italy, accounting for 27 of the 36 reports that presented ITCs for which the technique used was not determined (Fig. 2). Excluding Italy, the proportion of “not determined” ITC techniques fell to 10% (9/91).

Fig. 2
figure 2

Distribution of ITC techniques presented in HTA evaluation reports across England, France, Germany, Italy, and Spain

The analysis per country indicated that NMA was the most commonly used ITC technique in England, representing 45% of the HTA evaluation reports to NICE, contrary to Germany, where zero NMAs were identified. NMAs were also reported in 45% of reports identified for France. The Bucher ITC technique was the most commonly used technique in Germany (57%), followed by France (27%), England (18%), and Spain (5%). A MAIC accounted for 27% of the ITC techniques presented in the French HTA evaluation reports, followed by England (24%) and Germany (14%). An STC was only used in HTA evaluation reports in England (8%). The IPTW technique was utilized in 27% of French HTA evaluation reports and 5% of English HTA evaluation reports. Adjusted ITC techniques, where the specific technique used was unknown, were presented in 19% of Spanish HTA evaluation reports, 14% of German HTA evaluation reports, and 3% of English HTA evaluation reports. The use of unadjusted ITC techniques was only found in HTA evaluation reports from Spain and England (43% and 21% of reports in these countries, respectively) (Fig. 3).

Fig. 3
figure 3

ITC techniques presented in HTA evaluation reports by country*

3.3 Source of the Clinical Evidence Used in the ITCs

Clinical evidence used in the ITCs was largely derived from RCTs (52% of the HTA evaluation reports), followed by single-arm studies (17%). The source of the clinical evidence was not cited in 32% of the HTA evaluation reports, but this was mainly driven by the HTA evaluation reports published in Italy, accounting for 76% of reports that did not describe the source of clinical data (Fig. 4).

Fig. 4
figure 4

Sources of clinical evidence used in ITCs presented in HTA evaluation reports across England, France, Germany, Italy, and Spain

Where ITCs were based on single-arm studies, the data used to estimate the relative treatment effect were more often derived from a clinical trial (60%) than from a real-world evidence (RWE) study (30%; from a retrospective observational study). The ITC techniques based on single-arm studies were mostly unanchored MAICs or STCs (55%), unadjusted comparisons (35%), and IPTW (5%). Details of the source of clinical evidence per country can be found in Supplementary Data 4 in Online Resource 1.

3.4 Acceptance Rate of ITC Methods by HTA Agencies

The acceptance rate of the ITC methods was classified into three categories: “Yes,” when it was clearly stated in the report that the ITC technique used was appropriate; “No,” when it was clearly stated in the report that the ITC technique used was unsuitable; or “Unclear,” where no clear conclusion was reported.

The overall acceptance rate of ITC methods was 30%, with the highest acceptance rate observed in England (47%), followed by Germany (38%), Italy (31%), and Spain (5%), with the lowest acceptance rate in France, where no included ITC methods were accepted (0%) (Table 3).

Table 3 Overall acceptance rate of ITC methods presented in HTA evaluation reports by country

The overall rejection rate of ITC methods was 21%, with the highest rejection rate observed in France (91%), followed by Germany (33%), England (18%), and Spain (5%), with the lowest being Italy (0%) (Table 3).

Finally, the acceptance of the included ITC methods was considered unclear in 49% of the HTA evaluation reports reviewed.

The acceptance rate of each ITC technique by country is presented in Table 4. A full breakdown of ITC methods acceptance rates by country can be found in Supplementary Data 3 in Online Resource 1.

Table 4 Acceptance rate of ITC techniques presented in HTA evaluation reports by country

The most frequently accepted ITC techniques were NMA, Bucher ITC, and MAIC. In total, 39% of the HTA evaluation reports presenting an NMA, 43% of the HTA evaluation reports presenting a Bucher ITC and 33% of the HTA evaluation reports presenting a MAIC, reported acceptance of the ITC technique used.

3.5 Criticism of ITC Methods

For each ITC technique, criticisms from the HTA agency were analyzed by a number of predefined categories and subcategories presented in Table 5. Additional details of the criticism subcategories are presented in Supplementary Data 2 in Online Resource 1.

Table 5 Proportion of each subcategory of ITC criticism in HTA evaluation reports by ITC technique

The most frequently cited criticism among all HTA evaluation reports, regardless of the ITC technique, was “heterogeneity and risk of bias,” which was reported in 48% of all HTA evaluation reports (Fig. 5). “Lack of/unclear data” and “statistical methods” were also frequently mentioned across the reports analyzed (43% and 41%, respectively). The category “not determined” (32%) was largely due to HTA evaluation reports from Italy, accounting for 93% of reports which provided limited information with regards to the criticisms of the included ITC.

Fig. 5
figure 5

Distribution of criticisms of ITC methods in HTA evaluation reports by category and subcategory

The same distribution of criticisms was observed for each type of ITC technique across all HTA evaluation reports reviewed (Table 5).

4 Discussion

To the authors’ knowledge, this is the first study to comprehensively assess the acceptance of ITC methods in oncology by HTA agencies across England, France, Germany, Italy, and Spain. These countries were selected as the principal pharmaceutical markets in Europe and because of the availability of HTA evaluation reports published by their HTA agencies. A systematic approach was utilized for the search, review and selection process, extraction of data, and analyses. The results were reported quantitatively and qualitatively as robustly as possible based on the available data.

The analysis showed that an ITC was presented in 22% of the identified HTA evaluation reports assessing treatments for solid tumors in oncology. This suggests that direct comparisons were not sufficient and/or available in at least one in five HTA evaluations, and that an ITC was therefore performed to compare the efficacy of the intervention against the appropriate comparator(s). As such, although direct comparison remains the gold standard approach, the use of ITC methods is not uncommon, consistent with HTA methodological guidance for ITCs published over the last ten years [1, 2, 12, 21].

An exploratory analysis was conducted to investigate how manufacturers may submit different ITCs to different HTA agencies. For this analysis, the case of pembrolizumab in combination with pemetrexed and platinum chemotherapy in first-line treatment of metastatic non-small cell lung cancer was selected; this was the only case where an ITC was submitted to at least four countries (England, Germany, Spain, and France; no data publicly available for Italy), as reported by HTA evaluation reports. The fact that only one case was identified as presenting an ITC in HTA evaluation reports for up to four countries may illustrate manufacturers being influenced by the differing receptivity of ITCs anticipated for different HTA agencies.

For England, an NMA was initially presented, comparing pembrolizumab in combination with pemetrexed and platinum chemotherapy versus a number of different chemotherapies, with 17 studies included within the network. A second ITC was presented using the Bucher method, comparing pembrolizumab in combination with pemetrexed and platinum chemotherapy versus pembrolizumab monotherapy. A Bucher ITC for the same comparison of pembrolizumab in combination with pemetrexed and platinum chemotherapy versus pembrolizumab monotherapy was also presented in France, Spain, and Germany. For England, the ITC included KEYNOTE-189 and KEYNOTE-024 trials [29, 30]. Contrastingly, the evaluation report for Germany also included KEYNOTE-021G and KEYNOTE-042 [31, 32]. NICE criticized the absence of KEYNOTE-021G, stating that these data should have been included within the ITC analysis. No such details were presented in the French and Spanish HTA evaluation reports. This highlights that the decision made by manufacturers as to whether to present an ITC, may depend on the treatment landscape and relevant comparators within the relevant country, the acceptability of ITCs by HTA agencies, data availability, and the acceptability of the inclusion and exclusion criteria for different sources of data (e.g., clinical trials). Further analysis of additional cases may extend upon the reasons provided here; however, in-depth analysis remains limited due to the lack of data provided in many HTA evaluation reports.

Differences arose between countries in the proportion of HTA evaluation reports that included an ITC, with the lowest number of HTA evaluation reports with an ITC being in France (6%). The results from this analysis are influenced by whether or not manufacturers decide to submit an ITC to an HTA agency, in anticipation of the receptivity to such analysis. The results by country may vary because of differences in the submission specifications (such as the most appropriate comparator in clinical practice in each country), but also because of differences in the evaluation conducted by the HTA agency. Fewer ITCs were submitted in French reimbursement dossiers, despite the Doctrine of the TC and HAS methodological guidance for ITCs stating that “the use of indirect comparisons may be considered” [8, 12]. In Germany, 18% of HTA evaluation reports included an ITC, which is unsurprising given the IQWIG general methods guidelines state that there are still several unsolved methodological problems, so that currently the routine application of [indirect comparison methods] within the framework of benefit assessments is not advisable” [1]. Conversely, the highest number of ITCs were identified in England (51%). This reflects NICE recommendations which explicitly state that “data from trials that compare the technology with non-relevant comparators may be needed to enable the technology and the comparators to be linked in an indirect or mixed treatment comparison.” NICE also recommends that full details of the ITC methodology should be presented in the manufacturer HTA dossier [2].

Among the five countries of interest, the overall acceptance rate of the included ITCs was estimated to be 30%. This indicates that although head-to-head comparisons are preferred, the use of an ITC can be considered to inform HTA recommendations. Nevertheless, while ITC methods are evaluated by HTA agencies, their acceptance rate remains suboptimal. The highest acceptance rate was observed in England (47%) and the lowest in France (0%). Although HAS has published methodological guidance for ITC methods to date, the acceptance of ITC methods in France remains poor. Whilst a direct link cannot be made from the evidence collected in this analysis, this low acceptance of ITC methods by HAS raises further questions into the overall reimbursement of new oncology interventions in France, which may justify further investigation.

The analysis of ITC techniques utilized showed that the most commonly used ITC techniques across the five countries were NMA (23%), Bucher ITC (19%), and unadjusted ITC (14%). In particular, Bucher ITCs were the most commonly used technique in Germany (57% of ITCs), which is in line with IQWIG guidelines for ITC methods. NMAs were reported in 45% of reports identified for France; however, these results should be interpreted with caution due to the comparatively low number of HTA evaluation reports analyzed (N = 11). Research on the methodology of ITC techniques indicates that these are evolving quickly and that improved techniques may be available in the future, particularly in the area of multi-level network meta-analysis [33,34,35]. Moreover, further work is also needed to provide a detailed comparative analysis of the practical performances of different ITC techniques and to reduce the substantial risk of bias associated with ITCs conducted using nonrandomized data [36, 37].

The analysis of HTA agency criticisms of the ITC methods used showed that the most common criticisms across the five countries related to heterogeneity/risk of bias, lack of/unclear data, and the statistical methods used. This finding was consistent with Werner et al., which also found that study similarity and statistical methods were amongst the most common reasons for rejection of ITCs by IQWIG [38]. This may be due to the inappropriate selection of statistical methods and the failure to address heterogeneity via the chosen methodology. Alternatively, statistical methods may be subject to criticism in cases where a more appropriate method is unfeasible, for example due to a paucity of data. Although various statistical methods may be associated with minor sources of bias, they are generally suitable when used within the appropriate sphere of application. As such, there appears to be a need for further clarity on the appropriate use of statistical methods when conducting ITCs and for guidelines on the assessment of the results of such methods.

Across this analysis, the overall acceptance rate of ITCs was generally low, and a range of criticisms from HTA agencies were identified. This suggests that, whilst in the absence of a direct comparison ITCs may bring additional relevant evidence, there remains a need to select appropriate ITC techniques, applied to good quality data, with adequate considerations regarding heterogeneity and risk of bias. Since most HTA evaluations are based on ITCs submitted by the manufacturer, which are not necessarily published in peer-reviewed journals, this may increase the risk of the inappropriate use of ITC methods. There is also a need for an international consensus on the appropriate use of ITC methods, which could improve the quality of ITCs submitted to HTA agencies and the rate of ITC methods acceptance. Research to compare the performance of the different statistical methods is needed to establish the preferred method(s). This analysis found higher acceptance rates of adjusted ITCs (including Bucher ITCs), compared with unadjusted ITCs, in England, France, and Germany, which is in line with currently available methodological guidance. This was consistent with Lebioda et al., which also found adjusted ITCs to be more generally accepted than unadjusted ITCs [39]. Although justification for the choice of ITC methodology is rarely provided in HTA evaluations, it is likely that this depends on the availability of data and limitations in the network.

The main source of clinical evidence used in the included ITCs was RCTs (52%). Surprisingly, a large proportion (32%) of HTA evaluation reports did not describe the source of clinical evidence (e.g., whether the clinical data were derived from an RCT or single-arm trial). Seventeen percent of the HTA evaluation reports used data from single-arm studies to estimate the relative treatment effect in the ITCs conducted. This is consistent with the growing number of single-arm studies conducted in oncology and the growing number of oncology drug candidates that are obtaining regulatory approval based on noncomparative trials. In terms of the ITC techniques found to be accepted in cases of single-arm studies, the most commonly reported technique was a MAIC (included in 13% of the HTA evaluation reports). STCs, another indirect comparison approach typically utilized with single-arm studies, remained largely underrepresented (included in only 3% of the HTA evaluation reports). For the ITCs that were based on single-arm studies, data for the external comparator arm were more often derived from a clinical trial than from a RWE study, which reinforces findings from Patel et al., which found that 24% of submissions in oncology (including hematological cancers) used a clinical trial as the source of external comparator data, versus 20% that utilized RWE data [40].

It should be noted that new guidelines for the use of ITCs have recently been developed by the EUnetHTA 21 to be applied from January 2025 with the implementation of the EU HTA regulations [11, 41]. It would be pertinent to reassess the findings of this study after the new regulations have been implemented for new oncology products and advanced therapeutic medicinal products.

4.1 Limitations of this Analysis

There are several limitations to this analysis. Firstly, a limited number (<10) of HTA evaluation reports were identified for a number of ITC techniques, including NMR, STC, IPTW, and adjusted ITCs (method not reported). Moreover, one category of ITC techniques was “not determined” due to no further details being available in 30% of the HTA evaluation reports, most notably within Italian HTA evaluations. This category was kept in the analysis to avoid any bias in data interpretation. This lack of information illustrates the need for more transparency in published HTA evaluation reports, although it may also reflect the wide use of unpublished evidence from industry in HTA decision making. Additionally, it should be noted that the proportions of reported ITC techniques may be reflecting the available evidence structures for the HTA evaluations, rather than the particular strengths or weaknesses of the techniques themselves. Secondly, many HTA evaluation reports only stated that an adjusted ITC was used, without additional information on the method used, which limits the assessment of the value of the approach used and again, calls for more transparency from HTA agencies in terms of the information included in their publicly available HTA evaluation reports, as well as their guidance for the reporting of ITC methods. Thirdly, there was a paucity of data publicly available in the Italian HTA evaluation reports, with no information to determine the ITC techniques used in 93% (27/29) of the ITCs presented. This suggests a specific lack of transparency in the Italian HTA process, compared with the other countries in this analysis. Nevertheless, the HTA evaluation reports from Italy were still kept within the analysis, due to the lack of data within the literature that has assessed the use of ITCs included in Italian HTA evaluation reports. Additionally, the categories of criticisms were limited, due to unclear reporting across HTA agencies resulting in overlap between the categories provided. As such, categorization could be improved with additional transparency and evaluation of HTA reports over time. Nevertheless the analysis concludes that at least 40% of ITCs were criticized related to heterogeneity/risk of bias, lack of/unclear data, and the statistical methods used.

Finally, the analysis was limited to publicly available HTA evaluation reports, which may be biased compared with the total number of files submitted by the manufacturer. Also, the analysis was restricted to the oncology area specifically. Similar trends may be relevant to other therapeutic areas; however, this requires further exploration. Overall, this analysis was therefore limited by discrepancies in the ITC information available in the published HTA evaluation reports among the five countries of interest. Finally, the acceptance of the ITC methods was not clearly stated in most of the HTA evaluation reports, and it was not always reported whether the conclusions of the benefit assessment were partly inferred from the ITC, illustrating a lack of transparency by HTA agencies in terms of the weight of the ITC in the reimbursement decision making process among the full body of evidence. Greater transparency is needed to strengthen the evaluation of ITC methods and their acceptance by HTA agencies. There is scope for further evaluation assessing the correlation between the submission of direct versus indirect comparative evidence to HTA agencies and reaching positive recommendation for reimbursement.

At a regulatory level, while RCTs remain gold standard, the European Medicines Agency (EMA) has not yet adopted a formal position on the acceptability of ITCs [42]. However, for ITCs based on single-arm studies and RWE as an external comparator, RWE can have a substantial impact on regulatory decision making, by providing evidence on the natural history of disease and standards of care, and by contextualizing results of uncontrolled trials [42,43,44,45]. A study supported by the EMA examining marketing authorization applications (MAA) submitted in 2018 and 2019 showed that 40% of initial MAAs, and 18% of applications for an indication extension for products currently on the market, contained RWE, mostly sourced from registry-based cohort studies. RWE was used to support the efficacy of the relevant intervention in 50% of cases, for example, through provision of external comparators for single-arm studies [46]. Antineoplastic and immunomodulating agents accounted for over a third of initial MAAs and a half of indication extension submissions, and as such, many applications where RWE was used [46]. Many of these products are indicated for rare and often fatal diseases, with applications being based on uncontrolled clinical trials instead of traditional RCTs. Where RCTs are not available, RWE has the potential to help address uncertainty, for example, by providing supplementary clinical evidence for external comparators to single-arm studies. Depending on the role of the external comparator, the contribution could be designated as a “supportive” study or as part of the “main” study. Work is ongoing to detail the type, purpose, and influence of RWE during regulatory assessment for marketing authorization, but with the anticipated increase in RWE studies, it will be interesting to see how ITC methods may need to evolve to incorporate more RWE.

5 Conclusions

Of 543 HTA evaluation reports for oncology treatments for solid tumors published between April 2018 and April 2021 in England, France, Germany, Italy, and Spain, 120 (22%) included an ITC. The most commonly utilized ITC techniques were NMA (23%), Bucher ITC (19%), and unadjusted ITC (14%). The overall acceptance rate of the ITCs utilized was low (30%), with large disparities observed across the countries analyzed, from 0% acceptance in France to 47% in England. The main criticisms of the ITC methods presented related to heterogeneity/risk of bias, lack of/unclear data, and the statistical methods used. Where direct head-to-head comparisons are not possible, there is a need for further guidance on the properties of ITC methods and the assessment of results from such methods, particularly in the context of the growing body of single-arm studies and RWE for novel therapies in oncology.