1 Introduction

Well-designed randomised controlled trials (RCTs) remain the gold standard for estimating treatment effects between counterfactual groups (e.g., Treatment A vs Treatment B). This is due to the role trial designs and randomisation play in controlling for various forms of bias, particularly selection bias and confounding [1,2,3], strengthening our ability to make causal claims related to estimated differences between counterfactual groups.  RCTs are not perfect though and are still prone to various forms of bias, such as information bias (e.g., missing data due to loss to follow-up) [2]. A trip to any health technology assessment (HTA)-related conference suggests a preference for decision-analytic modelling over trial-based economic evaluations. This may have been influenced by a perception among researchers that HTA agencies have a preference for decision-analytic models. Other factors could include a growing workforce skilled in decision-analytic modelling, insufficient data collected in trials to enable trial-based economic evaluations, and publications recommending decision-analytic modelling over trial-based analyses.

Our view is that it is time to revisit the underlying philosophies and requirements that have resulted in decision-analytic models being the preferred mode of economic evaluation, particularly to inform HTA processes. We do this to rebalance the preference for decision-analytic model-based economic evaluations that has developed over the last 20+ years. As such, this article provides an overview of why decision-analytic models may have become preferred to trial-based economic evaluations, but also set out circumstances where trial-based analyses are sufficient and perhaps preferable to modelling.

2 Do We Need to Debate Models Versus Trial-Based Evaluations?

At the turn of the millennium alongside the recently created National Institute for Clinical Excellence (NICE), as it was known, Brennan and Akehurst [4] asked and provided their answers to the question: “Modelling in Health Economic Evaluation: What is its Place? What is its Value?”. This article was needed, given that at the time the debate had often take an adversarial, trials versus decision-analytic modelling perspective; RCTs as the gold standard for producing estimates of comparative treatment effects were already considered suitable vehicles for economic evaluation, although with some noted limitations. This also meant there were concerns around bias of modelling-based analyses and their subsequent validity [5]. Brennan and Akehurst [4] stated, rightly, that RCTs do have known limitations which would branch into the economic evaluation evidence, including:

  • choice of comparison therapy;

  • protocol-driven costs and outcomes;

  • artificial (e.g., highly controlled) environments;

  • intermediate/surrogate versus final outcomes;

  • inadequate patient follow-up; and

  • selected patient and provider populations (e.g., high internal, but low external, validity) [4].

Six years later, Sculpher et al. [6] took a stronger debate stance, suggesting that:

“…the use of a single trial as a vehicle for economic analysis will, in most situations, lead to a partial and limited analysis with which to inform decision making. The more appropriate framework for economic analysis is evidence synthesis and decision modelling where all available data are brought to bear on fully specified decision problems [6].”

Subsequently, many health economists have taken this to mean we should only conduct modelling-based economic evaluations to inform decision making. The preferred evidence base for HTAs, though, has recently undergone a change, with an increase in pragmatic clinical trials that are intended to better represent standard clinical practice. The use of real-world evidence has also become more common, partially in response to the COVID-19 pandemic, as a way to conduct faster and more efficient trials and real-world causal analyses [7]. Causal analyses using real-world data includes methods such as target trial emulation, which supports causal analyes by applying the principles of RCTs to observational data [8]. As a result, the limitations originally set out by Brennan and Akehurst [4] may no longer apply to trial (or real-world) data in the same way that they did in the year 2000.

3 Concerns with Representing Total Causal Effects and Bias

Sculpher et al. [6] suggests that “Arguably the most damning criticism of trial-based economic evaluation is the fact that [a] single trial is very unlikely to include all evidence relevant to a given evaluation”. We agree with this point, but when moving away from estimates obtained from RCTs, there is an increasing potential for more biased evidence. Sculpher et al. [6] recognises that trials have the advantage of potentially reflecting an unbiased estimate of the comparative treatment effect; however, the discussion of bias when considering incorporation of non-experimental (e.g., observational rather than RCT) evidence circulates around one specific type of bias: selection bias [6]. To rationalise the use of non-experimental estimates in models beyond the main treatment effect, Sculpher et al. [6] suggest:

“…the only reason why resource use or utilities would be expected to differ between interventions would be as a result of clinical or health events, which should be reflected in the treatment effects. So there should be no risk of selection bias resulting from the use of non-trial sources for cost and utility parameters as long as all relevant clinical and health events are included in the model and cost and utility data are conditioned on those events [6].”

Selection bias, as a simple definition, refers to the biases that arise from the procedure by which individuals are selected into the analysis [9]. As such, we agree there would be no selection bias in this circumstance for modelling-based analyses. Essentially, selection bias is avoided by modelling the same hypothetical population in both treatment arms of the decision-analytic model by conditioning the costs and utilities on health events, not the underlying patient sample. However, the assumption that the treatment effect will reflect differences in clinical and health events, and hence differences in resource use and health utility, is unlikely to hold in all circumstances.

First, it is important to remember that observed significant differences between randomised groups in an RCT are mainly attributed to random treatment allocation. The total causal effect estimate of randomisation on cost and utility related outcomes is best estimated based on data collected and analysed as part of the RCT; inferring the causal effect of randomisation in costs and utilities from the RCTs treatment effect (i.e., the primary outcome) may not capture the full causal effect. Using the treatment effect, and not randomisation, to estimate differences in other outcomes (e.g., cost or utilities) between trial arms is equivalent to using an intermediate or surrogate outcome to estimate differences, and hence requires careful consideration.

Secondly, resource use and utilities may differ between trial arms due to non-health events. For example, an intervention may improve patient engagement with health care services or decrease the intensity or frequency of future engagement beyond the core treatment being evaluated. In this instance, the same health state will have different resource use between trial arms. If the resource use is fully conditioned on the health state, this would not be reflected in a decision-analytic model. Changes in resource use and utilities may also differ by things that are not easy to incorporate into health states. For example, in-home compared with in-hospital dialysis for chronic kidney disease (CKD) should be equally biologically effective at preventing CKD. Although a model may capture the intervention resource use (e.g., treatment cost differences due to being in hospital or at home, better engagement with treatment), the causal effect on downstream resource use and utilities are unlikely to be captured within a model if not fully informed by the RCT; health state utilities estimated in the model based purely on health states are unlikely to capture the additional utility patients may experience from in-home dialysis due to the setting. More appropriate health state utilities cannot be inferred from observational data without introducing selection bias due to in-home and in-hospital dialysis patients potentially representing different groups in a standard clinical setting. Overall, decision-analytic models can struggle to incorporate complex and additional impacts beyond the primary treatment effect.

In essence though, relying on randomisation is a powerful tool to support the causal claims of RCT-based economic evaluation evidence if we deal with post-randomisation events appropriately, such that we can be confident that the cost and utility differences between trial arms is due to random treatment allocation not other factors. This may also facilitate directly extrapolating trial-based estimates, although with some caveats, as discussed in Sect. 4. In comparison, given that decision-analytic models impose an assumed causal pathway and use an array of estimates, it is possible for biased and non-causal estimates to have causal effects within a model. Although uncertainty can be quantified within models, the full impact of biased estimates are not as easily quantified even if sensitivity analyses may aid with quantifying the potential implications to a certain degree [10].

4 Moving Beyond the RCT Time Horizon: Is It Always Needed?

The edict that economic evaluations should capture all costs and consequences relevant to the decision problem [11] is the tenet that is perhaps the most problematic for trial-based analyses. Some health economists have erroneously taken this to mean that all economic evaluations should have a lifetime horizon, thus decision-analytic models should always be used.

To avoid bias and strengthen model validity though, this requires that health care interventions have evidence of a causal treatment effect from the time of treatment until death. For many treatments where mortality is not included in the trial time horizon, evidence for longer-term effectiveness and treatment adherence is usually non-existent or of poor quality. Short trial durations (e.g., 6 months to 1 year) limit our ability to obtain such longer-term causal estimates from RCTs. Thus, for decision-analytic modelling-based analyses, longer-term disease and care pathway effect evidence comes from observational data, extrapolation, and/or expert opinion/assumptions to allow the model to run for the longer time horizon. However, decision models need to ensure they are transparent about issues associated with any form of extrapolation beyond the trial duration, as we are potentially trading off decreasing uncertainly for an increase in bias and decrease in model validity (see Sect. 3).

As such, there is a case to be made against extrapolating beyond the RCT time horizon. As noted by Sculpher et al. [6], RCTs are often thoughtfully designed to capture the key clinical endpoint over a relevant time horizon. This should also be sufficient to identify if adverse events differ between the trial arms. This raises an interesting consideration though: if a trial-based economic evaluation estimates an intervention to be cost effective over the RCT time horizon, what is the reason for developing a decision model to extrapolate cost effectiveness beyond that time horizon?

If the shorter-term trial-based economic evaluation suggests the intervention is cost effective, there is limited rationale to model the longer term as the key clinical impact which would be used to drive the model will have already happened and should be reflected in the trial-based cost-effectiveness evidence. As such, there is limited requirement to look over the longer term, particularly if a model’s extrapolation assumes either ongoing intervention effectiveness or effectiveness reduction to at least the same level as the counterfactual (e.g., treatment as usual). In these instances, modelling the longer term is unlikely to result in an incremental cost-effectiveness ratio that will change the interpretation of cost effectiveness, particularly if there are no ongoing treatment costs or longer-term effects which would sway the results to the advantage of the control. Given that looking beyond the RCT time horizon increases the potential for bias, this seems unnecessary given we risk reducing the validity of the results unless appropriate analyses have been conducted for the purpose of extrapolation [12].

5 Conclusion

We agree with past articles that acknowledge the limitations of RCT-based analyses. We also agree decision-analytic modelling is necessary in situations with incorrect comparators, when a key outcome of interest is not captured, and if evidence synthesis (e.g., meta-analyses) is possible/required. However, the emerging paradigm that decision-analytic models are always needed and are superior to RCT-based analyses seems inaccurate; in many cases, we are potentially trading more information for more bias. In our suggested circumstances, RCT-based economic evaluation should be considered not only sufficient, but also preferred to decision analytical modelling-based analyses when causal inference and reducing bias is a key consideration.