Decision-analytic models are often required to evaluate the value of new health technologies such as medical devices and pharmaceuticals. Within health technology assessment (HTA) submissions, the role of modelling is principally to assess whether a proposed new technology is cost effective by predicting its long-term costs and benefits compared with relevant comparators such as usual care. In countries such as Australia and the UK, such evidence is commonly used as one of the key inputs into the decision-making process to inform resource allocation decisions. Given the impact of model-based evaluations on funding decisions, it is critical to develop and consistently use credible models. However, important concerns about decision-analytic modelling still remain [1].

In this editorial, we focus on one important concern within the decision-making process: the impact of alternative model structural choices on the comparability and accuracy of economic evaluations, and how (disease-specific) reference models can potentially address this.

1 Model Structuring in Health Technology Assessment Submissions

Although, there has been considerable attention paid to developing guidelines for good modelling practice to improve the quality of models, important concerns about the credibility of model-based estimates in the decision-making process still remain [1, 2]. Of note is the key challenge relating to model structuring (a critical stage within the model development process), which can impact the accuracy of model predictions and comparability of evaluations.

The model structuring process refers to the use of the best available evidence and methods to inform structural choices (see Haji Ali Afzali et al. [3] for full details). This, for example, includes the choice about clinically/economically important health states/events impacted by health technologies under evaluation, relevant patient attributes and event histories that influence disease progression. These choices should appropriately reflect the natural history and biology of the condition under study. Examples of other key structural choices include assumptions about the duration of treatment effects beyond the follow-up period and the relationship between time and transition probabilities [3, 4]

In recent years, emerging empirical evidence has demonstrated the impact of structural choices on model predictions [5,6,7,8]. For example, in models evaluating a combination therapy of lapatinib and capecitabine versus capecitabine alone for the treatment of advanced breast cancer, the choice of alternative health states and transitions between them has resulted in a wide range of cost-effectiveness results, potentially leading to different funding decisions [6]. This example, among others, demonstrates the significance of the model-structuring process within a specific condition such as breast cancer.

An example of this issue in submissions to national funding bodies is the assessment of two model-based applications (from two different sponsors) that have recently been submitted to the Australian Medical Services Advisory Committee (MSAC), whose role is to provide advice to the Federal Government on whether a new medical service should be publicly funded. The submissions requested the public funding of two interventions for the treatment of adults with antidepressant medication-resistant depression: vagus nerve stimulation (VNS) therapy and repetitive transcranial magnetic stimulation (rTMS) [9, 10]. The VNS and rTMS submissions nominated standard care (including alternative pharmaceuticals) and third-line antidepressant therapy as the main comparators, respectively. The VNS and rTMS submissions reported a baseline incremental cost-effectiveness ratio (ICER) of 26,600 Australian dollars ($A)/quality-adjusted life-year (QALY) and $A6400/QALY, respectively. In its meeting in July 2018, MSAC did not support public funding of VNS and rTMS, mainly due to concerns regarding uncertainty in economic evaluation [9, 10].

The VNS submission used a cohort-based state-transition (Markov) model and, in addition to death, included four health states representing depression severity defined by Montgomery-Åsberg Depression Rating Scale (MADRS) scores (ranging from 0 to 60): minimal depression (full remission), mild depression (partial remission), moderate depression and severe depression. Using a 10-year time horizon, it appears that, in the base-case analysis, the submission assumed a continuous treatment effect beyond the 5-year observed clinical benefits, a structural assumption in favour of VNS that may not be justified.

The rTMS model was an individual-based state-transition model and included depressive episodes (no response/relapse), full remission, partial remission, hospital admission (with the possibility of remaining in hospital), electroconvulsive therapy (ECT) or lithium augmentation (if rTMS failed), and death (with increased risk of mortality for patients in the acute phase). In this model, the severity of depression (evaluating response to treatment) was defined by Hamilton Depression Rating Scale (HAM-D) scores. In the base-case analysis, the rTMS submission used a 3-year time horizon, and assumed a continuous benefit beyond the follow-up period (6 months).

Compared to the VNS model, the rTMS model included additional health states and transitions to capture clinical pathways for non-responders more accurately; for example, they can experience an acute episode that requires hospital admissions. Also, non-responders can move to an alternative treatment state (e.g. ECT). The rTMS model appropriately captured an increased risk of death due to suicide whilst in a depressive episode. While the rTMS model contained three depression states as indicators of treatment success, the VNS model included four depression states (disaggregating depressive episodes to moderate and severe depression). Using an individual-based modelling technique, the rTMS submission used tracker variables to record the number of previous treatments before moving to an alternative treatment (e.g. ECT).

In the VNS and rTMS submissions, the choice of alternative structural aspects (including different modelling techniques) can reduce the comparability of evaluations of VNS and rTMS, especially when no systematic approach to characterise structural uncertainty has been reported. This can potentially lead to inconsistent MSAC funding decisions when evaluating alternative technologies proposed for the same target population because changes in model structure and analysis can produce different results. For example, the structural choices relating to the inclusion of hospital admissions in the rTMS model and the possibility of remaining in hospital can lead to results in favour of the proposed technology due to hospital cost offsets given larger remission rates in the rTMS arm (and hence higher non-responder rates in the comparator arm with the possibility of transitioning and remaining in the ‘hospital’ state). This, among other things, may have contributed to the lower ICER reported by the rTMS submission. Also, with no valid benchmark to compare, it is not clear which set of structural choices presented in the submissions more appropriately captures the natural history of depression and disease stages.

2 Key Limitations Underlying the Need for (Disease-Specific) Reference Models

One of the key challenges associated with the use of different structural aspects is inconsistencies between the models presented to evaluate similar funding decision problems. This is particularly important in HTA decision-making given the current timeline of an appraisal, which makes detailed evaluation of model structures (including uncertainty around structural choices) very difficult. There are likely to be multiple factors influencing the modellers’ decision to choose different structural aspects (e.g. generic guidelines developed by national funding bodies, lack of time and experience to apply a systematic model structuring process, industry-induced bias in favour of the proposed intervention) [3, 18]. The personal experience of the authors is that the VNS and rTMS submissions are only one of a range of examples in which submission to national funding bodies have used different model structures for the evaluation of the same target population.

One way to address these concerns, within the HTA decision-making process, is the use of (disease-specific) reference models. In 2011, we proposed the development and use of reference models to inform reimbursement decisions [11]. This has been reiterated by Frederix et al. in 2014 [5] and 2015 [12], and, more recently (2019), by Mauskopf [13] and Sampson et al. [14]. Using the best available evidence and data sources, reference models represent a specific disease (e.g. depression), simulating the disease severity and progression. Such models can be used to evaluate a wide range of healthcare interventions (e.g. medical devices, pharmaceuticals) that can be used for the management of the same condition (e.g. depression). A methodological framework to develop reference models has been previously proposed [15]. Briefly, this includes the development of a conceptual model (captured by relevant and significant health states/events and patient attributes/event histories), choice of key structural assumptions, choice of an appropriate modelling technique, model construction and performance evaluation (e.g. validation). Model structural aspects can be updated when new evidence (e.g. about the natural history of the condition) becomes available. Using this framework, we developed a conceptual model and proposed an appropriate modelling technique to inform a reference model for the evaluation of alternative health technologies for depression [16]. Our proposed conceptual model is based on existing evidence of the progression of depression (clinical literature and guidelines supplemented by clinical inputs) and includes seven health states/events (depressive episode, response, remission, recovery, chronic depression, hospital admission and death). During the conceptualisation process we also identified a number of patient attributes and event histories that influence the experience of subsequent events (e.g. the number, severity and duration of previous depressive episodes). Given a number of recurring events and the need to reflect event histories, we proposed an individual-based model as an appropriate modelling technique [16].

Compared with our proposed model, the rTMS and VNS submissions do not include a number of relevant health states (e.g. chronic depression) and do not capture the impact of event histories on subsequent depressive episodes. As proposed by clinical practice guidelines, we used a common set of terminology to define depression states (i.e. response, remission, recovery and depressive episode/relapse). This, combined with the inclusion of chronic depression and hospital admissions as clinically/economically important states, reflects different stages of disease progression more appropriately (i.e. acute, continuation and maintenance phases). This has not been fully captured by the VNS and rTMS models. For example, these models do not capture the impact of the proposed interventions on recovery, defined as “an extended asymptomatic phase, which lasts more than 6 months” [17]. The likely impact will be biased results in favour of interventions with higher rates of remission (which may be the more or less effective treatment overall). Proposing individual-based modelling, our model structure includes a number of relevant and important patient attributes and events histories to capture their impact on the risk of subsequent events included in the model. These have not been included in the VNS and rTMS models.

3 Potential Value of (Disease-Specific) Reference Models

As the result of dedicated scientific efforts and with more time and resources available for the evidence-based model structuring practice and validation processes, reference models are expected to more accurately capture the progression of the condition and how costs and outcomes are accumulated over time. This can minimise modeller’s bias and lead to more accurate estimates of costs and health outcomes, potentially providing earlier access to new health technologies by, for example, reducing the number of resubmissions to national funding bodies. In the presence of uncertain or conflicting evidence on structural choices, reference models can represent uncertainty, which can be characterised using appropriate structural sensitivity analyses (e.g. model averaging or an expanded probabilistic sensitivity analysis) (see Haji Ali Afzali and Karnon [18] for full details of methods to characterise structural uncertainty). Also, the use of such models has the advantage that it can avoid inconsistencies when evaluating interventions targeting the same disease, which is often lacking with piecewise evaluations (a dominant approach within the current decision-making process). As noted by Sampson et al. [14], the use of reference models within HTA evaluation processes can improve transparency as “they provide an opportunity for transparent reporting of model structure and validation steps, independent of the nuances of individual policy questions that may create barriers to transparency.” These models can also reduce duplication of effort by HTA stakeholders (industry and reimbursement bodies), avoiding ‘reinventing the wheel’.

It should be noted that additional non-trivial resources (time and human resources) are required to develop and update reference models, and hence we are not proposing to develop these models for all disease areas. The additional value of reference models for reimbursement purposes is particularly realised for common complex (chronic) conditions (e.g. depression, osteoporosis, frailty). Chronic conditions include several health states/events relating to the natural history of the condition (with long-term and recurring events), where future prognosis is dependent on a number of patient attributes and event histories. Given that the choice of structural aspects is likely to have a significant impact on model outputs, chronic conditions represent good value for additional resources required to develop disease-specific models. The choice of chronic conditions can be informed through consultation with key stakeholders such as national funding bodies and manufacturers.

Given resource constraints for the public funding of emerging, costly health technologies, it is critical to improve the efficiency of funding decision processes [19]. By reflecting disease progression more accurately and minimising variations in structural choices, reference models can improve accuracy, comparability and transparency of HTA evaluations, ultimately enhancing the efficiency of the decision-making process.