FormalPara Key Summary Points

In health economics, Markov models are widely used to represent relapsing–remitting multiple sclerosis (RRMS), but usually evaluate only a single line of treatment.

Here, we report on the implementation and validation of an innovative computational model designed to address challenges regarding treatment sequences in patients with RRMS. We also apply this modelling framework as a case study to determine the impact of different decision-making approaches on the optimal treatment sequence and associated outcomes.

Internal and external validation of our model showed that outcomes were consistent with those of existing Markov models and the published literature.

Each decision-making criterion generated a different optimal treatment sequence; it was possible to improve patient outcomes compared with current treatment guidelines.

The model presented here has the potential to simulate individual patient trajectories and may be useful in supporting treatment switching decisions as well as informing future clinical guidelines.

Introduction

Multiple sclerosis (MS) is a chronic autoimmune-mediated inflammatory disease of the central nervous system that affects an estimated 2.8 million people worldwide [1, 2]. Of these, approximately 85% are diagnosed with relapsing–remitting multiple sclerosis (RRMS), which is characterized by periodic acute exacerbations of disease activity (relapses) that can lead to neurological disability over the patient’s lifetime [2]. Several disease-modifying therapies (DMTs), targeted at delaying the progression of disability, have received regulatory approval over the past 20 years, significantly increasing the number of treatment options and improving quality of life for patients [3]. The number of DMTs on the market, coupled with the reality that RRMS requires long-term treatment wherein most patients will switch DMTs at least once over the course of their disease, poses challenges for physicians regarding the choice of, and appropriate times to switch, a patient’s DMT [4, 5]. Additionally, health economists and policymakers are challenged with how to evaluate DMTs appropriately in order to capture the heterogeneity of disease trajectories and treatment patterns in patients with RRMS [6, 7].

Currently, the majority of health economic models for RRMS are Markov models, which typically evaluate a single line of treatment in a cohort of patients with RRMS [6, 7]. As such, these models are not able to address questions regarding treatment sequencing. Here we describe an approach, based on an innovative model framework, designed to address the challenges regarding treatment sequences in RRMS. This approach replicates the process of clinical decision-making, through simulating a ‘virtual’ physician who makes treatment decisions according to his or her evolving understanding of a patient’s disease as it manifests over time. As a computational model of physician behaviour, this approach has the potential to simulate individual patient trajectories in the current treatment landscape in order to support treatment switching and treatment positioning decisions in RRMS.

In the current paper, the implementation of this innovative approach to modelling treatment sequences in patients with RRMS is described in detail, along with results of several validation exercises. In addition, the modelling framework was applied as a case study to explore the impact of different decision-making criteria on the optimal sequence of DMTs, as well as to determine the costs, quality of life, and hospital resource usage associated with each sequence.

Methods

Model Concept

Disability worsening in RRMS is commonly assessed using the Expanded Disability Status Scale (EDSS). An EDSS score of 6.0 (on a scale from 0 to 10) is used to define a disabling level of disease [8]; time from disease onset to EDSS 6.0 (ttEDSS6), without treatment, can be regarded as a proxy for severity of disease [9, 10]. Patients with RRMS have heterogeneous disease trajectories, i.e. differing ttEDSS6 or severities. The core assumption underlying this model is that physicians use DMTs to slow disability progression and prolong ttEDSS6. Each DMT involves potential benefits and potential risks and the choice regarding treatment consists of a trade-off for each individual patient; the probability of serious adverse events (SAEs) may be identical but the potential benefits associated with the additional treatment efficacy may be greater for more severe patients who are deteriorating more rapidly [11]. As such, the physician’s expectation of severity drives treatment switching decisions.

The model centres on the patient–physician interaction during iterative outpatient visits (Fig. 1). Initially (1), a patient is simulated with characteristics including age at disease onset and sex. Severity (ttEDSS6 without treatment), time to non-MS related death, and relapse rate are then assigned to the patient on the basis of these simulated characteristics. During each visit, the virtual physician observes relapses, disability worsening, and adverse events (AEs) (2a) and forms/revises their expectation about the patient’s disease severity and probability of response to treatment (i.e. prior/posterior) on the basis of these observations (2b). The observed clinical outcomes (2a) also drive the decision whether to switch treatment when either tolerability or response is insufficient, and when this occurs the current expectation (prior) of severity is used to determine the optimal DMT for the patient (2c/d). Note, as in clinical reality, the actual severity level and all future events simulated by the model are unknown to the physician.

Fig. 1
figure 1

Visualization of the iterative treatment decision process. DMT disease-modifying therapy, ttEDSS6 time from disease onset to Expanded Disability Status Scale state 6, MS multiple sclerosis

The optimal DMT at treatment initiation and subsequent switch is identified using a Markov model which represents RRMS (2c). This Markov model yields the expected outcomes for each DMT, from which the physician selects the best option (2d) based on a specific decision rule. The clinical outcomes and patient trajectory are then simulated (3) on the basis of the treatment selected as the optimal option (at 2d).

The process continues until treatment with DMTs is terminated (e.g. due to death, attaining EDSS7 or EDSS8 as determined by the model, or reaching the maximum number of lines of treatment). This approach ultimately yields an optimal DMT sequence for each individual patient inclusive of costs and quality of life outcomes associated with the patient’s disease trajectory.

Model Implementation

The virtual physician model was built as a patient-level discrete event simulation (DES) in order to (1) capture the heterogeneity in disease trajectories and outcomes, (2) resemble the patient–physician interaction during visits, and (3) record individual patient histories. A schematic overview of the model is shown in Fig. 2. The model outcomes included costs and quality of life (represented as quality-adjusted life years, QALYs) related to disease management linked to EDSS scores, relapses, and SAEs. The model was programmed in R [12]. Please refer to the Supplemental Materials for a more detailed description of the methods, including calculations and model inputs.

Fig. 2
figure 2

Schematic overview of the virtual physician model implementation. The current events determine what occurs at each visit (based on the earliest of the current events). Update simulation clock and age: the time and age of the patient are updated to reflect the passage of time to the current event. DMT disease-modifying therapy, EDSS Expanded Disability Status Scale, SAE serious adverse event, PML progressive multifocal leukoencephalopathy

Disease Trajectory

For each simulated patient, the process commences at the diagnosis of RRMS. At this point, sex and age at onset, randomly drawn from independent statistical distributions in the patient population of interest, are assigned to the patient. Next, the patient’s disease trajectory, a list of events that may happen to the patient over the natural history of the disease, is simulated (on the basis of the British Columbia Multiple Sclerosis [BCMS] registry data [13, 14]). For each specific treatment, the disease trajectory for relapses and EDSS progression is determined through application of a treatment effect to the natural history. The treatment effect applied is based on a published network meta-analysis (NMA) of cladribine tablets versus comparators [15].

The modelled events include the time of (1) the next EDSS step, (2) a relapse, (3) an SAE, (4) the next routine visit, and (5) death due to causes other than MS. Secondary progressive MS (SPMS) is not explicitly modelled here as it is considered a later stage of RRMS [16] and conversion to SPMS has little or no implications for costs and QALYs [17, 18].

The time of the next EDSS step is based on the patient’s severity (ttEDSS6), which is randomly drawn from a distribution of patient severities stratified by sex and age at onset [13]. The time of a relapse is based on a patient’s individual annualised relapse rate (ARR). In the first 5 years following diagnosis, the ARR is determined by the patient’s simulated severity [14]. Beyond 5 years, disability worsening and relapses are modelled independently as no data were identified regarding the relationship between EDSS and relapses in natural history [14, 19]. In addition, the ARR is assumed to decrease over the course of the disease on the basis of the age at onset. A relapse can either be severe (requiring hospitalisation) or non-severe. The time of an SAE is based on incidence rates which were sourced from pivotal trials [20,21,22,23,24,25,26,27,28,29]. Furthermore, the probability of progressive multifocal leukoencephalopathy (PML), a potentially fatal complication first linked with natalizumab treatment and now thought to occur with a variety of DMTs, increases over time and is modelled on the basis of the patient’s anti-John Cunningham virus (JCV) antibody serological status and the duration of exposure to natalizumab [15, 30]. Routine visits take place at regular time intervals, by default set at 1 year. The simulation ceases for each patient only at the point when the patient dies, either from MS (i.e. when EDSS10 is reached) or from other causes. The time of death due to causes other than MS is calculated given the patient’s age and sex, using the life expectancy of the general population [16].

Following an event, the disease trajectory is modified in one of two ways: (1) if the event does not trigger a switch of treatment, an updated time is simulated for the event which occurred with times for all other events remaining unchanged or (2) if the event leads to a treatment switch, the disease trajectory is re-simulated to reflect the treatment effects and SAEs associated with the new treatment. When a patient reaches the point of DMT termination, the patient is ‘switched’ to natural history. From this point, the trajectory only contains relapses and EDSS steps to allow the calculation of lifetime costs and effects; SAEs associated with DMTs are no longer modelled. In natural history, it is assumed that no further treatment switches are possible.

Switching Indicators

Re-evaluation of the treatment decision following an event is triggered by inadequate effectiveness, safety concerns such as the occurrence of a SAE, or an unacceptable PML risk. For effectiveness, the model switches a proportion of patients on the basis of assumptions regarding relapses and/or EDSS worsening (see Supplemental Table 10).

Updating the Prior of Severity

The physician’s expectation (prior) of a patient’s severity is updated at each visit and based on observations about EDSS progression. The prior of severity is implemented by 11 severity groups, each representing a different ttEDSS6 in 5-year bands: (1) > 0–5 years, (2) > 5–10 years, …,11) > 50–55 years. The prior of severity gives a probability distribution for the patient residing in each of these severity groups. These probabilities are collectively exhaustive, i.e. the probabilities must sum to one. The virtual physician’s prior of severity at baseline is equal to the distribution of severity groups from which the patient’s individual severity is drawn (e.g. male, onset age 35 years). The approach taken to update the prior of severity as the physician interacts with the patient over time is described in the Supplemental Materials.

Determination of the Optimal DMT

At treatment initiation or switch, the alternative treatment options are evaluated, under the assumption that a patient can only receive any DMT once. A Markov model component [16], which represents the decision process conducted by the physician, is used to identify the optimal DMT given the patient’s current age, sex, and expected ARR given disease duration and the prior of severity (the Markov model makes use of the same clinical parameters as the DES but uses average/population values rather than individual estimates). The Markov model consists of 11 health states ranging from EDSS 0 to 9 plus death and is evaluated for each of the EDSS severity groups and for all available DMT options. The transition probabilities are dependent on the severity and are adjusted by a treatment effect (see Supplementary Materials for more detail).

For each available treatment and for each of the severity groups, the Markov model predicts the time the patient would spend in each EDSS state until the end of the time horizon (set at 5 years on the basis of a Delphi study [18]) or death. The Markov model generates costs and QALYs related to treatment, disease management, relapses, and AEs. The outcomes associated with each possible treatment are determined as a weighted average of the outcomes associated with each severity group using the distribution of patient severities (prior of severity) as weights. This yields the expected outcomes for the patient for each treatment. The virtual physician then determines the optimal treatment based on a specific decision rule. Decision rules that may be considered include minimising number of relapses, maximising the time to disability progression, and cost-effectiveness (best value for money).

Model Validation

Both the DES and the Markov model were subjected to internal and external validation. Internal validation assessed whether model outcomes for natural history were in line with the input parameters used to inform the model. The DES model for natural history was based on the BCMS registry data [13, 14, 31]. The relapse rate and ttEDSS6 were validated against their original sources, including modelling the median simulated ttEDSS6, the impact of age at onset and sex on simulated ttEDSS6, and relapses conditional on disease duration, age at onset, and severity. External validation compared costs and quality of life outcomes from the DES model against outcomes produced using the model described in a recently published cost-effectiveness analysis [32]. This reference is a cohort-based Markov model that compared alemtuzumab, cladribine tablets, natalizumab, and natural history (best supportive care, BSC). In the DES, these four treatment strategies were evaluated, one at a time; after discontinuation of active treatment patients were moved to BSC.

The Markov component of the treatment-sequencing model is crucial as it determines the patient’s next treatment. As such, a separate validation was undertaken and the Markov component was validated against the reference model developed for cost-effectiveness purposes [32]. This validation comprised two elements: first, the progression rates in natural history were compared to published progression rates [16], then the cost and QALY outcomes were compared to the reference model [32].

Case Study: Modelling the Optimal Treatment Sequence and Assessing the Impact of Different Decision-Making Criteria

A case study was developed to model how different decision-making criteria identified the optimal sequence of DMTs, along with the associated impact on costs and quality of life outcomes, as well as hospital resource usage. A cohort of 1500 patients was simulated in the DES over their future lifetime; treatment decisions were re-evaluated annually over a 5-year time horizon, which is a typical time horizon for decision-making according to the Delphi study with neurologists [18]. The analyses included a set of nine DMTs (alemtuzumab, cladribine tablets, dimethyl fumarate, fingolimod, glatiramer acetate, interferon beta-1a, natalizumab, ocrelizumab, and teriflunomide) as well as the option of no treatment.

Current Treatment Guidelines

Initially, a sequence based on current treatment guidelines was estimated using the National Health Service (NHS) England’s ‘Treatment Algorithm for Multiple Sclerosis Disease-Modifying Therapies’ [33]. To find a suitable scenario for the population simulated in the model, it was assumed that at the start of the treatment sequence, the patient is diagnosed with RRMS on the basis of one relapse in the last 2 years and radiological activity. Additionally, it was assumed that the physician would not choose interventions that are indicated in the algorithm as being high risk for that line of therapy. As such, the chosen guideline treatment sequence was (1) interferon beta-1a; (2) cladribine tablets; (3) ocrelizumab. The NHS report gives no recommendations of optimal DMTs after third-line treatment; therefore, the analyses presented here focused on the first three lines of treatment only.

Decision Rules

Three different decision rules were selected for this case study; the optimal treatment sequence for each was calculated and compared to current treatment guidelines. The ‘number of relapses’ criterion selects the optimal DMT based on the lowest relative risk of relapses. The ‘number of EDSS steps’ criterion estimates the optimal DMT based on the lowest average EDSS value at the end of the 5-year time horizon. The ‘cost-effectiveness’ criterion uses the Markov model component to calculate the optimal DMT over the 5-year period based on the highest number of QALYs with an associated incremental cost-effectiveness ratio (ICER) below the cost-effectiveness threshold. This is implemented as the treatment that yields the highest expected net monetary benefit (NMB = QALYs × willingness-to-pay threshold − costs), calculated using a willingness-to-pay threshold of £30,000 per QALY gained.

Estimating Resource Usage

Three DMTs (alemtuzumab, natalizumab, and ocrelizumab) included in the model require infusion visits for patients. These visits are associated with additional resource use and costs, which are included in the model, but also require the availability of physical capacity in the hospital (e.g. chair time) for the duration of the infusion. The capacity required over time, in terms of proportion of patients on treatment requiring infusion visits, was calculated for the current treatment guidelines and optimal sequence of DMTs based on each of the three decision rules presented above. These calculations assumed a fixed cohort of patients requiring capacity and no newly diagnosed patients entering the model.

Compliance with Ethics Guidelines

This article does not contain any new studies with human participants or animals performed by any of the authors.

Results

Model Validation

DES Internal Validation

Tremlett et al. reported a median ttEDSS6 of 32.6 years (95% confidence interval [CI] 29.2, 36.0) for patients with RRMS [13]. Before adjustment for background mortality (competing risks), the DES provided a good prediction of the median ttEDSS6 (32.5 years; 95% credible interval [CrI] 32.0, 33.0) based on 25,000 simulated patients. A Cox model, fitted to the simulated ttEDSS6, predicted the hazard ratio (HR) of ttEDSS6 due to age at disease onset and sex well (Table 1).

Table 1 DES internal validation results on time to EDSS6

After correction of the ttEDSS6 for background mortality, the severity at baseline was re-simulated for 25,000 patients. The median ttEDSS6 after correction was 26.8 years (mean 27.7; 95% CrI 27.5, 27.9). The DES was then run for 2000 patients given natural history. A Cox model was fitted on the ttEDSS6 censoring patients who died before reaching EDSS6. The median ttEDSS6 reached by patients in the DES (27.1 years, 95% CI 26.0, 28.2) was similar to the simulated values at baseline, suggesting that disability worsening until EDSS6 was implemented correctly in the DES.

In terms of relapse data, the mean ARR from the DES was 0.24 (median 0.22; interquartile range [IQR] 0.14, 0.31), which was similar to the ARR of 0.23, calculated as 11,722 events/(mean follow-up [20.6 years] × patients [2477]) observed by Tremlett et al. [14]. This shows that the model predicted relapses appropriately.

DES External Validation

The results of the comparisons of alemtuzumab, cladribine tablets, natalizumab, and BSC are shown in Table 2. Overall, the results of the DES model were in a similar range to the results of the reference model. The most important differences were those associated with drug costs, relapse-related costs and QALYs, and AE-related costs and QALYs. Furthermore, the disease management costs associated with alemtuzumab and cladribine tablets were higher in the DES than in the reference model. The differences in drug and disease management costs are mainly explained by higher discontinuation rates in the DES than in the reference model. Differences in AEs are explained by including only severe AEs in the DES. The average ttEDSS6 was 26.4 years in the DES compared to 20.9 years in the reference model; this would be expected from the relatively more severe population used in the latter.

Table 2 DES external validation results

Markov Model

The proportions of patients who had not reached EDSS6 over time are displayed in Appendix Fig. 1. The weighted average across all severity groups produced by the Markov model closely matched the data from Palace et al. until year 5 [16]. Beyond this point, the proportion of patients who had not reached EDSS6 declined more slowly in our model.

Appendix Fig. 2 displays the average time spent in each health state for a time horizon of 25 years. These outcomes are well aligned between the models, although patients in the reference model spent relatively more time in EDSS6 and EDSS8 compared to the Markov component of our model, while they spent relatively less time in EDSS4 and EDSS5. When the modelled cohort starts at EDSS3, patients in the reference model again spent more time in EDSS6 and EDSS8 than patients in our model. The most striking difference was observed for EDSS state less than 3: no patients resided in EDSS0 to EDSS2 in our Markov model since no backward transitions were allowed, as compared to the reference model where patients spent approximately 5 years in EDSS0 to EDSS2 within a 25-year time horizon.

The outcomes of the Markov component of our model were also compared to the outcomes from the reference Markov model [32]. The results of the validation for a time horizon of 25 years are shown in Table 3. The outcomes of our Markov model are reasonably aligned with the outcomes of the reference model. The most prominent differences are (1) life years and associated QALYs are lower in our Markov model because MS-related mortality was modelled differently, (2) the number of relapses in natural history and the associated costs and QALYs are higher in our Markov model because it used a more detailed method to calculate relapses in natural history; specifically the number of relapses with treatment are lower because no discontinuation is assumed, and (3) the treatment costs of natalizumab, the only continuously administered DMT included in this model, are considerably higher in our Markov model because no discontinuation was assumed.

Table 3 Markov model validation results based on a 25-year time horizon

Case study: Modelling the Optimal Treatment Sequence and Assessing the Impact of Different Decision-Making Criteria

Figure 3 presents the optimal treatment sequences based on the current treatment guidelines as well as for each of the three decision criteria. As illustrated, each generated a different optimal treatment sequence. Cladribine tablets is the only DMT that was common to all sequences; however, its positioning within treatment lines changed with the choice of scenario. The sequences based on minimising the number of relapses or EDSS steps were similar, with the second- and third-line treatments (natalizumab and ocrelizumab) switching. The sequence optimising for cost-effectiveness had a common DMT for all patients (glatiramer acetate) for first-line therapy. For second and subsequent treatment lines, different patients receive different treatments.

Fig. 3
figure 3

Optimal treatment sequences using different treatment decision criteria. Note that proportions do not sum to 100% after the first treatment because patients may drop out of treatment or the model (because of death). EDSS Expanded Disability Status Scale

Table 4 presents the impact of the treatment sequences in terms of the proportion of patients reaching EDSS6, time spent in the model, costs, and QALYs. The results indicate that the lowest proportion of patients reaching EDSS6 (1.07%) occurred when the treatment decision was made based on the sequence most likely to minimise the number of EDSS steps. Choosing treatments to minimise the number of relapses in turn maximised the time spent in the model, as well as QALYs, but this sequence has the largest associated costs. Choosing the optimal treatment based on cost-effectiveness was associated with the lowest costs and the lowest QALYs. The main differences between the treatment sequences in terms of costs come from drug acquisition and administration. As higher costs were associated with better outcomes in these scenarios, it suggests that the more efficacious treatments in terms of the number of relapses and reducing the number of EDSS steps are also more expensive. Additionally, the second- and third-line treatments in these sequences (natalizumab and ocrelizumab) require infusions and are among the most expensive treatments both in terms of annualised drug and administration costs.

Table 4 Results based on treatment decision-making criteria

Figure 4 presents the results of the scenarios on a cost-effectiveness plane relative to the current treatment guidelines. This indicates that at the illustrated threshold of £30,000 per QALY gained, the optimal sequences for minimising EDSS steps or relapses were not cost-effective compared to the current treatment guidelines sequence. The sequence obtained from optimising cost-effectiveness decision criteria is cost-effective at this threshold; however, it does reduce the number of QALYs (at reduced cost) for patients compared to the sequence based on current guidelines.

Fig. 4
figure 4

Results based on different treatment decision criteria relative to the current treatment guidelines sequence. EDSS Expanded Disability Status Scale

Figure 5 presents the number of patients on a treatment that require an infusion visit over time. When choosing the optimal treatment sequence based on cost-effectiveness, no treatments that require an infusion visit are included; this is likely due to the increased cost in terms of drug acquisition and administration associated with these treatments. The graphs based on minimising EDSS steps or relapses decision criteria are similar. This can be explained by the similarity in the treatment sequences (see Fig. 5), with the second and third treatment lines alternating between the two sequences. In all cases, the number of incident patients requiring infusion visits peaks at 10–20 years after treatment initiation.

Fig. 5
figure 5

Number of patients on treatments requiring infusion visits by treatment decision criteria. NMB net monetary benefit, EDSS Expanded Disability Status Scale

Discussion

New modelling approaches are needed to address questions regarding treatment sequences in RRMS, including what decisions rules are currently used when considering the benefit–risk profiles of different DMTs in clinical settings and what thresholds should be used to determine that a DMT is not performing consistent with expectations. The approach described in the current paper is proposed to help decision-makers address these complex questions and provides an innovative framework for the explicit modelling of treatment sequences in RRMS. The model focuses not only on what treatment a patient will, or should, switch to but also on when a patient should switch treatment. The conceptualisation of the (virtual) physician as someone who updates his/her view of a patient’s expected severity each time he/she observes the patient was introduced to reflect the fact that physicians develop an understanding of the patient over time and, in theory, can make more informed treatment decisions as time progresses.

The internal validation of the DES showed that disease severity, survival, and number of relapses are appropriately simulated for natural history. It is worth noting that the narrower CrIs calculated by the DES compared to the reference model can be partially attributed to the large number of patients simulated. The external validation showed that the outcomes of the DES and Markov model matched the reference models reasonably well, except for some differences which relate to modelling choices. The most prominent difference was the discontinuation rate, which was modelled using a user-adjustable decision rule in the DES and may have resulted in different discontinuation rates than those used in the Markov model. The relapse outcomes also differed between the models; however, given the ARR in the DES has been validated against the average relapse rate from the BCMS registry, we are confident that the DES simulates relapses appropriately. Differences were also observed in natural history progression of the Markov model compared to the reference cost-effectiveness model [16, 32]. These are as expected as Palace et al. [16] used a selection of patients from the BCMS registry with relatively severe disease (at least two relapses in the last 2 years), whereas the severity groups in our model captured patients from the BCMS registry without any requirements for disease activity. In addition, our Markov model provided no allowance for backward transitions (i.e. EDSS improvements) because these are considered temporary improvements, whereas backward transitions are included in the reference model [16]. Any other differences in Markov model outcomes are as expected and are attributable to different inputs and modelling choices. Overall, the results of the Markov model are considered robust and reliable.

One of the challenges with patient-level DES are the data requirements. The current implementation of the model used published, aggregate data and the key assumptions were validated in a Delphi study [18]. Although natural history was consistently based on the BCMS registry [13, 14, 31] and the validation process showed that the model can reproduce the published population level estimates, the use of aggregate data to populate the DES means that the covariance between simulated patient characteristics could not be included, neither could any relation between disability worsening and relapses nor a correlation between treatment effects within a patient. As such, the individual outcomes might be incorrect. A possible next step in the development of this model would be to fill these gaps using data from real-world studies. For example, magnetic resonance imaging test results were omitted from the current model because, as a result of the high correlation with relapses, it was deemed inappropriate to include them in the patient-level simulation, based on published aggregate data only [18]. However, the inclusion of radiology in the DES next to relapses and disability worsening, as well as in the decision rule for treatment switching, may potentially improve the model. Similarly, the probability of PML was only modelled for natalizumab treatment; real-world data on the incidence of PML in other DMTs for RRMS would be useful.

The virtual physician uses a likelihood function to update his/her belief of the patient’s disease severity. This likelihood function describes what a physician learns about the patient’s severity when the physician makes observations regarding disability worsening. Currently, a Gompertz distribution with a shape parameter of 0.1 was used to model this relationship between EDSS steps and a patient’s severity. This assumption was based on the progressive nature of RRMS and the assumed uncertainty in this relationship. Additional research is required to determine the accuracy of the current likelihood function and/or propose suitable alternatives. One of the key challenges in modelling treatment sequences in RRMS, as well as other autoimmune diseases including rheumatoid arthritis, is locating evidence for the effectiveness of treatments when given in later lines [17, 34, 35]. Whereas this model does not present a solution to this data gap, modelling treatment sequences in more detail makes knowledge gaps apparent and enables researchers to investigate the impact on treatment decisions. In addition, the physician’s choice of treatment will be affected by factors other than perception of disease severity, including patient choice, lifestyle, pregnancy, and co-morbidities; further research is required to develop the model to incorporate these factors.

This innovative framework enables the user to explore different decision criteria for choosing the optimal treatment strategy. In the case study, it was possible to improve on the current treatment guidelines strategy in terms of reducing the proportion of patients who reach EDSS6, duration of time spent on the first three lines of treatment, and associated quality of life. This was accomplished by choosing the optimal treatment sequence based on minimising the expected number of EDSS steps or relapses. However, this improvement does come at a financial cost. The treatments that minimise EDSS steps and relapse rates are more costly in terms of drug acquisition and administration costs. Optimising treatment by maximising cost-effectiveness is the least expensive treatment sequence, but also has the lowest QALYs. Furthermore, in addition to direct patient benefits, an important consideration in identifying optimal treatment sequences is the impact a treatment sequence may have on hospital capacity. Our analyses indicated that the capacity required over time is dependent on the choice of sequence, with resource use peaking 10–15 years after treatment initiation. The required capacity is presented in terms of the number of patients on each of the treatments requiring infusion visits. However, this may not be a clear indicator of the capacity required in the hospital as it is unlikely that all these patients would require their infusion visit at the same time. It is therefore important to consider treatment schedules along with the operating hours of the units.

It should be noted that the choice of the base-case treatment guidelines using the NHS treatment algorithm [33] may not be representative of the treatment sequence used in practice. A change in the treatment guidelines sequence will have a subsequent impact on the relative costs and benefits of the other sequences compared to it. Furthermore, these analyses focused on the relative benefits of treatment sequences for the first three treatment lines, in order to compare the results to the NHS treatment algorithm which gave no clear information on treatments that should be used in the fourth and subsequent treatment line [33]. In addition, there was limited information on the effectiveness of treatments when given in later treatment lines [17, 34, 36]. This lack of information adds some strength to our decision to concentrate on the first three treatment lines. Alternatively, focusing on the first three treatment lines may underestimate the treatment benefits to the extent that improvement could appear in later lines. An example of this is when patients are given less efficacious and less costly treatments in earlier treatment lines. Those patients with more severe RRMS would likely switch away from these treatments quickly to more effective and costly treatments, whereas patients with milder RRMS may stay on these earlier treatment lines for longer. Furthermore, as the benefits of the treatment are deferred until later lines for the more severe patients, this would mean that the total number of QALYs gained from the treatment sequence would be increased, impacting the value of the cost-effectiveness scenario.

Conclusions

From a validation perspective, the model proved to be robust in generating outcomes consistent with existing RRMS models and published studies on natural history. This approach can be used to identify optimal treatment sequences for patients with RRMS using different decision criteria. Improvements to the current treatment guidelines sequence in terms of the proportion of patients reaching EDSS6 after three lines of treatment and quality of life outcomes were possible. However, this comes at both a financial and capacity cost. Moving forward, this innovative framework has the potential to reliably simulate individual patient trajectories in the current complicated treatment landscape and therefore may prove useful to support treatment switching, treatment positioning, and treatment guideline decisions in RRMS.