The individual-level data used for constructing the simulation model are from a specialist UK population-based registry, the Haematological Malignancy Research Network (www.hmrn.org); the methods of which have been previously described [1, 18]. Briefly, since September 2004, all patients newly diagnosed with a haematological malignancy (leukaemias, lymphomas, and myelomas) in a catchment population of more than 3.6 million have been routinely ascertained and followed-up. HMRN has Sect. 251 support under the NHS Act 2006, which allows all patients regardless of consent, to have full-treatment, response and outcome data collected to clinical trial standards; and to be ‘flagged’ for death and subsequent cancer registrations at the national Medical Research Information Service (MRIS) and linked to nationwide information on Hospital Episode Statistics (HES).
The current study includes all adult patients (≥18 years) newly diagnosed with de novo DLBCL (International Classification of Disease for Oncology, 3rd edition: 9680/3, 9735/3, 9712/3, and 9679/3) within HMRN in 2007 (N = 271). All patients were followed for 5 years from the date of diagnosis, and treatment pathways were individually mapped out according to the chemotherapy regimens received. A more detailed summary of patient characteristics is presented in Supplementary Table 1.
In order to reflect the current treatment strategies, while also being responsive to future changes, a discrete event based micro-simulation model was constructed using Simul8 software (Simul8 2013 Professional version, Simul8 Corporation, Boston, MA, USA). The model first assigned attributes (such as age at diagnosis, sex and prognostic factors) to a group of simulated patients, and then moved each patient forward to the next event, based both on their characteristics and on the timing of the events instead of fixed time cycles.
The model structure was based on patient treatment pathways determined from empirical HMRN data, expert opinion and clinical guidelines. The structure of the model is shown in Fig. 1, and a simplified version of the model can be found via the following link https://www.hmrn.org/economics/models. Date of diagnosis defines the start of the model; with costs for diagnostic tests such as biopsies, scans, electrocardiography (ECG) and echocardiography (ECHO) being included. After diagnosis, the model splits into two branches according to whether the initial decision was to administer first-line chemotherapy with curative intent or manage supportively using a palliative approach. This is a unique and important feature of the model, ensuring the results reflect ‘real world’ practice and capture the fact that some patients are managed palliatively from the date of diagnosis until death. For those who entered the first-line curative treatment branch, different chemotherapy regimens, with or without supportive care, were assigned to each patient. The probability of receiving each treatment varied according to the patient’s individual attributes [such as age, disease stage and central nervous system (CNS) involvement]. This design allowed the model to capture the differences in ‘cost’ and ‘time in treatment’ between alternative regimens. However, it was beyond the scope of this study to compare the economic impact of different first-line chemotherapies.
Once first-line treatment had been received, one of three outcomes was assigned to each individual: died during treatment, responded to treatment, and no response to treatment. The probabilities of these outcomes were dependent on the first-line chemotherapy regimen and age at diagnosis.
For individuals who responded to first-line treatment, one of three events could occur: relapse, remain in remission until cured (defined as staying in remission ≥5 years) and death in remission. For those who were deemed to be cured, it was assumed that mortality returned to that of the general population and no subsequent DLBCL-related medical costs were incurred.
For individuals who relapsed or had disease that was refractory to treatment, two further options were possible: further potentially curative treatment or the adoption of an end-of-life (palliative) approach. The probability of the decision was dependent on age at diagnosis, previous chemotherapy regimen and response. For those who were not treated curatively, end-of-life care included all care given from last chemotherapy until death. For those who received second-line treatment, different types of chemotherapy regimens with or without autologous stem-cell transplant (ASCT) were included. Following this, each patient could remain in remission, receive third-line treatment, or receive end-of-life care; with the decision process being identical to that for first-line treatment. Few patients received treatment post third-line; so for the purposes of the model it was assumed that those who did had similar treatment patterns and response rates to those observed at third line.
The key input parameters used in the model are listed in Tables 1. For more details, please refer to Supplementary Tables 2–4.
Model inputs: medical costs
The model was built from an NHS perspective; and only medical costs directly related to DLBCL management were considered. This included costs for diagnosis, treatment, supportive care, follow-up and end-of-life care. Details of the cost items and different chemotherapy regimens included in each costing phase can be found in Supplementary Table 5.
All cost parameters were calculated using the National Tariff 2013/14 , representing the reimbursement/expenditure of NHS for treating the DLBCL population. For costs that were locally negotiated, such as the costs of chemotherapy regimens, information was derived from the Leeds Teaching Hospital NHS Trust. The inflated NHS reference cost 2012/13  was used only when data were not available. All costs were expressed in 2013 British pound sterling; and the detailed cost information (unit costs) used in the model is summarised in Table 2.
Model inputs: time to event
Time-to-event (TTE) is a key element for discrete event simulation. Several time-to-event analyses were carried out using empirical data derived from HMRN to estimate the distributions associated with time between two events. This included the time from diagnosis to treatment, time in treatment, time from response to relapse, time from response to death and time in end-of-life care. All time-to-event analyses (survival analyses) were based on the best fit distributions as a function of patient’s age, treatment intent and treatment details. Five parametric survival models (exponential, Weibull, log-normal, log-logistic and Gompertz distributions) were tested and the best fit model was determined using the Akaike information criterion (AIC) score. It was assumed that cured patients’ mortality would return to normal, and the distribution of time to death was generated using the United Kingdom National Life Table, 2011–2013 . The key parameters used in the model are illustrated in Fig. 2a–d. For more details on the time-to-event analyses, please refer to Supplementary Table 6.
Health outcome was measured by life-years gained (LYG), while economic outcomes were captured by medical costs. Both economic and health outcomes were discounted using a 3.5 % annual discount rate, based on UK guidance recommended by the National Institute for Health and Clinical Excellence (NICE) .
Probabilistic sensitivity analysis was performed on all parameters in order to explore the cumulative uncertainty of the model. Each parameter was assigned a distribution to reflect sample variability, whilst coefficients of survival models were assigned multivariate normal distributions. Then, Monte Carlo simulations were carried out by sampling parameters from the corresponding distributions simultaneously over a large number of iterations until stable results were reached (500 times). All outputs from the iterations were summarised with 95 % confidence intervals .
To investigate the impact on the UK as a whole, the annual number of expected cases in the UK (N = 4880) derived from HMRN rates was used to run the simulation model www.hmrn.org/statistics/incidence. Incidence-based results were presented in aggregate, as well as for the time horizons of 5-year, 15-year, and lifetime (simulated until 100 years of age or death). Survival beyond 5 years was extrapolated based on the best fit time-to-event distributions derived from the empirical data and the UK National Life Table, 2011–2013 . See “Model inputs: time to event” in Methods for details.
To further investigate the effect of age, a sub-simulation was conducted to capture differences in cost and life-years gained for two age groups: under 70 and over 70 years of age. In addition, using the expected number of new cases of DLBCL diagnosed each year in the UK (N = 4880), the model further simulated national prevalence-based costs and life-years gained. Results were collected after a burn-in period of 10 years.
The model was validated by means of standard methods, including face, internal and external validations . Face validation was conducted while the model was under construction by consulting clinical experts on model structure, data sources and results. Internal validation was assessed by comparing predicted costs and life-years gained with empirical estimates, and external validation was carried out by comparing simulated results with relevant literature.