Estimating Risk Factor Time Paths Among People with Type 2 Diabetes and QALY Gains from Risk Factor Management

Objectives Most type 2 diabetes simulation models utilise equations mapping out lifetime trajectories of risk factors [e.g. glycated haemoglobin (HbA1c)]. Existing equations, using historic data or assuming constant risk factors, frequently underestimate or overestimate complication rates. Updated risk factor time path equations are needed for simulation models to more accurately predict complication rates. Aims (1) Update United Kingdom Prospective Diabetes Study Outcomes Model (UKPDS-OM2) risk factor time path equations; (2) compare quality-adjusted life-years (QALYs) using original and updated equations; and (3) compare QALY gains for reference case simulations using different risk factor equations. Methods Using pooled contemporary data from two randomised trials EXSCEL and TECOS (n = 28,608), we estimated: dynamic panel models of seven continuous risk factors (high-density lipoprotein cholesterol, low density lipoprotein cholesterol, HbA1c, haemoglobin, heart rate, blood pressure and body mass index); two-step models of estimated glomerular filtration rate; and survival analyses of peripheral arterial disease, atrial fibrillation and albuminuria. UKPDS-OM2-derived lifetime QALYs were extrapolated over 70 years using historical and the new risk factor equations. Results All new risk factor equation predictions were within 95% confidence intervals of observed values, displaying good agreement between observed and estimated values. Historical risk factor time path equations predicted trial participants would accrue 9.84 QALYs, increasing to 10.98 QALYs using contemporary equations. Discussion Incorporating updated risk factor time path equations into diabetes simulation models could give more accurate predictions of long-term health, costs, QALYs and cost-effectiveness estimates, as well as a more precise understanding of the impact of diabetes on patients’ health, expenditure and quality of life. Trial Registration ClinicalTrials.gov NCT01144338 and NCT00790205 Supplementary Information The online version contains supplementary material available at 10.1007/s40273-024-01398-4.


2
Supplementary material 1: Additional results Table A2.Summary of risk factors averaged over all years.Patient characteristics for each trial at randomisation have been reported 4,5  previously [4,5] A5) and models fitted on TECOS data (Table A6) against observed data from either EXSCEL or TECOS.

Supplementary material 2: Instructions on how to use the coefficients to predict risk factors
Tables 1 and 2 in the manuscript provide coefficients for the estimation of the continuous and binary risk factors of an individual patient for each year of simulation, based on their predicted risk factor value in the previous year and their risk factor value at the start of the simulation.The values at the start of the simulation may represent the last recorded value in a randomised trial.
This supplementary material describes how these coefficients should be applied, using a hypothetical individual who had the risk factor values shown in Table A7 at the start of the simulation.The methods for applying these coefficients are similar to those for Leal et al 2021 [6].where  is the value of risk factor for individual i in year t. , is the previous year's risk factor value; y , is the risk factor value at start of simulation [6] and ℎ is a series of dummy variables for ethnicity, with '1' indicating White for ℎ , Black for ℎ , or Asian (oriental, Indian or other) for ℎ .The baseline ethnicity category is other (Hispanic, Aboriginal (Australia), Maori, Native Hawaiian, Pacific Islander, Indian (American) or Alaska Native).Finally,  is age at the start of the simulation.
We can insert the coefficient values from Table 1 to estimate predictions for HbA1c for the hypothetical individual shown in Table A7 12 months after the start of the simulation (9 years of duration of diabetes) as: The same methods can be used to estimate predictions for other continuous risk factors using the coefficients in Table 1.

Estimating predictions for binary risk factors
Atrial fibrillation (AF), albuminuria (ALB), PVD and eGFR <60 ml/min/1.73m 2 were predicted based on Weibull proportional hazards models (Table 2).We assumed that once a patient has been diagnosed with one of these events, they will have it for the rest of their life.The unconditional probability of these events occurring in the interval t to t+1 is: Where    is the integrated cumulative hazard at time t (years since diagnosis of diabetes) defined for the Weibull regression as where  is the vector of the covariates reported in Table A7 and  is the vector of their respective coefficients (Table 2).
Hence, the probability of atrial fibrillation (AF) 12 months after the start of the simulation for the individual outlined in Second, we estimate the probability of AF in the first year of the simulation to be 1.8% as,  = 1 − exp(0.07836− 0.09623) = 0.01771 This probability is then compared against a random draw from a uniform (0,1) distribution and, if it is higher, the individual develops AF.
The same approach is used to predict the probability of albuminuria, PVD and eGFR <60 ml/min/1.73m 2 conditional on the Weibull model covariates reported in Table 2. Smoking at randomisation was a significant predictor of PVD (Table 1); when predicting PVD, a dummy variable indicating whether the patient smoked at the start of the model simulation can be used in place of a dummy indicating whether the patient smoked at randomisation.

Predicting eGFR
A two-step approach was used to predict eGFR for each year of the UKPDS-OM2 simulation.The first step comprised a proportional hazard Weibull survival model predicting the probability that eGFR is <60 ml/min/1.73m 2 for each year of simulation.As with the other survival equations, once an individual progresses to eGFR<60 they remain in this health state for the rest of the simulation.The second step uses one of two Tobit models to predict the patient's eGFR value (as a continuous variable) conditional on whether they were predicted to have eGFR above or below 60 ml/min/1.73m 2 in the first step.
The Weibull model and Tobit models described in the main manuscript were used to predict eGFR for each patient one year at a time.For each patient with eGFR>60 ml/min/1.73m 2 in the previous year, in each loop of the UKPDS-OM simulation, the predicted probability of the Weibull model is compared with a random number to determine whether the patient progressed to eGFR<60 ml/min/1.73m 2 that year.Then, conditional on progression or not to eGFR<60 ml/min/1.73m 2 , one of the Tobit models is used to predict that patient's eGFR value.For patients with eGFR<60 ml/min/1.73m 2 in the previous year, the patient is assumed to remain in the eGFR<60 ml/min/1.73m 2 health state and the Tobit model for eGFR<60 ml/min/1.73m 2 is used to predict eGFR for that year.
For the first step, the instructions for binary risk factors in the previous section can be followed to estimate the predicted probability that eGFR is <60 ml/min/1.73m 2 and determine whether the patient progresses to eGFR<60 ml/min/1.73m 2 that year (by comparing predicted probability against random draw from uniform distribution (0,1)).
If patient i is determined to have eGFR below 60 ml/min/1.73m 2 , the conditional expected eGFR value is predicted as: ) is the cumulative distribution function for a standard normal distribution using the lower limit (0 ml/min/1.73m 2 ); Φ = Φ( ) is the cumulative distribution function for a standard normal distribution using the upper limit (60 ml/min/1.73m 2 );  and  are the corresponding density functions for the standard normal distribution; and  is the sigma (standard error of the forecast) given in  This patient will also be assumed to continue to have eGFR<60 ml/min/1.73m 2 for the rest of their life.If patient i is determined to have eGFR above 60 ml/min/1.73m 2 in loop L, the conditional eGFR value is predicted as: ) is the cumulative distribution function for a standard normal distribution using the lower limit (60 ml/min/1.73m 2 );  is the corresponding density functions for the standard normal distribution;  is the sigma (standard error of the forecast) given in Table 2 (13.839).
For the same hypothetical individual in  If the patient did not progress to eGFR <60 ml/min/1.73m 2 in year 1, the eGFR predicted for that patient for year 1 in this loop was then used to predict the probability that this patient progressed to eGFR <60 ml/min/1.73m 2 the following year for the same loop.The eGFR for year 1 was also used to predict the exact eGFR value in the relevant Tobit model conditional on the prediction of the Weibull survival model.This was repeated for subsequent years for this patient in this loop and then repeated for multiple loops for the same patient, and then for other patients.

Notes regarding all time paths
Coefficients to more decimal places and those with an alternative ethnicity coding (which uses white as the baseline category, rather than other) are available from the corresponding author on request.A set of 5000 bootstrap estimates (with replacement) of all regression coefficients using this alternative ethnicity coding are also available on request, which could be used to propagate uncertainty around the risk that the trajectory equations within a model, such as UKPDS-OM2.
Since neither EXSCEL nor TECOS collected data on white blood cell count, or post-baseline smoking, the equations estimated by Leal et al could be used to project white blood cell count and smoking [6].
EXSCEL and TECOS were designed as glycaemic equipoise studies, i.e. "usual care physicians [were] encouraged to follow guidelines for care based on local and institutional practice patterns and any relevant published practice guidelines" [2].This differs considerably from previous diabetes trials where individuals were prescribed a placebo without usual care.See, for example, Bethel 2020 [8].Both trials had a highly pragmatic design with very few restrictions on concomitant treatment and compared usual care plus study drug against usual care plus placebo.Hence, data from these trials are likely to be valid for contemporary populations that are similar to those in the trials.
In addition to the study medication evaluated in EXSCEL and TECOS, there have been many other new drugs and many changes to clinical management and secular trends since the UKPDS era (1977-2002) that will affect risk factor time paths.For example, HbA1c may be affected by increased use of metformin, lower blood glucose targets, patient education, introduction of long-acting insulins, glitazones, as well as the newer drugs (e.g.empagliflozin) that were introduced during EXSCEL and TECOS follow up.LDL will be affected by the introduction of statins and change of guidance recommending lower LDL targets, higher intensity statin regimens and use in a wider range of patients.People diagnosed with diabetes in 1977-1997 are also a different generation from most of those participating in EXSCEL and TECOS and may have been less active in old age and be more likely to have smoked.
For each patient, it is likely that any change in medication (whether this is exenatide/sitagliptin, or concomitant drug) will produce a step change in risk factors in the first 6-12 months after changing medication.While patients are on stable treatment, the change in risk factors is likely to be relatively stable but there may be changes due to concomitant medication, lifestyle factors and increasing duration of diabetes.For example, an individual EXSCEL study participant may initially receive exenatide plus metformin and have a gradual increase in HbA1c over time.If their HbA1c exceeds a target level two years later, their clinician may decide to add insulin, resulting in a reduction in HbA1c.After that fall, HbA1c levels may then gradually rise and their clinician may increase the insulin dose accordingly, resulting in another rapid fall and then a gradual rise.Other patients may stop medications (e.g.due to patient preference, hypoglycaemia or adverse events).However, many patients may have no change in glucoselowering medication during the study period and may have relatively stable HbA1c.Across our cohorts, there will be patients starting or stopping many different medications at different times.Our time path equations aim to predict the average risk factor trend over time given the current package of interventions (and controlling for demographics and past risk factor levels).
Users of diabetes simulation models, such as UKPDS-OM2, will typically use our risk factor time paths to model how risk factors change over time after the effect of study interventions has been applied.For example, researchers may do a network meta-analysis of 48-week trials to estimate the reduction in HbA1c during the initial year of treatment.They may reduce HbA1c in year 1 of simulation based on the meta-analysis treatment effects and use our risk factor time paths to model how HbA1c changes over time from year 2 onwards.Researchers evaluating treat-to-target regimens may also assume that patients intensify treatment when our time path equations predict that patients' HbA1c has risen above a target level and may then apply an initial treatment effect, followed by a further period extrapolated using our time path equations.
Supplementary material 3: Methods for QALY gains using current and previous risk equations extrapolated using UKPDS-OM2 The impact of different risk factor trajectories on QALYs was by extrapolating risk factors for 2579 participants randomised to placebo in either EXSCEL or TECOS who had complete data on all risk factors at baseline.
For each participant, we took the pre-randomisation values of each risk factor and extrapolated using either the risk factor trajectories estimated in this paper, or those estimated by Leal et al. [6].
Since neither trial measured white blood cell count, baseline white blood cell count was imputed using a published algorithm estimated using the UKPDS trial dataset [9].Post-baseline values for white blood cell count and for smoking (where there was no data post-baseline) were estimated from the baseline value using the trajectories estimated by Leal et al. [6].Continuous risk factors were extrapolated in Stata version 17.0 and entered into the model directly as fixed values.
A bespoke version of UKPDS-OM version 2.2 was used, which incorporated risk factor time paths using the equations of Leal et al. [6] or those of the current paper.As described in Supplementary material 2, for binary risk factors (PVD, albuminuria, atrial fibrillation and smoking) and eGFR, accurate estimation of lifetime QALYs requires Monte Carlo simulation.For PVD, albuminuria, atrial fibrillation and eGFR <60 ml/min/173m 2 , the survival models predict the rate at which patients will develop risk factors, whereas the UKPDS-OM requires binary inputs for whether the patient has or has not developed that risk factor at each time point.Similarly, the logistic regression for smoking reported by Leal et al [6] predicts whether patients are current smokers or not.In each loop of the model, the hazard (or log-odds) for developing each risk factor in the next year is estimated and this hazard (or log-odds) is compared against random numbers to decide whether or not the patient develops that risk factor that year.Outcomes for the following year are estimated conditional on whether the patient developed it previous year.This approach ensures that the patient will develop each risk factor in the correct proportion of loops and ensures that the mean QALYs for each patient reflect that patient's true risk.By contrast, other methods, such as the highest probability approach, would introduce bias [10].
Results were extrapolated for 70 years using 100,000 loops.No bootstraps were used since the aim was to compare point estimates.The default utility values, estimated from the UKPDS sample [11], were used to estimate QALYs.No discounting was applied.

Figure A4 .
Figure A4.Cross-validation analyses comparing the performance of models fitted on EXSCEL data (TableA5) and models fitted on TECOS data (TableA6) against observed data from either EXSCEL or TECOS.

Figure A5 .
Figure A5.Predicted risk factors for male (blue lines) and female (red lines) control patients defined by the Mount Hood Registry extrapolated over a lifetime using the time paths estimated in the current study.Values were estimated using a beta version of UKPDS-OM2; predicted cases of binary endpoints allow for death as a competing risk.

Table A1 .
Comparison of the trials used in the study Method for estimating eGFR Modification of Diet in Renal Disease (MDRD) method Modification of Diet in Renal Disease (MDRD) method Abbreviations: eGFR, estimated glomerular filter rate; EXSCEL, Exenatide Study of Cardiovascular Event Lowering; TECOS, Trial Evaluating Cardiovascular Outcomes With Sitagliptin. .

Table A3 .
Mean, SD and number of individuals (N) for each continuous risk factor during six follow-up years

Table A4 .
Mean and SD for each fifth of the population (first observed value post randomisation) of the continuous risk factors FigureA1.Continuous risk factor values for the EXSCEL and TECOS study populations separately, using the time path models that were estimated on the pooled dataset (coefficients in Table1).Observed values (black line) are shown with 95% confidence intervals (grey area) and simulated (red line) time paths.

Table A5 .
Coefficients for the models estimating annual risk factor values of continuous variables using solely EXSCEL data Robust standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1

Table A6 .
Coefficients for the models estimating annual risk factor values of continuous variables using solely TECOS data

Table A7 .
[7]k factor values for hypothetical individual at the start of the simulation.These are based on the Mount Hood reference case simulation[7]Estimating predictions for continuous risk factors As described in the Methods, continuous risk factors, such as HbA1c are predicted as  =  +   +  y , +   +  ℎ +   +  ln(   ) * Assumed for the purposes of this simulation: not specified in the Mount Hood reference.
Table A7 can be calculated as follows.First we estimate the cumulative hazards at time t and time t+1.Where Γ =1.210 i.e. exp(ln(Γ)) in Table 2 of the manuscript.
TableA7who was age 66 with eGFR of 70 ml/min/1.73m 2and SBP of 145 mmHg at the start of the simulation, the linear predictor   at the end of the first year of simulation (nine years after diagnosis of diabetes) can be calculated as follows if the first step predicts eGFR ≥60 ml/min/1.73m 2 in loop L:

Table A9 .
Example of BMI inputs for reference simulation

Table A10 .
[6]ults of reference simulation: QALYs for each hypothetical individual defined in the Mount Hood reference.The results for the model we used (version 2.2 Global) and the 2.0 release version are negligible and result purely from Monte Carlo error., last observation carried forward (assuming no change in risk factors over 40 years, other than the increment applied at year one).*Trajectoriesfor smoking and WBC taken from Leal et al.[6]. LOCF

Table A11 .
Coefficients for the models estimating annual risk factor values of continuous variables separately for individual study arms.For brevity, the table shows only those combinations of study and risk factor for which treatment allocation had a statistically significant effect (p<0.05) when added to the regression.