A model to estimate the lifetime health outcomes of patients with Type 2 diabetes: the United Kingdom Prospective Diabetes Study (UKPDS) Outcomes Model (UKPDS no. 68)



The aim of this study was to develop a simulation model for Type 2 diabetes that can be used to estimate the likely occurrence of major diabetes-related complications over a lifetime, in order to calculate health economic outcomes such as quality-adjusted life expectancy.


Equations for forecasting the occurrence of seven diabetes-related complications and death were estimated using data on 3642 patients from the United Kingdom Prospective Diabetes Study (UKPDS). After examining the internal validity, the UKPDS Outcomes Model was used to simulate the mean difference in expected quality-adjusted life years between the UKPDS regimens of intensive and conventional blood glucose control.


The model’s forecasts fell within the 95% confidence interval for the occurrence of observed events during the UKPDS follow-up period. When the model was used to simulate event history over patients’ lifetimes, those treated with a regimen of conventional glucose control could expect 16.35 undiscounted quality-adjusted life years, and those receiving treatment with intensive glucose control could expect 16.62 quality-adjusted life years, a difference of 0.27 (95% CI: −0.48 to 1.03).


The UKPDS Outcomes Model is able to simulate event histories that closely match observed outcomes in the UKPDS and that can be extrapolated over patients’ lifetimes. Its validity in estimating outcomes in other groups of patients, however, remains to be evaluated. The model allows simulation of a range of long-term outcomes, which should assist in informing future economic evaluations of interventions in Type 2 diabetes.


Computer simulation models are being used increasingly both to model the progression of Type 2 diabetes and to estimate lifetime outcomes associated with different disease management strategies. These models estimate the future occurrence of diabetes-related complications and quantify outcomes in terms of mean life expectancy or mean quality-adjusted life expectancy. They can also be used to estimate future healthcare costs of patients with Type 2 diabetes, but their main purpose is to estimate the cost-effectiveness of different disease management strategies, especially when evidence of the impact of interventions on surrogate endpoints is limited, or where evidence from clinical trials has to be extrapolated over patients’ lifetimes. Currently, there are at least five simulation models being used in these ways [1, 2, 3, 4, 5].

Existing simulation models have synthesised data from a variety of sources into a series of modules that are used to estimate occurrence of different complications. For example, the first model of the progression of Type 2 diabetes had separate modules for cardiovascular disease, retinopathy, nephropathy and neuropathy [1], and used a probabilistic Monte-Carlo analysis to simulate event histories over the remaining lifetimes of newly diagnosed patients with Type 2 diabetes. While that model represented a landmark in the use of computer simulation to model the progression of the disease, it had several limitations. In particular, it placed considerable reliance on data from a Type 1 diabetes trial [6] and on cardiovascular risk estimates derived from the Framingham cohort study, despite there being only 337 people with Type 2 diabetes in the Framingham study and consequent doubts concerning its predictive accuracy for such patients [7]. The Framingham cardiovascular risk equations for the general population were modified by a dichotomous variable indicating the presence or absence of diabetes. It was also unable to simulate the timing or occurrence of multiple complications and hence could not fully capture the clustering of complications that tend to arise [2]. While subsequent simulation models have been able to address some of these limitations [2], there is a need for a comprehensive integrated system of equations from a large and well-validated diabetes-specific data source to forecast major diabetes-related complications.

Observational analyses using data from the UKPDS have examined the relationship between exposure to glycaemia [8] and levels of systolic BP [9] over time, and the development of macrovascular and microvascular complications. The results of these studies, expressed in terms of the change in relative risk of major complications associated with decrements in HbA1c or systolic BP, were consistent with those obtained with regimens of intensive control of blood glucose and BP in the clinical trial. The UKPDS Risk Engine, a risk calculator for estimating absolute risk of coronary heart disease [10] and stroke [11], has been developed based on factors such as age, sex, race, smoking status, and levels of HbA1c, systolic BP and lipid ratios in the first 2 years after diagnosis of Type 2 diabetes.

The model presented here, called the UKPDS Outcomes Model, develops further the modelling of lifetime outcomes, through a system of equations, to predict the occurrence and timing of seven diabetes-related complications and death, and thereby to calculate life expectancy and quality-adjusted life expectancy for patients with Type 2 diabetes. A key aspect of this model is that it is designed to capture the association between different types of complications at an individual patient level. Complications may be associated not only because they share common known risk factors, but also due to the event-related dependence that arises when one complication substantially increases the likelihood of another [2]. For example, the probability of a patient experiencing congestive heart failure (CHF) or myocardial infarction (MI) is positively associated with systolic BP, but the risk of an MI is higher for patients with a history of CHF due to event-related dependence. To account for these dependencies the model makes use of time-varying risk factors, which also facilitates its application to patient groups at different stages of the disease.

Subjects and methods

Study subjects

The UKPDS was a clinical trial evaluating regimens with more intensive control of blood glucose and BP than had been described previously [12]. Briefly, between 1977 and 1991, 5102 patients with newly diagnosed Type 2 diabetes were recruited to the study. The subjects were aged between 25 and 65 years, were subsequently shown to have a fasting plasma glucose of above 6 mmol/l on two occasions, had no recent history of MI, ischaemic heart disease (IHD) or CHF, and had never had more than one major vascular event or a severe concurrent illness that would limit life expectancy. Biochemical measurements were taken, including HbA1c, systolic BP, and lipid and lipoprotein fractions at entry to the study, at randomisation after 3 months of dietary therapy, and in each subsequent year. The systolic BP result was the average of three measurements taken at the same visit. Lipids were included in the model as the ratio of total cholesterol : HDL cholesterol. The methodology for the biochemical measurement of HbA1c in the UKPDS, and standardisation with measures reported in the DCCT has been reported elsewhere [13]. Included in this analysis were results from 3642 patients (white, Asian–Indian and Afro-Caribbean) where annual data on potential risk factors were available. The characteristics of these patients and comparisons with the 3867 patients included in the UKPDS glucose control study have been reported elsewhere [8].

The study design and protocol amendments, which are in accordance with the guidelines of the Declaration of Helsinki as revised in 2000, were approved by the Central Oxford Research Ethics Committee and by the equivalent committees at each centre. Each patient also gave informed witnessed consent.

Model structure

The aim of the UKPDS Outcomes Model is to estimate the first occurrence of each of seven diabetes-related complications (fatal or non-fatal MI, other IHD, stroke, heart failure, amputation, renal failure and eye disease measured in terms of blindness in one eye) and death in order to estimate lifetime outcomes and quality-adjusted life expectancy. The model is based on an integrated system of parametric equations used to estimate absolute risk of any of the above complications based on the patient’s characteristics (e.g. age and sex), and time-varying risk factors such as HbA1c and the patient’s history of complications. Definitions of all risk factors initially included in the model are listed in Table 1. To improve model stability, values for time-varying risk factors (HbA1c, systolic BP, and total cholesterol : HDL cholesterol) were calculated as the means of the values for the previous 2 years (e.g. the value for these variables in the fourth year was the average of the values from the second and third years), while smoking status was based on 3-year periods from diagnosis of diabetes. To model the long-term dependence [14] between different complications, time-varying covariates indicating whether a patient had had a history of other complications since diagnosis of diabetes were also included in the model. These covariates were set to 0 until an event occurred, and then set to 1 from that point onwards.

Table 1 Abbreviations, definitions and values of variables included in the model

The procedures used to fit equations to the observed events in the UKPDS population, as well as the estimated parameters, are reported in greater detail in the statistical appendix. In brief, each type of diabetes-related event was modelled using one or more equations that included time-varying risk factors. In the case of diabetes-related complications, a Weibull proportional hazards regression was used to model the occurrence of a composite outcome covering both fatal and non-fatal events. The coefficients for risk factors were then estimated using maximum likelihood methods that account for censoring (e.g. due to factors such as loss of follow-up, or death). Risk factors with a p value of less than 0.05 were considered statistically significant. Separate equations were used to model diabetes- and non-diabetes-related mortality using a combination of Gompertz and logistic regression equations. Finally, the time paths of four risk factors were also estimated using these data. To aid exposition, we grouped the estimated equations into the following categories.

Cardiovascular disease

The increased risk of cardiovascular disease among people with Type 2 diabetes is well established and is increasingly recognised as one of the major hazards of Type 2 diabetes. In developed countries, cardiovascular disease accounts for the majority of deaths amongst patients with Type 2 diabetes [15] and is a significant contributor to diabetes-related healthcare costs [16]. Within the cardiovascular disease category we have three Weibull equations to estimate the absolute risk of the following: (i) MI, defined as non-fatal myocardial infarction (International Classification of Diseases, Ninth Revision [ICD-9] code 410), or fatal vascular cardiac event (ICD-9 codes ≥410–≤414.9, or ≥428–≤428.9), or sudden death (ICD-9 codes ≥798–≤798.9); (ii) IHD, defined as an ICD-9 code of ≥411–≤414.9; and (iii) CHF, defined as an ICD-9 code of ≥428–≤428.9. We only recorded IHD and CHF events if they occurred prior to an MI event; a patient who had experienced an MI was not classified subsequently as developing IHD.

Cerebrovascular disease

People with Type 2 diabetes have a higher risk of stroke, and previous epidemiological studies have found that this risk increases with age, elevated BP, smoking and lipoprotein levels [17]. In our model a single Weibull equation was used to estimate the absolute risk of a first non-fatal stroke (ICD-9 codes ≥430–≤434.9, or 436) or fatal stroke (ICD-9 codes ≥430–≤438.9).


Patients with diabetes have an amputation rate that is many times higher than that of patients without diabetes, and previous studies have found that the primary risk factors for amputation include age, (male) sex, smoking, the presence of peripheral vascular disease, and BP [18]. Our model contains a single Weibull equation to estimate the absolute risk of a first amputation of a digit or limb (ICD-9 codes ≥5.845–≤5.848, or 250.6) or a fatal peripheral vascular event (ICD-9 codes 997.2, 997.6, 250.6 or 440.2).


The model contains a Weibull equation to estimate the risk of blindness in one eye, which is defined as a visual acuity Snellen 6/60 or ETDRS logMAR 1.0 or worse for any reason, persisting for 3 months (ICD-9 codes ≥369–≤369.9).


Increasing duration of diabetes, hypertension and poor glycaemic control have been shown to elevate the risk of renal disease [19] and the subsequent risk of mortality [20]. Our model contains a single Weibull equation to predict renal failure, defined as creatinine levels of above 250 µmol/l, not ascribable to any acute inter-current illness (ICD-9 codes 250.3, and ≥585–≤586), and death due to renal failure (ICD-9 codes ≥580–≤593.9).

Risk factor progression

Previous simulation models of the progression of diabetes have made a variety of assumptions regarding how risk factors such as HbA1c and systolic BP change over time [2, 4]. While carrying forward current values provides a simple means of extrapolation [21, 22], it does not account for the likely change in some of these risk factors over time (e.g. the upward trend in HbA1c levels that many patients experience). In the UKPDS Outcomes Model, we treat the four risk factors (HbA1c, systolic BP, total cholesterol : HDL cholesterol, and smoking status) as panel data (longitudinal data), and fit a random effects model to allow estimation of a time path [23]. While the exact specification differs between risk factors, these equations generally include the value or status of the risk factor at the time a decision was made regarding randomisation in the UKPDS (which took place after a 3-month dietary run-in) [24], the number of years since diagnosis, and the value of the risk factor in the preceding period. For HbA1c an indicator variable for the second year was also included to account for the initial decline in HbA1c observed in the study. Equations for HbA1c, systolic BP, and total cholesterol : HDL cholesterol were based on annual measures of each risk factor, while smoking status (due to its less frequent ascertainment) was based on 3-year periods from diagnosis of diabetes.


People diagnosed with Type 2 diabetes in middle age typically have a life expectancy of between 5 and 10 years shorter than people without diabetes, with heart disease being the major underlying cause of death [25]. Our model contains three equations for estimating absolute risk of mortality. Two of these equations estimate the likelihood of death after the occurrence of several diabetes-related complications that have been shown to elevate the risk of mortality. The first equation, based on logistic regression, estimates the probability of death in the first year in which either an MI, CHF, stroke, amputation or renal failure first occurs (as defined in the Weibull models outlined above). The second equation estimates the risk of diabetes-related mortality of patients with a history of any of these events in all subsequent years. The third mortality equation estimates the risk of death from causes unrelated to diabetes (primarily due to cancers and accidents).

Using the model to simulate outcomes

The main purpose of the model is to estimate the likely occurrence of the major diabetes-related complications over a lifetime for patients with specified prognostic risk factors, in order to calculate health outcomes such as life expectancy or quality-adjusted life expectancy.

While increases in mean life expectancy are widely recognised as a useful measure of benefit [26], it is also important to encapsulate the potential impact of complications on quality of life. This model uses the quality-adjusted life-years (QALY) method to adjust length of life for quality of life by assigning a value or health utility, on a scale on which 0 represents death and 1 represents full health, for each year of life.

The impact on utility of different diabetes-related complications has been reported in a number of studies, including one study in which the EQ-5D health status questionnaire [27] was used to survey 3192 patients still participating in the UKPDS in 1997 [28]. Based on data from that survey, the mean utility for patients free of microvascular and macrovascular complications was estimated at 0.78. Patients with a history of complications have a lower utility and the following decrements were estimated: −0.055 for an MI; −0.090 for other IHD; −0.164 for stroke; −0.108 for heart failure; −0.280 for amputation; and −0.074 for blindness in one eye. For simplicity and in the absence of sufficient data to estimate empirically, it is assumed that multiple complications have an additive effect on utility, e.g. the utility of a patient who had other IHD and then had an MI would first be decremented by 0.090 and then by a further 0.055. However, the UKPDS Outcomes Model is sufficiently flexible to allow other assumptions to be adopted regarding the impact that different complications (and combinations of complications) have on quality of life.

When estimating QALYs it is important to model the sequence of endpoints for each patient. For example, consider the simulated profiles for two patients who are free of complications: the first patient is predicted to have an amputation in the second year post follow-up, an IHD and an MI in the fourth year post follow-up, and is predicted to die in the fifth year; the second patient is predicted to have IHD in the third year post follow-up, an amputation in the fourth year, and a fatal MI in the fifth year. Using the utility weights reported above, while both patients experience the same set of diabetes-related complications over their remaining lifetimes, the QALY profile is different: the profile of the first patient is equivalent to 2.19 QALYs, and the second is equivalent to 2.66 QALYs. By simulating individual patient histories, the model is able to take these differences into account.

Simulations are based on a probabilistic discrete-time illness–death model [14] with annual cycles. In the model, patients start with a given health status (e.g. no complications) and can have one or more non-fatal complications and/or can die in any model cycle. Figure 1 provides an algorithm that illustrates the sequence of modelling events. To begin, the values for all the characteristics for each patient included in the simulation are set (the variables representing these characteristics are listed in Table 1). The first cycle of the model is then run, in which the probability of death or of experiencing the seven complications is calculated. Each probability is compared with a random number drawn from a uniform distribution ranging from 0 to 1 to determine whether an event occurs.

Fig. 1

Algorithm for model simulation

There is a need to account for event-related dependence between some complications, as noted above. The actual event-related dependencies were estimated empirically during the equation-fitting procedure, and included if they had a significance level (p value) of less than 0.05. We then checked all significant dependencies against the known epidemiology, as discussed in more detail below. In the model, if a non-fatal event is predicted to have occurred, the patient acquires a history of that event (the time-varying covariate for that endpoint is updated from 0 to 1), and thereafter carries that history in their set of characteristics. In addition, as noted above, the two diabetes-related mortality equations are applied to all patients who have a history of complications that were found to elevate the risk of death (MI, CHF, stroke, amputation or renal failure) from the point at which the complication is predicted to have occurred. These equations deal with the likelihood of immediate death following the first event, and the long-term elevation of risk of death consequent to acquiring a history of these complications. Finally a non-diabetes mortality equation is used to forecast the incidence of death from cancer, accidents or other causes not addressed elsewhere in the model.

If the model predicts that a death has occurred, the years lived and quality-adjusted years lived (life years decremented in each cycle by the reduced quality of life associated with particular complications) by that patient are calculated. If the patient is predicted to have survived that cycle, the risk factor equations are used to update their current risk factor values (e.g. systolic BP) and these are carried forward to the next cycle of the model along with the updated event history.

It is important to note that the order in which the event equations listed in Figure 1 (equations 1 to 10) are evaluated to determine the occurrence of an event is not predetermined. Further, some of these events are competing risks (e.g. if a patient dies within a cycle of the model, they can have no additional events). To take this into account, the equations are run in random order in each cycle.

Internal validation of the model using the UKPDS population

We tested the consistency of the forecast cumulative incidence of different complications and death to the cumulative incidence calculated using non-parametric (life-table) methods. This test of internal validity [29] was based on the forecast number of events from the second to the twelfth year after diagnosis, using the information from the first year after diagnosis in the 3642 patients used to estimate the model [11].

Application of the model

We demonstrate two potential applications of the model. Firstly, we used the simulation model to extrapolate outcomes over the remaining lifetime of the 3867 patients randomised to conventional or intensive blood glucose control in the UKPDS. Estimated outcomes are stratified by sex and 5-year age categories for patients over 45 years of age at diagnosis of diabetes. Secondly, we calculated the incremental benefit of a regimen of intensive blood glucose control compared with conventional therapy for patients’ remaining lifetimes in terms of life expectancy and quality-adjusted life expectancy. Previously, in the absence of a full outcomes model, we reported the cost-effectiveness of this intervention only in terms of time free of diabetes-related complications [30] using all patients randomised in that comparison rather than the 3642 used here. Given the illustrative nature of these applications we did not discount outcomes.

Handling uncertainty

Appropriate handling of uncertainty in complex patient simulation models such as the UKPDS Outcomes Model is not straightforward. In applications involving extrapolation of outcomes of patients who were alive at the end of the study, a combination of bootstrap methods and multiple imputation methods were used. The confidence intervals around the QALY estimates were adjusted to take into account the variance within and across imputed data sets [31]. Our method removes Monte-Carlo error, so that confidence intervals from the outcomes model reflect parameter uncertainty in the model.


The median follow-up time of patients in the UKPDS was 10.3 years with 39 460 person-years of follow-up data available for model fitting. The coefficients representing the log of the hazard ratios (or the log of the odds ratio in the case of the logistic equations representing event fatality [equation 8] and smoking status [equation 14]), and the standard errors for each coefficient are listed in Table 2 (event equations), Table 3 (mortality equations) and Table 4 (risk factor equations). The numbers of subjects in which each equation was estimated are also shown. In three event equations (IHD, CHF and stroke), a small number of patients with missing information were omitted from that particular estimation, but the eliminated cases represent less than 1% of the total.

Table 2 Sample size, functional form, parameters and beta coefficients (SEs) for seven equations to estimate the probability of diabetes-related complications
Table 3 Sample size, functional form, parameters and beta coefficients (SEs) for three equations to estimate the probability of mortality
Table 4 Sample size, functional form, parameters and beta coefficients (SEs) for four equations used to estimate annual risk factor values from diagnosis of diabetes

The relationship between the equations is illustrated in Figure 2 along with the estimated hazard or odds ratio for each risk factor. When interpreting these ratios it is important to note that some of the risk factors have been subject to a log transformation. The arrows in the figure linking different equations indicate the existence and the direction of event-related dependence, e.g. the arrow from CHF to MI indicates that an individual who has experienced CHF subsequently has a 4.74-fold increased risk of MI.

Fig. 2

Summary of model equations showing event-related dependencies and hazard/odds ratio for each risk factor. The hazard/odds ratio shown for each risk factor is the exponent of the beta coefficients shown in Tables 2 and 3

Respectively for the risk factors HbA1c, systolic BP, total cholesterol : HDL cholesterol and smoking status, Figure 3a–d shows: (i) the actual mean value of the risk factor (for the 3642 patients used to estimate the model); (ii) the predicted values, using the equations given in Table 4, for patients with differing values of the relevant risk factor at the time of the decision regarding randomisation in the UKPDS. For HbA1c, which is shown in Figure 3a, it is evident that patients with an initial value of between 6% and 8% subsequently follow a time path similar to the mean of the observed values for all patients, while the time path for patients with higher initial values differs from the mean of the observed values for all patients, possibly reflecting more intensive treatment aimed at improving the metabolic control of these high-risk patients.

Fig. 3

Observed and simulated time paths for (a) mean HbA1c, (b) systolic BP and (c) total cholesterol : HDL cholesterol, from diagnosis of diabetes, as well as for (d) smoking status

Figure 4 shows the simulated and actual cumulative incidence for each of the diabetes-related events, and all-cause mortality up to 12 years from diagnosis. The estimated number of patients experiencing each event is initially based on baseline levels of risk factors and hence on the time paths predicted by the time-varying risk factor equations. The predicted numbers in all cases are within the 95% confidence intervals surrounding the actual cumulative incidence curves for the study population.

Fig. 4

Observed and simulated cumulative incidence for each of seven diabetes-related complications, and all-cause mortality, for 12 years from diagnosis of diabetes. a. Equation 1, ischaemic heart disease. b. Equation 2, myocardial infarction. c. Equation 3, heart failure. d. Equation 4, stroke. e. Equation 5, amputation. f. Equation 6, blindness. g. Equation 7, renal failure. h. Equations 8, 9 and 10, mortality. Black solid lines, observed; red dotted lines, estimated; grey dotted lines, observed upper and lower 95% CIs

Two illustrative applications of the model

Estimates of mean life expectancy and mean quality-adjusted life expectancy by age group and sex are listed in Table 5. For example, the mean life expectancy for males between the age of 45 and 50 years at diagnosis of diabetes is 25.54 years (95% CI: 21.92–29.66). This compares with 30.05 years remaining life expectancy for males between the age of 45 and 50 years in the UK general population in 1999 to 2001 [32], suggesting a loss of approximately 5 years of life attributable to Type 2 diabetes, in line with previously reported estimates [25]. The corresponding mean quality-adjusted life expectancy is estimated to be 18.82 (95% CI: 16.48–21.15) QALYs. As in the general population, women have a higher life expectancy at all ages. It is important to note that life expectancies are for cohorts in the UKPDS, and given the criteria for inclusion in the study (e.g. no recent history of MI), these patients may have longer life expectancies than some diabetic cohorts in the general population. Further, reported confidence intervals are for the life expectancies of cohorts, and the estimated life expectancy for any individual patient would be subject to a much greater level of uncertainty.

Table 5 Estimated mean life expectancy and mean quality-adjusted life expectancy by age at diagnosis of diabetes in the UKPDS populationa

Table 6 reports the quality-adjusted life expectancy for UKPDS patients who were treated with a regimen of intensive and conventional blood glucose control under the trial protocol. The quality-adjusted life expectancy for the conventional group was 16.35 QALYs, and that for the intensive group was 16.62 QALYs, a difference of 0.27 (95% CI: −0.48 to 1.03) QALYs. Such estimates will permit the calculation of cost-effectiveness.

Table 6 Estimated life expectancy and QALYS for UKPDS patients on conventional and intensive blood glucose control


Simulation models such as the UKPDS Outcomes Model make it possible to quantify the lifetime impact of chronic diseases and the long-term benefits associated with alternative management strategies. They are therefore an important adjunct to randomised controlled trials and other types of epidemiological study, and may also be of value in disease management and healthcare planning. They are essential for proper evaluation of interventions affecting any of the risk factors incorporated in the model, because they permit users to estimate the likely occurrence, over a lifetime, of the main diabetes-related complications. Only when this has been done can quality-adjusted life expectancy and lifetime treatment costs be fully estimated, and the cost-effectiveness of the intervention calculated.

As noted in the Introduction, a number of other diabetes simulation models have been developed, and the present model has benefited greatly from the insights these have provided.

The UKPDS Outcomes Model has some important advantages over some previous simulation models: (i) we used data from 3642 patients who participated in the UKPDS; (ii) there was a median follow-up period of 10.3 years; (iii) we analysed full clinical histories including biochemical measures and ascertainment of major diabetes-related complications; and (iv) there was a very low loss of data during follow-up. The quantity and quality of these data, and the fact that they have come from a single study, has obviated the need to synthesise evidence from many disparate sources. Indeed, we have also been able to use this patient population to estimate the impact of the same diabetes-related complications on subsequent healthcare costs and on quality of life. These results have been reported elsewhere [28] but in time will be integrated with the outcome model described here to form part of a more general UKPDS diabetes health economics model.

The volume of data available to us also removed many constraints on the modelling approach adopted and allowed us to construct a model based on a set of equations estimated using a consistent methodology. In particular, our use of time-varying covariates allowed us to model the linkages between different diabetes-related complications. These have increasingly been recognised as crucial to such simulation models: as a recent study has noted “...it can be catastrophically inaccurate to simulate each kind of event as if it were statistically independent” [2]. We have attempted to address this issue by incorporating linkages where there is a high degree of statistical significance and supporting clinical and epidemiological evidence. For example, our results for stroke are consistent with those from the Framingham study showing that a history of heart failure and atrial fibrillation increase the risk of subsequent stroke [17]. The model may also be used to simulate outcomes in populations with differing baseline characteristics such as age, smoking status, systolic BP, blood lipids, HbA1c and history of diabetes-related complications.

A number of limitations of the UKPDS Outcomes Model should be acknowledged. Firstly, the model only predicts the first event in any single category of diabetes-related complications, and does not allow series of events such as sequential amputations to be modelled directly. However, this limitation should not be overstated, as: (i) such multiple events in the UKPDS data were relatively infrequent; (ii) subsequent fatal events in specific categories of diabetes-related complications are included in the diabetes-related mortality equation; and (iii) additional post-study monitoring data will in time allow this issue to be revisited.

Secondly, the model does not explicitly incorporate a number of morbidities related to diabetes. In particular, peripheral neuropathy and ulceration may give rise to significant reduction in quality of life, to hospitalisation and to high treatment costs [33]. These conditions are also likely to elevate the risk of subsequent events such as amputation. However, these were not included as major endpoints in the UKPDS and cannot easily be incorporated into the outcomes model. Hypoglycaemia and hyperglycaemia are also excluded. Episodes of these conditions are infrequent and rarely serious, but more serious morbidity could occur in small numbers of patients undertaking intensive glycaemic management.

Thirdly, some complications of diabetes are represented in the model using a single state. For example, the only state representing eye disease in the model is the endpoint of blindness in one eye. Again, this is consistent with the primary analyses of the UKPDS, but it almost certainly fails to fully describe the complex process of disease progression and associated treatments, as well as the potential impact on quality of life. There is scope to develop a more detailed model of the progression of eye disease in future work.

Fourthly, we have briefly described the methods used to estimate confidence intervals around our estimates, but there are many important methodological and computational issues in the generation of these confidence intervals and more generally in the ways uncertainty should be handled in a complex model; these need to be explored in greater detail in the future.

We expect the UKPDS Outcomes Model to develop in a number of ways as further follow-up data become available. The greater length of follow-up will allow the shape of the hazard in the Weibull and Gompertz equations to be estimated with greater precision. The risk factors and linkages contained in the model can be re-examined, especially the equations that are currently based on a relatively small number of events (e.g. only 24 patients had renal failure).

The UKPDS Outcomes Model has been described here in some detail to demonstrate the rationale for our approach, the problems encountered and the remaining limitations. Publication of the coefficients of all equations and the description of the simulation algorithm have been included to facilitate a high degree of transparency and to promote a deeper understanding of the model. In this study we have demonstrated that the UKPDS Outcomes Model is able to simulate event histories that closely match observed outcomes of patients in the UKPDS study, and that it can also be used to extrapolate lifetime outcomes. The next challenge is to test external validity by applying the model to other populations of Type 2 diabetic patients. Until this has been undertaken, caution should be exercised when using the model to predict outcomes in these populations. We look forward to the assessment of the model’s external validity and to its future use in assessing outcomes of therapies for Type 2 diabetes.



congestive heart failure


Early Treatment Diabetic Retinopathy Study


International Classification of Diseases, Ninth Revision


ischaemic heart disease


logarithm of the minimum angle of resolution


myocardial infarction


quality-adjusted life years


United Kingdom Prospective Diabetes Study


  1. 1.

    Eastman RC, Javitt JC, Herman WH et al. (1997) Model of complications of NIDDM. I. Model construction and assumptions. Diabetes Care 20:725–734

    CAS  PubMed  Google Scholar 

  2. 2.

    Brown JB, Russell A, Chan W, Pedula K, Aickin M (2000) The global diabetes model: user friendly version 3.0. Diabetes Res Clin Pract 50 [Suppl 3]:S15–S46

  3. 3.

    Palmer AJ, Brandt A, Valerio G, Weiss C, Stock H, Wenzel H (2000) Outline of a diabetes disease management model. Principles and applications. Diabetes Res Clin Pract 50 [Suppl 3]:S47–S56

  4. 4.

    Bagust A, Hopkinson PK, Maier W, Currie CJ (2001) An economic model of the long-term health care burden of Type II diabetes. Diabetologia 44:2140–2155

    Article  CAS  PubMed  Google Scholar 

  5. 5.

    CDC Group (2002) Cost-effectiveness of intensive glycemic control, intensified hypertension control, and serum cholesterol level reduction for type 2 diabetes. JAMA 287:2542–2551

    PubMed  Google Scholar 

  6. 6.

    DCCT (1996) Lifetime benefits and costs of intensive therapy as practiced in the diabetes control and complications trial. The Diabetes Control and Complications Trial Research Group. JAMA 276:1409–1415

    PubMed  Google Scholar 

  7. 7.

    Yeo WW, Yeo KR (2001) Predicting CHD risk in patients with diabetes mellitus. Diabet Med 18:341–344

    Article  CAS  PubMed  Google Scholar 

  8. 8.

    UKPDS Group (2000) Association of glycaemia with macrovascular and microvascular complications of type 2 diabetes (UKPDS 35): prospective observational study. BMJ 321:405–412

    Article  CAS  PubMed  Google Scholar 

  9. 9.

    Adler AI, Stratton IM, Neil HA et al. (2000) Association of systolic blood pressure with macrovascular and microvascular complications of type 2 diabetes (UKPDS 36): prospective observational study. BMJ 321:412–419

    Article  CAS  PubMed  Google Scholar 

  10. 10.

    Stevens RJ, Kothari V, Adler AI, Stratton IM (2001) The UKPDS risk engine: a model for the risk of coronary heart disease in Type II diabetes (UKPDS 56). Clin Sci (Lond) 101:671–679

    Article  Google Scholar 

  11. 11.

    Kothari V, Stevens RJ, Adler AI et al. (2002) UKPDS 60: risk of stroke in type 2 diabetes estimated by the UK Prospective Diabetes Study risk engine. Stroke 33:1776–1781

    Article  PubMed  Google Scholar 

  12. 12.

    UKPDS Group (1991) UK Prospective Diabetes Study VIII: study design, progress and performance. Diabetologia 34:877–890

    PubMed  Google Scholar 

  13. 13.

    UKPDS Group (1994) Biochemical risk factors in type 2 diabetic patients at diagnosis compared with age-matched normal subjects. Diabet Med 11:534–544

    PubMed  Google Scholar 

  14. 14.

    Hougaard P (2000) Analysis of multivariate survival data (Statistics for biology and health). Springer, Berlin

  15. 15.

    American Diabetes Association (1998) Economic consequences of diabetes mellitus in the U.S. in 1997. Diabetes Care 21:296–309

    PubMed  Google Scholar 

  16. 16.

    Brown JB, Pedula KL, Bakst AW (1999) The progressive cost of complications in type 2 diabetes mellitus. Arch Intern Med 159:1873–1880

    Article  CAS  PubMed  Google Scholar 

  17. 17.

    Wolf PA, D’Agostino RB, Belanger AJ, Kannel WB (1991) Probability of stroke: a risk profile from the Framingham Study. Stroke 22:312–318

    CAS  PubMed  Google Scholar 

  18. 18.

    Boyko EJ, Ahroni JH, Stensel V, Forsberg RC, Davignon DR, Smith DG (1999) A prospective study of risk factors for diabetic foot ulcer. The Seattle Diabetic Foot Study. Diabetes Care 22:1036–1042

    CAS  PubMed  Google Scholar 

  19. 19.

    Brancati FL, Whelton PK, Randall BL, Neaton JD, Stamler J, Klag MJ (1997) Risk of end-stage renal disease in diabetes mellitus: a prospective cohort study of men screened for MRFIT. Multiple Risk Factor Intervention Trial. JAMA 278:2069–2074

    Article  CAS  PubMed  Google Scholar 

  20. 20.

    UKPDS Group (2003) Development and progression of nephropathy in Type 2 diabetes: observation and modelling from the United Kingdom Prospective Diabetes Study (UKPDS 64). Kidney Int 63:225–232

    Article  PubMed  Google Scholar 

  21. 21.

    Cupples LA, D’Agostino RB, Anderson K, Kannel WB (1988) Comparison of baseline and repeated measure covariate techniques in the Framingham Heart Study. Stat Med 7:205–222

    CAS  PubMed  Google Scholar 

  22. 22.

    Altman DG, De Stavola BL (1994) Practical problems in fitting a proportional hazards model to data with updated measurements of the covariates. Stat Med 13:301–341

    CAS  PubMed  Google Scholar 

  23. 23.

    Greene WH (1997) Econometric analysis, 4 edn. Prentice-Hall, London

  24. 24.

    UKPDS Group (1998) Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). Lancet 352:837–853

    Article  PubMed  Google Scholar 

  25. 25.

    Panzram G (1987) Mortality and survival in type 2 (non-insulin-dependent) diabetes mellitus. Diabetologia 30:123–131

    CAS  PubMed  Google Scholar 

  26. 26.

    Wright JC, Weinstein MC (1998) Gains in life expectancy from medical interventions—standardizing data on outcomes. N Engl J Med 339:380–386

    Article  CAS  PubMed  Google Scholar 

  27. 27.

    EuroQol Group (1990) EuroQol—a new facility for the measurement of health-related quality of life. Health Policy 16:199–208

    Article  PubMed  Google Scholar 

  28. 28.

    Clarke P, Gray A, Holman R (2002) Estimating utility values for health states of type 2 diabetic patients using the EQ-5D (UKPDS 62). Med Decis Making 22:340–349

    Article  PubMed  Google Scholar 

  29. 29.

    Altman DG, Royston P (2000) What do we mean by validating a prognostic model? Stat Med 19:453–473

    Article  CAS  PubMed  Google Scholar 

  30. 30.

    Gray A, Raikou M, McGuire A et al. (2000) Cost effectiveness of an intensive blood glucose control policy in patients with type 2 diabetes: economic analysis alongside randomised controlled trial (UKPDS 41). BMJ 320:1373–1378

    Article  CAS  PubMed  Google Scholar 

  31. 31.

    Schafer JL (1999) Multiple imputation: a primer. Stat Methods Med Res 8:3–15

    Article  CAS  PubMed  Google Scholar 

  32. 32.

    Government Actuary’s Department (2003) Interim life tables 1999–2001. The Stationery Office, London

  33. 33.

    Williams R, Van Gaal L, Lucioni C (2002) Assessing the impact of complications on the costs of Type II diabetes. Diabetologia 45:S13–S17

    Article  CAS  PubMed  Google Scholar 

  34. 34.

    Anderson KM, Odell PM, Wilson PW, Kannel WB (1991) Cardiovascular disease risk profiles. Am Heart J 121:293–298

    Article  CAS  PubMed  Google Scholar 

  35. 35.

    Menotti A, Lanti M, Puddu PE (2000) Long-term time-related predictivity of coronary events as a function of a single measurement of serum cholesterol and systolic blood pressure. Acta Cardiol 55:87–93

    CAS  PubMed  Google Scholar 

  36. 36.

    Stevens R, Adler A, Gray A, Briggs A, Holman R (2000) Life-expectancy projection by modelling and computer simulation (UKPDS 46). Diabetes Res Clin Pract 50 [Suppl 3]:S5–S13

Download references


The UK Prospective Diabetes Study (UKPDS) Group is composed of the following centres:

Radcliffe Infirmary, Oxford; Royal Infirmary, Aberdeen; General Hospital, Birmingham; St George’s Hospital, London; Hammersmith Hospital, London; City Hospital, Belfast; North Staffordshire Royal Infirmary, Stoke-on-Trent; Royal Victoria Hospital, Belfast; St Helier Hospital, Carshalton; Whittington Hospital, London; Norfolk and Norwich Hospital; Lister Hospital, Stevenage; Ipswich Hospital; Ninewells Hospital, Dundee; Northampton Hospital; Torbay Hospital; Peterborough General Hospital; Scarborough Hospital; Derbyshire Royal Infirmary; Manchester Royal Infirmary; Hope Hospital, Salford; Leicester General Hospital; Royal Devon and Exeter Hospital.

The cooperation of the patients and staff at the UKPDS centres is much appreciated. The UKPDS received funding from the UK Medical Research Council, the British Diabetic Association, the UK Department of Health, the National Eye Institute and the National Institute of Diabetes and Digestive and Kidney Disease (the US National Institutes of Health), the British Heart Foundation, The Wellcome Trust, the Charles Wolfson Charitable Trust, the Clothworkers’ Foundation, the Health Promotion Research Trust, the Alan and Babette Sainsbury Trust, the Oxford University Medical Research Fund Committee. Funding was also provided by pharmaceutical companies including Novo-Nordisk, Bayer, Bristol-Myers Squibb, Hoechst, Lilly, Lipha and Farmitalia Carlo Erba, GlaxoWellcome, SmithKline Beecham, Pfizer, Zeneca, Pharmacia and Upjohn, and Roche provided grants for health economics and epidemiological studies.

Author information




Corresponding author

Correspondence to P. M. Clarke.

Additional information

The centres of the UKPDS are listed at the end of the paper

Conflict of interest. Several authors (as indicated above) are employed by the University of Oxford. This paper describes and places in the public domain a simulation model that we have called the UKPDS Outcomes Model. All of the information necessary to reproduce the UKPDS Outcomes Model is provided in this article, but it is conceivable that a future user with a commercial interest in the UKPDS Outcomes Model might prefer to use the software already created by University programmers. Depending on the nature of the proposed use of the UKPDS Outcomes Model, the University of Oxford might charge a fee in this case.

Appendix: Statistical analysis

Appendix: Statistical analysis

Details of the statistical modelling

Parametric methods for risk estimation have previously been applied in modelling events such as MI and stroke [34, 35, 36]. In this analysis, a proportional hazards Weibull regression model was used to model diabetes-related complications with a baseline hazard of the form:

$$h_{0} (t) = \lambda \gamma t^{{\gamma - 1}} $$

where γ is a shape parameter and the scale parameter λ=exp(β0) or the exponentiated intercept coefficient β0. Under the proportional hazards assumption, the hazard of an event at time t for the ith individual is:

$$h(t|x_{{tj}} ) = h_{0} (t)\exp (x_{{tj}} \beta _{j} )$$

where x tj is a vector of j covariates and β j their respective coefficients. Some of these covariates (such as AGE and SEX) remain constant as time elapses; others potentially vary over time (such as HbA1c and systolic BP). Thus, the unknown parameters requiring estimation are λ, γ, β0 and β j .

Two types of risk estimation were used to model the risk of diabetes-related death. Firstly, logistic regression was used to estimate the probability that the first MI, IHD, CHF, amputation or renal failure event would be fatal. Secondly, a Gompertz regression model, a functional form widely used to model mortality, was used to calculate the risk of diabetes-related mortality in subsequent years for patients with a history of these complications. Non-diabetes-related mortality was also modelled using Gompertz regression. For these models the baseline hazard is:

$$h_{0} (t) = \lambda \exp (\varphi t)$$

and is therefore as follows in the proportional hazards model:

$$h(t|x_{{tj}} ) = h_{0} (t)\exp (x_{{tj}} \beta _{j} )$$

and the unknown parameters requiring estimation are λ, ϕ, β0 and β j . In the case of diabetes-related mortality the time at risk is from the occurrence of the first complication that elevates the risk of mortality.

The unconditional probability of an event occurring between t and t+1 can be calculated using the integrated hazard. For example, the integrated hazard at time t is:

$$H(t|x_{{tj}} ) = \exp (\beta _{0} + x_{{tj}} \beta _{j} )t^{\gamma } $$

and the unconditional probability of an event occurring in the interval t to t+1 is:

$$1 - \exp (H(t|x_{{tj}} ) - H(t + 1|x_{{tj}} ))$$

Finally, equations representing risk factor progression were estimated using random effects panel data regression. Equations 10 to 13 were estimated using the form:

$$RF_{{it}} = \alpha + \beta _{j} x_{{itj}} + \mu _{i} + \nu _{{it}} $$

where RF it is the risk factor for the ith patient (i=1..n) in year t of the study (t=1,...,T) and x itj are explanatory variables (j=1.. J). Predicted values of RF it were used in conjunction with the event equations in order to complete the simulations. The smoking status equation (equation 14) used a logistic regression panel data model to estimate the probability of smoking in 3-year periods from the diagnosis of diabetes.

Methods for handling uncertainty

The results reported in Tables 5 and 6 were based on a simulation model that predicts outcomes at the individual level. Representing uncertainty in models of this type is complicated by the need to remove Monte-Carlo error (the differences between individual simulations that are due to the play of chance rather than uncertainty in the parameters).

Confidence intervals presented in this paper are based on a two-stage process of evaluation. Firstly, the original data used to fit the risk equations were bootstrapped and the risk equations refitted and the coefficients recorded. Repeating this process 20 times generated a vector of coefficients that represented the parameter uncertainty in those coefficients, but which also accounted for the covariance between risk equations. As we were applying the model to predict lifetime outcomes of UKPDS patients, in the second stage we treated the predictions as imputations of missing values (in that we were predicting values that were not observed) and the 20 sets of results as multiple imputations. Standard methods for combining the results of multiple imputations were then employed [31], including a bias correction to adjust for the fact that a (relatively) small number of multiple imputations were performed. Further information on the model including details of software can be obtained from: http://www.dtu.ox.ac.uk/outcomesmodel.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Clarke, P.M., Gray, A.M., Briggs, A. et al. A model to estimate the lifetime health outcomes of patients with Type 2 diabetes: the United Kingdom Prospective Diabetes Study (UKPDS) Outcomes Model (UKPDS no. 68). Diabetologia 47, 1747–1759 (2004). https://doi.org/10.1007/s00125-004-1527-z

Download citation


  • Blood glucose
  • Diabetes
  • Diabetes-related complications
  • Economic evaluation
  • Life expectancy
  • Quality-adjusted life years
  • Risk equation simulation model