Selecting the optimal risk threshold of diabetes risk scores to identify high-risk individuals for diabetes prevention: a cost-effectiveness analysis
- 238 Downloads
Although risk scores to predict type 2 diabetes exist, cost-effectiveness of risk thresholds to target prevention interventions are unknown. We applied cost-effectiveness analysis to identify optimal thresholds of predicted risk to target a low-cost community-based intervention in the USA.
We used a validated Markov-based type 2 diabetes simulation model to evaluate the lifetime cost-effectiveness of alternative thresholds of diabetes risk. Population characteristics for the model were obtained from NHANES 2001–2004 and incidence rates and performance of two noninvasive diabetes risk scores (German diabetes risk score, GDRS, and ARIC 2009 score) were determined in the ARIC and Cardiovascular Health Study (CHS). Incremental cost-effectiveness ratios (ICERs) were calculated for increasing risk score thresholds. Two scenarios were assumed: 1-stage (risk score only) and 2-stage (risk score plus fasting plasma glucose (FPG) test (threshold 100 mg/dl) in the high-risk group).
In ARIC and CHS combined, the area under the receiver operating characteristic curve for the GDRS and the ARIC 2009 score were 0.691 (0.677–0.704) and 0.720 (0.707–0.732), respectively. The optimal threshold of predicted diabetes risk (ICER < $50,000/QALY gained in case of intervention in those above the threshold) was 7% for the GDRS and 9% for the ARIC 2009 score. In the 2-stage scenario, ICERs for all cutoffs ≥ 5% were below $50,000/QALY gained.
Intervening in those with ≥ 7% diabetes risk based on the GDRS or ≥ 9% on the ARIC 2009 score would be cost-effective. A risk score threshold ≥ 5% together with elevated FPG would also allow targeting interventions cost-effectively.
KeywordsDiabetes mellitus, Type 2 Cost-effectiveness analysis Lifestyle risk reduction Clinical prediction rule
Diabetes risk scores allow calculation of predicted risk based on several individual characteristics. However, using risk scores as a screening and risk stratification tool requires decisions about specific thresholds of predicted risk whereby individuals should be referred for intervention. Selecting such thresholds is difficult given that risk scores have a continuous association with diabetes risk. Cost-effectiveness analysis provides a framework for identifying the economically optimal threshold from the perspective of efficiently using health care resources. Using cost-effectiveness analysis to identify the economically optimal threshold for diabetes prevention has been applied to fasting glucose , HbA1c  and a combination of glucose testing and risk scores [3, 4]; however, noninvasive risk scores for type 2 diabetes do not require blood sampling and can therefore be useful tools to guide providers whether a diagnostic blood test for prediabetes be performed . We are not aware of studies applying cost-effectiveness analysis to the application of diabetes risk scores alone or to a two-step screening procedure as described above.
The aim of this study was to apply the framework of cost-effectiveness analysis to identify optimal thresholds of predicted risk from noninvasive diabetes risk scores to target a low-cost community-based intervention. We considered two screening scenarios: a 1-stage scenario with risk score assessment only and a 2-stage scenario in which the risk score assessment is followed by a fasting plasma glucose testing.
General methodological concept
Diabetes risk scores
We considered diabetes risk scores which were (a) based on noninvasively assessable risk factors, (b) for which performance has been validated in varying populations and (c) for which scoring algorithms were available. Based on a systematic review , we selected the ARIC 2009 score  which was developed in a US population. Although an additional US based diabetes risk score (Framingham Offspring Study) met our criteria, this score had limited accuracy in validation studies [9, 11], specifically in other US cohorts . As second risk score, we considered the German Diabetes Risk Score (GDRS) [13, 14, 15] which showed comparable accuracy in validation studies to the ARIC 2009 score . By comparing these scores, we are able to evaluate whether results are risk score specific or more general.
To determine the performance of the two diabetes risk scores in a US population, individual 5-year predicted diabetes risks were calculated in the ARIC and CHS studies based on published equations [10, 13]. Due to a large amount of missing information for the risk score components, multiple imputation was performed. Based on 10 imputed datasets, the average discrimination was evaluated by the area under the receiver operating characteristic curve (ROC-AUC) .
Diabetes incidence according to thresholds of predicted diabetes risk
The diabetes incidence rates for the US population were derived from the ARIC and CHS studies. A detailed description of the study populations and how data were combined is reported in the Online Resource 1. Based on 1582 incident diabetes cases within a follow-up time of ~ 2–10 years from 10 multiple imputation datasets, diabetes incidence rates were determined for a series of high-risk cohorts defined with a range of thresholds of predicted 5-year diabetes risk with GDRS and ARIC 2009 score. For the 2-stage screening scenario, incidence rates were determined among those subgroups identified to be high-risk from the risk scores which had additionally fasting glucose ≥ 100 mg/dl.
Overview of the CDC-RTI type 2 diabetes model
The CDC Type 2 Diabetes Cost-effectiveness Model was designed to simulate the development and progression of type 2 diabetes to assess the cost-effectiveness of various prevention and treatment interventions. The basic model has been described and validated elsewhere [17, 18, 19, 20]. Briefly, it is a Markov simulation model of disease progression and cost-effectiveness that follows persons from the onset of disease until death or age 95. In the model, separate modules are used to simulate the development of type 2 diabetes, hypertension, hyperlipidemia, coronary heart diseases, and stroke among high-risk individuals. For individuals who developed type 2 diabetes, the model additionally simulates three diabetes-related microvascular complications (nephropathy, neuropathy, and retinopathy), which are primarily based upon observations of the UK Prospective Diabetes Study . Model outcomes include disease complications, death, costs, and QALYs. The model has been validated and used for assessing the lifetime cost-effectiveness of various interventions for preventing type 2 diabetes and its complications (19–24).
Perspective, cost and utilities
Assumptions for the simulation model regarding cost and effectiveness parameters
1-stage and 2-stage
Deterministic sensitivity analyses
1-stage and 2-stage
Cost of screening instrument
$5.01 (Medicare fee schedule 2011)
Cost of additional time in physician office visits
$53.2 (Medicare fee schedule 2011)
Percentage participating in lifestyle intervention
Percentage completing lifestyle intervention
Group-based lifestyle intervention at community level (Y-DPP)
Diabetes risk reduction in first 3 years
12.5% (SA2)/50% (SA3)/stable over time
After 3 years
12.5% (assumed as half of original DPP)
6.75% (SA2)/25% (SA3)
Hypertension risk reduction
Hypercholesterolemia risk reduction
Lifestyle Intervention cost
Stable over time (SA4)
$375 Ackermann et al. 
3 and after
Impact of intervention on medical costs
Coffey’s model 
Coffey’s model 
CVD before diabetes
Microvascular complications before diabetes
Hypertension before diabetes
Hypercholesterolemia before diabetes
Parameters of the model
The simulation sample was derived from data on non-diabetic US adults aged 35–65 years in NHANES 2001–2004. According to both the 1-stage and 2-stage approach for the two risk scores, we created multiple cohorts stratified by individuals’ demographics and risk factors including age, sex, race/ethnicity, hypertension, smoking, and total cholesterol and accounted for the joint distribution of those variables.
We applied a low-cost community-based intervention similar to the Y-DPP (Diabetes Prevention Program) . We assumed the program leads to a 25% risk reduction in 3 years which diminishes to 12.5% after year 3 and is maintained thereafter. These estimates include a participation rate of 50% and a compliance rate of 50% was assumed in the intervention group [26, 27].
ICERs were calculated by dividing incremental costs measured in 2012 US dollars by incremental health benefit measured by QALYs. ICERs were expressed in 2012 US dollars. Both future health benefits and costs were discounted at an annual rate of 3% . To identify the economically optimal threshold of predicted diabetes risk, we calculated ICERs for different thresholds of predicted diabetes risk for the two risk scores for both the 1-stage approach and the 2-stage approach.
We performed several one-way deterministic sensitivity analyses to examine how the cost-effectiveness results would change under different cost and effectiveness scenarios of the lifestyle intervention. To do so, we rerun analyses by changing the value of one parameter at a time while keeping all other parameters at their base-case values (Table 1). First, we doubled the cost of the lifestyle intervention to test whether a costlier intervention program might change the ICERs and thus the selection of the economically optimal cutoff points under each screening scenario. Second, we halved the diabetes risk reduction in the intervention to 12.5% in the first 3 years and to 6.75% in the years thereafter. Third, we doubled the risk reductions in the intervention from their base-case values. Fourth, we assumed costs and effectiveness of the lifestyle intervention to be stable over time. Finally, we added the potential additional benefit of the intervention on hypertension risk reduction.
We also performed probabilistic sensitivity analysis to generate the cost-effectiveness acceptability curve as recommended by good research practices for cost-effectiveness analysis . We selected 18 most critical parameters (e.g., effect and cost of diabetes prevention program) and varied them simultaneously in 500 iterations (Suppl. Table 1 in Online Resource 1). The incremental costs of each risk threshold were plotted against their incremental effects (QALY) to form cost-effectiveness plane with a diagonal willingness-to-pay line set at $50,000/QALY. In addition, the probabilities of being cost-effective given a range of willingness-to-pay levels were plotted to form a cost-effectiveness acceptability curve.
Performance of risk scores
The ROC-AUC (95%-CI) of the GDRS and the ARIC 2009 score for prediction of incident diabetes in ARIC and CHS was 0.691 (0.677–0.704) and 0.720 (0.707–0.732), respectively. Comparable results were observed in ARIC, but lower ROC-AUC values were observed in CHS (Suppl. Table 2 in Online Resource 1). Sensitivity and specificity for varying risk thresholds for both scores are tabulated in Suppl. Table 3 in Online Resource 1. Including fasting glucose in addition, the risk scores in ARIC and CHS increased the ROC-AUC to 0.787 (0.774–0.799) for the GDRS and 0.800 (0.788–0.812) for the ARIC 2009 score.
Base case analysis
Suppl. Table 4 in Online Resource 1 shows the annual incidence rates by threshold of predicted diabetes risk for both risk scores and for the two screening approaches. When comparing the rates directly for each risk threshold, we observed higher incidence rates for subgroups identified with the GDRS for thresholds up to 13% risk for the 1-stage and up to 12% for the 2-stage approach; for thresholds higher than these values, higher incidence rates were observed when the ARIC 2009 score was used to predict risk.
When evaluating NAHES 2001–2004 data, the thresholds of ≥ 7% (GDRS) and ≥ 9% (ARIC 2009) risk correspond to about 29% and 40% of the US adult population to be screened positively and thus being a target for intervention (Suppl. Table 5 in Online Resource 1). In the 2-stage screening scenario, a diabetes risk threshold of ≥ 5% risk in combination with impaired fasting glucose resulted in 20% of US adults identified for intervention when using the GDRS and 27% when using the ARIC 2009 score.
Results from deterministic sensitivity analyses are displayed in Suppl. Fig. 2 in Online Resource 1. The strongest influence on the results was observed for parameters associated with the costs and effectiveness of the preventive intervention. Results from probabilistic sensitivity analyses are presented in Suppl. Figs. 3 and 4 in Online Resource 1. The probability for the proposed strategy being cost-effective varied by different willingness-to-pay thresholds. Based on a willingness-to-pay threshold of $50,000/QALY, $60,000/QALY, and $70,000/QALY, the probabilities for the proposed strategy being cost-effective were estimated to be 48.3%, 87.1%, and 98.9%.
Our results indicate that, from a health care system perspective, a low-cost community-based intervention is cost-effective in the USA when target groups are identified by noninvasive diabetes risk scores. Screening with the GDRS or the ARIC 2009 score allows cost-effective intervention at thresholds of 7% and 9% predicted 5-year diabetes risk, respectively. If additional glucose tests are feasible (2-step screening), lower thresholds of predicted diabetes risk can be applied to identify a high-risk group for cost-effective intervention. We have investigated two diabetes risk scores, and while the overall performance to predict incident diabetes was comparable, the observed differences in absolute risk thresholds identified for both scores suggest that individual scores need to be evaluated in detail before their application.
We used the commonly used cutoff of $50,000/QALY gained to define cost-effectiveness [6, 7, 8]. However, this cutoff is not universally considered optimal and others have been discussed . Stakeholders and policy makers might be more comfortable with lower cost-effectiveness thresholds, as this usually results in smaller target groups and thus lower total costs of interventions. For example, using $40,000/QALY gained, the selected risk thresholds to be optimal would be considerably higher (16% and 20% risk for the ARIC 2009 score and the GDRS, respectively), resulting in a considerably smaller proportion of individuals qualifying for intervention.
Two previous studies evaluated cost-effectiveness of diabetes prevention in the context of screening with diabetes risk scores. Chen et al.  showed that costs were lowest for a 2-stage approach which involved the original AusDrisk risk score and a recalculation of risk with an extended risk score which additionally included fasting glucose. While, several risk thresholds were evaluated, the economically optimal threshold or the cost-effectiveness in terms of cost per QALY gained were not systematically investigated. Sullivan et al.  reported that a 2-stage strategy which additionally considered a diabetes risk score was more cost-effective than a screening strategy for identifying high-risk individuals based on impaired fasting glucose alone. However, the assumed intervention was not a lifestyle intervention and the risk score was not noninvasive but rather based on multiple biomarkers. Neither of the two studies investigated a comparable 1-stage screening scenario with a noninvasive risk score. Our analyses also extend previous publications which evaluated varying cutoffs of fasting glucose or HbA1c as screening tool [1, 2]. Interestingly, the recommended cutoff for impaired fasting glucose (100 mg/dl)  was not cost-effective in the context of targeting the DPP intervention . However, our results strongly support that initial risk score based screening to select individuals for further fasting glucose testing considerably increases cost-effectiveness. Using conventional risk factors and fasting glucose together for prediction has been shown to outperform prediction based on conventional risk factors or fasting glucose only both for ARIC and the GDRS [31, 32], further supported by our results. Thus, if applicable, a two-stage screening approach is preferable.
Our study has several limitations. First, our analyses were based on various assumptions for the simulation model. The low-cost lifestyle intervention program considered represents a group-based intervention in the communities based on DPP. Group-based interventions were shown to achieve the same effectiveness as individual programs or to be cost-effective before [18, 33, 34]. Also, the 4.4% weight loss observed in translational programs from the US National DPP in the first year  can be translated to a 35.4% risk reduction. Based on this evidence, the assumed intervention effect (25% risk reduction) seems reasonable. Still, we cannot rule out that effectiveness might be heterogeneous in high-risk groups according to different patient characteristic such as age, sex, family history of diabetes, or other risk factors. However, given that lifestyle intervention among individuals with prediabetes appear to be more effective among those with higher diabetes risk based on a noninvasive risk score , we believe our assumed intervention effect is rather conservative. We furthermore addressed uncertainty about this assumption in several sensitivity analyses. Although the diabetes incidence rates according to thresholds of diabetes risk are based on the large, fairly representative cohort studies ARIC and CHS, incidence could vary in different populations. Furthermore, we did not evaluate thresholds of diabetes risk below 5% for both risk scores. Our results indicate here that cutoffs lower than 5% might still be cost-effective if screening by fasting glucose follows initial risk score screening. In addition, the cost model has specifically been developed for cost-effectiveness analyses in a US context; generalizability of our findings to other countries is therefore unclear. We also assumed a one-off screening strategy for our simulation, but repetitive screening might change overall cost-effectiveness and thereby the optimal risk thresholds. Moreover, we assumed a screening scenario with 100% coverage in the population, and future studies are needed to evaluate different screening scenarios.
In conclusion, our findings suggest that noninvasive diabetes risk scores, such as the GDRS or the ARIC 2009 score, allow identification of high-risk target groups for cost-effective lifestyle interventions to prevent type 2 diabetes. The findings specifically support economically optimal thresholds of predicted risk derived from these risk scores for targeting community-based lifestyle interventions under a US healthcare system perspective. Such thresholds can be used to justify categories of risk when risk scores are used as tests in clinical practice. Our finding, that risk score based screening followed by fasting glucose testing increases cost-effectiveness supports current recommendations to use risk test to guide providers on whether performing a diagnostic test for prediabetes .
This manuscript was prepared using ARIC and CHS Research Materials obtained from the NHLBI Biologic Specimen and Data Repository Information Coordinating Center and does not necessarily reflect the opinions or views of the ARIC, CHS, or the NHLBI. This work further used data from the National Health and Nutrition Examination Survey (NHANES). The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. We like to thank Charlabados-Markos Dintsios for commenting on the manuscript draft. This work was supported by a grant from the German Ministry of Education and Research (BMBF) and the State of Brandenburg (DZD Grant 82DZD00302).
Compliance with ethical standards
Conflict of interest
The authors declare no conflict of interest.
Ethical standard statement
The secondary analysis of ARIC and CHS data was approved by the Ethics committee of the University of Potsdam, Germany.
All participants provided written informed consent.
- 13.Mühlenbruch K, Joost H-G, Boeing H, Schulze MB (2014) Risk prediction for type 2 diabetes in the German population with the updated German Diabetes Risk Score (GDRS). Ernahrungs Umschau 61:90–93Google Scholar
- 20.Hoerger TJ, Segel JE, Zhang P, Sorensen SW (2009) Validation of the CDC-RTI diabetes cost-effectiveness model. RTI Press Method Reports, Research Triangle Institute InternationalGoogle Scholar
- 28.Lipscomb J, Weinstein MC, Torrance GW (1996) Time preference. In: Gold MR, Siegel JE, Russell LB, Weinstein MC (eds) Cost-effectiveness in health and medicine. Oxford University Press, New York, pp 214–235Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.