figure b

Introduction

Worldwide, there was an estimated 463 million people living with diabetes in 2019. By 2045 its prevalence is expected to increase to 700 million [1], with 90% being type 2 diabetes. Type 2 diabetes is associated with an increased risk of both macrovascular and microvascular complications [2], including foot ulceration or lower limb amputation. Approximately 15–25% of people with diabetes experience a foot ulcer during their lifetime [3] and people with type 2 diabetes have an approximately ten times higher risk of amputation compared with those without [4]. Therefore, people with type 2 diabetes are monitored annually to assess their risk of foot ulcer and amputation [5].

To guide monitoring frequency or initiate appropriate treatment, the risk of foot ulcer or amputation can be estimated using prognostic models. Such models might be particularly useful in times when prioritisation of routine care is needed, as is the case during the current coronavirus disease 2019 (COVID-19) pandemic. Three main steps should be taken for a prognostic model to be applicable in clinical practice [6]. First, a prognostic model is developed in a prospective cohort or registry. Second, the prognostic model should be validated in an independent population. Third, the impact of the use of the prognostic model in clinical practice on decision making or health outcomes should be tested.

Several prognostic models have been developed to predict the risk of foot ulcer or amputation. A systematic review identified seven risk stratification systems for classifying abnormalities in the foot examination that were developed through a literature review or expert consensus [7]. Another systematic review and meta-analysis developed a prognostic model for foot ulcers among 16,385 people with diabetes [8]. This identified only three predictors: a history of foot ulceration; an inability to feel a 10 g monofilament; and the absence of any pedal pulse [8]. However, the performance of prognostic models for foot ulcer and amputation in an independent population has hardly been investigated. Two small studies of 293 and 446 individuals with diabetes externally validated the risk stratification systems for risk of foot ulcer or amputation identified in a systematic review [7], showing C statistics ranging from 0.56 to 0.86 [9, 10]. However, these systems mainly classified abnormalities in the foot examination and did not incorporate other prognostic factors. Moreover, the majority of the people with diabetes in these validation studies were from a hospital setting with only 223 people from a community-based setting and only included up to 12 months follow-up. These data suggested that the performance of these risk stratification systems was poor in a community-based setting [10]. Therefore, existing prognostic models for the risk of foot ulcer or amputation require external validation particularly in a community-based setting over a longer follow-up period.

The aim of this study was to systematically review all published prognostic models for the risk of foot ulcer or amputation in people with type 2 diabetes, to determine their quality and to quantify their predictive performance in a large community-based independent cohort over 5 years of follow-up.

Methods

We performed a systematic review and an external validation study. The protocol of the systematic review was registered with the International Prospective Register of Systematic Reviews (PROSPERO) on 21 October 2020 (registration no. CRD42019126838), and the review was performed according to Cochrane guidance for prognostic model reviews [11] and reported according to the PRISMA-P guideline [12]. The external validation study was reported according to the transparent reporting of a multivariable prognostic model for individual prognosis or diagnosis (TRIPOD) guidelines (electronic supplementary material [ESM] Table 1) [13, 14].

Systematic review

To identify prognostic models for foot ulcer or amputation, we performed a systematic search of PubMed and EMBASE until 21 October 2020. The search term contained several variations of the following keywords: ‘type 2 diabetes’; ‘diabetic foot’ or ‘neuropathy’; and ‘prediction model’ (ESM Table 2 and 3). Studies were included if the following criteria were met: (1) the prognostic model was developed for people with type 2 diabetes or included type 2 diabetes as a predictor; (2) the minimal follow-up period was 1 year; and (3) the outcome was foot ulcer, amputation, neuropathy, or a combination of these. Articles were excluded for the following reasons: non-human studies; studies in languages other than English or Dutch; external validation studies; and if the article was a commentary, review or conference abstract. We included studies with predominantly people with type 2 diabetes or with unspecified diabetes that was suspected to be type 2 diabetes based on individual characteristics. Studies with populations restricted to other forms of diabetes or conducted in populations with predominantly other forms of diabetes were excluded. Titles, abstracts and full texts were screened by two reviewers and by a third reviewer in case of disagreement (JSY, AAH, JWB) using the online tool Covidence (www.covidence.org), which only records inclusion or exclusion of a record. Full text screening was done using an Excel file to record whether a study complied with each inclusion or exclusion criterion to record the main reason for exclusion.

Data extraction

Data extraction was conducted by three reviewers (JSY, AAH, JWB) using an Excel file based on the checklist for critical appraisal and data extraction for systematic reviews of prediction modelling studies (CHARMS) [15]. This checklist was developed based on existing reporting guidelines for other types of clinical research and key methodological literature discussing recommended approaches for the design, conduct, analysis and reporting of prediction models. The following domains of CHARMS were used: source of data; participants; outcome(s) to be predicted; candidate predictors; sample size; missing data; model development; model performance; model evaluation; and results.

Risk of bias and applicability assessment

Each article was critically appraised for risk of bias and the models’ applicability to the intended population and setting by two reviewers and by a third reviewer in case of disagreement (JSY, AAH, JWB), using an Excel file based on the Prediction study Risk of Bias Assessment Tool (PROBAST) [16, 17]. The risk of bias was assessed for the following domains: the source of data; participants; outcome(s) to be predicted; candidate predictors; missing data; model development; and model performance (ESM Table 4).

External validation in the Diabetes Care System cohort

To assess the predictive performance of the selected models, we applied the models to the Diabetes Care System (DCS) cohort. The DCS cohort is a large prospective study of people with type 2 diabetes treated in routine primary care from the West-Friesland region of the Netherlands [18]. The DCS cohort is dynamic, with data on risk factors and complications collected annually from 1998 onwards. We used the most recent DCS data from 2014 until 2019 from all participants with sufficient predictor and outcome information available. In 2014, 8348 participants visited our centre and 724 were excluded because of missing data on incidence of foot ulcer or amputation during follow-up, leaving 7624 participants for analysis. We validated all models for the same time period (5 years) to obtain a fair comparison between the models.

The DCS cohort includes several demographic variables (i.e. age, sex), biomedical variables (i.e. systolic BP, HbA1c) and variables from foot screening (i.e. monofilament tests) [18]. All predictors were included as a baseline value with 2014 considered baseline. Data on the occurrence and location of ulcers and amputations were retrieved from the medical records from the DCS and the local hospital.

Model selection for validation

Retrieved models developed for people with critical limb ischaemia or people with an infected diabetic foot ulcer were excluded from the validation study since such models are not applicable to people with type 2 diabetes treated in primary care. Models were also excluded if important predictors (or proxy variables) were not available in the DCS cohort or if the parameter estimates were not provided in the model development paper and could not be retrieved from the authors.

Statistical analyses

To evaluate the predictive performance of the selected models, we used the parameter estimates as stated in each development paper (intercept or baseline hazard and coefficients). If insufficient information was available, the authors were contacted for the original model specification. We validated each model for the outcome for which it was developed and for a combined outcome of 5 year incidence of foot ulcer and amputation.

Differences in the incidence of foot ulcer or amputation in our cohort, and in the development populations, may lead to significant deviation between observed risk in our cohort and predicted risk estimated by the prognostic model. To reduce this source of miscalibration, we ‘recalibrated’ each prognostic model by adjusting the intercept (for logistic regression models) or the baseline survival function (for survival regression models). Each model was validated with and without recalibration of the intercept. For validation without recalibration, we simply applied the model to our data. If the intercept or baseline hazard could not be obtained from the original study, the model was only validated using the incidence derived from the cohort.

Model performance was assessed based on discrimination and calibration. Discrimination describes the ability of the model to distinguish those at high risk of developing foot ulcer or amputation from those at low risk. Discrimination was evaluated using Harrell’s C statistic. Calibration indicates the ability of the model to correctly estimate the absolute risks and was evaluated using calibration plots and the observed/expected ratio. Missing data were handled with multiple imputation. We used five imputation sets and pooled the model performance measures (C statistic and observed/expected ratio) according to Rubin’s rule. All statistical analyses were conducted using 5 R (version 3.6.1) (R Core Team, Vienna, Austria, www.r-project.org) [19] in combination with the following R packages: mice (version 3.7.0); rms (version 5.1-4); survival (version 3.1.8); and survAUC (version 1.0.5).

Results

Identification of prognostic models for foot ulcer or amputation

The systematic review identified 6933 articles. Of these, the full texts of 203 articles were screened and 21 articles met our inclusion criteria (ESM Fig. 1). The main reasons for exclusion were that articles did not present a prognostic model or used a follow-up shorter than one year.

Characteristics of the models

We identified 21 studies that presented 34 prognostic models to predict the risk of foot ulcer or amputation (Table 1). Most of the studies originated from Europe (n = 11) [20,21,22,23,24,25,26,27,28,29,30], followed in frequency by the USA (n = 6) [31,32,33,34,35,36]. One study originated from Japan [37], one study from India [38], one from Taiwan [39] and one study included data from multiple other studies conducted worldwide [40]. Most studies used a study population with diabetes (n = 17) and the remaining four studies included diabetes or treatment of diabetes as predictor. Most of the models were developed to predict the risk of amputation (n = 16), seven predicted foot ulcer and six predicted some form of diabetic polyneuropathy. The most commonly used prediction horizons were 1 year and 10 years. The number of events ranged from 23 to 3281. The number of predictors included in the 27 models ranged from 2 to 13. An overview of the included predictors is provided in ESM Fig. 2. The most commonly used predictors in the externally validated models were age (n = 8), HbA1c (n = 6), history of foot ulcer (n = 6) or peripheral artery disease (n = 6) (Fig. 1).

Table 1 Description of the 19 studies identified in the systematic review
Fig. 1
figure 1

Predictors included in the externally validated prediction models for foot ulcer or amputation. SBP, systolic BP

Risk of bias and applicability

In most studies, the data source was considered a low to moderate risk of bias (ESM Figs 3, 4). The domains missing data, model development and model performance were most often rated as high risk of bias. For eight of the studies missing data was rated as unclear risk of bias. Reasons for the high risk of bias in these domains were not reporting or inappropriate handling of missing data, selecting predictors in the model-based univariable selection or not reporting on measures of discrimination and/or calibration for model performance.

Apparent model performance

Discrimination was reported in terms of a C statistic for 18 models, of which ten models also presented the 95% CIs. These C statistics (95% CI) ranged from 0.65 (0.62, 0.67) to 0.84 (0.74, 0.94) for models predicting diabetic foot ulcer and from 0.52 (0.51, 0.53) to 0.83 (0.78, 0.89) for models predicting amputation (Fig. 2 and ESM Table 5). For different neuropathy outcomes, C statistics ranged from 0.57 (95% CI 0.55, 0.58) to 0.80 (95% CI not reported). Seven studies reported calibration, which was generally shown to be good. Exceptions to the good calibration were an observed/expected ratio ranging from 0.7 to 1.6 in the study by Goodney et al (2010) [33] and borderline significant calibration tests for models by Venermo et al (p ≥ 0.07) and Basu et al (p ≥ 0.05) [27, 36].

Fig. 2
figure 2

Apparent discrimination of prognostic models for amputation (a) and foot ulcer (b). amp, amputation; excl, excluded; NA, not applicable because the original study did not report these data

External validation

Selection of the models

Of the 21 studies reporting on 34 models, 12 were excluded for external validation. The most important reason was that the model was developed for a different target population (n = 5) such as people with critical limb ischaemia. Two studies were excluded for external validation because the parameter estimates were not reported and five studies could not be validated because the required predictors or outcome (i.e. incidence of neuropathy) were not available in the cohort (ESM Fig. 1). For some models, predictors were approximated to enable validation. Townsend deprivation score, included in the model of Hippisley-Cox and Coupland [22], was not available in the DCS cohort and the reported sex-specific mean value of the deprivation score was used (men 0.5, women 0.8). Also, the number of cigarettes per day was not recorded in the DCS, and all smokers were assumed to be moderate smokers when applying the model of Hippisley-Cox and Coupland [22]. The model of Martins-Mendes et al [25] used physical impairment as a predictor; this was not registered in the DCS and we assumed that none of the participants were physically impaired. The model of Tseng et al included several specific skin infections [35], and these were assumed absent in all the DCS participants due to unreliable recording of these variables in the DCS.

Characteristics of the external validation cohort

Of the 7624 people with type 2 diabetes at baseline, 485 (6.4%) developed a foot ulcer and 70 (0.9%) underwent amputation during the 5 years of follow-up. The mean age of the study population was 67.3 years, 53.1% were male sex and the median duration of diabetes was 7.2 years (Table 2).

Table 2 Characteristics of 7624 people with type 2 diabetes from the Hoorn DCS cohort according to the history of ulcer or amputation in 2014

Discrimination

In the external validation, discriminatory ability of six prognostic models for the development of foot ulcer over 5 years showed C statistics (95% CI) ranging from 0.54 (0.54, 0.54) for the prediction of diabetic foot ulcerations (PODUS) model [40] to 0.81 (0.75, 0.86) for the model by Boyko et al [31] (Fig. 3). For risk of amputation, discriminatory ability of the seven models showed C statistics (95% CI) ranging from 0.63 (0.55, 0.71) for the model by Resnick et al [34] to 0.86 (0.78, 0.94) for the final model by Tseng et al [35] (Fig. 3). For the prediction of the combined outcome of foot ulcer and amputation, C statistics (95% CI) ranged from 0.53 (0.51, 0.55) to 0.84 (0.82, 0.86) (ESM Table 6).

Fig. 3
figure 3

Discriminatory ability of seven prognostic models for amputation (a) and six prognostic models for a foot ulcer (b) during 5 years in an external validation among 7624 people with type 2 diabetes from the DCS cohort

Calibration

The calibration plots display the relationship between the predicted risk and observed outcome of the prognostic models after recalibration based on the incidence of the outcome in the DCS cohort (ESM Figs 58). In most models, the first quintiles showed agreement between the predicted risk and observed incidence of the particular outcome. Most models generally showed an underestimation of foot ulcer or amputation risk in people in the highest or second-highest risk quintiles. PODUS (2015) [40] and the models by Resnick [34] and Hippisley-Cox and Coupland [22] showed good calibration over all the quintiles.

Discussion

This systematic review and external validation study provides a comprehensive overview of prognostic models for foot ulcer or amputation and estimated their performance in an independent population. We identified 21 studies that described 34 prognostic models, of which most predicted risk of amputation. Thirteen models could be validated in a large independent community-based population of people with type 2 diabetes. For foot ulcer, most models showed C statistics above 0.75. Performance of models predicting the risk of amputation showed C statistics ranging from 0.63 to 0.86. Predictors included in the models were mostly available in clinical practice and consisted of demographic factors, diabetes-related risk factors, comorbidities and results from the foot examination. Most studies showed a moderate to high risk of bias, mainly due to insufficient reporting on model development.

Most of the identified models predicted future risk of amputation. Of the seven models applicable to the general type 2 diabetes population, three models showed a C statistic around 0.65. However, the other four models showed good discriminatory performance with C statistics over 0.74, which is similar to the performance found in the development populations [25, 35]. Calibration of these models generally showed overprediction of the risk of an ulcer or amputation in the highest predicted risk group. The models by Martins-Mendes et al [25] only contained three predictors (complication count, pulses and previous foot ulcer). In comparison, the model by Tseng et al [35] consisted of 12 predictors including interaction terms using specific information on infections. Therefore, the model by Martins-Mendes [25] seems a very applicable and well-performing model for predicting future risk of amputation. The earlier review by Monteiro-Soares also reported relatively good performance of classification systems for amputation with a C statistic over 0.72, although exact C statistics were not provided [9]. However, this study validated these systems over a prediction horizon of 1 year, which generally results in better performance. This study shows that prognostic models based on foot examination and other characteristics can accurately predict future risk of amputation over a 5 year horizon.

For the prediction of foot ulcers, we were able to validate six prognostic models. These models performed equally well with C statistics around 0.75 for four models, with the exception of the model by Crawford et al (2011) [21]. This model included three predictors, of which two were interaction terms between two variables. Since the model was developed with only 23 events, these predictors may have resulted from overfitting. The other models generally showed performance comparable with the development study, with the exception of PODUS (2015) where discrimination was not reported [40]. The PODUS model (2015) [40] showed good calibration while the other models showed an overprediction of ulcers in the highest risk groups. The five models were based on two to seven predictors and are all relatively easy to apply in clinical practice. Of note, the simplified model by Martins-Mendes was only based on complication count and history of foot ulcer and showed a performance comparable with that of other models [25]. In the study by Monteiro-Soares, risk classification systems based on the foot examination were validated for their prognostic performance to predict future foot ulcers [10]. In a hospital setting, these systems performed well to predict foot ulcers but in a community-based setting discriminatory performance was generally below 0.7. Our study shows that prognostic models including other predictors next to the foot examination perform well to predict future risk of foot ulcers in a community-based setting over a 5 year horizon.

Adequate performance of a model based on calibration and discrimination is a prerequisite for its application in clinical practice. Once this is evaluated, clinical applicability is the leading consideration for choosing a suitable model. With this external validation study, we identified models from four studies that performed sufficiently well to allow such clinical application. For ease of use in clinical practice, the most suitable prognostic model is based on only a few predictors that are readily available in clinical practice. Since we evaluated predictive performance for two closely linked outcomes, foot ulcer and amputation, the prognostic model of choice should preferably perform well to predict both outcomes simultaneously. Using a combined endpoint, only the models of Boyko et al [31], PODUS (2015) [40] and Martins-Mendes et al [25] showed good performance with C statistics of 0.75 or over. These studies also showed low or moderate risk of bias for most domains, except one for PODUS (2015) [40] and two for Martins-Mendes et al [25]. Although the PODUS (2015) model and the model by Martins-Mendes et al contain predictors that are mostly available in clinical practice, the model by Boyko et al also contains predictors requiring further diagnostic testing such as visual acuity or infections of the nails. Such predictors make a model less feasible to implement in clinical practice. Furthermore, a calculation tool such as an Excel spreadsheet, provided by Boyko et al, may enable implementation in routine practice. Of note, all models contain history of foot ulcer or amputation as predictors. Future studies should thus update the models to apply them for primary prevention. Nevertheless, the models identified in this study can be used to guide monitoring strategies in people with type 2 diabetes based on their predicted risk of foot ulcer and amputation. In people at low risk of foot complications, the current routine screening interval can be extended. Such a personalised screening frequency can be established using an optimal balance between the risk of delayed detection of foot complications and the costs of foot screening. This approach was previously used to personalise monitoring frequency for retinopathy based on prediction models [41, 42]. A recent observational study of 10,421 people with diabetes showed that only 5.1% of those classified as low risk had progressed to moderate risk in 2 years [43]. If people with diabetes change risk status infrequently, then regular foot screening is less likely to be of clinical value and personalised screening intervals could be valuable. Personalised monitoring based on prognostic models for foot ulcer and amputation could substantially reduce patient and clinician burden in the management of people with type 2 diabetes. The clinical utility of a prediction model is also dependent on the interventions available for those at high risk. There are a number of effective interventions for preventing foot ulceration and as a result amputations, including the reduction of peak foot pressure, removing excessive callus, accommodating foot deformities [44] and adequate information and education of the individual with type 2 diabetes for foot care [45]. Further research could focus on the development of such a stratification and monitoring system, where the prevalence of false-negative individuals should be foremost in determining monitoring intervals.

A limitation of our study is the low incidence of amputation in the DCS cohort. Validation studies to test model performance require at least 100 events [46, 47]. This criterion was fulfilled for incidence of foot ulcer but the analysis may have been slightly underpowered for incidence of amputation. However, we observed stable estimates for risk of amputation with relatively small CIs, suggesting that the limited number of events did not affect our results to a large extent. Another limitation of our study is the inability to differentiate between major and minor imputations. These data were only available for part of the events and further restricting an already infrequent outcome in our population would result in too few events to draw conclusions. Furthermore, the study population was primarily of European descent and from a centrally organised care centre with standardised protocols. This may limit the extrapolation to populations with a different migration background and people with diabetes in a less-standardised care setting. Although certain models were directly applicable to our data, a final limitation is that not all predictors used by the models were available in the DCS cohort. However, data from the DCS cohort arises from routine clinical practice and the first prerequisite of a model to be used in clinical practice is the availability of the predictors in routinely collected data. This study also has several strengths. First, the prognostic models were identified through a systematic review. Second, the use of a large, unselected cohort of people with type 2 diabetes, including almost all people with type 2 diabetes in the catchment area of the DCS, enhances this external validation study. Third, all predictor and outcome measurements in the cohort were performed according to centrally standardised protocols, which enhanced the reliability of the data and resulted in a low number of missing variables. Finally, the results of the validation study apply to routine clinical practice settings for people with type 2 diabetes.

In conclusion, this systematic review and external validation study identified 34 prognostic models for future risk of foot ulcer or amputation. The external validation of 13 models showed that most models performed well to predict foot ulcer or amputation over a 5 year horizon. The models by Boyko et al [31], PODUS (2015) [40] and Martins-Mendes et al [25] performed best in predicting the combined endpoint of ulcer and amputation, and contain easy-to-measure predictors, making them suitable for clinical practice. These prognostic models could be used to tailor the screening frequency of the foot examination based on individual risk predictions, and may highly increase the efficiency of foot care.