Abstract
Aims/hypothesis
Metabolic risk factors and plasma biomarkers for diabetes have previously been shown to change prior to a clinical diabetes diagnosis. However, these markers only cover a small subset of molecular biomarkers linked to the disease. In this study, we aimed to profile a more comprehensive set of molecular biomarkers and explore their temporal association with incident diabetes.
Methods
We performed a targeted analysis of 54 proteins and 171 metabolites and lipoprotein particles measured in three sequential samples spanning up to 11 years of follow-up in 324 individuals with incident diabetes and 359 individuals without diabetes in the Danish Blood Donor Study (DBDS) matched for sex and birth year distribution. We used linear mixed-effects models to identify temporal changes before a diabetes diagnosis, either for any incident diabetes diagnosis or for type 1 and type 2 diabetes mellitus diagnoses specifically. We further performed linear and non-linear feature selection, adding 28 polygenic risk scores to the biomarker pool. We tested the time-to-event prediction gain of the biomarkers with the highest variable importance, compared with selected clinical covariates and plasma glucose.
Results
We identified two proteins and 16 metabolites and lipoprotein particles whose levels changed temporally before diabetes diagnosis and for which the estimated marginal means were significant after FDR adjustment. Sixteen of these have not previously been described. Additionally, 75 biomarkers were consistently higher or lower in the years before a diabetes diagnosis. We identified a single temporal biomarker for type 1 diabetes, IL-17A/F, a cytokine that is associated with multiple other autoimmune diseases. Inclusion of 12 biomarkers improved the 10-year prediction of a diabetes diagnosis (i.e. the area under the receiver operating curve increased from 0.79 to 0.84), compared with clinical information and plasma glucose alone.
Conclusions/interpretation
Systemic molecular changes manifest in plasma several years before a diabetes diagnosis. A particular subset of biomarkers shows distinct, time-dependent patterns, offering potential as predictive markers for diabetes onset. Notably, these biomarkers show shared and distinct patterns between type 1 diabetes and type 2 diabetes. After independent replication, our findings may be used to develop new clinical prediction models.
Graphical Abstract
![](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00125-024-06231-3/MediaObjects/125_2024_6231_Figa_HTML.png)
Similar content being viewed by others
Avoid common mistakes on your manuscript.
![figure b](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00125-024-06231-3/MediaObjects/125_2024_6231_Figb_HTML.png)
Introduction
The ongoing increase in the prevalence of type 2 diabetes mellitus poses a need for better diagnostics and treatment of the disease [1]. Timely detection of the onset of type 2 diabetes is crucial to reduce the harmful effects of long-term hyperglycaemia and hyperinsulinaemia on tissues such as blood vessels and nerves [2, 3]. The inclusion of molecular biomarkers in predictive models for type 2 diabetes diagnosis has been shown to improve accuracy compared with models based exclusively on risk factors such as family history, lifestyle behaviour and anthropometric measures [4,5,6]. Moreover, using combinations of biomarkers in panels may be more effective than using single biomarkers [5]. The discovery of novel biomarkers may also provide new insights into disease mechanisms, yield more accurate risk estimates for type 2 diabetes to initiate screening and prevention measures, and be used in identification of targets for new treatments [4, 7, 8].
Previous studies have described the temporal changes in biomarkers such as BMI, fasting glucose, LDL, HDL, triacylglycerol and C-reactive protein (CRP) preceding a type 2 diabetes diagnosis [9,10,11,12,13,14]. These studies show a clear trend for an increasing burden of risk factors towards the time of diagnosis. However, these temporal biomarkers only cover a small subset of the biomarkers that have been reported to be associated with diabetes mellitus [7]. Moreover, inflammation, a central driver of diabetes [15, 16], is poorly covered by current biomarkers, except for CRP [9, 13] and IL-1 receptor antagonist [11]. Increasing the molecular depth of temporal modelling of progression to diabetes before diagnosis may be key to understanding the aetiology of diabetes. While type 1 diabetes and type 2 diabetes are often viewed as two distinct diseases, some individuals exhibit phenotypic features of both diseases, including autoimmunity and insulin resistance [17, 18]. Moreover, the immune system has been shown to drive insulin resistance in type 2 diabetes [15, 16]. It has therefore been suggested that diabetes should be viewed as a continuum [19]. By including both diseases in cohort studies of diabetes, we can improve our understanding of the shared and distinct molecular features of type 1 and type 2 diabetes.
In this study, we analysed plasma samples from 683 blood donors in the nationwide Danish Blood Donor Study (DBDS) biobank to characterise temporal changes in the years leading to a diabetes diagnosis, and to determine the predictive value of such biomarkers together with commonly used risk factors for diabetes.
Methods
Further details of the methods are given in electronic supplementary material (ESM) Methods.
Ethics approvals
The study was approved by the National Committee on Health Research Ethics (NVK 1700407) and the Danish Data Protection Agency (P-2019-99).
Study design and cohort characteristics
In this retrospective case–control study nested within the DBDS, blood samples collected as part of standard blood donations were selected based on the presence or absence of an incident diabetes mellitus diagnosis in the period 2 January 2006 to 31 December 2016. To identify individuals with incident diabetes (non-childhood-onset type 1 diabetes or type 2 diabetes), we created a diabetes register for the DBDS using a modified version of the algorithm described by Carstensen et al [20] covering the period 1977–2016. An initial cohort was constructed based on donation records for all Danish blood donors in the period 2006–2016 (ESM Fig. 1). Individuals were required to fulfil the following criteria: (1) provide valid DBDS consent; (2) have imputed genotype data available; and (3) have had a sample collected as part of inclusion in the DBDS, and at least two other samples at least 9 months apart within the study period. Restriction to individuals for whom imputed genotype data were available resulted in selection of individuals with European ancestry. Based on the above criteria, 344 individuals were identified as having incident diabetes. We randomly sampled 372 individuals without diabetes from 71,095 eligible individuals with a comparable birth year and sex distribution as the individuals with incident diabetes (corresponding to a 1:1.1 sampling ratio), with subsequent removal of six individuals with diabetes. After checking for sample availability in the biobank, 324 individuals with incident diabetes and 359 individuals without diabetes were included in the study. Three consecutive samples obtained at least 9 months apart were analysed for 659 individuals, and two samples were analysed for 24 individuals. We defined the end of follow-up as the date of diabetes diagnosis according to the diabetes register for individuals with incident diabetes. For individuals without diabetes, the end of the follow-up was defined as 31 December 2016, i.e. the end of the available version of the diabetes register. Because of the selection criteria, the cohort is not comparable to the overall DBDS cohort with regard to age and sex distributions, although the regional distribution was similar in the two cohorts (ESM Table 1). As the majority of those in the DBDS cohort are genetically of European ancestry, our cohort is largely representative of the ethnic makeup of the DBDS cohort [21]. Data regarding socioeconomic factors were not available for this study; socioeconomic factors in the DBDS cohort have been described previously [22].
Biomarker measurements
We measured the concentration of 54 proteins using the V-PLEX metabolic panel 1 human kit and the V-PLEX human biomarker 54-plex kit (Meso Scale Diagnostics, USA) for 2025 samples (ESM Table 2). Additionally, 249 metabolites and lipoprotein particles were analysed for 1866 samples by Nightingale Health (Finland) using targeted metabolomics, of which 171 metabolites in 1863 samples were successfully analysed (ESM Table 2). Polygenic risk scores (PRSs) for 28 phenotypes were calculated using the LDpred2 algorithm [23] (ESM Table 3).
Phenotype and technical covariates
Sex, date of birth, parental disease history and medication use were extracted from the Danish national registers. Self-reported height, weight and smoking status were extracted from questionnaires completed as part of the DBDS. Data for region, seasonality, time of day when sample was obtained and storage duration were available through the blood bank database.
Statistical analysis and imputation of missing values
All statistical analyses were performed using R version 4.0.0 (https://www.R-project.org/). Missingness across variables varied between 0.1% and 15% (Table 1 and ESM Table 2). To avoid bias and loss of power, missing values for height, weight, smoking status, time of day when sample was obtained, and protein and metabolite data were imputed using multiple imputations by chained equations as implemented in the mice package (version 3.14.0, https://cran.r-project.org/package=mice) [24].
The association between the exposures (diabetes diagnosis and diabetes type) and protein and metabolite concentrations was estimated using linear mixed-effects models. This cause–effect relationship between diabetes and systemic biomarker concentrations relies on the assumption that initiation of the pathological cascade and subsequent onset of diabetes occur years before diagnosis [14, 25]. Each biomarker was fitted using diabetes diagnosis (true/false) or diabetes type (type 2 diabetes/type 1 diabetes/no diabetes) as the exposure with person-specific random intercept and slope using the lmer function from the lme4 package (version 1.1-30, https://cran.r-project.org/package=lme4). For each biomarker, two models were fitted. Model 1 was an additive model used to assess the main effect of the diabetes exposure on the biomarker. In model 2, we modelled the time-dependent interaction of diabetes diagnosis for individuals with incident diabetes using restricted cubic splines on time to end of follow-up. For individuals without incident diabetes, the end of follow-up is simply a time point in a normal life course, and the time-dependent interaction was therefore modelled using a linear term on time to end of follow-up. We estimated the per-year effects using estimated marginal means as implemented in the emmeans package (version 1.7.5, https://cran.r-project.org/package=emmeans). Depending on the model fit, models were fitted using either log-transformed or untransformed biomarker concentrations. Estimates and confidence intervals are reported as the relative fold change between individuals with and without incident diabetes. We removed biomarkers for which the model check was not satisfactory from the main results; the effect estimates for these are shown in ESM Table 4.
We used linear methods (Poisson regression with bootstrapping, ‘boot-Poisson’) and non-linear methods (survival random forest model, ‘surv-RF’) to rank the biomarkers according to their predictive value for diabetes diagnosis. Biomarker panels were created based on the rankings from the two methods. For each method, 41 panels were created. Panel 0 serves as a base model consisting only of the covariates age, sex, BMI, smoking status, parental history of diabetes, region, time of day when sample was obtained and seasonality. Panels 1–40 include the top 40 variables ranked based on feature importance, where panel 1 includes covariates and the top biomarker, panel 2 includes covariates and the top two biomarkers, etc., and panel 40 includes covariates and the top 40 biomarkers.
To assess the time-dependent predictive value of the biomarker panels identified above, we assessed the effect of biomarkers on the time-dependent risk of a diabetes diagnosis. We performed Poisson regression analysis using the glm function from the stats R package with log(time to end of follow-up) as an offset. Each model was run on 500 bootstrapped datasets containing approximately two-thirds of the individuals with available data across biomarker data types. All models were assessed for prediction accuracy and calibration using 3-, 5- and 10-year areas under the receiver operating characteristic curve (AUROC), Brier score and Matthew's correlation coefficient (MCC, cumulative risk cut-off = 0.5) for the first (earliest) sample using the one-third holdout sample. The three metrics were summarised as median (2.5th percentile, 97.5th percentile). To account for the unequal distribution of individuals with incident diabetes and individuals without diabetes, each holdout sample was down-sampled to a 50:50 case–control ratio with an equal number of participants in each sample.
All p values were adjusted using a false discovery rate (FDR) of 5% for diabetes diagnosis and diabetes type separately.
Results
We identified 324 blood donors who developed diabetes within the study period from 2006 to 2016 (incident diabetes group) and 359 blood donors without diabetes (non-diabetes group). As expected, individuals with incident diabetes had higher median BMI at baseline and a parental history of diabetes was more common than among individuals without diabetes (Table 1). The majority of individuals with incident diabetes (n=301, 93%) were identified as having type 2 diabetes, while 23 individuals (7%) were identified as having non-childhood-onset type 1 diabetes.
We measured the concentration of 225 biomarkers (54 proteins, 73 metabolites and 98 lipoprotein particles) in three plasma samples from 659 individuals and two plasma samples from 24 individuals, obtained during a follow-up period of up to 11 years (Fig. 1, Table 2 and ESM Table 2). The majority of samples were obtained between 0 and 8 years before the end of follow-up (ESM Fig. 2). We identified two proteins, nine metabolites and seven lipoprotein particles that showed a significant temporal relationship with diabetes diagnosis, and 18 proteins, 29 metabolites and 27 lipoprotein particles with an absolute difference between the two groups (Fig. 2 and ESM Table 4). Moreover, we found 21 biomarkers that were significantly different between individuals who developed type 1 or type 2 diabetes (ESM Fig. 3).
Graphical representation of study set-up. Individuals with incident diabetes were selected based on a diabetes register capturing patients with type 1 or type 2 diabetes (DM) in the period 1977–2016. Only individuals with at least one DBDS inclusion sample collected in the period 2010–2016 and two additional plasma samples (archival sample or DBDS inclusion sample) collected in the period 2006–2016 were included as participants. The three plasma samples had to be donated at least 9 months apart to ensure ample time for molecular changes reflecting the development of diabetes to take place
Biomarker-specific effect estimates for incident diabetes. Effect estimates obtained using mixed-effects models (fold change, FC) are shown for proteins (a), metabolites (a, b) and lipoprotein particles (b). All estimates are shown as point estimates and 95% CIs. Biomarkers with a significant interaction term are indicated by triangles that show the direction of the trend. Significant associations (FDR-adjusted p<0.05) are indicated by filled circles; non-significant estimates are indicated by open circles. bFGF, basic fibroblast growth factor; FA, fatty acids; GM-CSF, granulocyte macrophage colony-stimulating factor; IDL, intermediate-density lipoprotein; IP-10, interferon-induced protein 10; MDC, macrophage-derived chemokine; MUFAs, mono-unsaturated fatty acids; PlGF, placental growth factor; PUFAs, poly-unsaturated fatty acids; SFAs, saturated fatty acids; sVCAM-1, soluble vascular cell adhesion molecule-1; ULDL, ultra low-density lipoprotein; VEGF, vascular endothelial growth factor. The prefixes XS, S, M, L, XL and XXL refer to lipoprotein sizes from extra small to extremely large
Markers of glucose metabolism
We found that plasma glucose increases progressively towards the time of diabetes diagnosis, starting 2 years before diagnosis (Fig. 3a). Insulin and glucagon were 48% and 29% higher, respectively, in the incident diabetes group compared with the non-diabetes group; however, we found no temporal trends for these biomarkers (Fig. 2a). It should be noted that the model check plots for glucose and glucagon did not show an entirely satisfactory model fit, and the estimates should be regarded with caution. In the sub-analysis for diabetes type, we found that glucose levels were significantly lower in the type 1 diabetes group compared with the type 2 diabetes and non-diabetes groups between 5 and 10 years before diagnosis. At 2 years before diagnosis, plasma glucose was significantly higher in the type 1 diabetes group compared with the non-diabetes group, with considerably higher effect estimates than the type 2 diabetes group, although the difference was not significant (ESM Fig. 4). The type 2 diabetes group displayed a similar pattern for plasma glucose as the full diabetes group, with increasing plasma glucose levels at 3 years before diagnosis. Insulin and glucagon levels were lower in the type 1 diabetes group compared with the type 2 diabetes group, and similar to those in the non-diabetes group (ESM Fig. 3).
Per-year estimated marginal means for temporally changing biomarkers. Estimated marginal means are shown for proteins (a), metabolites (b) and lipoprotein particles (c) that showed a significant interaction between incident diabetes and time to end of follow-up (FDR-adjusted p<0.05) as assessed by ANOVA. Estimated marginal means are shown as point estimates and 95% CIs for the diabetes group (points with error bars) and the non-diabetes group (line with shaded area) for each year before the end of follow-up, i.e. time 0 (diabetes diagnosis for incident diabetes cases and the end of the study period for individuals without diabetes). Values have been z score-normalised to ease visualisation; hence one unit difference corresponds to one SD. The exact estimated marginal means are given in ESM Table 4. Significant estimates (FDR-adjusted p<0.05) are indicated by filled circles; non-significant estimates are indicated by open circles. Conc., concentration; IDL, intermediate-density lipoprotein; PG, phosphoglycerides; TG, triacylglycerol; ULDL, ultra low-density lipoprotein. The prefixes XS, S, M, L, XL and XXL refer to lipoprotein sizes from extra small to extremely large
Growth factors and markers of endothelial function
Of the ten growth factors and biomarkers of endothelial function that were included, five were higher in the incident diabetes group (Fig. 2a). The levels of a further three markers, soluble intercellular adhesion molecule-1 (sICAM-1), tyrosine protein kinase receptor 2 (Tie-2) and fms-related receptor tyrosine kinase 1 (Flt-1), were found to change towards the time of diabetes diagnosis, although the per-year effects for sICAM-1 and Tie-2 were not significant after FDR adjustment of p values (Fig. 3a). sICAM-1 increased 2 years before diagnosis from a relative fold change of 7% (95% CI 1, 14%; padj=0.101) to 12% at year zero (95% CI 2, 23%; padj=0.101) and Tie-2 increased 3 years before diagnosis from 4% (95% CI 0, 9%; padj=0.076) to 10% at year 0 (95% CI 2, 19%; padj=0.061) (Fig. 3a). Moreover, we found 20% and 8% lower concentrations of Flt-1, a soluble vascular endothelial growth factor receptor, at 10 and 5 years before diagnosis, respectively, after which it normalised to that of the non-diabetes group (Fig. 3a). When comparing the two diabetes types, Tie-2 showed higher point estimates for the type 1 diabetes group compared with the type 2 diabetes group, Flt-1 showed opposite trends between the two diabetes types, and sICAM-1 showed no time dependency and was only found to be increased in the type 2 diabetes group (ESM Figs 3 and 4).
Markers of inflammation
Six chemokines, four proinflammatory cytokines and one T cell-derived cytokine showed higher concentrations in the incident diabetes group (Fig. 2a). The largest increases were found for serum amyloid A (SAA) (37%, 95% CI 18, 60%; padj=2.21×10−4) and thymus and activation-regulated chemokine (TARC) (31%, 95% CI 15, 48%; padj=1.03×10−4). Moreover, we found that glycoprotein acetyls, another indicator of inflammation, were higher in the incident diabetes group (5%, 95% CI 2, 8%; padj=4.97×10−4) (Fig. 2a). We found that the chemokine C-C motif chemokine ligand 4 (CCL4) increased towards the time of diabetes diagnosis, starting from a lower level than the non-diabetes group of −44% (95% CI −54, −5%; padj=0.049) at 10 years before diagnosis, to −19% at 5 years before diagnosis (95% CI −30, −6%; padj=0.049), after which the diabetes group did not differ from the non-diabetes group (Fig. 3a). In the sub-analysis for diabetes type, we found that an inflammatory environment was generally present in both diabetes types, although the point estimates were generally lower for the type 1 diabetes group than the type 2 diabetes group, except for human monocyte chemoattractant protein 4 (MCP-4), which had a point estimate of 54% (95% CI 14, 110%) for type 1 diabetes vs 35% (95% CI 16, 57%) for type 2 diabetes (ESM Fig. 3). Lastly, we found that the IL-17A/F heterodimer was a type 1 diabetes-specific temporal marker, increasing from −61% (95% CI −77, −56%; padj=9.30×10−4) at 10 years before diabetes diagnosis to 103% (95% CI 26, 226%, padj=0.007) at year 0, relative to the type 2 diabetes group, with similar estimates relative to the non-diabetes group (ESM Fig. 4a).
Amino acids and lipoproteins
We found that the two branched-chain amino acids (BCAAs), isoleucine and valine, as well as total BCAAs, were higher in the incident diabetes group (Fig. 2a), while glycine decreased towards the time of diabetes diagnosis, starting 4 years before diagnosis, with a 21% lower concentration in the year of diagnosis (95% CI −30, −12%; padj=1.96×10−4) (Fig. 3b). We observed a general trend of higher concentrations of VLDL particles of all sizes and the components of VLDL particles (Fig. 2b), and a significant temporal relationship was observed for HDL-cholesterol and multiple HDL particle features from 10 to 5 years before diagnosis, thereafter decreasing towards the time of diagnosis (Fig. 3b). This trend in non-esterified cholesterol, particle concentration, phospholipids and total lipids appeared to be driven by small HDL (S-HDL) and medium HDL (M-HDL) particles (Fig. 3c). In the sub-analysis for diabetes type, we found no significant differences in isoleucine or glycine between any of the groups, while valine and total BCAAs were only significantly higher in the type 2 diabetes group (ESM Fig. 3). Differences in lipoproteins were only observed for the incident type 2 diabetes cases, while the type 1 diabetes group did not differ from the non-diabetes group (ESM Fig. 3).
Ranking of biomarkers for time-to-event prediction
To rank the measured biomarkers according to their predictive potential in time-to-event models, we used Poisson regression with bootstrapping (‘boot-Poisson’) and survival random forest models (‘surv-RF’). In addition to proteins, metabolites and lipoproteins, we included PRSs for type 1 diabetes, type 2 diabetes and 26 health-related phenotypes (ESM Table 3). We created panels based on the top 40 biomarkers from each biomarker data type (28 for the PRS data) and four combinations of biomarker data types (Fig. 4).
Parameter importance for the top 40 markers from each molecular dataset. Parameter importance for the top 40 markers for each molecular dataset as assessed by variable importance values from the surv-RF model using 100 trees over 1000 bootstraps and the percentage of models where p<0.1 for the marker estimate calculated from the boot-Poisson model with 1000 bootstraps. The rank within each combination of biomarker data types is shown in the heatmap. Markers are arranged according to molecular type and groups. Groups are coloured to assist distinction between marker groups. Variable importance has been multiplied by 10 to give a range of 0–1. AMI, acute myocardial infarction; bFGF, basic fibroblast growth factor; CAD, coronary artery disease; CKD, chronic kidney disease; FA, fatty acids; GIP, gastric inhibitory polypeptide; GLP-1, glucagon-like peptide-1; IMID, immune-mediated inflammatory diseases; IP-10, interferon-induced protein 10; MDC, macrophage-derived chemokine; MUFAs, mono-unsaturated fatty acids; NAFLD, non-alcoholic fatty liver disease; PlGF, placental growth factor; PUFAs, poly-unsaturated fatty acids; SFAs, saturated fatty acids; T1DM, type 1 diabetes; T2DM, type 2 diabetes; TSLP, thymic stromal lymphopoietin; ULDL, ultra low-density lipoprotein; VEGF, vascular endothelial growth factor. The prefixes XS, S, M, L, XL and XXL refer to lipoprotein sizes from extra small to extremely large
When comparing the 40 top-ranked biomarkers from the 14 model combinations, a few observations stood out. First, plasma glucose had the highest rank in all models where it was included, with the percentage of models ranging from 97 to 100% in the boot-Poisson model and the variable importance ranging from 0.063 to 0.095 in the surv-RF model, which was approximately 4–6 times higher than the variable importance for the second highest marker in any panel (ESM Table 5). Second, a large proportion of the biomarkers identified above as affected by an incident diabetes diagnosis, either temporally or persistently, were included in the panels. For example, the surv-RF 40 top-ranked biomarkers included plasma glucose, Tie-2, CRP, sICAM-1 and glycine, which were all identified as temporally changing biomarkers (Fig. 3). Moreover, the model included IL-1 receptor agonist, which also increases towards the time of diabetes diagnosis but was excluded from the results above due to poor model fit (ESM Table 4). All six biomarkers were found in the top 12 highest ranks. Of the remaining 34 markers, 22 were found to be significantly persistently higher in the diabetes group compared with the non-diabetes group. Only two non-significant markers had a rank of 28 or higher. Surprisingly, the boot-Poisson top 40 markers only included 16 markers that were shown to be significantly different between the diabetes and non-diabetes groups, four of which temporally changed towards the time of diabetes diagnosis, namely plasma glucose, glycine, Tie-2 and phosphatidylcholines. Moreover, in contrast to the surv-RF method, we did not find a clustering of non-significant markers within low ranks. Third, boot-Poisson panels generally included more PRSs and markers with lower concentrations in the diabetes group than the non-diabetes group, while surv-RF gave higher ranks to proteins and VLDL particles (Fig. 4).
Prediction of diabetes diagnosis
To test the predictive capability of each panel, we modelled the risk of developing diabetes within 3, 5 or 10 years (ESM Fig. 5). For both feature selection methods, the best prediction based on AUROC using all three data types over a 10-year prediction period included 12 biomarkers, both with an AUROC of 0.84 (2.5th, 97.5th percentiles 0.66, 0.96) (ESM Table 6). Addition of 11 biomarkers yielded an increase in AUROC of 0.05 compared with covariates plus plasma glucose (AUROC = 0.79; 0.55, 0.93), a comparable increase to that observed for addition of plasma glucose to the model that only included covariates (AUROC = 0.75; 0.54, 0.91) (ESM Table 7). Boot-Poisson yielded the highest AUROC for 3- and 5-year prediction periods using panel 21 (AUROC = 0.76; 0.56, 0.92) and panel 38 (AUROC = 0.82; 0.62, 0.95), respectively (ESM Table 6).
Discussion
Diagnostic biomarkers, primarily HbA1c and fasting glucose, are widely used to identify prediabetic and diabetic individuals [26]. However, their benefit is limited for individuals with prediabetes or early stages of type 2 diabetes [26, 27]. Insulin resistance develops years before changes in fasting glucose levels occur [14], indicating that the current clinical biomarkers do not capture early changes in glucose metabolism. Consequently, there is a need to identify novel biomarkers that provide accurate risk estimates for type 2 diabetes at an early, sub-clinical stage to enable early screening and prevention [4, 5].
To our knowledge, this study is the first to describe the temporal trajectories of growth factors and markers associated with endothelial dysregulation prior to a diabetes diagnosis, although these biomarkers are well known to be associated with microvascular diabetes complications such as retinopathy [28,29,30] and nephropathy [31]. We found that all but two of the measured markers were higher in the diabetes group or increased towards the time of diabetes diagnosis. The majority of these were found to be significantly increased for both types of diabetes. We found that Flt-1, sICAM-1 and Tie-2 increased towards the time of diabetes diagnosis, although the per-year effects were not significant for sICAM-1 and Tie-2 after FDR adjustment. Flt-1 has been found in higher concentrations in individuals with type 2 diabetes, but not in individuals with impaired fasting glucose [32]. sICAM-1 has been found to be associated with the development of both retinopathy [33] and nephropathy [34], and Tie-2 has been shown to be associated with nephropathy [35], but may have a protective role in retinopathy [36]. Both sICAM-1 and Tie-2 were found to increase approximately 2–3 years before diabetes diagnosis. This is in line with previous studies showing that nephropathy and retinopathy are some of the earliest clinical indications of diabetes [37, 38].
We identified panels of 12 biomarkers that improved the prediction of diabetes up to 10 years before a diagnosis compared with a base model that comprised commonly used risk factors (age, BMI, parental history of diabetes and plasma glucose). The best-performing panels included all biomarker data types, i.e. PRSs, proteins and metabolites. The inclusion of biomarkers to assess diabetes risk has the potential to improve screening, which has been shown to have only moderate utility when based on HbA1c alone [39]. Our findings may be used to develop new risk assessment tools for early detection of the development of diabetes using a single non-fasting plasma sample.
Our analysis showed impaired glucose metabolism and dyslipidaemia in the incident diabetes group. The incident diabetes group had increasing plasma glucose, decreasing glycine and elevated insulin and glucagon levels. Glycine is a known marker of insulin sensitivity [40], and it is therefore noteworthy that the change in glycine levels precedes the increase in plasma glucose. Moreover, the incident diabetes group had a high concentration of VLDL particles, decreasing concentrations of HDL particles towards the time of diabetes diagnosis (driven by S-HDL and M-HDL particles), and high triacylglycerol concentrations in all lipoprotein particles. This pattern was exclusive to the type 2 diabetes group, in agreement with previous reports [41, 42]. Moreover, as previously reported for insulin resistance [43] and incident type 2 diabetes [44], HDL particle size decreased towards the time of diabetes diagnosis. The decrease in HDL particles and size occurred concurrently with the decrease in glycine, indicating that HDL-related dyslipidaemia and insulin resistance develop simultaneously in type 2 diabetes.
While type 1 and type 2 diabetes are characterised by distinct pathologies, phenotypic features such as autoimmunity and insulin resistance are seen in both patient populations [17, 18]. Moreover, diseases such as retinopathy, nephropathy and neuropathy are comorbid with both diseases, probably due to the damaging effects of prolonged hyperglycaemia [45, 46]. In this study, we estimated the associations between 225 biomarkers and type 1 diabetes plus type 2 diabetes together in the main analysis, and separately in a sub-analysis of the two diabetes types. Most individuals with incident diabetes were classified as having type 2 diabetes (93%), and thus the results between the main analysis and the sub-analysis did not differ considerably for this group. However, for the individuals with non-childhood-onset type 1 diabetes (7%), we found a largely similar inflammatory profile, except for a type 1 diabetes-specific increase in IL-17A/F. IL-17A/F, together with IL-17A and IL-17F, has been linked to beta cell pathogenesis in type 1 diabetes and multiple autoimmune diseases through activation of the IL-1 receptor agonist [47, 48].
Study limitations
There are several limitations in working with historical biobank samples. First, BMI was only recorded by means of questionnaires, and, in most cases, a BMI value was only available for a single point in time. Adjustment for BMI may therefore not be accurate over large time intervals. Second, for some metabolomics measurements, there was considerable sample degradation affecting the measured levels. We therefore only report fold changes between groups of individuals. Moreover, we have chosen not to present the results for biomarkers with poor model fit in the main text, as estimates from these models may not accurately represent underlying biological differences. Third, as our cohort consists of blood donors with a minimum age of 18 years, all individuals with type 1 diabetes were diagnosed in adulthood. Therefore, our results reflect the molecular changes occurring in adult-onset type 1 diabetes and may not translate to childhood-onset type 1 diabetes. Moreover, as our cohort only includes 23 individuals with type 1 diabetes, we have limited power to detect type 1 diabetes-specific effects and model these temporally. Fourth, as only a few samples were available for the very early time points (more than 7–8 years before the end of follow-up), the estimates for these time points are underpowered and should be considered with caution (ESM Fig. 2). Moreover, as the diagnosis date for individuals with diabetes is based on registry data, it does not necessarily reflect the exact time of diabetes onset. Fifth, as the DBDS has strict criteria for inclusion, the cohort may be affected by the healthy donor effect, for example [49]. Hence, some caution should be exercised when translating the presented results to a general, non-donor population.
Conclusion
Our study greatly expands the number of biomarkers that are known to temporally change in the progression of diabetes before diagnosis. The identified biomarkers align with the known pathologies of type 1 diabetes and type 2 diabetes, and show both shared and distinct molecular patterns between the two diabetes types. Our findings are useful for understanding the sequence of pathological changes that occur in the development of diabetes. Moreover, our findings may, upon independent replication, be used to identify clinical biomarkers for the early detection of individuals with an increased risk of developing diabetes.
Abbreviations
- AUROC:
-
Area under the receiver operating curve
- BCAA:
-
Branched-chain amino acid
- CCL4:
-
C-C motif chemokine ligand 4
- CRP:
-
C-reactive protein
- DBDS:
-
Danish Blood Donor Study
- FDR:
-
False discovery rate
- Flt-1:
-
fms-related receptor tyrosine kinase 1
- MCP-4:
-
Monocyte chemoattractant protein 4
- M-HDL:
-
Medium HDL
- PRS:
-
Polygenic risk score
- RF:
-
Random forest
- SAA:
-
Serum amyloid A
- S-HDL:
-
Small HDL
- sICAM-1:
-
Soluble intercellular adhesion molecule-1
- TARC:
-
Thymus and activation-regulated chemokine
- Tie-2:
-
Tyrosine protein kinase receptor 2
References
NCD Risk Factor Collaboration (2016) Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4.4 million participants. Lancet 387(10027):1513–1530. https://doi.org/10.1016/S0140-6736(16)00618-8
United Kingdom Prospective Diabetes Study Group (1995) United Kingdom Prospective Diabetes Study (UKPDS) 13: relative efficacy of randomly allocated diet, sulphonylurea, insulin, or metformin in patients with newly diagnosed non-insulin dependent diabetes followed for three years. BMJ 310(6972):83–88. https://doi.org/10.1136/bmj.310.6972.83
Arcaro G, Cretti A, Balzano S et al (2002) Insulin causes endothelial dysfunction in humans. Circulation 105(5):576–582. https://doi.org/10.1161/hc0502.103333
Kolberg JA, Jørgensen T, Gerwien RW et al (2009) Development of a type 2 diabetes risk model from a panel of serum biomarkers from the Inter99 cohort. Diabetes Care 32(7):1207–1212. https://doi.org/10.2337/DC08-1935
Thorand B, Zierer A, Büyüközkan M et al (2021) A panel of 6 biomarkers significantly improves the prediction of type 2 diabetes in the MONICA/KORA study population. J Clin Endocrinol Metab 106(4):e1647–e1659. https://doi.org/10.1210/clinem/dgaa953
Herder C, Kowall B, Tabak AG, Rathmann W (2014) The potential of novel biomarkers to improve risk prediction of type 2 diabetes. Diabetologia 57(1):16–29. https://doi.org/10.1007/s00125-013-3061-3
Abbasi A, Sahlqvist A-S, Lotta L et al (2016) A systematic review of biomarkers and risk of incident type 2 diabetes: an overview of epidemiological, prediction and aetiological research literature. PLoS One 11(10):e0163721. https://doi.org/10.1371/journal.pone.0163721
Strawbridge RJ, van Zuydam NR (2018) Shared genetic contribution of type 2 diabetes and cardiovascular disease: implications for prognosis and treatment. Curr Diab Rep 18(8):59. https://doi.org/10.1007/s11892-018-1021-5
Hulsegge G, Spijkerman A, Schouw YTVD et al (2017) Trajectories of metabolic risk factors and biochemical markers prior to the onset of type 2 diabetes: the population-based longitudinal Doetinchem study. Nat Publ Group 7:270. https://doi.org/10.1038/nutd.2017.23
Færch K, Witte DR, Tabák AG et al (2013) Trajectories of cardiometabolic risk factors before diagnosis of three subtypes of type 2 diabetes: a post-hoc analysis of the longitudinal Whitehall II cohort study. Lancet Diabetes Endocrinol 1(1):43–51. https://doi.org/10.1016/S2213-8587(13)70008-1
Vistisen D, Witte DR, Tabák AG et al (2014) Patterns of obesity development before the diagnosis of type 2 diabetes: the Whitehall II cohort study. PLoS Med 11(2):e1001602. https://doi.org/10.1371/journal.pmed.1001602
Nano J, Dhana K, Asllanaj E et al (2020) Trajectories of BMI before diagnosis of type 2 diabetes: the Rotterdam study. Obesity 28(6):1149–1156. https://doi.org/10.1002/OBY.22802
Tu Z-Z, Yuan Y, Xia P-F et al (2022) Trajectories of metabolic risk factors during the development of type 2 diabetes in Chinese adults. Diabetes Metab 48:101348. https://doi.org/10.1016/j.diabet.2022.101348
Tabák AG, Jokela M, Akbaraly TN, Brunner EJ, Kivimäki M, Witte DR (2009) Trajectories of glycaemia, insulin sensitivity, and insulin secretion before diagnosis of type 2 diabetes: an analysis from the Whitehall II study. Lancet 373(9682):2215–2221. https://doi.org/10.1016/S0140-6736(09)60619-X
Shoelson SE, Lee J, Goldfine AB (2006) Inflammation and insulin resistance. J Clin Invest 116(7):1793. https://doi.org/10.1172/JCI29069
Marques-Vidal P, Schmid R, Bochud M et al (2012) Adipocytokines, hepatic and inflammatory biomarkers and incidence of type 2 diabetes. The CoLaus study. PLoS One 7(12):e51768. https://doi.org/10.1371/journal.pone.0051768
Irvine WJ, McCallum CJ, Gray RS, Duncan LJ (1977) Clinical and pathogenic significance of pancreatic-islet-cell antibodies in diabetics treated with oral hypoglycaemic agents. Lancet 309(8020):1025–1027. https://doi.org/10.1016/s0140-6736(77)91258-2
Subauste A, Gianani R, Chang AM et al (2014) Islet autoimmunity identifies a unique pattern of impaired pancreatic beta-cell function, markedly reduced pancreatic beta cell mass and insulin resistance in clinically diagnosed type 2 diabetes. PLoS One 9(9):e106537. https://doi.org/10.1371/journal.pone.0106537
de Candia P, Prattichizzo F, Garavelli S et al (2019) Type 2 diabetes: how much of an autoimmune disease? Front Endocrinol 10:451. https://doi.org/10.3389/fendo.2019.00451
Carstensen B, Rønn PF, Jørgensen ME (2020) Prevalence, incidence and mortality of type 1 and type 2 diabetes in Denmark 1996–2016. BMJ Open Diabetes Res Care 8(1):e001071. https://doi.org/10.1136/BMJDRC-2019-001071
Hansen TF, Banasik K, Erikstrup C et al (2019) DBDS Genomic Cohort, a prospective and comprehensive resource for integrative and temporal analysis of genetic, environmental and lifestyle factors affecting health of blood donors. BMJ Open 9(6):e028401. https://doi.org/10.1136/BMJOPEN-2018-028401
Burgdorf KS, Simonsen J, Sundby A et al (2017) Socio-demographic characteristics of Danish blood donors. PLOS ONE 12(2):e0169112. https://doi.org/10.1371/journal.pone.0169112
Privé F, Arbel J, Vilhjálmsson BJ (2020) LDpred2: better, faster, stronger. Bioinformatics 36(22–23):5424–5431. https://doi.org/10.1093/bioinformatics/btaa1029
van Buuren S (2016) Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res 16(3):219–242. https://doi.org/10.1177/0962280206074463
Porta M, Curletto G, Cipullo D et al (2014) Estimating the delay between onset and diagnosis of type 2 diabetes from the time course of retinopathy prevalence. Diabetes Care 37(6):1668–1674. https://doi.org/10.2337/dc13-2101
Gallagher EJ, Le Roith D, Bloomgarden Z (2009) Review of hemoglobin A1c in the management of diabetes. J Diabetes 1(1):9–17. https://doi.org/10.1111/j.1753-0407.2009.00009.x
Waugh NR, Shyangdan D, Taylor-Phillips S, Suri G, Hall B (2013) Screening for type 2 diabetes: a short report for the National Screening Committee. Health Technol Assess 17(35):1–90. https://doi.org/10.3310/hta17350
Aiello LP, Avery RL, Arrigg PG et al (1994) Vascular endothelial growth factor in ocular fluid of patients with diabetic retinopathy and other retinal disorders. N Engl J Med 331(22):1480–1487. https://doi.org/10.1056/NEJM199412013312203
Kowalczuk L, Touchard E, Omri S et al (2011) Placental growth factor contributes to micro-vascular abnormalization and blood–retinal barrier breakdown in diabetic retinopathy. PLoS One 6(3):e17462. https://doi.org/10.1371/journal.pone.0017462
Grant M, Russell B, Fitzgerald C, Merimee TJ (1986) Insulin-like growth factors in vitreous: studies in control and diabetic subjects with neovascularization. Diabetes 35(4):416–420. https://doi.org/10.2337/diab.35.4.416
Kumar PA, Brosius FC, Menon RK (2011) The glomerular podocyte as a target of growth hormone action: implications for the pathogenesis of diabetic nephropathy. Curr Diabetes Rev 7(1):50–55. https://doi.org/10.2174/157339911794273900
Nandy D, Mukhopadhyay D, Basu A (2010) Both vascular endothelial growth factor and soluble Flt-1 are increased in type 2 diabetes but not in impaired fasting glucose. J Investig Med 58(6):804–806. https://doi.org/10.231/JIM.0b013e3181e96203
Yao Y, Du J, Li R et al (2019) Association between ICAM-1 level and diabetic retinopathy: a review and meta-analysis. Postgrad Med J 95(1121):162–168. https://doi.org/10.1136/POSTGRADMEDJ-2018-136102
Chow FY, Nikolic-Paterson DJ, Ozols E, Atkins RC, Tesch GH (2005) Intercellular adhesion molecule-1 deficiency is protective against nephropathy in type 2 diabetic db/db mice. J Am Soc Nephrol 16(6):1711–1722. https://doi.org/10.1681/ASN.2004070612
Jiang L, Hu X, Feng Y et al (2024) Reduction of renal interstitial fibrosis by targeting Tie2 in vascular endothelial cells. Pediatr Res 95(4):959–965. https://doi.org/10.1038/s41390-023-02893-8
Campochiaro PA, Peters KG (2016) Targeting Tie2 for treatment of diabetic retinopathy and diabetic macular edema. Curr Diab Rep 16(12):126. https://doi.org/10.1007/s11892-016-0816-5
Tapp RJ, Tikellis G, Wong TY et al (2008) Longitudinal association of glucose metabolism with retinopathy: results from the Australian Diabetes Obesity and Lifestyle (AusDiab) study. Diabetes Care 31(7):1349–1354. https://doi.org/10.2337/dc07-1707
Plantinga LC, Crews DC, Coresh J et al (2010) Prevalence of chronic kidney disease in US adults with undiagnosed diabetes or prediabetes. Clin J Am Soc Nephrol 5(4):673–682. https://doi.org/10.2215/CJN.07891109
US Preventive Services Task Force (2021) Screening for prediabetes and type 2 diabetes: US Preventive Services Task Force recommendation statement. JAMA 326(8):736–743. https://doi.org/10.1001/jama.2021.12531
Takashina C, Tsujino I, Watanabe T et al (2016) Associations among the plasma amino acid profile, obesity, and glucose metabolism in Japanese adults with normal glucose tolerance. Nutr Metab 13(1):5. https://doi.org/10.1186/s12986-015-0059-5
Winocour PH, Ishola M, Durrington PN, Anderson DC (1986) Lipoprotein abnormalities in insulin-dependent diabetes mellitus. Lancet 327(8491):1176–1178. https://doi.org/10.1016/S0140-6736(86)91159-1
Ahola-Olli AV, Mustelin L, Kalimeri M et al (2019) Circulating metabolites and the risk of type 2 diabetes: a prospective study of 11,896 young adults from four Finnish cohorts. Diabetologia 62(12):2298–2309. https://doi.org/10.1007/s00125-019-05001-w
Festa A, Williams K, Hanley AJG et al (2005) Nuclear magnetic resonance lipoprotein abnormalities in prediabetic subjects in the Insulin Resistance Atherosclerosis Study. Circulation 111(25):3465–3472. https://doi.org/10.1161/CIRCULATIONAHA.104.512079
Sokooti S, Flores-Guerrero JL, Kieneker LM et al (2021) HDL particle subspecies and their association with incident type 2 diabetes: the PREVEND study. J Clin Endocrinol Metab 106(6):1761–1772. https://doi.org/10.1210/clinem/dgab075
Gorst C, Kwok CS, Aslam S et al (2015) Long-term glycemic variability and risk of adverse outcomes: a systematic review and meta-analysis. Diabetes Care 38(12):2354–2369. https://doi.org/10.2337/dc15-1188
Chen J, Yi Q, Wang Y et al (2022) Long-term glycemic variability and risk of adverse health outcomes in patients with diabetes: a systematic review and meta-analysis of cohort studies. Diabetes Res Clin Pract 192:110085. https://doi.org/10.1016/j.diabres.2022.110085
Qiu A-W, Cao X, Zhang W-W, Liu Q-H (2021) IL-17A is involved in diabetic inflammatory pathogenesis by its receptor IL-17RA. Exp Biol Med 246(1):57–65. https://doi.org/10.1177/1535370220956943
Crawford MP, Sinha S, Renavikar PS, Borcherding N, Karandikar NJ (2020) CD4 T cell-intrinsic role for the T helper 17 signature cytokine IL-17: effector resistance to immune suppression. Proc Natl Acad Sci USA 117(32):19408–19414. https://doi.org/10.1073/pnas.2005010117
Brodersen T, Rostgaard K, Lau CJ et al (2023) The healthy donor effect and survey participation, becoming a donor and donor career. Transfusion (Paris) 63(1):143–155. https://doi.org/10.1111/trf.17190
Author information
Authors and Affiliations
Consortia
Corresponding author
Ethics declarations
Data availability
Data cannot be publicly shared but may be made available upon request. Requests should be directed to the corresponding author. Approval is contingent on adherence to Danish law and may be subject to restrictions.
Funding
Open access funding provided by Copenhagen University. ATL, DW, TR, KB and SB were supported by the Novo Nordisk Foundation (grants NNF14CC0001 and NNF17OC0027594). BDK was supported by the Novo Nordisk Foundation Challenge Programme (grant NNF17OC0027864). LHM acknowledges support from the Novo Nordisk Foundation (grants NNF17OC0027812 and NNF17OC0027594). DV has received research grants from Bayer A/S, Sanofi A/S, Novo Nordisk A/S, and Boehringer Ingelheim; all fees were given to the Steno Diabetes Center Copenhagen. PR reports receipt of honoraria to Steno Diabetes Center Copenhagen for education and consultancy from Abbott, AstraZeneca, Bayer A/S, Boehringer Ingelheim, Lexicon Inc, Novartis and Novo Nordisk A/S.
Authors’ relationships and activities
DV holds shares in Novo Nordisk A/S and is employed at Novo Nordisk A/S (current affiliation). SB holds shares in Intomics A/S, Hoba Therapeutics Aps, Novo Nordisk A/S and Lundbeck A/S, and managing board memberships in Proscion A/S and Intomics A/S. The remaining authors declare that there are no relationships or activities that might bias, or be perceived to bias, their work.
Contribution statement
ATL, DV, DW, HU, KB, LHM and SB conceived and designed the study. ATL, BDK, the DBDS Genomic Consortium, LH, LHM, MHL and TR carried out the analysis. ATL, DV, KB, TR and SB drafted the manuscript. SB is responsible for the integrity of the work as a whole. All authors acquired and interpreted the data, critically revised the paper, and had final responsibility for the decision to submit for publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A list of the DBDS Genomic Consortium members is provided in the electronic supplementary material.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lundgaard, A.T., Westergaard, D., Röder, T. et al. Longitudinal metabolite and protein trajectories prior to diabetes mellitus diagnosis in Danish blood donors: a nested case–control study. Diabetologia (2024). https://doi.org/10.1007/s00125-024-06231-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00125-024-06231-3