In this study of 2939 middle-aged, black and white men and women, we identified seven named compounds that were independently associated with the development of diabetes over 20 years of follow-up after accounting for sociodemographics, diabetes risk factors and fasting glucose levels. These seven metabolites—isoleucine, leucine, valine, asparagine, 3-(4-hydoxyphenyl)lactate, trehalose and erythritol—improved the prediction of diabetes beyond established diabetes risk factors and fasting glucose. These metabolites represented three distinct categories of metabolic pathways, i.e. amino acids, carbohydrates and a xenobiotic (food additive). In models that were not adjusted for fasting glucose, 47 serum metabolites were significantly associated with diabetes, representing a wide variety of metabolic pathways and suggesting that diabetes is a state of substantial metabolic disruption. The compounds that were detected by our metabolomic platform and found to be associated with incident diabetes consisted of established markers of diabetes, including glucose, and compounds consumed by individuals with diabetes, including erythritol, thereby providing proof of concept for this untargeted metabolomic approach. Novel markers of diabetes were also identified, including branched chain amino acids, asparagine, trehalose and 3-(4-hydoxyphenyl)lactate, which point to potential mechanisms of diabetes development.
Our study findings are consistent with current knowledge about diabetes [6]. In models that were not adjusted for fasting glucose, the metabolite with the greatest magnitude of association with incident diabetes was, as expected, glucose. The concentration of glucose in the blood is the most widely used biomarker to screen and diagnose diabetes [15]. In models that adjusted for fasting glucose, trehalose was the only compound representative of carbohydrate metabolism that remained significantly associated with incident diabetes. Trehalose is a disaccharide of two glucose molecules, which is added to food and other manufactured products to prevent dehydration and protein denaturation [16, 17]. In a prior analysis among black participants in the ARIC study, this serum metabolite was reported to be significantly associated with the TREH genetic variant as well as incident diabetes [18]. Individuals who were at risk of developing diabetes had elevated serum levels of the glucose metabolite and related compounds involved in carbohydrate metabolism, even after excluding participants with diabetes at baseline.
This untargeted metabolomic profile also included xenobiotics or exogenous substances, such as food components and drugs. Erythritol was significantly associated with incident diabetes in the fully adjusted model, which probably reflects a higher consumption of this compound among individuals with a higher risk of developing diabetes. Specifically, erythritol is a low-calorie sweetener that is added to food as a substitute for simple sugars since it has little to no impact on blood levels of insulin and glucose [19, 20]. Erythritol was previously detected by a metabolomic profile and found to be associated with diabetes in a case–control study of 100 participants nested within the KORA (Cooperative Health Research in the Region of Augsburg) study and with elevated glucose in the TwinsUK cohort consisting of 2204 women [21, 22].
The class of metabolites with the most significant hits for the association with diabetes was amino acids. It is noteworthy that higher serum levels of all of the branched chain amino acids (leucine, isoleucine and valine) were associated with an increased risk of diabetes. Even after adjustment for baseline glucose, the branched chain amino acids remained statistically significantly associated with incident diabetes. In a meta-analysis of eight prospective studies with metabolomic profiling, branched chain amino acids were consistently and significantly associated with diabetes and other measures of impaired glucose metabolism [6, 7, 23,24,25,26,27,28,29]. However, the aetiology of risk of diabetes mediated by branched chain amino acids has yet to be determined. One purported mechanism is that leucine activates mTORC-1 (mammalian target of rapamycin complex-1) and S6K1 (ribosomal protein S6 kinase), leading to serine phosphorylation of IRS-1 and IRS-2, which results in insulin resistance [30]. Another theory is that the metabolism of branched chain amino acids leads to an accumulation of toxic intermediates, beta cell mitochondrial dysfunction and insulin resistance [31, 32].
In addition to the three branched chain amino acids, we identified two other amino acid-related metabolites that were significantly associated with incident diabetes, i.e. 3-(4-hydroxyphenyl)lactate and asparagine. The metabolite 3-(4-hydoxyphenyl)lactate is a byproduct of the degradation of tyrosine, an aromatic amino acid [33]. Whereas the aromatic amino acids tyrosine and phenylalanine have been consistently associated with diabetes risk in a meta-analysis of prospective studies with metabolomic profiling, 3-(4-hydoxyphenyl)lactate has not previously been identified as a compound of interest [6]. Tyrosine is considered to be both glucogenic and ketogenic in that the catabolism of tyrosine yields fumarate, which is an intermediate of the tricarboxylic acid (TCA) cycle, and acetoacetate, which can be used to synthesise ketone bodies. The process of converting amino acid degradation products to glucose is stimulated by a high blood glucagon to insulin ratio, such as in the setting of untreated diabetes. The metabolite 3-(4-hydoxyphenyl)lactate acts as an antioxidant by decreasing the production of reactive oxidative species, which are present during states of oxidative stress, for example among individuals at risk of developing diabetes [34, 35].
Asparagine, an amino acid, was the sole metabolite in our study that had an inverse association with diabetes risk. Similar to tyrosine, asparagine is a glucogenic amino acid because oxaloacetate, a byproduct of asparagine catabolism, can be used in the TCA cycle to synthesise glucose. Asparagine is readily converted to aspartate and then undergoes transamination to form glutamate. Glutamate, along with glycine and cysteine, is a constituent of the tripeptide glutathione, which is a major antioxidant and thus protects against chronic diseases [36, 37]. Higher blood levels of glutamine and glutamate have consistently been shown to be associated with a lower risk of diabetes in a meta-analysis of prospective metabolomic research studies [6]. Asparagine was reported as being significantly associated with insulin and HOMA, but not glucose, in the Framingham Heart Study [28]. No known metabolomics studies have previously identified asparagine as an independent predictor of incident diabetes.
Some study limitations should be considered in the interpretation of our results. Using a discovery approach to comprehensively detect a broad spectrum of diabetes biomarkers, we obtained relative measures of serum metabolites. Subsequent research using targeted assays will be needed to quantify absolute levels of promising new markers of diabetes risk. Metabolomic profiling was conducted using specimens in storage for over 20 years. Degradation of metabolites over time would be expected to be non-differential by incident diabetes case status. Furthermore, we found that the correlation between glucose measured with metabolomic profiling and glucose measured using the standard clinical chemistry method was high (>0.9). As with any observational study, the reported associations could, in part, be explained by residual confounding. However, we were able to account for multiple covariates that are established risk factors for diabetes in multivariable regression models. There was a small but statistically significant increase in the C statistic as a measure of diabetes risk prediction with the seven metabolites vs established risk factors. Nonetheless, these metabolites may represent metabolic pathways that would be worthwhile pursuing in future research.
There are several strengths of the present study that deserve mention. Compared with other metabolomics studies, the present study was conducted with a relatively large sample of 2939 study participants, with a substantial number of individuals with incident diabetes identified over an extended follow-up period of over 20 years. The prospective analysis allowed for the characterisation of metabolic disturbances apparent among those individuals at risk of subsequently developing diabetes. Our study included both black and white men and women from four communities in the USA, allowing for broad generalisability. Nonetheless, replication of these results will be necessary in similarly diverse study populations. In addition, we conducted a comprehensive and unbiased examination of the serum metabolomic profile using a leading metabolomics platform providing coverage of known pathways of carbohydrate metabolism and maximising the opportunity for the discovery of new diabetes biomarkers. Finally, we employed a conservative approach to account for multiple testing, i.e. Bonferroni correction, in order to reduce the likelihood of false-positive results. Given that some of the metabolites are correlated with each other, the use of the Bonferroni correction was probably an overly conservative approach and may have resulted in some false-negative results (true associations that we have not detected as statistically significant).
In conclusion, we identified seven serum metabolites that were independently associated with and improved the prediction of incident diabetes after accounting for sociodemographic factors, study design features, established risk factors for diabetes and fasting glucose: isoleucine, leucine, valine, asparagine, 3-(4-hydoxyphenyl)lactate, trehalose and erythritol. These metabolites may be useful as a panel of biomarkers to assess future risk of diabetes. This study provides clues to the early metabolic features associated with future development of diabetes in middle-aged adults, which may inform strategies for the prevention and individualised treatment of diabetes. Future research is warranted to precisely quantify these biomarkers and determine their role in diabetes pathophysiology.