Diabetologia

, Volume 57, Issue 1, pp 16–29

The potential of novel biomarkers to improve risk prediction of type 2 diabetes

  • Christian Herder
  • Bernd Kowall
  • Adam G. Tabak
  • Wolfgang Rathmann
Review

DOI: 10.1007/s00125-013-3061-3

Cite this article as:
Herder, C., Kowall, B., Tabak, A.G. et al. Diabetologia (2014) 57: 16. doi:10.1007/s00125-013-3061-3

Abstract

The incidence of type 2 diabetes can be reduced substantially by implementing preventive measures in high-risk individuals, but this requires prior knowledge of disease risk in the individual. Various diabetes risk models have been designed, and these have all included a similar combination of factors, such as age, sex, obesity, hypertension, lifestyle factors, family history of diabetes and metabolic traits. The accuracy of prediction models is often assessed by the area under the receiver operating characteristic curve (AROC) as a measure of discrimination, but AROCs should be complemented by measures of calibration and reclassification to estimate the incremental value of novel biomarkers. This review discusses the potential of novel biomarkers to improve model accuracy. The range of molecules that serve as potential predictors of type 2 diabetes includes genetic variants, RNA transcripts, peptides and proteins, lipids and small metabolites. Some of these biomarkers lead to a statistically significant increase of model accuracy, but their incremental value currently seems too small for routine clinical use. However, only a fraction of potentially relevant biomarkers have been assessed with regard to their predictive value. Moreover, serial measurements of biomarkers may help determine individual risk. In conclusion, current risk models provide valuable tools of risk estimation, but perform suboptimally in the prediction of individual diabetes risk. Novel biomarkers still fail to have a clinically applicable impact. However, more efficient use of biomarker data and technological advances in their measurement in clinical settings may allow the development of more accurate predictive models in the future.

Keywords

BiomarkersCalibrationDiscriminationGenetic variantsMetabolomicsPrediction modelsReclassificationRepeated measuresReviewRisk ScoresType 2 diabetes

Abbreviations

AROC

Area under the receiver operating characteristic curve

IDI

Integrated discrimination improvement

IFG

Impaired fasting glucose

IGT

Impaired glucose tolerance

KORA

Cooperative Health Research in the Region of Augsburg

miRNA

MicroRNA

NRI

Net reclassification improvement

SNP

Single nucleotide polymorphism

Introduction

Non-pharmacological and pharmacological interventions are able to decrease the incidence of type 2 diabetes in high-risk individuals. The ultimate aim of these interventions is the prevention or the delay of the onset of diabetes-related macro- and microvascular complications that often lead to considerable morbidity and premature death, but a considerable number of individuals who could benefit from such interventions are not aware of their disease risk. Numerous prognostic models and scores for type 2 diabetes have been developed [13] based on known risk factors, including age, sex, obesity, metabolic and lifestyle factors, family history of diabetes or ethnic background. Given that the performance of these risk scores is often far from perfect, it is desirable to identify novel prognostic factors, such as biomarkers from ‘-omics’ technologies, with the aim of achieving better model accuracy.

The main purpose of this review is to appraise the potential of novel biomarkers to improve risk prediction for type 2 diabetes. To achieve this, we will proceed in four steps. First, we will critically discuss statistical methods to compare risk scores without and with biomarkers and to quantify any potential improvement. Second, we will very briefly summarise contemporary methodology to assess the performance of diabetes risk scores based on established risk factors. Third, we will provide an overview of novel biomarkers that have either been investigated in the context of risk prediction or will soon be available for such analyses. Fourth, we will suggest approaches to make more efficient use of biomarkers, and discuss limitations which we should be aware of in our search for the optimal risk model for incident type 2 diabetes.

Methodological issues involved in the assessment and comparison of the performance of risk models using different biomarkers

As reviewed recently [46], it is important to consider methodological issues in the development of risk prediction models. Before the incremental value of novel biomarkers for diabetes prediction can be evaluated, several specific issues related to model performance deserve attention and will be briefly summarised.

Discrimination measures

First, suitable measures of discrimination must be selected. How well a risk model identifies those who will develop a disease over the follow-up time in a cohort study is defined as discrimination. Three common measures of discrimination are explained in the text box [710]. The most popular measure of discrimination is the area under the receiver operating characteristic curve (AROC) or c statistic. The interpretation of AROCs might not be straightforward. For example, an AROC of 0.8 does not mean that 80% of persons who will develop diabetes are actually identified but, rather, that the likelihood is 80% that a randomly selected case (i.e. a person who will develop diabetes) will be assigned a higher estimated diabetes risk than a randomly selected non-case (i.e. a person who will remain diabetes-free). AROCs only use rank information and are quite insensitive to the addition of even strong risk predictors to an established model with a reasonable predictive ability [7, 8].

Measures of discrimination for prediction models

Measure

Explanation

Interpretation

Advantages and disadvantages

Area under the receiver operating curve (AROC, c statistic) [7, 8]

Among all the pairs in the cohort consisting of one participant with and one without incident diabetes, the AROC is the proportion of pairs for which the probability of getting the disease as estimated by the prediction model is larger for the case than for the non-case.

Or

Area under a plot of sensitivity (true-positive rate) vs 1 − specificity (false-positive rate).

An AROC of 0.5 means that the prediction model is no better than tossing a coin (in other words: there is one true positive case for each false positive case).

An AROC of 0.9 is excellent.

An AROC of 1 is the maximum.

+

AROCs are routinely provided by statistical packages for logistic regression models.

+

To determine whether increases in AROCs upon addition of new markers to a model are statistically significant, specific tests are available [9].

AROCs only make use of rank information.

They are quite insensitive to the addition of new markers to established models.

+/−

They cover the whole range of cut-off values.

Net reclassification improvement (NRI)

[10]

NRIs afford setting up categories of diabetes risks (i.e. 0–<5%, 5–<10%, 10–<20%, ≥20%). If a given model is compared with a new model that includes an additional new marker, NRI is calculated as follows:

+ probability of classification in a higher risk category for cases

− probability of classification in a lower risk category for cases

+ probability of classification in a lower risk category for non-cases

− probability of classification in a higher risk category for non-cases.

An NRI of 0 means no predictive improvement by the new marker.

A large NRI indicates that a high proportion of individuals moves into a more appropriate category of predicted risk.

+

NRIs may indicate changes in risk category when changes in AROC are minimal.

+

Tests for the null hypothesis, NRI = 0, are available.

NRIs only reflect changes in predictive ability. They cannot be used to characterise the predictive ability of a given model per se.

NRIs strongly depend on the number of risk categories and their cut-off points.

+

In the absence of established risk categories, category-free NRIs can be used for comparison purposes.

Integrated discrimination improvement (IDI) [10]

IDIs represent a continuous version of NRI.

An IDI of 0 means no predictive improvement.

+

As for NRI, IDI can reveal changes in disease risk when changes in AROC are minimal.

+

Fixing risk categories is not necessary.

Like NRIs, IDIs only reflect changes in predictive ability.

When modifying risk scores, it might be most important to improve the risk stratification of individuals thought to be at intermediate risk (i.e. those with a diabetes risk in the range of 5–15%). However, the AROC and the integrated discrimination improvement (IDI) are both continuous measures of discrimination that do not require fixed cut-off points and, thus, an increase in AROC or IDI does not necessarily indicate a better prognosis in persons at intermediate risk, because such an increase could also be due to more accurate prediction in persons with an apparently poor or good prognosis [11].

The net reclassification improvement (NRI) might be more sensitive than AROCs to the incremental predictive power of new markers [7]. Moreover, in calculating NRIs, the improvement of diabetes prediction can be considered separately for cases and non-cases at low, intermediate and high risk of diabetes. However, the NRI also has some caveats [10]. In particular, it is strongly dependent on the number of risk categories and on the cut-off points selected for risk stratification [12]. Thus, for calculating NRIs, Leening and Cook strongly recommend a priori risk classifications with a clinical meaning [13], but with respect to diabetes, there are no standard risk categories yet.

In view of the strengths and drawbacks of AROC, NRI and IDI, Pencina et al suggested using all three measures of discrimination to assess the incremental value of a new marker [14]. Pencina and co-workers proposed a category-free NRI that does not depend on the selection and the number of categories but is instead based on any change in the estimated risks [15]. In the absence of established risk categories for type 2 diabetes, category-free NRIs might be useful for comparison purposes.

Calibration

Besides discrimination, calibration is of particular importance in the application of prediction models. Calibration refers to how well the predicted probabilities agree with the observed diabetes risk. Calibration might be poor if the prevalence of diabetes in the dataset used to develop the score differs widely from the population in which it is applied. A common test of calibration is the Hosmer–Lemeshow test, which is based on a χ2 statistic. To cope with poor calibration, several methods for updating models have been suggested [16]. Updating methods range from simple recalibration methods to more sophisticated revision methods. In its simplest form, recalibration means adjusting the intercept of the prediction model leaving the regression coefficients unchanged. A further recalibration method includes adjustment of the intercept and multiplication of all regression coefficients with the same factor (calibration slope) [16].

An example of a calibration assessment is provided in electronic supplementary material (ESM) Table 1. A non-invasive diabetes prediction model was applied to its developmental dataset from the Cooperative Health Research in the Region of Augsburg (KORA) S4/F4 cohort. For each participant, the probability of developing the disease was estimated according to the risk score and, based on these estimated probabilities, participants were ranked from the lowest to the highest estimated risk and grouped into ten groups of approximately equal size (deciles). Thus, for each decile the number of expected diabetes cases can be calculated (number of individuals in the decile multiplied by the mean estimated risk for that decile) and compared with the observed number of incident cases. If the actual prevalence of diabetes is considerably different from the estimated prevalence, the test would result in a low p value, indicating poor calibration. In the example, the estimated and the real number of cases are similar, which is reflected by a non-significant p value of 0.66.

Internal and external validation

Risk scores often show model overfit in the datasets used for model development. This is because regression coefficients are estimated with maximum likelihood methods, so that the prediction of the outcome in the original data is optimal. Thus, AROCs obtained from original data are often considerably larger than AROCs obtained from other independent data. Therefore, model overfit requires external validation of a prediction model before its widespread use.

External validation means applying the prediction model to a dataset with different individuals and re-assessing the measures of model performance. Internal validation (such as cross-validation or bootstrapping methods) is not a full equivalent to external validation, as internal validation still relies on the original data [5]. The use of very heterogeneous datasets for external validation is a widely neglected source of error [8]. AROCs are calculated as the proportion of pairs composed of one case and one non-case, where the estimated probability is larger for the case than for the non-case. As an example, in a dataset including large proportions of younger and older subjects, there are many pairs of one younger, healthy person who does not develop diabetes, and one older person who develops diabetes. Even poor prediction models assign larger diabetes probabilities to the older case than to the younger non-case, which leads to an increase in the AROC. This means that, for example, AROCs are larger when they are calculated for younger and middle-aged subjects than for middle-aged subjects alone. Examples of external validation of diabetes risk scores are given in Table 1 [1730]. Quite often, not all the risk factors included in the original score are available in the dataset used for external validation. Thus, the original prediction models sometimes undergo some transformation before external validation.
Table 1

Diabetes risk models with examples of external evaluation

Study, country, reference

Risk factors included in score

Original data set

External validation

Non-invasive variables

Metabolic risk factors

AROC (95% CI)

Calibration

Study, country, reference

AROC (95% CI)

Calibration

Cambridge Risk Score, UK [17]

Age, sex, BMI, smoking, current use of corticosteroids, antihypertensive drug use, diabetes family history

0.74

N/A

Whitehall II, UK [26]

0.72 (0.69, 0.76)

Hosmer–Lemeshow p = 0.77

KORA, Germany [18]

Age, sex, BMI, smoking, hypertension, parental diabetes

0.76

N/A

PREVEND, the Netherlands [27]

0.66 (0.63–0.70)

Hosmer–Lemeshow p < 0.001

FINDRISC, Finland [19]

Age, BMI, waist circumference, antihypertensive drug use, history of high blood glucose, physical inactivity, diet (vegetables, fruits, berries)

0.85

N/A

DETECT-2: the Netherlands, Denmark, Sweden, UK, Australia [28]

Recalibrated:0.77(0.75, 0.78)

Hosmer–Lemeshow p = 0.27

QDScore, UK [20]

Age, sex, ethnicity, BMI, smoking, diabetes family history, Townsend deprivation score, treated hypertension, cardiovascular disease, current use of corticosteroids

Men, 0.83 (0.83, 0.84); women, 0.85 (0.85, 0.86)

Brier score: men, 0.078; women, 0.058

Health Improvement Network database, UK [29]

Men, 0.80; women, 0.81

Brier score: men, 0.053; women, 0.041

AUSDRISK, Australia [21]

Age, sex, ethnicity, parental diabetes, history of high blood glucose, antihypertensive drug use, smoking, physical inactivity, waist circumference

0.78 (0.76, 0.81)

Hosmer–Lemeshow p = 0.85

Blue Mountains Eye Study, Australia [21]

0.66 (0.60, 0.71)

Hosmer–Lemeshow p = 0.32

North West Adelaide Health Study, Australia [21]

0.79 (0.72, 0.86)

Hosmer–Lemeshow p < 0.001

ARIC-1, USA [22]

Diabetic mother, diabetic father, hypertension, black race, age 55 to 64 years, ever smoking, waist circumference, height, resting pulse, weight

0.71 (0.69, 0.73)

N/A

KORA S4/F4, Germany [18]

0.75 (0.70, 0.80)

N/A

ARIC-2, USA [22]

Diabetic mother, diabetic father, hypertension, black race, age 55 to 64 years, never or former drinking, waist circumference, height, resting pulse

Glucose, triacylglycerol, HDL-C, uric acid

0.79 (0.77, 0.81)

N/A

KORA S4/F4, Germany [18]

0.79 (0.74, 0.84)

N/A

San Antonio, USA [23]

Age, sex, BMI, ethnicity, systolic BP, family history of diabetes

HDL-C, fasting plasma glucose

0.84 (0.82, 0.87)

Hosmer–Lemeshow p > 0.2

Multi-Ethnic Study of Atherosclerosis, USA [30]

0.83 (0.81, 0.85)

Hosmer–Lemeshow p < 0.001, after recalibration p > 0.10

ARIC, USA [24]

Age, race, waist circumference, height, systolic BP, family history of diabetes

Fasting plasma glucose, triacylglycerol, HDL-C

0.80

N/A

Multi-Ethnic Study of Atherosclerosis, USA [30]

0.84 (0.82, 0.86)

Hosmer–Lemeshow p < 0.001, after recalibration p > 0.10

Framingham Offspring Study, USA [25]

BMI, HDL-C, parental history of diabetes, BP

Fasting plasma glucose, triacylglycerol

0.85

N/A

Multi-Ethnic Study of Atherosclerosis, USA [30]

0.78 (0.74, 0.82)

Hosmer–Lemeshow p < 0.001, after recalibration p > 0.10

KORA, Germany [18]

Age, sex, BMI, parental history of diabetes, smoking, hypertension

Fasting plasma glucose, HbA1c, uric acid

0.84 (0.80, 0.89)

Hosmer–Lemeshow p = 0.45, Brier score 0.072

PREVEND, the Netherlands [27]a

0.81 (0.78–0.84)

Hosmer–Lemeshow p < 0.001, after recalibration p = 0.35

ARIC, Atherosclerosis Risk in Communities; AUSDRISK, Australian Type 2 Diabetes Risk Assessment Tool; FINDRISC, Finnish Diabetes Risk Score; HDL-C, HDL-cholesterol; KORA, Cooperative Health Research in the Region of Augsburg; N/A, not assessed; PREVEND, Prevention of Renal and Vascular Endstage Disease

aHbA1c not available in the validation dataset

External validation is a key component to assess the extent to which novel biomarkers can improve risk prediction. Genome-wide association studies often demonstrate a so-called ‘winner’s curse’, with more pronounced associations in discovery datasets than in replication datasets. These data from genomic studies clearly show the importance of external validation for all biomarkers before their inclusion into prediction models.

Prediction models with established, non-invasive and conventional clinical variables

Model accuracy is most commonly assessed by AROCs. In the examples given in Table 1 [1730], AROCs of 0.71 to 0.78 have been achieved with non-invasive models, while models including measures of glycaemia or routine metabolic laboratory analyses have achieved AROCs of up to 0.85. Fasting and postload glucose levels are by themselves strong predictors of diabetes. Thus, the extent to which glycaemic measures contribute to diabetes risk scores should be discussed briefly. Individuals with elevated HbA1c (6.0–6.4% [42–46 mmol/mol]), impaired fasting glucose ([IFG] fasting glucose 6.1–6.9 mmol/l) or impaired glucose tolerance ([IGT] 2 h OGTT glucose 7.8–11.1 mmol/l) have a strongly increased risk for type 2 diabetes compared with normoglycaemic people [31]. Persons with IFG and IGT have an even higher risk of diabetes than those who have only one of the two disorders [31]. As an example, in the KORA cohort of older participants, almost half of those with IFG and IGT combined developed type 2 diabetes over 7 years [32].

The main metabolic risk factors for isolated IFG and isolated IGT are different. The pathophysiology of isolated IFG seems to include reduced hepatic insulin sensitivity, beta cell dysfunction and low beta cell mass [33]. In contrast, isolated IGT is characterised by reduced peripheral insulin sensitivity but near normal hepatic insulin sensitivity and progressive loss of beta cell function. Individuals with combined IFG and IGT exhibit severe defects in both peripheral and hepatic insulin sensitivity, as well as loss of beta cell function [33].

Although a clearly increased diabetes risk can be observed for individuals with isolated IFG and those with isolated IGT, the categorisation of individuals as either ‘normal’ or ‘pre-diabetic’ (IFG, IGT) neglects the fact that a significant increase in diabetes risk also exists for increasing fasting glucose levels within the normal range [34]. Glycaemic measures (fasting and 2 h glucose, HbA1c) are strong diabetes risk predictors, but may be more useful without classification, e.g. as a continuous risk factor. This has been indicated in the German KORA and the Danish Inter99 studies [18, 35].

It is of clinical importance whether a single glycaemic measure performs as well as a simple clinical score. In the multiethnic Atherosclerosis Risk in Communities (ARIC) study the simple risk score including waist circumference, height, blood pressure, family history of diabetes, ethnicity and age performed similarly to fasting glucose alone (AROC 0.71 vs 0.74, p = 0.2) [22]. Figure 1 and ESM Table 2 show that the separate addition of fasting glucose, HbA1c or 2 h glucose to basic models with non-invasive variables leads to a strong increase in model accuracy [18, 24, 3638]. In several studies, HbA1c improved the predictive power to a similar extent to fasting glucose. In the KORA study, the strongest incremental value was seen on the addition of 2 h glucose [18]. In the Study of Health in Pomerania (SHIP) cohort, even random glucose improved the predictive ability of diabetes risk scores [36]. Thus, the predictive potential of glucose values can also be used in non-fasting participants.
https://static-content.springer.com/image/art%3A10.1007%2Fs00125-013-3061-3/MediaObjects/125_2013_3061_Fig1_HTML.gif
Fig. 1

Increase in the AROC achieved by adding glycaemic measures to a basic prediction model: KORA S4/F4 Study. Data are from the KORA S4/F4 Study (n = 881; age range 55–74 years; 7-year follow-up) [18] Please see ESM Table 2 for 95% CIs of AROCs. The basic model included age, sex, BMI, hypertension, parental diabetes and former or present smoking. Diabetes was ascertained by validated self-report or OGTT. *p < 0.05; **p < 0.01; ***p < 0.001 vs the basic model

Taken together, non-invasive risk factors including age, sex, BMI, waist circumference, family history, smoking or hypertension form the basis of all diabetes risk scores. Routine clinical biomarkers, such as glucose, HbA1c, lipids and uric acid, have the potential to improve the predictive ability of these basic risk factors, but AROCs rarely exceed 0.85. This argues in favour of a search for novel risk factors to further improve the accuracy of diabetes risk models.

Novel biomarkers from ‘-omics’ technologies as potential components of risk models

Despite moderate or even good model accuracy in some studies (Table 1, ESM Table 2), current prediction algorithms leave room for improvement and raise the question of whether novel biomarkers could be clinically useful, particularly if they could improve risk models that already contain measures of glycaemia. The range of molecules that could serve as potential biomarkers of diabetes risk includes genetic variants, RNA transcripts, peptides and proteins, lipids and small metabolites, cellular markers and metabolic waste products [39]. Owing to current advances in ‘-omics’ technologies, such as genomics, transcriptomics, proteomics and metabolomics, the number of candidate biomarkers keeps growing; however, only a small proportion of these has been investigated with reference to their potential to improve the prediction of type 2 diabetes.

Genetic variants

The heritability of glycaemic traits and type 2 diabetes is high [40], and the large genome-wide association studies published to date since the first in 2007, based on up to >105 study participants, has helped us to better understand the genetic architecture of this disease. Single nucleotide polymorphisms (SNPs) in more than 60 regions throughout the genome (so-called susceptibility loci containing multiple genes) were found to be associated with the risk of type 2 diabetes [39, 4144]. Most of these SNPs are common, with minor allele frequencies of 10–90%. Interestingly, loci associated with diabetes risk show only a partial overlap with loci that determine levels of fasting glucose, 2 h glucose and HbA1c. Thus, some loci influence both disease risk and glycaemic traits, whereas others seem to mainly regulate glucose levels within the physiological range without affecting the development of overt type 2 diabetes, and vice versa [45, 46].

Most susceptibility loci harbour genes that play a role in pancreatic development and in beta cell function in adults, whereas loci that could be linked to insulin resistance are less frequent [43, 4648]. Other loci are enriched in genes involved in cell cycle regulation, adipocytokine signalling, CREB binding protein (CREBBP)-related transcription and regulation of circadian rhythm [43, 44]. It can be expected that the aforementioned search for the genetic location of causal variants within these loci will lead to a list of novel pathophysiological mechanisms that may serve as therapeutic targets.

The currently known risk variants have rather modest effect sizes; the presence of each risk variant or allele is only associated with increases in diabetes risk of between 5% and 40% (ORs 1.05–1.4). Therefore, these loci do not explain more than 10–15% of the estimated genetic heritability of type 2 diabetes [44, 49]. This estimate is in line with the observation that known risk variants explain only a small fraction of family history-associated diabetes risk [50]. Combinations of up to 40 SNPs resulted in AROCs of 0.55–0.63, which is substantially lower than those achieved by age, sex and BMI alone. In some studies, the addition of genotype information to models based on established anthropometric and clinical risk factors led to statistically significant increases in AROCs, but these improvements were usually not larger than 0.03 [51, 52]. In line with the findings for AROCs, only a few studies reported improvements of NRI and/or IDI by including SNP data, but these improvements were always too low to be of clinical relevance [53, 54].

It should be noted that the effect of genetic markers on risk prediction may be more pronounced in younger individuals, in leaner persons and in studies with long follow-up periods [53, 54], but few studies on young populations, in which the assessment of future genetic risk may be most relevant, are currently available [55]. The initial age of individuals is closely related to the time horizon for any model to predict type 2 diabetes. Several prospective studies have applied genetic risk scores for follow-up times of approximately 10 years. This time period corresponds to that in tools such as the Framingham Risk Score, which estimates an individual’s 10-year risk for incident cardiovascular disease. It has been proposed that genetic risk scores might be more helpful in longer term prediction because, in contrast to variables used in clinical risk scores, genetic variants do not change over time [52, 56]. Eventually, the time horizon for risk models needs to correspond to the period before the onset of type 2 diabetes in which preventive efforts are most effective.

Another caveat is that most genome-wide association and prediction studies have been conducted in populations of European descent [44, 51, 52], and case–control and prospective genetic studies in African-American [57, 58] or Asian [5961] populations are still rare. It has been hypothesised that different risk alleles and allele frequencies in various ethnic groups could contribute to global differences in incidence rates of type 2 diabetes [62], but this needs to be corroborated in further studies.

Recent simulation studies indicate that an increase of common SNPs currently below the threshold of genome-wide significance in prediction models by hundreds or several thousand may be able to capture up to half of the risk of type 2 diabetes and thus most of the genetic component [43]. In addition to the investigation of common SNPs, ongoing projects using DNA sequencing are addressing the issue of ‘missing heritability’, leading to the identification of further risk variants, especially with lower risk allele frequencies. One recent study of the MTNR1B locus encoding melatonin receptor 1B indicated that this locus may not only contain common variants with low effect sizes (ORs <1.4), but may also contain rare variants with considerably stronger associations with the risk of type 2 diabetes (OR 5.7, 95% CI 2.2, 14.8) for rare loss-of-function variants of the receptor [63]. Sequencing of all genes in the genome (exome sequencing), as recently reported for a Danish case–control study [64], and whole-genome resequencing, as performed in the 1000 Genomes Project [65], will improve our understanding of the potential relevance of low-frequency (0.5–5%) and rare (<0.5%) variants in the development of type 2 diabetes [66]. It remains to be seen to what extent ongoing studies and analyses of other kinds of genetic variations such as copy number and structural variations will contribute to more precise risk assessment.

Finally, it should also be noted that the problem of ‘missing heritability’ does not only refer to the proportion of phenotypic variance that can be explained by known risk variants (the numerator, which will undoubtedly increase with further studies). ‘Missing heritability’ is also affected by the total phenotypic variance of type 2 diabetes caused by genetic variants, which represents the denominator in our formula for estimating the proportion of explained heritability. It is difficult to accurately assess total phenotypic variance because it may be inflated by ill-defined shared environmental factors in families, by gene–gene interactions and by epigenetic phenomena. Therefore, a more precise quantification of total heritability is required to better define the contribution that genetic data can make to models of risk prediction.

Transcriptomics and type 2 diabetes: RNA species

mRNAs and microRNAs (miRNAs) from various tissues have been investigated as biomarkers of type 2 diabetes, mainly in small and cross-sectional studies [67]. Consequently, it is not clear whether the analysis of the human transcriptome can improve the accuracy of current risk scores. In the context of risk assessment, blood samples appear to be the most suitable biomaterial for transcriptome analyses because they are routinely obtained clinically. Methods for the analysis of transcriptomics datasets in relation to phenotypes and disease risk are currently being developed [68].

MiRNAs have been linked to insulin resistance, reduced beta cell function and type 2 diabetes [69, 70]. In the Bruneck study (South Tyrol, Italy), five miRNAs extracted from plasma were found to be associated with incident type 2 diabetes, but their performance in combination with established risk scores was not reported [71].

Gene expression is regulated at several levels, including epigenetic changes of the genome such as DNA methylation and histone modification. Commercially available bead array-based platforms can analyse DNA methylation intensities at almost 500,000 sites throughout the whole genome. The first results from studies linking epigenetic changes to glycaemic traits and type 2 diabetes risk will be available over the coming years [72].

Peptides and proteins

The complexity of the human serum or plasma proteome consisting of approximately 106 different protein species means, on the one hand, that blood is a rich source of potential biomarkers of diabetes risk but, on the other, that the comprehensive quantification of even a substantial fraction of these peptides and proteins is extremely challenging from a technological perspective [73, 74].

A range of hypothesis-driven studies investigated the contribution of multiple protein biomarkers such as liver enzymes, lipoproteins, insulin or markers of subclinical inflammation, iron metabolism and endothelial dysfunction to established risk scores of type 2 diabetes. A substantial increment of c statistics is possible if these prediction models do not contain a measure of glycaemia [75]. However, protein-based biomarkers that not only lead to statistically significant, but also to clinically relevant improvements of model accuracy remain to be identified for models that already consider glucose or HbA1c [18, 35, 7680], as summarised in a recent review [39].

One hypothesis-free prospective study used linear matrix-assisted laser desorption/ionisation time-of-flight mass spectroscopy to characterise protein profiles in serum samples from 85 cases with incident type 2 diabetes and 195 normoglycaemic controls within the Whitehall II cohort. Six protein peaks were significantly associated with incident type 2 diabetes after adjustment for age, sex, obesity, lipids, C-reactive protein, fasting glucose and 2 h glucose, but no data on the potential improvement of prediction models by these proteins were provided [81]. However, this work can be seen as a proof-of-concept study suggesting that proteomic methods may be useful for the detection of blood proteins that play a role early in the development of type 2 diabetes.

Lipids and small metabolites

While triacylglycerols and cholesterol have been used in various risk scores resulting in only modest improvements of model accuracy [1], their subfractions and smaller lipids, as well as sugars, amino acids, organic acids, nucleotides and other small-molecule metabolites from serum or plasma samples, are less well investigated but have moved into the focus of metabolomics studies [82]. Cross-sectional approaches have identified ‘metabolic signatures’ associated with insulin resistance and type 2 diabetes [82, 83], and thus indicated their potential as prognostic biomarkers for type 2 diabetes risk.

Very recently, data on lipids and small metabolites have become available from prospective studies, and these are summarised in Table 2 [8491]. These studies showed that elevated levels of branched-chain and aromatic amino acids and lower levels of glycine are associated with incident type 2 diabetes or deteriorating glucose homeostasis [84, 8690]. In addition, various lipid species and lipid fractions [85, 8790], as well as other small metabolites [87, 89, 91], showed significant associations with the risk of type 2 diabetes or incident impaired glucose metabolism after adjustment for multiple confounders. Some of the aforementioned studies compared the accuracy of prediction models without and with metabolites (Table 2) and found fairly modest improvements in AROCs for models that included metabolomics in addition to established risk factors for type 2 diabetes [84, 88, 89, 91].
Table 2

Prospective metabolomics studies in the field of type 2 diabetes

Reference

Study population

Analytical method; type of blood sample

Main findings

Prediction models

Wang et al, 2011 [84]

Framingham Offspring Study

- 189 incident T2D cases

- 189 matched controls

- 400 random controls

- Follow-up 12 years

• LC-MS

• Plasma (fasting)

Isoleucine, leucine, valine, tyrosine and phenylalanine associated with incident T2D (adjusted for age, sex, BMI, fasting glucose)

AROC (age, BMI, glucose) = 0.52, AROC (age, BMI, glucose + 5 amino acids) = 0.65 for comparison between cases and matched controls;

AROC (age, BMI, glucose) = 0.801, AROC (age, BMI, glucose + 5 amino acids) = 0.805 for comparison of cases and random controls

Rhee et al, 2011 [85]

Framingham Heart Study

- 189 incident T2D cases

- 189 matched controls

- Follow-up 12 years

• LC-MS

• Plasma (fasting)

Multiple lipid species of lower carbon number and double-bond content directly associated with T2D, multiple lipid species of higher carbon number and double-bond content inversely associated with T2D (adjusted for age, sex, BMI, fasting glucose, fasting insulin, TG, HDL-C)

N/A

Stančákocvá et al, 2012 [86]

METSIM study (men only)

- 151 incident T2D cases

- 375 controls

- Follow-up 4.7 years

• NMR

• Serum (fasting)

Alanine, glutamine, isoleucine, leucine, phenylalanine and tyrosine associated with incident T2D (adjusted for age and BMI)

N/A

Würtz et al, 2012 [87]

Pieksämäki study

- 618 individuals not treated for diabetes

- Follow-up 6.5 years

• NMR

• Serum (fasting)

Alanine, glycine, lactate, pyruvate, tyrosine, α1-acid glycoprotein and various fatty acid groups associated with change in fasting glucose; citrate, α1-acid glycoprotein and various fatty acid groups associated with change in 2 h glucose (all adjusted for age, sex, BMI, systolic BP, glucose, insulin, HDL-C, TG [fatty acids: no adjustment for HDL-C, TG])

N/A

Wang-Sattler et al, 2012 [88]a

KORA S4/F4

- 91 incident T2D cases

- 785 controls

- Follow-up 7 years

• FIA-MS/MS

• Serum (fasting)

Glycine and LysoPC C18:2 inversely associated with incident T2D (adjusted for age, sex, BMI, physical activity, alcohol intake, smoking, systolic BP, HDL-C)

AROC (model 1) = 0.742, AROC (model 1 + glycine, LysoPC 18:2, C2) = 0.754 (p = 0.01);

AROC (model 2) = 0.818, AROC (model 2 + glycine, LysoPC 18:2, C2) = 0.828 (p = 0.06)

Wang-Sattler et al, 2012 [88]a

KORA S4/F4 study

- 118 incident IGT cases

- 471 controls

- Follow-up 7 years

• FIA-MS/MS

• Serum (fasting)

Glycine and LysoPC C18:2 inversely associated with incident IGT (adjusted for adjusted for age, sex, BMI, physical activity, alcohol intake, smoking, systolic BP, HDL-C)

AROC (model 1) = 0.638, AROC (model 1 + glycine, LysoPC 18:2, C2) = 0.671 (p = 0.012);

AROC (model 2) = 0.656, AROC (model 2 + glycine, LysoPC 18:2, C2) = 0.683 (p = 0.015)

Floegel et al, 2013 [89]b

EPIC-Potsdam study

- 800 incident T2D cases

- 2,282 controls

- Follow-up 7 years

• FIA-MS/MS

• Serum (fasting or non-fasting)

Hexose, phenylalanine, 4 diacyl-PCs and (inversely) LysoPC C18:2 associated with incident T2D (adjusted for age, sex, BMI, WC, hypertension, smoking, multiple dietary components, glucose, HbA1c, HDL-C, TG)

AROC (model 1) = 0.847, AROC (model 1 + 14 metabolites = 0.890 (p < 0.0001);

AROC (model 2) = 0.901, AROC (model 2 + 14 metabolites) = 0.912 (p < 0.0001)

Würtz et al, 2013 [90]

Cardiovascular Risk in Young Finns study

- 1,680 individuals

- Follow-up 6 years

• NMR

• Serum (fasting)

Isoleucine, leucine, valine, phenylalanine, tyrosine associated with follow-up HOMA-IR in men; leucine, valine and phenylalanine associated with follow-up HOMA-IR in women (all adjusted for age, BMI, systolic BP, HDL-C, TG, smoking, physical activity, baseline HOMA-IR)

N/A

Würtz et al, 2013 [90]

Cardiovascular Risk in Young Finns study

- 1,680 individuals

- Follow-up 6 years

• NMR

• Serum (fasting)

Glutamine inversely associated with follow-up fasting glucose in women (adjusted for age, BMI, systolic BP, HDL-C, TG, smoking, physical activity, baseline fasting glucose)

N/A

Würtz et al, 2013 [90]

Cardiovascular Risk in Young Finns study

- 1,680 individuals

- Follow-up 6 years

• NMR

• Serum (fasting)

Isoleucine, leucine, valine, phenylalanine and tyrosine associated with HOMA-IR ≥ 90th percentile at follow-up in men (adjusted for age, BMI, systolic BP, HDL-C, TG, smoking, physical activity, baseline HOMA-IR)

No significant improvements of baseline model (BMI, apolipoprotein B, physical activity index, HOMA-IR) by amino acid score as assessed by AROC, NRI and IDI in either sex

Ferrannini et al, 2013 [91]c

RISC study

- 779 stable NGT

- 123 incident dysglycaemia

- Follow-up 3 years

• UHPLC-MS/MS

• Plasma (fasting)

α-HB directly and linoleoylglycerophosphocholine (L-GPC) inversely associated with incident dysglycaemia (adjusted for age, sex, familial T2D, BMI, fasting glucose, α-HB for L-GPC and vice versa)

AROC (model 1) = 0.762, AROC (model 1 + α-HB, L-GPC) = 0.790;

AROC (model 2) = 0.786, AROC (model 2 + α-HB, L-GPC) = 0.804

Ferrannini et al, 2013 [91]c

Botnia study

- 151 incident T2D cases

- 2,429 controls

- Follow-up 9.5 years

• UHPLC-MS/MS

• Plasma (fasting)

α-HB directly and linoleoylglycerophosphocholine inversely associated with incident T2D (adjusted for age, sex, familial T2D, BMI, fasting glucose, α-HB for L-GPC and vice versa)

AROC (model 1) = 0.766, AROC (model 1 + α-HB, L-GPC) = 0.783;

AROC (model 2) = 0.788, AROC (model 2 + α-HB, L-GPC) = 0.796

C2, acetylcarnitine C2; EPIC, European Prospective Investigation into Cancer and Nutrition; FIA, flow injection analysis; α-HB, α-hydroxybutyrate; HDL-C, HDL-cholesterol; HOMA-IR, HOMA of insulin resistance; LC, liquid chromatography; L-GPC, linoleoylglycerophosphocholine; LysoPC, lysophosphatidylcholine; METSIM, Metabolic Syndrome in Men; MS, mass spectrometry; N/A, not applicable (no data reported); NGT, normal glucose tolerance; NMR, nuclear magnetic resonance; RISC, Relationship between Insulin Sensitivity and Cardiovascular Disease; PC, phosphatidylcholine; T2D, type 2 diabetes; TG, triaclyglycerols; UHPLC, ultra-high performance liquid chromatography; WC, waist circumference

aModel 1: adjusted for age, sex, BMI, physical activity, alcohol intake, smoking, systolic BP, HDL-C; model 2: model 1 + fasting glucose, HbA1c, fasting insulin

bModel 1: age, WC, hypertension, smoking status, multiple dietary components, physical activity; model 2: model 1 + glucose, HbA1c

cModel 1: age, sex, BMI, family history of diabetes, fasting glucose; model 2: model 1 + 2 h glucose

Opportunities for and limitations to the use of biomarkers for the prediction of type 2 diabetes

Opportunities: repeated measurements of biomarkers

In the aforementioned studies, associations and risk score performances were mainly based on single biomarker measurements, which are all characterised by normal intraindividual variation over time (with the exception of genetic markers). Repeated measurements of biomarkers within days or weeks could be useful to improve measurement precision, but may be inconvenient for the patient.

There is growing evidence that biomarker trajectories preceding diabetes development for cases and non-cases diverge over time [9296]. Such trajectories require blood samples to be taken over a wider timeframe (several years or decades). These curves enable a better understanding of the pathophysiological processes of diabetes development and have been described for fasting and postload glucose, HbA1c, interleukin-1 receptor antagonist, adiponectin, alanine aminotransferase and triacylglycerols [9296]. Deeper insight into the development of type 2 diabetes can also be expected from the analysis of established metabolic risk factors such as BMI, waist circumference, other lipids or uric acid, for which multiple measurements in the same patients over time are usually available to the treating general practitioner. In a previous analysis from the Whitehall II study [94] a 0.5 mmol/l difference between fasting glucose 3 years before diabetes diagnosis and a 0.3 mmol/l steeper increase in fasting glucose in later diabetes cases were observed compared with non-cases (Fig. 2), suggesting that fasting glucose values measured 5–10 years apart could provide improved prediction of diabetes over a single glucose measurement.
https://static-content.springer.com/image/art%3A10.1007%2Fs00125-013-3061-3/MediaObjects/125_2013_3061_Fig2_HTML.gif
Fig. 2

Fasting glucose trajectories before diagnosis of diabetes or the end of follow-up in the Whitehall II study. The analysis is based on 505 incident diabetes cases (triangles) and 6033 individuals who remained diabetes-free (squares). Time 0 is diagnosis for incident diabetes cases or end of follow-up for non-diabetics. Graphs are based on multilevel longitudinal modelling. Modified from [94] with permission from Elsevier

While the use of repeated measurements for the prediction of diabetes seems to be a tempting approach, as repeated measurements of different diabetes risk factors are collected in general practice, only risk factors with highly different trajectories are expected to improve the predictive ability of a given risk score [97].

It may be argued that improved prediction based on multiple compared with single measurements of glucose is obvious, but the fact that current prediction scores do not make use of multiple measurements in clinical practice to improve individual risk assessment seems noteworthy. One important concern regarding repeated measurements of risk factors is that this approach might have negative effects on disease prevention as it may delay the initiation of preventive efforts. However, if single measurements are used for risk models where repeated measurements are not available, this would not delay any preventive or therapeutic interventions.

Limitations in disease prediction

One might ask to what extent AROCs can be improved by the addition of novel biomarkers. Perfect prediction of diabetes might not be possible for at least five reasons: First, the diagnosis of diabetes is not as clear as the diagnosis of other chronic diseases (e.g. cancer). As an example, diagnosing a person with diabetes when the 2 h glucose level is 201 mg/dl (11.17 mmol/l), but not when the level is 199 mg/dl (11.06 mmol/l) is, to some extent, dependent on chance and measurement imprecision. Second, the measurement imprecision also applies, to lesser or greater degrees, to all predictors used in risk scores. Third, risk scores cannot capture changes of lifestyle or medication following the assessment of individual risk. Fourth, incident cases of diabetes in the cohort study used to develop a prediction model might have been missed because they occurred after the end of the follow-up period, which contributes to measurement error in the outcome of type 2 diabetes. Fifth, many novel biomarkers described as independent risk factors for type 2 diabetes are correlated with traditional risk factors or other biomarkers [98]. They therefore only provide limited incremental information and do not contribute to better discrimination. A further limitation of risk models for type 2 diabetes is that they can predict the onset of the disease, but cannot predict the onset of micro- and macrovascular complications, the major determinants of quality of life, morbidity, mortality and diabetes-related costs. Recently, several different diabetes risk scores were applied to an external set of prospective data of older individuals, and the scores did not prove to be useful in the prediction of cardiovascular diseases [99].

Summary and outlook

A lot of work has been performed to assess the incremental value of novel markers, beyond established risk factors, for the prediction of diabetes. Nevertheless, several questions remain to be answered.

First, the addition of biomarkers to conventional diabetes risk scores has so far not or, at best, only slightly improved the predictive ability of the models. This raises the question, under which condition novel markers may have a larger incremental value. Often biomarkers are strongly correlated with conventional risk factors so that they do not provide additional predictive information [98, 100]. While in the near future many novel biomarkers are expected to be described as a result of technological progress, these will only improve diabetes prediction if they are at best weakly correlated with established risk factors. Moreover, it is conceivable that the slope of a biomarker trajectory (the change of the biomarker over time) captures incremental predictive information above the last measurement of the marker alone. However, the potential of trajectories has not yet been assessed for diabetes prediction.

Second, one might ask how good is good enough in diabetes prediction, and which criteria might be used to assess an individual’s diabetes risk with a sufficient level of precision. The question of sufficient precision can only be answered with regard to the purpose of the score. For a paper-and-pencil score used as the first step of a population-wide screening, sufficient level of precision could be lower than that for a score which is used to guide lifestyle recommendations and treatment for individuals in clinical practice. Furthermore, the ultimate performance measure of a novel marker will be the improvement in health outcomes through therapeutic changes and its cost-effectiveness [100]. However, critical risk values have not yet been defined for type 2 diabetes, and, thus, the question when risk models are good enough cannot be answered currently. As already stated by Hlatky et al [100], there is no single metric which assesses all the characteristics of a novel marker. For example, AROCs include only rank information and do not indicate how accurate predictions are. Therefore, other criteria like the IDI, a goodness-of-fit test, and positive and negative predictive values should be added.

Third, beyond optimising the predictive ability of diabetes risk scores, there is a wide range of issues which have not been considered in this review. From a public health perspective, it has to be asked whether diabetes risk scores are accepted by physicians, and which barriers might prevent physicians from using them; how scores are best implemented in clinical practice; to what extent intuitive risk assessments made by physicians are concordant with score-based assessments; and how good is the effectiveness and efficiency of diabetes prediction models. All these questions have hardly been addressed so far. Another issue to consider regarding non-economic costs relates to false positive test results (which could increase anxiety) and false negative risk estimates (which could lead to false reassurance). Finally, the successful implementation of any prognostic diabetes model will depend on a cost-effective intervention strategy for those persons for whom a high risk of developing type 2 diabetes is diagnosed. This list demonstrates that the assessment of the performance of novel biomarkers in risk models needs to be investigated in a substantially larger context than it is currently before recommendations for their widespread use can be given with certainty.

Acknowledgements

The authors would like to thank Kirti Kaul (Institute for Clinical Diabetology, German Diabetes Center, Düsseldorf, Germany) for critically reading the manuscript and for helpful discussions.

Funding

The German Diabetes Center is funded by the German Federal Ministry of Health (Berlin, Germany) and the Ministry of Innovation, Science and Research of the State of North Rhine-Westphalia (Düsseldorf, Germany). This study was supported in part by a grant from the German Federal Ministry of Education and Research (BMBF) to the German Center for Diabetes Research (DZD e.V.). This work was also supported by the research project Greifswald Approach to Individualized Medicine (GANI_MED). The GANI_MED consortium is funded by the Federal Ministry of Education and Research and the Ministry of Cultural Affairs of the Federal State of Mecklenburg-West Pomerania (support code: 03IS2061D). AGT is supported by TÁMOP 4.2.4.A/1-11-1-2012-0001 National Excellence Program–research fellowship co-financed by the European Union and the European Social Fund.

Duality of interest

The authors declare that there is no duality of interest associated with this manuscript.

Contribution statement

CH and WR were responsible for the conception of this review. CH, BK, AGT and WR were responsible for the design and drafting of the manuscript and approved the final version of the manuscript.

Supplementary material

125_2013_3061_MOESM1_ESM.pdf (44 kb)
ESM Table 1(PDF 43 kb)
125_2013_3061_MOESM2_ESM.pdf (66 kb)
ESM Table 2(PDF 66.3 kb)

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Christian Herder
    • 1
  • Bernd Kowall
    • 2
  • Adam G. Tabak
    • 3
    • 4
  • Wolfgang Rathmann
    • 2
  1. 1.Institute for Clinical Diabetology, German Diabetes CenterLeibniz Center for Diabetes Research at Heinrich Heine University DüsseldorfDüsseldorfGermany
  2. 2.Institute of Biometrics and Epidemiology, German Diabetes CenterLeibniz Center for Diabetes Research at Heinrich Heine University DüsseldorfDüsseldorfGermany
  3. 3.Department of Epidemiology and Public HealthUniversity College LondonLondonUK
  4. 4.1st Department of MedicineSemmelweis University Faculty of MedicineBudapestHungary