, Volume 57, Issue 1, pp 16–29 | Cite as

The potential of novel biomarkers to improve risk prediction of type 2 diabetes

  • Christian HerderEmail author
  • Bernd Kowall
  • Adam G. Tabak
  • Wolfgang Rathmann


The incidence of type 2 diabetes can be reduced substantially by implementing preventive measures in high-risk individuals, but this requires prior knowledge of disease risk in the individual. Various diabetes risk models have been designed, and these have all included a similar combination of factors, such as age, sex, obesity, hypertension, lifestyle factors, family history of diabetes and metabolic traits. The accuracy of prediction models is often assessed by the area under the receiver operating characteristic curve (AROC) as a measure of discrimination, but AROCs should be complemented by measures of calibration and reclassification to estimate the incremental value of novel biomarkers. This review discusses the potential of novel biomarkers to improve model accuracy. The range of molecules that serve as potential predictors of type 2 diabetes includes genetic variants, RNA transcripts, peptides and proteins, lipids and small metabolites. Some of these biomarkers lead to a statistically significant increase of model accuracy, but their incremental value currently seems too small for routine clinical use. However, only a fraction of potentially relevant biomarkers have been assessed with regard to their predictive value. Moreover, serial measurements of biomarkers may help determine individual risk. In conclusion, current risk models provide valuable tools of risk estimation, but perform suboptimally in the prediction of individual diabetes risk. Novel biomarkers still fail to have a clinically applicable impact. However, more efficient use of biomarker data and technological advances in their measurement in clinical settings may allow the development of more accurate predictive models in the future.


Biomarkers Calibration Discrimination Genetic variants Metabolomics Prediction models Reclassification Repeated measures Review Risk Scores Type 2 diabetes 



Area under the receiver operating characteristic curve


Integrated discrimination improvement


Impaired fasting glucose


Impaired glucose tolerance


Cooperative Health Research in the Region of Augsburg




Net reclassification improvement


Single nucleotide polymorphism


Non-pharmacological and pharmacological interventions are able to decrease the incidence of type 2 diabetes in high-risk individuals. The ultimate aim of these interventions is the prevention or the delay of the onset of diabetes-related macro- and microvascular complications that often lead to considerable morbidity and premature death, but a considerable number of individuals who could benefit from such interventions are not aware of their disease risk. Numerous prognostic models and scores for type 2 diabetes have been developed [1, 2, 3] based on known risk factors, including age, sex, obesity, metabolic and lifestyle factors, family history of diabetes or ethnic background. Given that the performance of these risk scores is often far from perfect, it is desirable to identify novel prognostic factors, such as biomarkers from ‘-omics’ technologies, with the aim of achieving better model accuracy.

The main purpose of this review is to appraise the potential of novel biomarkers to improve risk prediction for type 2 diabetes. To achieve this, we will proceed in four steps. First, we will critically discuss statistical methods to compare risk scores without and with biomarkers and to quantify any potential improvement. Second, we will very briefly summarise contemporary methodology to assess the performance of diabetes risk scores based on established risk factors. Third, we will provide an overview of novel biomarkers that have either been investigated in the context of risk prediction or will soon be available for such analyses. Fourth, we will suggest approaches to make more efficient use of biomarkers, and discuss limitations which we should be aware of in our search for the optimal risk model for incident type 2 diabetes.

Methodological issues involved in the assessment and comparison of the performance of risk models using different biomarkers

As reviewed recently [4, 5, 6], it is important to consider methodological issues in the development of risk prediction models. Before the incremental value of novel biomarkers for diabetes prediction can be evaluated, several specific issues related to model performance deserve attention and will be briefly summarised.

Discrimination measures

First, suitable measures of discrimination must be selected. How well a risk model identifies those who will develop a disease over the follow-up time in a cohort study is defined as discrimination. Three common measures of discrimination are explained in the text box [7, 8, 9, 10]. The most popular measure of discrimination is the area under the receiver operating characteristic curve (AROC) or c statistic. The interpretation of AROCs might not be straightforward. For example, an AROC of 0.8 does not mean that 80% of persons who will develop diabetes are actually identified but, rather, that the likelihood is 80% that a randomly selected case (i.e. a person who will develop diabetes) will be assigned a higher estimated diabetes risk than a randomly selected non-case (i.e. a person who will remain diabetes-free). AROCs only use rank information and are quite insensitive to the addition of even strong risk predictors to an established model with a reasonable predictive ability [7, 8].

Measures of discrimination for prediction models




Advantages and disadvantages

Area under the receiver operating curve (AROC, c statistic) [7, 8]

Among all the pairs in the cohort consisting of one participant with and one without incident diabetes, the AROC is the proportion of pairs for which the probability of getting the disease as estimated by the prediction model is larger for the case than for the non-case.


Area under a plot of sensitivity (true-positive rate) vs 1 − specificity (false-positive rate).

An AROC of 0.5 means that the prediction model is no better than tossing a coin (in other words: there is one true positive case for each false positive case).

An AROC of 0.9 is excellent.

An AROC of 1 is the maximum.


AROCs are routinely provided by statistical packages for logistic regression models.


To determine whether increases in AROCs upon addition of new markers to a model are statistically significant, specific tests are available [9].

AROCs only make use of rank information.

They are quite insensitive to the addition of new markers to established models.


They cover the whole range of cut-off values.

Net reclassification improvement (NRI)


NRIs afford setting up categories of diabetes risks (i.e. 0–<5%, 5–<10%, 10–<20%, ≥20%). If a given model is compared with a new model that includes an additional new marker, NRI is calculated as follows:

+ probability of classification in a higher risk category for cases

− probability of classification in a lower risk category for cases

+ probability of classification in a lower risk category for non-cases

− probability of classification in a higher risk category for non-cases.

An NRI of 0 means no predictive improvement by the new marker.

A large NRI indicates that a high proportion of individuals moves into a more appropriate category of predicted risk.


NRIs may indicate changes in risk category when changes in AROC are minimal.


Tests for the null hypothesis, NRI = 0, are available.

NRIs only reflect changes in predictive ability. They cannot be used to characterise the predictive ability of a given model per se.

NRIs strongly depend on the number of risk categories and their cut-off points.


In the absence of established risk categories, category-free NRIs can be used for comparison purposes.

Integrated discrimination improvement (IDI) [10]

IDIs represent a continuous version of NRI.

An IDI of 0 means no predictive improvement.


As for NRI, IDI can reveal changes in disease risk when changes in AROC are minimal.


Fixing risk categories is not necessary.

Like NRIs, IDIs only reflect changes in predictive ability.

When modifying risk scores, it might be most important to improve the risk stratification of individuals thought to be at intermediate risk (i.e. those with a diabetes risk in the range of 5–15%). However, the AROC and the integrated discrimination improvement (IDI) are both continuous measures of discrimination that do not require fixed cut-off points and, thus, an increase in AROC or IDI does not necessarily indicate a better prognosis in persons at intermediate risk, because such an increase could also be due to more accurate prediction in persons with an apparently poor or good prognosis [11].

The net reclassification improvement (NRI) might be more sensitive than AROCs to the incremental predictive power of new markers [7]. Moreover, in calculating NRIs, the improvement of diabetes prediction can be considered separately for cases and non-cases at low, intermediate and high risk of diabetes. However, the NRI also has some caveats [10]. In particular, it is strongly dependent on the number of risk categories and on the cut-off points selected for risk stratification [12]. Thus, for calculating NRIs, Leening and Cook strongly recommend a priori risk classifications with a clinical meaning [13], but with respect to diabetes, there are no standard risk categories yet.

In view of the strengths and drawbacks of AROC, NRI and IDI, Pencina et al suggested using all three measures of discrimination to assess the incremental value of a new marker [14]. Pencina and co-workers proposed a category-free NRI that does not depend on the selection and the number of categories but is instead based on any change in the estimated risks [15]. In the absence of established risk categories for type 2 diabetes, category-free NRIs might be useful for comparison purposes.


Besides discrimination, calibration is of particular importance in the application of prediction models. Calibration refers to how well the predicted probabilities agree with the observed diabetes risk. Calibration might be poor if the prevalence of diabetes in the dataset used to develop the score differs widely from the population in which it is applied. A common test of calibration is the Hosmer–Lemeshow test, which is based on a χ 2 statistic. To cope with poor calibration, several methods for updating models have been suggested [16]. Updating methods range from simple recalibration methods to more sophisticated revision methods. In its simplest form, recalibration means adjusting the intercept of the prediction model leaving the regression coefficients unchanged. A further recalibration method includes adjustment of the intercept and multiplication of all regression coefficients with the same factor (calibration slope) [16].

An example of a calibration assessment is provided in electronic supplementary material (ESM) Table 1. A non-invasive diabetes prediction model was applied to its developmental dataset from the Cooperative Health Research in the Region of Augsburg (KORA) S4/F4 cohort. For each participant, the probability of developing the disease was estimated according to the risk score and, based on these estimated probabilities, participants were ranked from the lowest to the highest estimated risk and grouped into ten groups of approximately equal size (deciles). Thus, for each decile the number of expected diabetes cases can be calculated (number of individuals in the decile multiplied by the mean estimated risk for that decile) and compared with the observed number of incident cases. If the actual prevalence of diabetes is considerably different from the estimated prevalence, the test would result in a low p value, indicating poor calibration. In the example, the estimated and the real number of cases are similar, which is reflected by a non-significant p value of 0.66.

Internal and external validation

Risk scores often show model overfit in the datasets used for model development. This is because regression coefficients are estimated with maximum likelihood methods, so that the prediction of the outcome in the original data is optimal. Thus, AROCs obtained from original data are often considerably larger than AROCs obtained from other independent data. Therefore, model overfit requires external validation of a prediction model before its widespread use.

External validation means applying the prediction model to a dataset with different individuals and re-assessing the measures of model performance. Internal validation (such as cross-validation or bootstrapping methods) is not a full equivalent to external validation, as internal validation still relies on the original data [5]. The use of very heterogeneous datasets for external validation is a widely neglected source of error [8]. AROCs are calculated as the proportion of pairs composed of one case and one non-case, where the estimated probability is larger for the case than for the non-case. As an example, in a dataset including large proportions of younger and older subjects, there are many pairs of one younger, healthy person who does not develop diabetes, and one older person who develops diabetes. Even poor prediction models assign larger diabetes probabilities to the older case than to the younger non-case, which leads to an increase in the AROC. This means that, for example, AROCs are larger when they are calculated for younger and middle-aged subjects than for middle-aged subjects alone. Examples of external validation of diabetes risk scores are given in Table 1 [17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]. Quite often, not all the risk factors included in the original score are available in the dataset used for external validation. Thus, the original prediction models sometimes undergo some transformation before external validation.
Table 1

Diabetes risk models with examples of external evaluation

Study, country, reference

Risk factors included in score

Original data set

External validation

Non-invasive variables

Metabolic risk factors

AROC (95% CI)


Study, country, reference

AROC (95% CI)


Cambridge Risk Score, UK [17]

Age, sex, BMI, smoking, current use of corticosteroids, antihypertensive drug use, diabetes family history



Whitehall II, UK [26]

0.72 (0.69, 0.76)

Hosmer–Lemeshow p = 0.77

KORA, Germany [18]

Age, sex, BMI, smoking, hypertension, parental diabetes



PREVEND, the Netherlands [27]

0.66 (0.63–0.70)

Hosmer–Lemeshow p < 0.001

FINDRISC, Finland [19]

Age, BMI, waist circumference, antihypertensive drug use, history of high blood glucose, physical inactivity, diet (vegetables, fruits, berries)



DETECT-2: the Netherlands, Denmark, Sweden, UK, Australia [28]

Recalibrated:0.77(0.75, 0.78)

Hosmer–Lemeshow p = 0.27

QDScore, UK [20]

Age, sex, ethnicity, BMI, smoking, diabetes family history, Townsend deprivation score, treated hypertension, cardiovascular disease, current use of corticosteroids

Men, 0.83 (0.83, 0.84); women, 0.85 (0.85, 0.86)

Brier score: men, 0.078; women, 0.058

Health Improvement Network database, UK [29]

Men, 0.80; women, 0.81

Brier score: men, 0.053; women, 0.041

AUSDRISK, Australia [21]

Age, sex, ethnicity, parental diabetes, history of high blood glucose, antihypertensive drug use, smoking, physical inactivity, waist circumference

0.78 (0.76, 0.81)

Hosmer–Lemeshow p = 0.85

Blue Mountains Eye Study, Australia [21]

0.66 (0.60, 0.71)

Hosmer–Lemeshow p = 0.32

North West Adelaide Health Study, Australia [21]

0.79 (0.72, 0.86)

Hosmer–Lemeshow p < 0.001

ARIC-1, USA [22]

Diabetic mother, diabetic father, hypertension, black race, age 55 to 64 years, ever smoking, waist circumference, height, resting pulse, weight

0.71 (0.69, 0.73)


KORA S4/F4, Germany [18]

0.75 (0.70, 0.80)


ARIC-2, USA [22]

Diabetic mother, diabetic father, hypertension, black race, age 55 to 64 years, never or former drinking, waist circumference, height, resting pulse

Glucose, triacylglycerol, HDL-C, uric acid

0.79 (0.77, 0.81)


KORA S4/F4, Germany [18]

0.79 (0.74, 0.84)


San Antonio, USA [23]

Age, sex, BMI, ethnicity, systolic BP, family history of diabetes

HDL-C, fasting plasma glucose

0.84 (0.82, 0.87)

Hosmer–Lemeshow p > 0.2

Multi-Ethnic Study of Atherosclerosis, USA [30]

0.83 (0.81, 0.85)

Hosmer–Lemeshow p < 0.001, after recalibration p > 0.10

ARIC, USA [24]

Age, race, waist circumference, height, systolic BP, family history of diabetes

Fasting plasma glucose, triacylglycerol, HDL-C



Multi-Ethnic Study of Atherosclerosis, USA [30]

0.84 (0.82, 0.86)

Hosmer–Lemeshow p < 0.001, after recalibration p > 0.10

Framingham Offspring Study, USA [25]

BMI, HDL-C, parental history of diabetes, BP

Fasting plasma glucose, triacylglycerol



Multi-Ethnic Study of Atherosclerosis, USA [30]

0.78 (0.74, 0.82)

Hosmer–Lemeshow p < 0.001, after recalibration p > 0.10

KORA, Germany [18]

Age, sex, BMI, parental history of diabetes, smoking, hypertension

Fasting plasma glucose, HbA1c, uric acid

0.84 (0.80, 0.89)

Hosmer–Lemeshow p = 0.45, Brier score 0.072

PREVEND, the Netherlands [27]a

0.81 (0.78–0.84)

Hosmer–Lemeshow p < 0.001, after recalibration p = 0.35

ARIC, Atherosclerosis Risk in Communities; AUSDRISK, Australian Type 2 Diabetes Risk Assessment Tool; FINDRISC, Finnish Diabetes Risk Score; HDL-C, HDL-cholesterol; KORA, Cooperative Health Research in the Region of Augsburg; N/A, not assessed; PREVEND, Prevention of Renal and Vascular Endstage Disease

aHbA1c not available in the validation dataset

External validation is a key component to assess the extent to which novel biomarkers can improve risk prediction. Genome-wide association studies often demonstrate a so-called ‘winner’s curse’, with more pronounced associations in discovery datasets than in replication datasets. These data from genomic studies clearly show the importance of external validation for all biomarkers before their inclusion into prediction models.

Prediction models with established, non-invasive and conventional clinical variables

Model accuracy is most commonly assessed by AROCs. In the examples given in Table 1 [17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], AROCs of 0.71 to 0.78 have been achieved with non-invasive models, while models including measures of glycaemia or routine metabolic laboratory analyses have achieved AROCs of up to 0.85. Fasting and postload glucose levels are by themselves strong predictors of diabetes. Thus, the extent to which glycaemic measures contribute to diabetes risk scores should be discussed briefly. Individuals with elevated HbA1c (6.0–6.4% [42–46 mmol/mol]), impaired fasting glucose ([IFG] fasting glucose 6.1–6.9 mmol/l) or impaired glucose tolerance ([IGT] 2 h OGTT glucose 7.8–11.1 mmol/l) have a strongly increased risk for type 2 diabetes compared with normoglycaemic people [31]. Persons with IFG and IGT have an even higher risk of diabetes than those who have only one of the two disorders [31]. As an example, in the KORA cohort of older participants, almost half of those with IFG and IGT combined developed type 2 diabetes over 7 years [32].

The main metabolic risk factors for isolated IFG and isolated IGT are different. The pathophysiology of isolated IFG seems to include reduced hepatic insulin sensitivity, beta cell dysfunction and low beta cell mass [33]. In contrast, isolated IGT is characterised by reduced peripheral insulin sensitivity but near normal hepatic insulin sensitivity and progressive loss of beta cell function. Individuals with combined IFG and IGT exhibit severe defects in both peripheral and hepatic insulin sensitivity, as well as loss of beta cell function [33].

Although a clearly increased diabetes risk can be observed for individuals with isolated IFG and those with isolated IGT, the categorisation of individuals as either ‘normal’ or ‘pre-diabetic’ (IFG, IGT) neglects the fact that a significant increase in diabetes risk also exists for increasing fasting glucose levels within the normal range [34]. Glycaemic measures (fasting and 2 h glucose, HbA1c) are strong diabetes risk predictors, but may be more useful without classification, e.g. as a continuous risk factor. This has been indicated in the German KORA and the Danish Inter99 studies [18, 35].

It is of clinical importance whether a single glycaemic measure performs as well as a simple clinical score. In the multiethnic Atherosclerosis Risk in Communities (ARIC) study the simple risk score including waist circumference, height, blood pressure, family history of diabetes, ethnicity and age performed similarly to fasting glucose alone (AROC 0.71 vs 0.74, p = 0.2) [22]. Figure 1 and ESM Table 2 show that the separate addition of fasting glucose, HbA1c or 2 h glucose to basic models with non-invasive variables leads to a strong increase in model accuracy [18, 24, 36, 37, 38]. In several studies, HbA1c improved the predictive power to a similar extent to fasting glucose. In the KORA study, the strongest incremental value was seen on the addition of 2 h glucose [18]. In the Study of Health in Pomerania (SHIP) cohort, even random glucose improved the predictive ability of diabetes risk scores [36]. Thus, the predictive potential of glucose values can also be used in non-fasting participants.
Fig. 1

Increase in the AROC achieved by adding glycaemic measures to a basic prediction model: KORA S4/F4 Study. Data are from the KORA S4/F4 Study (n = 881; age range 55–74 years; 7-year follow-up) [18] Please see ESM Table 2 for 95% CIs of AROCs. The basic model included age, sex, BMI, hypertension, parental diabetes and former or present smoking. Diabetes was ascertained by validated self-report or OGTT. *p < 0.05; **p < 0.01; ***p < 0.001 vs the basic model

Taken together, non-invasive risk factors including age, sex, BMI, waist circumference, family history, smoking or hypertension form the basis of all diabetes risk scores. Routine clinical biomarkers, such as glucose, HbA1c, lipids and uric acid, have the potential to improve the predictive ability of these basic risk factors, but AROCs rarely exceed 0.85. This argues in favour of a search for novel risk factors to further improve the accuracy of diabetes risk models.

Novel biomarkers from ‘-omics’ technologies as potential components of risk models

Despite moderate or even good model accuracy in some studies (Table 1, ESM Table 2), current prediction algorithms leave room for improvement and raise the question of whether novel biomarkers could be clinically useful, particularly if they could improve risk models that already contain measures of glycaemia. The range of molecules that could serve as potential biomarkers of diabetes risk includes genetic variants, RNA transcripts, peptides and proteins, lipids and small metabolites, cellular markers and metabolic waste products [39]. Owing to current advances in ‘-omics’ technologies, such as genomics, transcriptomics, proteomics and metabolomics, the number of candidate biomarkers keeps growing; however, only a small proportion of these has been investigated with reference to their potential to improve the prediction of type 2 diabetes.

Genetic variants

The heritability of glycaemic traits and type 2 diabetes is high [40], and the large genome-wide association studies published to date since the first in 2007, based on up to >105 study participants, has helped us to better understand the genetic architecture of this disease. Single nucleotide polymorphisms (SNPs) in more than 60 regions throughout the genome (so-called susceptibility loci containing multiple genes) were found to be associated with the risk of type 2 diabetes [39, 41, 42, 43, 44]. Most of these SNPs are common, with minor allele frequencies of 10–90%. Interestingly, loci associated with diabetes risk show only a partial overlap with loci that determine levels of fasting glucose, 2 h glucose and HbA1c. Thus, some loci influence both disease risk and glycaemic traits, whereas others seem to mainly regulate glucose levels within the physiological range without affecting the development of overt type 2 diabetes, and vice versa [45, 46].

Most susceptibility loci harbour genes that play a role in pancreatic development and in beta cell function in adults, whereas loci that could be linked to insulin resistance are less frequent [43, 46, 47, 48]. Other loci are enriched in genes involved in cell cycle regulation, adipocytokine signalling, CREB binding protein (CREBBP)-related transcription and regulation of circadian rhythm [43, 44]. It can be expected that the aforementioned search for the genetic location of causal variants within these loci will lead to a list of novel pathophysiological mechanisms that may serve as therapeutic targets.

The currently known risk variants have rather modest effect sizes; the presence of each risk variant or allele is only associated with increases in diabetes risk of between 5% and 40% (ORs 1.05–1.4). Therefore, these loci do not explain more than 10–15% of the estimated genetic heritability of type 2 diabetes [44, 49]. This estimate is in line with the observation that known risk variants explain only a small fraction of family history-associated diabetes risk [50]. Combinations of up to 40 SNPs resulted in AROCs of 0.55–0.63, which is substantially lower than those achieved by age, sex and BMI alone. In some studies, the addition of genotype information to models based on established anthropometric and clinical risk factors led to statistically significant increases in AROCs, but these improvements were usually not larger than 0.03 [51, 52]. In line with the findings for AROCs, only a few studies reported improvements of NRI and/or IDI by including SNP data, but these improvements were always too low to be of clinical relevance [53, 54].

It should be noted that the effect of genetic markers on risk prediction may be more pronounced in younger individuals, in leaner persons and in studies with long follow-up periods [53, 54], but few studies on young populations, in which the assessment of future genetic risk may be most relevant, are currently available [55]. The initial age of individuals is closely related to the time horizon for any model to predict type 2 diabetes. Several prospective studies have applied genetic risk scores for follow-up times of approximately 10 years. This time period corresponds to that in tools such as the Framingham Risk Score, which estimates an individual’s 10-year risk for incident cardiovascular disease. It has been proposed that genetic risk scores might be more helpful in longer term prediction because, in contrast to variables used in clinical risk scores, genetic variants do not change over time [52, 56]. Eventually, the time horizon for risk models needs to correspond to the period before the onset of type 2 diabetes in which preventive efforts are most effective.

Another caveat is that most genome-wide association and prediction studies have been conducted in populations of European descent [44, 51, 52], and case–control and prospective genetic studies in African-American [57, 58] or Asian [59, 60, 61] populations are still rare. It has been hypothesised that different risk alleles and allele frequencies in various ethnic groups could contribute to global differences in incidence rates of type 2 diabetes [62], but this needs to be corroborated in further studies.

Recent simulation studies indicate that an increase of common SNPs currently below the threshold of genome-wide significance in prediction models by hundreds or several thousand may be able to capture up to half of the risk of type 2 diabetes and thus most of the genetic component [43]. In addition to the investigation of common SNPs, ongoing projects using DNA sequencing are addressing the issue of ‘missing heritability’, leading to the identification of further risk variants, especially with lower risk allele frequencies. One recent study of the MTNR1B locus encoding melatonin receptor 1B indicated that this locus may not only contain common variants with low effect sizes (ORs <1.4), but may also contain rare variants with considerably stronger associations with the risk of type 2 diabetes (OR 5.7, 95% CI 2.2, 14.8) for rare loss-of-function variants of the receptor [63]. Sequencing of all genes in the genome (exome sequencing), as recently reported for a Danish case–control study [64], and whole-genome resequencing, as performed in the 1000 Genomes Project [65], will improve our understanding of the potential relevance of low-frequency (0.5–5%) and rare (<0.5%) variants in the development of type 2 diabetes [66]. It remains to be seen to what extent ongoing studies and analyses of other kinds of genetic variations such as copy number and structural variations will contribute to more precise risk assessment.

Finally, it should also be noted that the problem of ‘missing heritability’ does not only refer to the proportion of phenotypic variance that can be explained by known risk variants (the numerator, which will undoubtedly increase with further studies). ‘Missing heritability’ is also affected by the total phenotypic variance of type 2 diabetes caused by genetic variants, which represents the denominator in our formula for estimating the proportion of explained heritability. It is difficult to accurately assess total phenotypic variance because it may be inflated by ill-defined shared environmental factors in families, by gene–gene interactions and by epigenetic phenomena. Therefore, a more precise quantification of total heritability is required to better define the contribution that genetic data can make to models of risk prediction.

Transcriptomics and type 2 diabetes: RNA species

mRNAs and microRNAs (miRNAs) from various tissues have been investigated as biomarkers of type 2 diabetes, mainly in small and cross-sectional studies [67]. Consequently, it is not clear whether the analysis of the human transcriptome can improve the accuracy of current risk scores. In the context of risk assessment, blood samples appear to be the most suitable biomaterial for transcriptome analyses because they are routinely obtained clinically. Methods for the analysis of transcriptomics datasets in relation to phenotypes and disease risk are currently being developed [68].

MiRNAs have been linked to insulin resistance, reduced beta cell function and type 2 diabetes [69, 70]. In the Bruneck study (South Tyrol, Italy), five miRNAs extracted from plasma were found to be associated with incident type 2 diabetes, but their performance in combination with established risk scores was not reported [71].

Gene expression is regulated at several levels, including epigenetic changes of the genome such as DNA methylation and histone modification. Commercially available bead array-based platforms can analyse DNA methylation intensities at almost 500,000 sites throughout the whole genome. The first results from studies linking epigenetic changes to glycaemic traits and type 2 diabetes risk will be available over the coming years [72].

Peptides and proteins

The complexity of the human serum or plasma proteome consisting of approximately 106 different protein species means, on the one hand, that blood is a rich source of potential biomarkers of diabetes risk but, on the other, that the comprehensive quantification of even a substantial fraction of these peptides and proteins is extremely challenging from a technological perspective [73, 74].

A range of hypothesis-driven studies investigated the contribution of multiple protein biomarkers such as liver enzymes, lipoproteins, insulin or markers of subclinical inflammation, iron metabolism and endothelial dysfunction to established risk scores of type 2 diabetes. A substantial increment of c statistics is possible if these prediction models do not contain a measure of glycaemia [75]. However, protein-based biomarkers that not only lead to statistically significant, but also to clinically relevant improvements of model accuracy remain to be identified for models that already consider glucose or HbA1c [18, 35, 76, 77, 78, 79, 80], as summarised in a recent review [39].

One hypothesis-free prospective study used linear matrix-assisted laser desorption/ionisation time-of-flight mass spectroscopy to characterise protein profiles in serum samples from 85 cases with incident type 2 diabetes and 195 normoglycaemic controls within the Whitehall II cohort. Six protein peaks were significantly associated with incident type 2 diabetes after adjustment for age, sex, obesity, lipids, C-reactive protein, fasting glucose and 2 h glucose, but no data on the potential improvement of prediction models by these proteins were provided [81]. However, this work can be seen as a proof-of-concept study suggesting that proteomic methods may be useful for the detection of blood proteins that play a role early in the development of type 2 diabetes.

Lipids and small metabolites

While triacylglycerols and cholesterol have been used in various risk scores resulting in only modest improvements of model accuracy [1], their subfractions and smaller lipids, as well as sugars, amino acids, organic acids, nucleotides and other small-molecule metabolites from serum or plasma samples, are less well investigated but have moved into the focus of metabolomics studies [82]. Cross-sectional approaches have identified ‘metabolic signatures’ associated with insulin resistance and type 2 diabetes [82, 83], and thus indicated their potential as prognostic biomarkers for type 2 diabetes risk.

Very recently, data on lipids and small metabolites have become available from prospective studies, and these are summarised in Table 2 [84, 85, 86, 87, 88, 89, 90, 91]. These studies showed that elevated levels of branched-chain and aromatic amino acids and lower levels of glycine are associated with incident type 2 diabetes or deteriorating glucose homeostasis [84, 86, 87, 88, 89, 90]. In addition, various lipid species and lipid fractions [85, 87, 88, 89, 90], as well as other small metabolites [87, 89, 91], showed significant associations with the risk of type 2 diabetes or incident impaired glucose metabolism after adjustment for multiple confounders. Some of the aforementioned studies compared the accuracy of prediction models without and with metabolites (Table 2) and found fairly modest improvements in AROCs for models that included metabolomics in addition to established risk factors for type 2 diabetes [84, 88, 89, 91].
Table 2

Prospective metabolomics studies in the field of type 2 diabetes


Study population

Analytical method; type of blood sample

Main findings

Prediction models

Wang et al, 2011 [84]

Framingham Offspring Study

- 189 incident T2D cases

- 189 matched controls

- 400 random controls

- Follow-up 12 years


• Plasma (fasting)

Isoleucine, leucine, valine, tyrosine and phenylalanine associated with incident T2D (adjusted for age, sex, BMI, fasting glucose)

AROC (age, BMI, glucose) = 0.52, AROC (age, BMI, glucose + 5 amino acids) = 0.65 for comparison between cases and matched controls;

AROC (age, BMI, glucose) = 0.801, AROC (age, BMI, glucose + 5 amino acids) = 0.805 for comparison of cases and random controls

Rhee et al, 2011 [85]

Framingham Heart Study

- 189 incident T2D cases

- 189 matched controls

- Follow-up 12 years


• Plasma (fasting)

Multiple lipid species of lower carbon number and double-bond content directly associated with T2D, multiple lipid species of higher carbon number and double-bond content inversely associated with T2D (adjusted for age, sex, BMI, fasting glucose, fasting insulin, TG, HDL-C)


Stančákocvá et al, 2012 [86]

METSIM study (men only)

- 151 incident T2D cases

- 375 controls

- Follow-up 4.7 years


• Serum (fasting)

Alanine, glutamine, isoleucine, leucine, phenylalanine and tyrosine associated with incident T2D (adjusted for age and BMI)


Würtz et al, 2012 [87]

Pieksämäki study

- 618 individuals not treated for diabetes

- Follow-up 6.5 years


• Serum (fasting)

Alanine, glycine, lactate, pyruvate, tyrosine, α1-acid glycoprotein and various fatty acid groups associated with change in fasting glucose; citrate, α1-acid glycoprotein and various fatty acid groups associated with change in 2 h glucose (all adjusted for age, sex, BMI, systolic BP, glucose, insulin, HDL-C, TG [fatty acids: no adjustment for HDL-C, TG])


Wang-Sattler et al, 2012 [88]a


- 91 incident T2D cases

- 785 controls

- Follow-up 7 years


• Serum (fasting)

Glycine and LysoPC C18:2 inversely associated with incident T2D (adjusted for age, sex, BMI, physical activity, alcohol intake, smoking, systolic BP, HDL-C)

AROC (model 1) = 0.742, AROC (model 1 + glycine, LysoPC 18:2, C2) = 0.754 (p = 0.01);

AROC (model 2) = 0.818, AROC (model 2 + glycine, LysoPC 18:2, C2) = 0.828 (p = 0.06)

Wang-Sattler et al, 2012 [88]a

KORA S4/F4 study

- 118 incident IGT cases

- 471 controls

- Follow-up 7 years


• Serum (fasting)

Glycine and LysoPC C18:2 inversely associated with incident IGT (adjusted for adjusted for age, sex, BMI, physical activity, alcohol intake, smoking, systolic BP, HDL-C)

AROC (model 1) = 0.638, AROC (model 1 + glycine, LysoPC 18:2, C2) = 0.671 (p = 0.012);

AROC (model 2) = 0.656, AROC (model 2 + glycine, LysoPC 18:2, C2) = 0.683 (p = 0.015)

Floegel et al, 2013 [89]b

EPIC-Potsdam study

- 800 incident T2D cases

- 2,282 controls

- Follow-up 7 years


• Serum (fasting or non-fasting)

Hexose, phenylalanine, 4 diacyl-PCs and (inversely) LysoPC C18:2 associated with incident T2D (adjusted for age, sex, BMI, WC, hypertension, smoking, multiple dietary components, glucose, HbA1c, HDL-C, TG)

AROC (model 1) = 0.847, AROC (model 1 + 14 metabolites = 0.890 (p < 0.0001);

AROC (model 2) = 0.901, AROC (model 2 + 14 metabolites) = 0.912 (p < 0.0001)

Würtz et al, 2013 [90]

Cardiovascular Risk in Young Finns study

- 1,680 individuals

- Follow-up 6 years


• Serum (fasting)

Isoleucine, leucine, valine, phenylalanine, tyrosine associated with follow-up HOMA-IR in men; leucine, valine and phenylalanine associated with follow-up HOMA-IR in women (all adjusted for age, BMI, systolic BP, HDL-C, TG, smoking, physical activity, baseline HOMA-IR)


Würtz et al, 2013 [90]

Cardiovascular Risk in Young Finns study

- 1,680 individuals

- Follow-up 6 years


• Serum (fasting)

Glutamine inversely associated with follow-up fasting glucose in women (adjusted for age, BMI, systolic BP, HDL-C, TG, smoking, physical activity, baseline fasting glucose)


Würtz et al, 2013 [90]

Cardiovascular Risk in Young Finns study

- 1,680 individuals

- Follow-up 6 years


• Serum (fasting)

Isoleucine, leucine, valine, phenylalanine and tyrosine associated with HOMA-IR ≥ 90th percentile at follow-up in men (adjusted for age, BMI, systolic BP, HDL-C, TG, smoking, physical activity, baseline HOMA-IR)

No significant improvements of baseline model (BMI, apolipoprotein B, physical activity index, HOMA-IR) by amino acid score as assessed by AROC, NRI and IDI in either sex

Ferrannini et al, 2013 [91]c

RISC study

- 779 stable NGT

- 123 incident dysglycaemia

- Follow-up 3 years


• Plasma (fasting)

α-HB directly and linoleoylglycerophosphocholine (L-GPC) inversely associated with incident dysglycaemia (adjusted for age, sex, familial T2D, BMI, fasting glucose, α-HB for L-GPC and vice versa)

AROC (model 1) = 0.762, AROC (model 1 + α-HB, L-GPC) = 0.790;

AROC (model 2) = 0.786, AROC (model 2 + α-HB, L-GPC) = 0.804

Ferrannini et al, 2013 [91]c

Botnia study

- 151 incident T2D cases

- 2,429 controls

- Follow-up 9.5 years


• Plasma (fasting)

α-HB directly and linoleoylglycerophosphocholine inversely associated with incident T2D (adjusted for age, sex, familial T2D, BMI, fasting glucose, α-HB for L-GPC and vice versa)

AROC (model 1) = 0.766, AROC (model 1 + α-HB, L-GPC) = 0.783;

AROC (model 2) = 0.788, AROC (model 2 + α-HB, L-GPC) = 0.796

C2, acetylcarnitine C2; EPIC, European Prospective Investigation into Cancer and Nutrition; FIA, flow injection analysis; α-HB, α-hydroxybutyrate; HDL-C, HDL-cholesterol; HOMA-IR, HOMA of insulin resistance; LC, liquid chromatography; L-GPC, linoleoylglycerophosphocholine; LysoPC, lysophosphatidylcholine; METSIM, Metabolic Syndrome in Men; MS, mass spectrometry; N/A, not applicable (no data reported); NGT, normal glucose tolerance; NMR, nuclear magnetic resonance; RISC, Relationship between Insulin Sensitivity and Cardiovascular Disease; PC, phosphatidylcholine; T2D, type 2 diabetes; TG, triaclyglycerols; UHPLC, ultra-high performance liquid chromatography; WC, waist circumference

aModel 1: adjusted for age, sex, BMI, physical activity, alcohol intake, smoking, systolic BP, HDL-C; model 2: model 1 + fasting glucose, HbA1c, fasting insulin

bModel 1: age, WC, hypertension, smoking status, multiple dietary components, physical activity; model 2: model 1 + glucose, HbA1c

cModel 1: age, sex, BMI, family history of diabetes, fasting glucose; model 2: model 1 + 2 h glucose

Opportunities for and limitations to the use of biomarkers for the prediction of type 2 diabetes

Opportunities: repeated measurements of biomarkers

In the aforementioned studies, associations and risk score performances were mainly based on single biomarker measurements, which are all characterised by normal intraindividual variation over time (with the exception of genetic markers). Repeated measurements of biomarkers within days or weeks could be useful to improve measurement precision, but may be inconvenient for the patient.

There is growing evidence that biomarker trajectories preceding diabetes development for cases and non-cases diverge over time [92, 93, 94, 95, 96]. Such trajectories require blood samples to be taken over a wider timeframe (several years or decades). These curves enable a better understanding of the pathophysiological processes of diabetes development and have been described for fasting and postload glucose, HbA1c, interleukin-1 receptor antagonist, adiponectin, alanine aminotransferase and triacylglycerols [92, 93, 94, 95, 96]. Deeper insight into the development of type 2 diabetes can also be expected from the analysis of established metabolic risk factors such as BMI, waist circumference, other lipids or uric acid, for which multiple measurements in the same patients over time are usually available to the treating general practitioner. In a previous analysis from the Whitehall II study [94] a 0.5 mmol/l difference between fasting glucose 3 years before diabetes diagnosis and a 0.3 mmol/l steeper increase in fasting glucose in later diabetes cases were observed compared with non-cases (Fig. 2), suggesting that fasting glucose values measured 5–10 years apart could provide improved prediction of diabetes over a single glucose measurement.
Fig. 2

Fasting glucose trajectories before diagnosis of diabetes or the end of follow-up in the Whitehall II study. The analysis is based on 505 incident diabetes cases (triangles) and 6033 individuals who remained diabetes-free (squares). Time 0 is diagnosis for incident diabetes cases or end of follow-up for non-diabetics. Graphs are based on multilevel longitudinal modelling. Modified from [94] with permission from Elsevier

While the use of repeated measurements for the prediction of diabetes seems to be a tempting approach, as repeated measurements of different diabetes risk factors are collected in general practice, only risk factors with highly different trajectories are expected to improve the predictive ability of a given risk score [97].

It may be argued that improved prediction based on multiple compared with single measurements of glucose is obvious, but the fact that current prediction scores do not make use of multiple measurements in clinical practice to improve individual risk assessment seems noteworthy. One important concern regarding repeated measurements of risk factors is that this approach might have negative effects on disease prevention as it may delay the initiation of preventive efforts. However, if single measurements are used for risk models where repeated measurements are not available, this would not delay any preventive or therapeutic interventions.

Limitations in disease prediction

One might ask to what extent AROCs can be improved by the addition of novel biomarkers. Perfect prediction of diabetes might not be possible for at least five reasons: First, the diagnosis of diabetes is not as clear as the diagnosis of other chronic diseases (e.g. cancer). As an example, diagnosing a person with diabetes when the 2 h glucose level is 201 mg/dl (11.17 mmol/l), but not when the level is 199 mg/dl (11.06 mmol/l) is, to some extent, dependent on chance and measurement imprecision. Second, the measurement imprecision also applies, to lesser or greater degrees, to all predictors used in risk scores. Third, risk scores cannot capture changes of lifestyle or medication following the assessment of individual risk. Fourth, incident cases of diabetes in the cohort study used to develop a prediction model might have been missed because they occurred after the end of the follow-up period, which contributes to measurement error in the outcome of type 2 diabetes. Fifth, many novel biomarkers described as independent risk factors for type 2 diabetes are correlated with traditional risk factors or other biomarkers [98]. They therefore only provide limited incremental information and do not contribute to better discrimination. A further limitation of risk models for type 2 diabetes is that they can predict the onset of the disease, but cannot predict the onset of micro- and macrovascular complications, the major determinants of quality of life, morbidity, mortality and diabetes-related costs. Recently, several different diabetes risk scores were applied to an external set of prospective data of older individuals, and the scores did not prove to be useful in the prediction of cardiovascular diseases [99].

Summary and outlook

A lot of work has been performed to assess the incremental value of novel markers, beyond established risk factors, for the prediction of diabetes. Nevertheless, several questions remain to be answered.

First, the addition of biomarkers to conventional diabetes risk scores has so far not or, at best, only slightly improved the predictive ability of the models. This raises the question, under which condition novel markers may have a larger incremental value. Often biomarkers are strongly correlated with conventional risk factors so that they do not provide additional predictive information [98, 100]. While in the near future many novel biomarkers are expected to be described as a result of technological progress, these will only improve diabetes prediction if they are at best weakly correlated with established risk factors. Moreover, it is conceivable that the slope of a biomarker trajectory (the change of the biomarker over time) captures incremental predictive information above the last measurement of the marker alone. However, the potential of trajectories has not yet been assessed for diabetes prediction.

Second, one might ask how good is good enough in diabetes prediction, and which criteria might be used to assess an individual’s diabetes risk with a sufficient level of precision. The question of sufficient precision can only be answered with regard to the purpose of the score. For a paper-and-pencil score used as the first step of a population-wide screening, sufficient level of precision could be lower than that for a score which is used to guide lifestyle recommendations and treatment for individuals in clinical practice. Furthermore, the ultimate performance measure of a novel marker will be the improvement in health outcomes through therapeutic changes and its cost-effectiveness [100]. However, critical risk values have not yet been defined for type 2 diabetes, and, thus, the question when risk models are good enough cannot be answered currently. As already stated by Hlatky et al [100], there is no single metric which assesses all the characteristics of a novel marker. For example, AROCs include only rank information and do not indicate how accurate predictions are. Therefore, other criteria like the IDI, a goodness-of-fit test, and positive and negative predictive values should be added.

Third, beyond optimising the predictive ability of diabetes risk scores, there is a wide range of issues which have not been considered in this review. From a public health perspective, it has to be asked whether diabetes risk scores are accepted by physicians, and which barriers might prevent physicians from using them; how scores are best implemented in clinical practice; to what extent intuitive risk assessments made by physicians are concordant with score-based assessments; and how good is the effectiveness and efficiency of diabetes prediction models. All these questions have hardly been addressed so far. Another issue to consider regarding non-economic costs relates to false positive test results (which could increase anxiety) and false negative risk estimates (which could lead to false reassurance). Finally, the successful implementation of any prognostic diabetes model will depend on a cost-effective intervention strategy for those persons for whom a high risk of developing type 2 diabetes is diagnosed. This list demonstrates that the assessment of the performance of novel biomarkers in risk models needs to be investigated in a substantially larger context than it is currently before recommendations for their widespread use can be given with certainty.



The authors would like to thank Kirti Kaul (Institute for Clinical Diabetology, German Diabetes Center, Düsseldorf, Germany) for critically reading the manuscript and for helpful discussions.


The German Diabetes Center is funded by the German Federal Ministry of Health (Berlin, Germany) and the Ministry of Innovation, Science and Research of the State of North Rhine-Westphalia (Düsseldorf, Germany). This study was supported in part by a grant from the German Federal Ministry of Education and Research (BMBF) to the German Center for Diabetes Research (DZD e.V.). This work was also supported by the research project Greifswald Approach to Individualized Medicine (GANI_MED). The GANI_MED consortium is funded by the Federal Ministry of Education and Research and the Ministry of Cultural Affairs of the Federal State of Mecklenburg-West Pomerania (support code: 03IS2061D). AGT is supported by TÁMOP 4.2.4.A/1-11-1-2012-0001 National Excellence Program–research fellowship co-financed by the European Union and the European Social Fund.

Duality of interest

The authors declare that there is no duality of interest associated with this manuscript.

Contribution statement

CH and WR were responsible for the conception of this review. CH, BK, AGT and WR were responsible for the design and drafting of the manuscript and approved the final version of the manuscript.

Supplementary material

125_2013_3061_MOESM1_ESM.pdf (44 kb)
ESM Table 1 (PDF 43 kb)
125_2013_3061_MOESM2_ESM.pdf (66 kb)
ESM Table 2 (PDF 66.3 kb)


  1. 1.
    Buijsse B, Simmons RK, Griffin SJ, Schulze MB (2011) Risk assessment tools for identifying individuals at risk of developing type 2 diabetes. Epidemiol Rev 33:46–62PubMedCrossRefGoogle Scholar
  2. 2.
    Noble D, Mathur R, Dent T, Meads C, Greenhalgh T (2011) Risk models and scores for type 2 diabetes: systematic review. BMJ 343:d7163PubMedCrossRefGoogle Scholar
  3. 3.
    Abbasi A, Peelen LM, Corpeleijn E et al (2012) Prediction models for risk of developing type 2 diabetes: systematic literature search and independent external validation study. BMJ 345:e5900PubMedCrossRefGoogle Scholar
  4. 4.
    Moons KGM, Kengne AP, Woodward M et al (2012) Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker. Heart 98:683–690PubMedCrossRefGoogle Scholar
  5. 5.
    Moons KGM, Kengne AP, de Grobbee et al (2012) Risk prediction models: II. External validation, model updating, and impact assessment. Heart 98:691–698PubMedCrossRefGoogle Scholar
  6. 6.
    Collins GS, Mallett S, Omar O, Yu LM (2011) Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med 9:103PubMedCrossRefGoogle Scholar
  7. 7.
    Cook NR (2007) Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation 115:928–935PubMedCrossRefGoogle Scholar
  8. 8.
    Kowall B, Rathmann W, Strassburger K (2013) Use of areas under the receiver operating curve (AROCs) and some caveats. Int J Public Health 58:485–488PubMedCrossRefGoogle Scholar
  9. 9.
    DeLong ER, DeLong DM, Clarke Pearson DL (1988) Comparing the areas under two or more correlated receiver-operating characteristic curves; a nonparametric approach. Biometrics 44:837–845PubMedCrossRefGoogle Scholar
  10. 10.
    Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS (2008) Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 27:157–172PubMedCrossRefGoogle Scholar
  11. 11.
    Greenland S (2008) The need for reorientation toward cost-effective prediction: comments on ‘Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond’ by M. J. Pencina et al, Statistics in Medicine (DOI: 10.1002/sim.2929). Stat Med 27:199–206PubMedCrossRefGoogle Scholar
  12. 12.
    Mühlenbruch K, Heraclides A, Steyerberg EW, Joost HG, Boeing H, Schulze MB (2013) Assessing improvement in disease prediction using net reclassification improvement: impact of risk cut-offs and number of risk categories. Eur J Epidemiol 28:25–33PubMedCrossRefGoogle Scholar
  13. 13.
    Leening MJG, Cook NR (2013) Net reclassification improvement: a link between statistics and clinical practice. Eur J Epidemiol 28:21–23PubMedCrossRefGoogle Scholar
  14. 14.
    Pencina MJ, D'Agostino RB, Pencina KM, Janssens AC, Greenland P (2012) Interpreting incremental value of markers added to risk prediction models. Am J Epidemiol 76:473–481CrossRefGoogle Scholar
  15. 15.
    Pencina MJ, D'Agostino RB, Steyerberg EW (2011) Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat Med 30:11–21PubMedCrossRefGoogle Scholar
  16. 16.
    Janssen KJM, Moons KGM, Kalkman CJ, Grobbe DE, Vergouwe Y (2008) Updating methods improved the performance of a clinical prediction model in new patients. J Clin Epidemiol 61:76–86PubMedCrossRefGoogle Scholar
  17. 17.
    Rahman M, Simmons RK, Harding AH, Wareham NJ, Griffin SJ (2008) A simple risk score identifies individuals at high risk of developing type 2 diabetes: a prospective cohort study. Fam Pract 25:191–196PubMedCrossRefGoogle Scholar
  18. 18.
    Rathmann W, Kowall B, Heier M et al (2010) Prediction models for incident type 2 diabetes mellitus in the older population: KORA S4/F4 cohort study. Diabet Med 27:1116–1123PubMedCrossRefGoogle Scholar
  19. 19.
    Lindström J, Tuomilehto J (2003) The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 26:725–731PubMedCrossRefGoogle Scholar
  20. 20.
    Hippisley-Cox J, Coupland C, Robson J, Sheikh A, Brindle P (2009) Predicting risk of type 2 diabetes in England and Wales: prospective derivation and validation of QDScore. BMJ 338:b880PubMedCrossRefGoogle Scholar
  21. 21.
    Chen L, Magliano DJ, Balkau B et al (2010) AUSDRISK: an Australian Type 2 Diabetes Risk Assessment Tool based on demographic, lifestyle and simple anthropometric measures. Med J Aust 192:197–202PubMedGoogle Scholar
  22. 22.
    Kahn HS, Cheng YJ, Thompson TJ, Imperatore G, Gregg EW (2009) Two risk-scoring systems for predicting incident diabetes mellitus in U.S. adults age 45 to 64 years. Ann Intern Med 150:741–751PubMedCrossRefGoogle Scholar
  23. 23.
    Stern MP, Williams K, Haffner SM (2002) Identification of persons at high risk for type 2 diabetes mellitus: do we need the oral glucose tolerance test? Ann Intern Med 136:575–581PubMedCrossRefGoogle Scholar
  24. 24.
    Schmidt MI, Duncan BB, Bang H et al (2005) Identifying individuals at high risk for diabetes: the Atherosclerosis Risk in Communities study. Diabetes Care 28:2013–2018PubMedCrossRefGoogle Scholar
  25. 25.
    Wilson PW, Meigs JB, Sullivan L, Fox CS, Nathan DM, D’Agostino RB Sr (2007) Prediction of incident diabetes mellitus in middle-aged adults: the Framingham Offspring Study. Arch Intern Med 167:1068–1074PubMedCrossRefGoogle Scholar
  26. 26.
    Talmud PJ, Hingorani AD, Cooper JA et al (2010) Utility of genetic and non-genetic risk factors in prediction of type 2 diabetes: Whitehall II prospective cohort study. BMJ 340:b4838PubMedCrossRefGoogle Scholar
  27. 27.
    Abbasi A, Corpeleijn E, Peelen LM et al (2012) External validation of the KORA S4⁄F4 prediction models for the risk of developing type 2 diabetes in older adults: the PREVEND Study. Eur J Epidemiol 27:47–52PubMedCrossRefGoogle Scholar
  28. 28.
    Alssema M, Vistisen D, Heymans MW et al (2011) The Evaluation of Screening and Early Detection Strategies for Type 2 Diabetes and Impaired Glucose Tolerance (DETECT-2) update of the Finnish diabetes risk score for prediction of incident type 2 diabetes. Diabetologia 54:1004–1012PubMedCrossRefGoogle Scholar
  29. 29.
    Collins GS, Altman DG (2011) External validation of QDSCORE® for predicting the 10-year risk of developing type 2 diabetes. Diabet Med 28:599–607PubMedCrossRefGoogle Scholar
  30. 30.
    Mann DM, Bertoni AG, Shimbo D et al (2010) Comparative validity of 3 diabetes mellitus risk prediction scoring models in a multiethnic US cohort: the Multi-Ethnic Study of Atherosclerosis. Am J Epidemiol 171:980–988PubMedCrossRefGoogle Scholar
  31. 31.
    Morris DH, Khunti K, Achana F et al (2013) Progression rates from HbA1c 6.0–6.4% and other prediabetes definitions to type 2 diabetes: a meta-analysis. Diabetologia 56:1489–1493PubMedCrossRefGoogle Scholar
  32. 32.
    Rathmann W, Strassburger K, Heier M et al (2009) Incidence of type 2 diabetes in the elderly German population and the effect of clinical and lifestyle risk factors: KORA S4/F4 cohort study. Diabet Med 26:1212–1219PubMedCrossRefGoogle Scholar
  33. 33.
    Faerch K, Borch-Johnsen K, Holst JJ, Vaag A (2009) Pathophysiology and aetiology of impaired fasting glycaemia and impaired glucose tolerance: does it matter for prevention and treatment of type 2 diabetes? Diabetologia 52:1714–1723PubMedCrossRefGoogle Scholar
  34. 34.
    Tirosh A, Shai I, Tekes-Manova D et al (2005) Normal fasting plasma glucose levels and type 2 diabetes in young men. N Engl J Med 353:1454–1462PubMedCrossRefGoogle Scholar
  35. 35.
    Kolberg JA, Jorgensen T, Gerwien RW et al (2009) Development of a type 2 diabetes risk model from a panel of serum biomarkers from the Inter99 cohort. Diabetes Care 32:1207–1212PubMedCrossRefGoogle Scholar
  36. 36.
    Kowall B, Rathmann W, Giani G et al (2013) Random glucose is useful for individual prediction of type 2 diabetes: results of the Study of Health in Pomerania (SHIP). Prim Care Diabetes 7:25–31PubMedCrossRefGoogle Scholar
  37. 37.
    Schöttker B, Raum E, Rothenbacher D, Müller H, Brenner H (2011) Prognostic value of haemoglobin A1c and fasting plasma glucose for incident diabetes and implications for screening. Eur J Epidemiol 26:779–787PubMedCrossRefGoogle Scholar
  38. 38.
    Heianza Y, Arase Y, Hsieh SD et al (2012) Development of a new scoring system for predicting the 5 year incidence of type 2 diabetes in Japan: the Toranomon Hospital Health Management Center Study 6 (TOPICS 6). Diabetologia 55:3213–3223PubMedCrossRefGoogle Scholar
  39. 39.
    Herder C, Karakas M, Koenig W (2011) Biomarkers for the prediction of type 2 diabetes and cardiovascular disease. Clin Pharmacol Ther 90:52–66PubMedCrossRefGoogle Scholar
  40. 40.
    Nolan CJ, Damm P, Prentki M (2011) Type 2 diabetes across generations: from pathophysiology to prevention and management. Lancet 378:169–181PubMedCrossRefGoogle Scholar
  41. 41.
    Kooner JS, Saleheen D, Sim X et al (2011) Genome-wide association study in individuals of South Asian ancestry identifies six new type 2 diabetes susceptibility loci. Nat Genet 43:984–989PubMedCrossRefGoogle Scholar
  42. 42.
    Cho YS, Chen CH, Hu C et al (2011) Meta-analysis of genome-wide association studies identifies eight new loci for type 2 diabetes in east Asians. Nat Genet 44:67–72PubMedCrossRefGoogle Scholar
  43. 43.
    Morris AP, Voight BF, Teslowich TM et al (2012) Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet 44:981–990PubMedCrossRefGoogle Scholar
  44. 44.
    Pal PA, McCarthy MI (2013) The genetics of type 2 diabetes and its clinical relevance. Clin Genet 83:297–306PubMedCrossRefGoogle Scholar
  45. 45.
    Dupuis J, Langenberg C, Prokopenko I et al (2010) New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat Genet 42:105–116PubMedCrossRefGoogle Scholar
  46. 46.
    Scott RA, Lagou V, Welch RP et al (2012) Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways. Nat Genet 44:991–1005PubMedCrossRefGoogle Scholar
  47. 47.
    Voight BF, Scott LJ, Steinthorsdottir V et al (2010) Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat Genet 42:579–589PubMedCrossRefGoogle Scholar
  48. 48.
    Manning AK, Hivert MF, Scott RA et al (2012) A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance. Nat Genet 44:659–669PubMedCrossRefGoogle Scholar
  49. 49.
    McCarthy MI (2010) Genomics, type 2 diabetes, and obesity. N Engl J Med 363:2339–2350PubMedCrossRefGoogle Scholar
  50. 50.
    InterAct Consortium (2013) The link between family history and risk of type 2 diabetes is not explained by anthropometric, lifestyle or genetic risk factors: the EPIC-InterAct study. Diabetologia 56:60–69CrossRefGoogle Scholar
  51. 51.
    Herder C, Roden M (2011) Genetics of type 2 diabetes. Pathophysiologic and clinical relevance. Eur J Clin Invest 41:679–692PubMedCrossRefGoogle Scholar
  52. 52.
    Willems SM, Mihaescu R, Sijbrands EJG, van Duijn CM, Janssens AC (2011) A methodological perspective on genetic risk prediction studies in type 2 diabetes: recommendations for future research. Curr Diab Rep 11:511–518PubMedCrossRefGoogle Scholar
  53. 53.
    de Miguel-Yanes JM, Shrader P, Pencina MJ et al (2011) Genetic risk reclassification for type 2 diabetes by age below or above 50 years using 40 type 2 diabetes risk single nucleotide polymorphisms. Diabetes 34:121–125Google Scholar
  54. 54.
    Lyssenko V, Jonsson A, Almgren P et al (2008) Clinical risk factors, DNA variants, and the development of type 2 diabetes. N Engl J Med 359:2220–2232PubMedCrossRefGoogle Scholar
  55. 55.
    Vassy JL, DasMahapatra P, Meigs JB et al (2012) Genotype prediction of adult type 2 diabetes from adolescence in a multiracial population. Pediatrics 130:e1235–e1242PubMedCrossRefGoogle Scholar
  56. 56.
    Vassy JL, Meigs JB (2012) Is genetic testing useful to predict type 2 diabetes? Best Pract Res Clin Endocrinol Metab 26:189–201PubMedCrossRefGoogle Scholar
  57. 57.
    Cooke JN, Ng MCY, Palmer ND et al (2012) Genetic risk assessment of type 2 diabetes-associated polymorphisms in African Americans. Diabetes Care 35:287–292PubMedCrossRefGoogle Scholar
  58. 58.
    Vassy JL, Durant NH, Kabagambe EK et al (2012) A genotype risk score predicts type 2 diabetes from young adulthood: the CARDIA study. Diabetologia 55:2604–2612PubMedCrossRefGoogle Scholar
  59. 59.
    Hu C, Zhang R, Wang C et al (2009) PPARG, KCNJ11, CDKN2A-CDKN2B, IDE-KIF11-HHEX, IGFBP2 and SLC30A8 are associated with type 2 diabetes in a Chinese population. PLoS One 4:e7643PubMedCrossRefGoogle Scholar
  60. 60.
    Miyake K, Yang W, Hara K et al (2009) Construction of a prediction model for type 2 diabetes mellitus in the Japanese population based on 11 genes with strong evidence of association. J Hum Genet 54:236–241PubMedCrossRefGoogle Scholar
  61. 61.
    Qi Q, Li H, Wu Y et al (2010) Combined effects of 17 common genetic variants on type 2 diabetes risk in a Han Chinese population. Diabetologia 53:2163–2166PubMedCrossRefGoogle Scholar
  62. 62.
    Chen R, Corona E, Sikora M et al (2012) Type 2 diabetes risk alleles demonstrate extreme directional differentiation among human populations, compared to other diseases. PLoS Genet 8:e10002621Google Scholar
  63. 63.
    Bonnefond A, Clément N, Fawcett K et al (2012) Rare MTNR1B variants impairing melatonin receptor 1B function contribute to type 2 diabetes. Nat Genet 44:297–301PubMedCrossRefGoogle Scholar
  64. 64.
    Albrechtsen A, Grarup N, Li Y et al (2013) Exome sequencing-driven discovery of coding polymorphisms associated with common metabolic phenotypes. Diabetologia 56:298–310PubMedCrossRefGoogle Scholar
  65. 65.
    The 1000 Genomes Project Consortium (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65CrossRefGoogle Scholar
  66. 66.
    Day-Williams AG, Zeggini E (2011) The effect of next-generation sequencing technology on complex trait research. Eur J Clin Invest 41:561–567PubMedCrossRefGoogle Scholar
  67. 67.
    Herder C, Roden M, Carstensen M, Illig T (2012) Transcriptomics und typ-2-diabetes. Diabetologe 8:35–41 [article in German]CrossRefGoogle Scholar
  68. 68.
    Schurmann C, Heim K, Schillert A et al (2012) Analyzing Illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the MetaXpress Consortium. PLoS One 7:e50938PubMedCrossRefGoogle Scholar
  69. 69.
    Fernandez-Valverde SL, Taft RJ, Mattick JS (2011) MicroRNAs in beta-cell biology, insulin resistance, diabetes and ist complications. Diabetes 60:1825–1831PubMedCrossRefGoogle Scholar
  70. 70.
    Williams MD, Mitchell GM (2012) MicroRNAs in insulin resistance and obesity. Exp Diab Res 2012:484696Google Scholar
  71. 71.
    Zampetaki A, Kiechl S, Drozdov I et al (2010) Plasma microRNA profiling reveals loss of endothelial MiR-126 and other microRNAs in type 2 diabetes. Circ Res 107:810–817PubMedCrossRefGoogle Scholar
  72. 72.
    Drong AW, Lindgren CM, McCarthy MI (2012) The genetic and epigenetic basis of type 2 diabetes and obesity. Clin Pharmacol Ther 92:707–715PubMedCrossRefGoogle Scholar
  73. 73.
    Anderson NL, Anderson NG (2002) The human plasma proteome: history, character, and diagnostic prospects. Mol Cell Proteomics 1:845–867PubMedCrossRefGoogle Scholar
  74. 74.
    Sundsten T, Ortsäter H (2009) Proteomics in diabetes research. Mol Cell Endocrinol 297:93–103PubMedCrossRefGoogle Scholar
  75. 75.
    Herder C, Baumert J, Zierer A et al (2011) Immunological and cardiometabolic risk factors in the prediction of type 2 diabetes and coronary events: MONICA/KORA Augsburg case-cohort study. PLoS One 6:e19852PubMedCrossRefGoogle Scholar
  76. 76.
    Ley SH, Harris SB, Connelly PW et al (2008) Adipokines and incident type 2 diabetes in an Aboriginal Canadian Population. The Sandy Lake Health and Diabetes Project. Diabetes Care 31:1410–1415PubMedCrossRefGoogle Scholar
  77. 77.
    Schulze MB, Weikert C, Pischon T et al (2009) Use of multiple metabolic and genetic markers to improve the prediction of type 2 diabetes: the EPIC-Potsdam Study. Diabetes Care 32:2116–2119PubMedCrossRefGoogle Scholar
  78. 78.
    Salomaa V, Havulinna A, Saarela O et al (2010) Thirty-one novel biomarkers as predictors for clinically incident diabetes. PLoS One 5:e10100PubMedCrossRefGoogle Scholar
  79. 79.
    Chao C, Song Y, Cook N et al (2010) The lack of utility of circulating biomarkers of inflammation and endothelial dysfunction for type 2 diabetes risk prediction among postmenopausal women. The Women’s Health Initiative Observational Study. Arch Intern Med 170:1557–1565PubMedGoogle Scholar
  80. 80.
    Lyssenko V, Jorgensen T, Gerwien RW et al (2012) Validation of a multi-marker model for the prediction of incident type 2 diabetes mellitus: combined results of the Inter99 and Botnia studies. Diab Vasc Dis Res 9:59–67PubMedCrossRefGoogle Scholar
  81. 81.
    Jensen TM, Witte DR, Pieragostino D et al (2013) Association between protein signals and type 2 diabetes incidence. Acta Diabetol. doi: 10.1007/s00592-012-0376-3 PubMedGoogle Scholar
  82. 82.
    Bain JR, Stevens RD, Wenner BR, Ilkayeva O, Muoio DM, Newgard CB (2009) Metabolomics applied to diabetes research: moving from information to knowledge. Diabetes 58:2429–2443PubMedCrossRefGoogle Scholar
  83. 83.
    Suhre K, Meisinger C, Döring A et al (2011) Metabolic footprint of diabetes: a multiplatform metabolomics study in an epidemiological setting. PLoS One 5:e13953CrossRefGoogle Scholar
  84. 84.
    Wang TJ, Larson MG, Vasan RS et al (2011) Metabolite profiles and the risk of developing diabetes. Nat Med 17:448–453PubMedCrossRefGoogle Scholar
  85. 85.
    Rhee EP, Chang S, Larson MG et al (2011) Lipid profiling identifies a triacylglycerol signature of insulin resistance and improves diabetes prediction in humans. J Clin Invest 121:1402–1411PubMedCrossRefGoogle Scholar
  86. 86.
    Stančáková A, Civelek M, Saleem NK et al (2012) Hyperglycemia and a common variant of GCKR are associated with the levels of eight amino acids in 9,369 Finnish men. Diabetes 61:1895–1902PubMedCrossRefGoogle Scholar
  87. 87.
    Würtz P, Tiainen M, Mäkinen VP et al (2012) Circulating metabolite predictors of glycemia in middle-aged men and women. Diabetes Care 35:1749–1756PubMedCrossRefGoogle Scholar
  88. 88.
    Wang-Sattler R, Yu Z, Herder C et al (2012) Novel biomarkers for pre-diabetes identified by metabolomics. Mol Syst Biol 8:615PubMedGoogle Scholar
  89. 89.
    Floegel A, Stefan N, Yu Z et al (2013) Identification of serum metabolites associated with risk of type 2 diabetes using a targeted metabolomic approach. Diabetes 62:639–648PubMedCrossRefGoogle Scholar
  90. 90.
    Würtz P, Soininen P, Kangas AJ et al (2013) Branched-chain and aromatic amino acids are predictors of insulin resistance in young adults. Diabetes Care 36:648–655PubMedCrossRefGoogle Scholar
  91. 91.
    Ferrannini E, Natali A, Camastra S et al (2013) Early metabolic markers of the development of dysglycemia and type 2 diabetes and their physiological significance. Diabetes 62:1730–1737PubMedCrossRefGoogle Scholar
  92. 92.
    Carstensen M, Herder C, Kivimäki M et al (2010) Accelerated increase in serum interleukin-1 receptor antagonist starts 6 years before diagnosis of type 2 diabetes: Whitehall II prospective cohort study. Diabetes 59:1222–1227PubMedCrossRefGoogle Scholar
  93. 93.
    Heianza Y, Arase Y, Fujihara K et al (2012) Longitudinal trajectories of HbA1c and fasting plasma glucose levels during the development of type 2 diabetes: the Toranomon Hospital Health Management Center Study 7 (TOPICS 7). Diabetes Care 35:1050–1052PubMedCrossRefGoogle Scholar
  94. 94.
    Tabák AG, Jokela M, Akbaraly TN, Brunner EJ, Kivimäki M, Witte DR (2009) Trajectories of glycaemia, insulin sensitivity, and insulin secretion before diagnosis of type 2 diabetes: an analysis from the Whitehall II study. Lancet 373:2215–2221PubMedCrossRefGoogle Scholar
  95. 95.
    Tabák AG, Carstensen M, Witte DR et al (2012) Adiponectin trajectories before type 2 diabetes diagnosis: Whitehall II study. Diabetes Care 35:2540–2547PubMedCrossRefGoogle Scholar
  96. 96.
    Sattar N, McConnachie A, Ford I et al (2007) Serial measurements and conversion to type 2 diabetes in the West of Scotland Coronary Prevention Study: specific elevations in alanine aminotransferase and triglycerides suggest hepatic fat accumulation as a potential contributing factor. Diabetes 56:984–991PubMedCrossRefGoogle Scholar
  97. 97.
    Wald NJ, Morris JK (2011) Assessing risk factors as potential screening tests: a simple assessment tool. Arch Intern Med 171:286–291PubMedGoogle Scholar
  98. 98.
    Sattar N, Wannamethee SG, Forouhi NG (2008) Novel biochemical risk factors for type 2 diabetes: pathogenic insights or prediction possibilities? Diabetologia 51:926–940PubMedCrossRefGoogle Scholar
  99. 99.
    Kowall B, Rathmann W, Bongaerts B et al (2013) Are diabetes risk scores useful for the prediction of cardiovascular diseases? Assessment of seven diabetes risk scores in the KORA S4/F4 cohort study. J Diabetes Complicat 27:340–345PubMedCrossRefGoogle Scholar
  100. 100.
    Hlatky MA, Greenland P, Arnett DK et al (2009) Criteria for evaluation of novel markers of cardiovascular risk. A scientific statement from the American Heart Association. Circulation 119:2408–2416PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Christian Herder
    • 1
    Email author
  • Bernd Kowall
    • 2
  • Adam G. Tabak
    • 3
    • 4
  • Wolfgang Rathmann
    • 2
  1. 1.Institute for Clinical Diabetology, German Diabetes CenterLeibniz Center for Diabetes Research at Heinrich Heine University DüsseldorfDüsseldorfGermany
  2. 2.Institute of Biometrics and Epidemiology, German Diabetes CenterLeibniz Center for Diabetes Research at Heinrich Heine University DüsseldorfDüsseldorfGermany
  3. 3.Department of Epidemiology and Public HealthUniversity College LondonLondonUK
  4. 4.1st Department of MedicineSemmelweis University Faculty of MedicineBudapestHungary

Personalised recommendations