The gut incretin hormones glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic peptide (GIP) have a major role in acute food-stimulated secretion of insulin but also affect long-term beta cell mass and functioning [1]. Single nucleotide polymorphism (SNPs) in genes associated with type 2 diabetes in genome-wide association studies (GWAS) have been found to regulate the release and functioning of incretin hormones in small-scale experimental studies [2, 3]. Genetic variants in transcription factor 7-like 2 (TCF7L2) [48] and Wolframin ER transmembrane glycoprotein (WFS1) [9] have been linked to pancreatic GLP-1 sensitivity, gastric inhibitory polypeptide receptor (GIPR) variants have been linked to pancreatic GIP sensitivity [2], and genetic variants in potassium voltage-gated channel subfamily Q member 1 (KCNQ1) have been reported to affect GLP-1 and GIP secretion from the gut [1012].

Despite the direct involvement of food ingestion in the incretin response, the role of diet in the regulation of the incretin system is largely unknown. Some dietary factors have been found to stimulate the release and action of incretin hormones in animal studies and small-scale experimental studies in humans. Whey protein (present in all dairy products except cheese) has been found to increase postprandial GIP and GLP-1 concentrations compared with isoenergetic non-whey containing meals [1315] or dairy casein protein [16, 17]. Furthermore, whey-containing dairy products have been found to have a disproportionally high insulin index (typical blood insulin response to various foods) compared with their corresponding glycaemic index [18]. Similarly, intake of olive oil has been found to increase the production of both GIP and GLP-1 in healthy individuals [19] and GLP-1 in individuals with type 2 diabetes [20] compared with an isoenergetic intake of butter. A postprandial increase in the level of intact GLP-1 has also been found after consumption of purified oleic acid in animal studies [21]. Dietary fibre, especially of cereal origin, has also been found to stimulate GIP and GLP-1 release in healthy individuals [22, 23]. Finally, coffee, via mechanisms involving chlorogenic acid (the major polyphenol in coffee), has been found to increase production of GLP-1 compared with isoenergetic beverages [24].

All of the above dietary factors have also been linked to a reduction in the risk of type 2 diabetes in epidemiological studies [25]. In the European Prospective Investigation into Cancer and Nutrition (EPIC) - InterAct study, an inverse association between dairy products and type 2 diabetes was found [26], while dietary fibre and olive oil did not show an association [27, 28].

We hypothesised that the association between specific foods potentially influencing incretin release (whey-containing dairy, cereal fibre, olive oil and coffee) and type 2 diabetes risk is modified by incretin-related genetic variants. We investigated this hypothesis in the large population-based EPIC-InterAct study conducted in eight European countries.


Study population

The design and methods of the InterAct study, nested within the EPIC cohorts, are described in detail elsewhere [29]. All participants provided written informed consent, and the study was approved by the local ethics committee in the participating countries and the internal review board of the International Agency for Research on Cancer. Briefly, the study population included participants from 26 centres in the eight of the ten countries participating in EPIC who had available blood samples and information on diabetes (France, Italy, Spain, the UK, the Netherlands, Germany, Denmark and Sweden). All ascertained and verified incident type 2 diabetes cases between 1991 and 2007 (3.99 million person-years at risk, n = 12,403) were included in the case group. A centre-stratified, representative subcohort of 16,835 individuals was selected as the control group. Prevalent diabetes cases (n = 548) and individuals with uncertain diabetes status (n = 133) were excluded from the subcohort, leaving 16,154 individuals for analysis. Due to the random selection of the control group, 778 incident type 2 diabetes cases were part of the subcohort, leaving a total study sample of 27,779 individuals.

For the current analysis, we excluded participants with abnormal estimated energy intake (top 1% and bottom 1% of the distribution of the ratio of reported energy intake over estimated energy requirements, assessed by the basal metabolic rate) (n = 619) and those with missing information on dietary intake (n = 117). Participants with no data on any of the SNPs of interest (including all Danish participants) (n = 7617) were also excluded. For our interaction analysis, the models were adjusted for demographic and lifestyle factors, thus participants with no information on these key covariates were also excluded. Therefore, 18,638 individuals were included in the analysis, comprising of 8086 cases and 11,035 subcohort participants (including 483 cases in the subcohort). For some of these participants, information on specific SNPs was missing, thus the sample size for the analysis of each SNP differs slightly.

Case ascertainment

Ascertaining incident type 2 diabetes involved a review of the existing EPIC datasets at each centre using multiple sources of evidence including self-report, linkage to disease and drug registers, hospital admissions and mortality data. Information from any follow-up visit or external evidence with a date later than the baseline visit was used. To increase the specificity of the definition for these cases, we sought further evidence including individual medical records review in some centres. Follow-up was censored at the date of diagnosis, 31 December 2007 or the date of death, whichever occurred first. In total, 12,403 verified incident type 2 diabetes cases were identified.

Choice of dietary factors and dietary assessment

After a detailed literature review, we identified four dietary factors (whey-containing dairy, olive oil, coffee and cereal fibre) for which there was evidence for increasing postprandial incretin levels [1315]. Self- or interviewer-administered country-specific validated dietary questionnaires and/or diet records (Sweden only) were used to assess the usual food intake of participants. Nutrient intake was estimated using the standardised EPIC nutrient database (ENDB) [30]. Whey-containing dairy was calculated by subtracting the intake of cheese from the total intake of dairy products including milk, yoghurt, milk-based puddings and cream desserts. Olive oil consumption was reported on its own or as an ingredient in recipes, depending on the country. In the Swedish study centre Umeå, olive oil was not assessed in the questionnaire, and in the UK, it was only reported as an ingredient in recipes. Therefore, these centres had to be excluded from the olive oil interaction analysis. Coffee intake included both caffeinated and decaffeinated coffee. Cereal fibre was derived from the ENDB by adding the fibre content of cereal-based products, including bread, rice, wheat-based pasta, crisp bread, rusks and breakfast cereals [31].

Covariate assessment

Questionnaires were used to collect information on lifestyle factors and socioeconomic status at baseline [32]. For the current analysis we used a four category physical activity index reflecting occupational and recreational physical activity [33]. Educational attainment was categorised as: none; primary school; technical school; secondary school; and further education including university degree. Smoking status was categorised as: never; former; and current smoker. Alcohol consumption was categorised as: 0 g/day; >0–6 g/day; >6–12 g/day; >12–24 g/day; and >24 g/day. Total energy intake was assessed as kcal/day (converted to MJ/day), and intake of specific foods and nutrients of interest as g/day. Anthropometric measures including weight, height and waist circumference were collected at baseline by standardised procedures and adjusted for clothing [34]. Information on prevalent diseases was obtained at baseline including stroke, myocardial infarction, hypertension and hyperlipidaemia.

DNA extraction, genotyping and SNP selection

DNA was extracted from up to 1 ml of buffy coat for each individual from a citrated blood sample. A detailed account on the DNA extraction and genotyping procedures has been published previously [35]. Briefly, a total of 10,027 participants (4644 cases) were randomly selected across all centres (except Denmark) for genome-wide genotyping using the Illumina 660 W-Quad BeadChip (Illumina, San Diego, CA, USA). In addition, 9794 EPIC-InterAct participants with available DNA and not selected for genome-wide measurement were genotyped using the Illumina Cardio-Metabochip (Illumina, San Diego, CA, USA) [35].

Seven SNPs (GIPR: rs10423928; TCF7L2: rs7903146 and rs12255372; WFS1: rs10010131; KCNQ1: rs151290, rs2237892 and rs163184) associated with type 2 diabetes in GWAS [36] were selected based on evidence from small-scale experimental studies that had revealed major roles in the regulation of the release and functioning of incretin hormones [2, 3, 7, 8]. As TCF7L2 has several point mutations implicated in type 2 diabetes, we chose to use the ones for which there is evidence for involvement in the incretin system, based on small-scale experimental studies in humans. More specifically, two independent studies have collectively shown that TCF7L2 rs7903146, rs7901695 and rs12255372 influence the second phase of GLP-1-induced insulin secretion [7, 8]. Since TCF7L2 rs7903146 and TCF7L2 rs7901695 are in strong linkage disequilibrium (LD) (R2 = 0.98, D’ = 0.88) it is likely that the two SNPs capture the same genetic information, thus of the two, we only included rs7903146 in our analysis. The KCNQ1 SNPs included in our analysis were in low LD (R2 < 0.25).

Four of the aforementioned seven SNPs (GIPR rs10423928, TCF7L2 rs7903146, WFS1 rs10010131 and KCNQ1 rs163184) were available from direct genotyping within the entire EPIC-InterAct study on Sequenom or Taqman platforms. Information on two of the remaining SNPs (TCF7L2 rs12255372 and KCNQ1 rs2237892) was available from genotyping with the GWAS Illumina 660 W-Quad Chip and Illumina Cardio-Metabochip. The SNP KCNQ1 rs151290 was not available on any of the genotyping arrays and KCNQ1 rs163171 was chosen as a proxy SNP (r2 = 0.94, D’ = 1.0 in the CEU [Utah Residents with Northern and Western European Ancestry] of 1000 Genomes phase 3 dataset assessed via, accessed 21 January 2015). No significant deviation from Hardy–Weinberg equilibrium was observed (p > 0.01).

An unweighted genetic risk score was constructed based on all unlinked genetic variants (TCF7L2 rs7903146, KCNQ1 rs163184, KCNQ1 rs2237892, GIPR rs10423928 and WFS1 rs10010131) that were significantly related to type 2 diabetes within the study population. Minor allele frequencies for all SNPs can be found in the electronic supplementary material (ESM) Table 1.

Statistical analyses

Prentice-weighted Cox regression was used with age as the underlying time-scale, incident type 2 diabetes as the outcome and each SNP (GIPR: rs10423928; TCF7L2: rs7903146 and rs12255372; WFS1: rs10010131; KCNQ1: rs151290, rs2237892 and rs163184) as well as the genetic risk score, in turn, as exposure variables, stratified by centre and age at baseline (rounded to the nearest integer) and adjusted for sex. For each SNP, per genotype and additive genetic effects were modelled using the minor allele as the effect allele. Gene–diet interactions for each SNP (as well as genetic risk score) and intake of each dietary factor (whey-containing dairy, cereal fibre, coffee and olive oil) were modelled assuming an additive genetic effect by inclusion of a multiplicative interaction term in the model. This model was further adjusted for physical activity, education, BMI, smoking status, total energy intake, intake of fruit and vegetables, meat, soft drinks and alcohol, and mutual adjustment of the four dietary factors of interest. All dietary factors, except coffee, were energy-adjusted by the residual method [37]. Dietary factors were scaled to represent one serving. Calculation of one serving for each food item was based on previous publications from the EPIC-InterAct study and main EPIC studies: dairy 150 g/day; cereal fibre 10 g/day; olive oil 10 g/day; and coffee 125 g/day [26, 38, 39].

The country-specific Cox regression coefficients in the genetic main effects analysis and gene–diet interaction analyses were estimated and combined using random-effects meta-analysis. Between-study heterogeneity was estimated using I2.

In the interaction analysis, we established the number of independent genetic variables from a correlation matrix (Pearson’s r) of all genetic variables tested in the interaction analysis (seven SNPs), and a genetic risk score (range 0–10) by spectral decomposition [40]. Based on the number of independent genetic variables (n = 7) and dietary factors (n = 4) tested, p values below the multiple testing corrected significance threshold of 0.0018 (=0.05/28) were considered to be statistically significant.

All statistical analyses were performed using the SAS Enterprise Guide 6.1 (SAS Institute, Cary, NC, USA) and SAS 9.4 (SAS Institute). Meta-analyses were performed using R (version 3.1.2, and the R function ‘metagen’ available from the R package ‘meta’ (version 4.3-0) in PROC IML.


Participants in the subcohort (n = 11,035) were followed up for a median (interquartile range, IQR) of 12.5 (2.4) years, and 64% were women. The population was on average middle-aged (median [IQR] age 51.5 [14.0]) and had a median (IQR) BMI of 25.5 (5.4) kg/m2 (Table 1). The median (IQR) intake of the dietary factors of interest was 7.2 (5.2) g/day for cereal fibre, 248 (280) g/day for whey-containing dairy products, 1.7 (19.9) g/day for olive oil and 233 (417) g/day for coffee (Table 1). Study population characteristics stratified by country can be found in ESM Table 2.

Table 1 Baseline characteristics of the EPIC-InterAct subcohort participants with data available for at least one SNP of interest

All selected SNPs, with the exception of KCNQ1 rs163171 (the only proxy SNP used in our analysis), were associated with incident type 2 diabetes (ESM Table 3). We found no evidence of an interaction with any of the dietary factors for WFS1 rs10010131, or KCNQ1 rs163171, rs163184 and rs2237892. For GIPR rs10423928, we observed a marginally non-significant interaction with olive oil (p = 0.050) (Table 2).

Table 2 Multivariable-adjusted HR (95% CI) for risk of type 2 diabetes per one serving per day increment of each dietary factor, stratified by incretin-related SNPs, in the EPIC-InterAct study (n = 18,638)

We identified a nominally significant interaction between a TCF7L2 variant and coffee consumption (rs12255372: p Interaction = 0.048, Table 2). Overall, coffee intake was inversely related to the risk of type 2 diabetes (ESM Table 4). Stratification by rs12255372 genotype indicated a lack of an association between coffee intake and type 2 diabetes in non-carriers of the risk allele (rs12255372-GG: HR 0.99 [95% CI 0.97, 1.02] per cup of coffee; Table 2), while with each additional type 2 diabetes risk allele, the inverse association between coffee intake and risk of type 2 diabetes became stronger (rs12255372-GT: HR 0.96 [95% CI 0.93, 0.98] per cup; rs12255372-TT: HR 0.93 [95% CI 0.88, 0.98] per cup; Table 2). TCF7L2 rs12255372 is in moderate LD to TCF7L2 rs7903146 (see Methods section) for which we observed a similar trend for interaction with coffee (Table 2). There was no evidence for heterogeneity between countries (I2 = 8.8%).

We also observed some effect modification in the association between cereal fibre and type 2 diabetes by TCF7L2 variants, but the interaction terms did not reach statistical significance. As observed for coffee, the interaction of cereal fibre with TCF7L2 SNPs was slightly stronger for rs12255372 (p Interaction = 0.08, Table 2) than for rs7903146 (p Interaction = 0.30, Table 2).

A genetic risk score based on the diabetes risk alleles TCF7L2 rs7903146, KCNQ1 rs163184, KCNQ1 rs2237892, GIPR rs10423928 and WFS1 rs10010131 also showed a nominally significant interaction with coffee intake (p = 0.005, Table 3). The inverse association between coffee intake and risk of type 2 diabetes was stronger in carriers of six or more incretin-specific risk alleles (Table 3). After exclusion of TCF7L2 rs7903146 from the risk score the interaction effect remained significant, indicating that the effect was not only driven by the TCF7L2 SNP that showed a trend for interaction with coffee in the single SNP analysis. None of the abovementioned interactions can be considered significant after correction for multiple testing.

Table 3 Multivariable-adjusted HR (95% CI) stratified by genetic risk score and p value for the interaction of each dietary factor with incident type 2 diabetes in the EPIC-InterAct study (n = 18,638)

Sensitivity analyses of the statistically significant interactions were carried out excluding participants with prevalent stroke, myocardial infarction and cancer as well as cases occurring within the first two years of follow-up to examine the possibility of confounding and reverse causation by these factors. No notable changes in effect estimates and p values were observed. Since olive oil consumption varied considerably between countries (ESM Table 2), we performed further sensitivity analysis specifically for all interactions involving olive oil. First, we excluded centres with extremely low consumption of olive oil (all centres where the median intake was 0), which included all of the centres in France, the Netherlands and Sweden. In this analysis, estimates and p values did not change notably. Second, we assessed interactions between each SNP and a binary variable indicating consumption vs non-consumption of olive oil. No interaction effects between olive oil consumption and any of the SNPs of interest were identified (data not shown).


Summary of findings

Our primary aims were to investigate the presence of multiplicative interactions between the intake of foods as well as beverages and specific SNPs relevant to the incretin system, and to identify how these interactions affect incident type 2 diabetes in a European case-cohort study. We identified a possible interaction between TCF7L2 and coffee, with carriers of the type 2 diabetes risk-conferring T allele benefiting more from coffee consumption than non-carriers. As far as the other genes are concerned, no compelling evidence of interaction was observed. The genetic risk score also showed evidence for an interaction with coffee. None of the aforementioned interactions passed the p value threshold for statistical significance after correction for multiple testing, thus all nominally significant findings should be interpreted with caution, as we cannot exclude the possibility of a Type I error. On the other hand, since very large sample sizes are required to detect gene–environment interactions of small magnitude, Type II errors are also possible.

Gene–diet interactions involving TCF7L2

In line with the current findings, previous studies have also identified interactions between TCF7L2 variants and dietary components [4143], but none of these studies investigated an interaction with coffee. In our study, we found a nominally significant interaction between SNP rs12255372 in TCF7L2 and coffee consumption, where the inverse association of coffee intake and type 2 diabetes was only present among participants carrying the risk-conferring T allele and was stronger in homozygous compared with heterozygous carriers.

The evidence from previous studies regarding interactions between TCF7L2 variants and dietary components has focused mainly on wholegrain cereal and dietary fibre. Previous studies [4143] reported that TCF7L2 rs7903146 interacted with whole grains and cereal fibre intake, with the protective effect of dietary fibre appearing only in individuals homozygous for the C allele. In our study, we did find a similar trend of a protective association between cereal fibre and type 2 diabetes only among TCF7L2 rs7903146 CC carriers and rs12255372 GG carriers, but these interactions did not reach statistical significance (p = 0.30 and p = 0.08, respectively). Similarly, a previous study also failed to show an interaction between TCF7L2 rs12255372 and dietary fibre on type 2 diabetes risk [44]. Furthermore, another large study of 46,000 individuals failed to find an interaction between TCF7L2 rs4506565 (a SNP in high LD with rs7903146) and intake of wholegrain cereal in relation to blood glucose and insulin levels [45]. Although these studies show a trend towards an interaction, even larger sample sizes or improved assessment methods may be needed to achieve the appropriate statistical power for detecting this specific interaction. We have to acknowledge, however, that overall no association between cereal fibre and risk for type 2 diabetes was detected in the EPIC-InterAct study, while such an association was observed in many other prospective cohort studies resulting in a clearly significant protective effect in a meta-analysis [28].

Potential mechanisms

Our analyses show some evidence for a possible TCF7L2–coffee interaction, as well as an overall interaction of an incretin-specific genetic risk score with coffee on type 2 diabetes risk. The genetic risk score included SNPs preselected based on their involvement in the incretin system, therefore it may represent an overall genetic predisposition to defects in postprandial insulin secretion, which is somewhat attenuated by coffee consumption. TCF7L2 is a transcription factor for proteins involved in the proper functioning of the Wnt signalling pathway, essential for insulin secretion and beta cell proliferation. Being in positive feedback with GLP-1, TCF7L2 is believed to enhance pancreatic incretin sensitivity [13], as well as increase the production of GIP [14]. KCNQ1 plays an important role in the transport of incretins in enterocytes by exocytosis and thus incretin secretion from the gut [3, 10]. GIPR is the receptor for GIP and thus is directly involved in pancreatic incretin sensitivity [2, 3], while WFS1 codes for Wolframin, which is responsible for the proper folding of proinsulin, and is thus linked to pancreatic incretin sensitivity [3, 9].

A study by Faerch et al [46] suggests that a major effect of TCF7L2 variants on type 2 diabetes is mediated through lower secretion of GIP. In addition, the type 2 diabetes risk allele (T) in TCF7L2 rs7903146 and rs12255372 has been linked to impaired insulin secretion and incretin action [7, 8], and more specifically, a reduction in the second phase of GLP-1-induced insulin secretion [8]. Given this, it would be rational to speculate that foods and beverages which tend to stimulate the secretion of incretins, such as coffee, will have the tendency to compensate, partly, for any genetic defects in the incretin system and thus confer higher protection from type 2 diabetes in individuals carrying the predisposing polymorphisms. Consistent with our results, a randomised crossover trial among individuals with type 2 diabetes that investigated the postprandial effects of three different isoenergetic diets on glucose levels concluded that a dietary pattern characterised by high coffee consumption resulted in a more pronounced insulin response with similar postprandial glucose levels; an effect attributed to increased GIP hormone release [47]. This effect of coffee is possibly brought about via delayed intestinal glucose absorption [24], but pathways involving direct effects on beta cells cannot be excluded; thus, further experimental studies that take incretin levels into account are required to elucidate the exact mechanisms.

Study strengths and limitations

Study power is a major issue in gene–environment interaction studies. Many existing gene–environment interaction studies of type 2 diabetes are underpowered, particularly if the interactions are of small magnitude [48]. With a sample of approximately 20,000 our study is adequately powered for detecting gene–environment interactions of moderate magnitude. With 8291 cases of type 2 diabetes, our gene–environment interaction analysis is the largest ever reported from a single study. Still, we did not find an interaction that passed the multiple testing significance threshold, suggesting that either no interactions exist in the study population or that even larger sample sizes are required to detect them.

The fact that our study includes participants from eight different European countries significantly increases the external validity of our findings, at least in individuals of European origin.

Despite accurate assessment of both exposure (dietary factors) and outcome (diabetes) and the temporal relationship between the two, the inherent issue of dietary misreporting, which affects all observational studies, cannot be completely excluded.

The selection of dietary factors in our analysis was based on a review of the existing literature with the aim of identifying factors that have been shown to stimulate incretin release over and above macronutrient intake. However, we have to acknowledge that there are certainly more dietary factors and SNPs that influence the incretin system that are yet to be discovered. Hence, our analysis cannot be viewed as a comprehensive report on all possible interactions between SNPs and dietary factors in the incretin system.


Our population-based case–cohort study provides evidence for the possible interaction of a TCF7L2 variant and an incretin-specific genetic risk score with coffee consumption affecting the risk of type 2 diabetes. Further adequately powered studies or meta-analyses of smaller studies are required to replicate and confirm these interactions.