Dietary and lifestyle approaches have a high potential for the primary prevention of type 2 diabetes [1]. In nutritional epidemiology, dietary-pattern analysis has gained particular interest because it reflects the complexity of dietary intake. Two approaches are generally distinguished for defining dietary patterns [2]. The hypothesis-oriented approach defines diet-quality scores based on existing scientific evidence for chronic diseases. Examples include the Healthy Eating Index (HEI) [3], the Diet-Quality Index [4] and the alternative Healthy Eating Index (aHEI) [5]. In contrast, the exploratory approach uses the dietary data at hand, applying statistical methods such as factor analysis or cluster analysis to reveal major prevailing dietary patterns in a study population. Reduced rank regression (RRR) is a mixture of a hypothesis-oriented and an exploratory approach and is aimed at identifying food group combinations that explain a maximum of variation in (disease-related) response variables [6]. Therefore, in addition to the hypothesis-based definition of diet-quality scores, the RRR method may be especially useful in identifying diabetes-related dietary patterns.

One of the most extensively studied diet-quality scores is the Mediterranean dietary-pattern score. Overall, studies suggest that adherence to the Mediterranean dietary pattern is related to lower diabetes risk [710]. Besides the Mediterranean dietary pattern, there are a limited number of studies on individual diet-quality scores and diabetes incidence. The few available data suggest that adherence to the aHEI [11], the Dietary Approaches to Stop Hypertension (DASH) diet [12, 13], the HEI [11] and the Overall Nutritional Quality Index (ONQI) [14] may lower diabetes risk. No clear associations were observed for the Recommended Food Score [12], the Diet-Quality Index [15] and diet-quality scores, reflecting guidelines from Germany [16] and Australia [17]. Most of these studies were conducted in American populations [1115] and it has been suggested that associations may differ between heterogeneous populations such as different ethnic groups [13, 15]. Several RRR-derived dietary patterns have been associated with diabetes risk [1821], but it is unknown whether these dietary patterns are related to risk in different populations.

We reconstructed selected predefined diet-quality scores (aHEI and DASH), as well as RRR-derived dietary patterns that were originally derived in other populations, and evaluated their association with diabetes incidence in the multi-centre European Prospective Investigation into Cancer and Nutrition (EPIC)-InterAct study. We also assessed the degree of heterogeneity in the associations between countries involved in EPIC.


EPIC-InterAct study

The EPIC-InterAct study is a case-cohort study nested within the prospective EPIC study [22]. In brief, EPIC includes 521,448 adults aged 25–79 years who were recruited between 1991 and 2000 at 23 centres in ten European countries participating in EPIC [2325]. In the majority of the EPIC study centres, participants were recruited from the general population, with some exceptions [24]: the French cohort was based on members of a health insurance scheme for teachers; the Italian and Spanish cohorts included blood donors; participants from Utrecht (the Netherlands) and Florence (Italy) were recruited via a breast cancer screening programme; in Oxford (UK) half of the cohort were vegans, lacto-ovo vegetarians or fish eaters, and in France, Norway, Utrecht (the Netherlands) and Naples (Italy) only women were recruited [24]. Each EPIC centre obtained individual written informed consent and local ethics approval.

Within the InterAct project, incident cases of type 2 diabetes occurring in the EPIC cohort were ascertained and verified. All EPIC countries except Norway and Greece contributed to EPIC-InterAct (n = 455,680). Individuals without stored blood (n = 109,625) or without information on reported diabetes status (n = 5,821) were excluded, leaving 340,234 participants eligible for inclusion in EPIC-InterAct (corresponding to 3.99 million person-years follow-up).

Case-cohort construction and case ascertainment

A centre-stratified, random subcohort of 16,835 individuals was selected. After exclusion of 548 individuals with prevalent diabetes and 133 with uncertain diabetes status, the subcohort included 16,154 individuals for analysis. Because of random selection, this subcohort also included a random set of 778 individuals who had developed incident type 2 diabetes during follow-up (Fig. 1).

Fig. 1
figure 1

Construction of the EPIC-InterAct case-cohort study and the study population for the present analysis. T2D, type 2 diabetes

Ascertainment of incident type 2 diabetes involved a review of the existing EPIC datasets at each centre using multiple sources of evidence including self-report, linkage to primary-care registers, secondary-care registers, medication use (drug registers), hospital admissions and mortality data. Information from any follow-up visit or external evidence with a date later than the baseline visit was used. Rather than self-report, cases in Denmark and Sweden were identified via local and national diabetes and pharmaceutical registers [26] (, accessed 11 October 2013) and hence all ascertained cases were considered to be verified. Some cases in centres other than Denmark and Sweden were based on only one source of information. To increase the specificity of the definition for these cases, we sought further evidence including review of individual medical records in some centres. Follow-up was censored at the date of diagnosis, 31 December 2007 or the date of death, whichever occurred first. Altogether, 12,403 verified incident cases were identified [22]. As stated earlier, 778 of these 12,403 incident cases were also subcohort members, due to the random selection of the subcohort. Thus, the EPIC-InterAct study involves 27,779 participants (16,154 subcohort members; 12,403 incident cases including 778 cases within the subcohort; Fig. 1).

Study population for the present analysis

Of these 27,779 participants, we excluded those from study centres in Italy and Umeå (Sweden) (n = 5,238) because these centres did not obtain specific intake data on diet soft drinks, breakfast cereals and dressing sauces (Italy) or diet soft drinks and cabbages (Umeå), which are important dietary components of the selected dietary-pattern scores. Specifically for analyses on DASH, the UK centres were excluded due to the unavailability of intake data on vegetable oils (n = 1,857). We further excluded participants with missing data on diet or covariates (n = 925), resulting in a final study population of 21,616 (9,682 cases; 12,595 subcohort members with an overlap of 661 subcohort members who had developed incident type 2 diabetes; Fig. 1). The excluded participants were more likely to be slightly older, women, slightly less overweight, less physically active, less educated and a current or former smoker and they were less likely to have a family history of diabetes. The proportion of participants with HbA1c ≥ 6.5% (47.5 mmol/mol) was slightly higher among excluded participants.

Dietary assessment and selection of dietary-pattern scores

Usual food intake during the past 12 months was assessed at baseline with the use of quantitative or semi-quantitative dietary questionnaires, which were developed and validated locally [24, 27]. The reproducibility of these questionnaires was generally good in the EPIC centres, while the relative validity ranged from moderate to good as also observed in other validity studies conducted by independent research groups [28, 29]. Individual food items were classified into food groups based on nutrient composition. Definitions and contents of the food groups considered for the present analysis are shown in electronic supplementary material (ESM) Table 1. Intakes of specific nutrients and total energy were derived with the standardised EPIC Nutrient Database [30].

Dietary patterns considered in this study were selected from the literature. Criteria for selection were availability of the necessary intake data to construct the dietary patterns in the EPIC-InterAct study and presence of scientific evidence indicating that the dietary pattern had a potential relevance for diabetes risk. We have selected two widely used diet-quality scores, the aHEI [5] and the DASH diet [31, 32]. The relation of the Mediterranean dietary pattern to diabetes in EPIC-InterAct has been specifically addressed previously [9] and hence not investigated here. We could not evaluate the HEI and the ONQI as it was not possible to appropriately reflect these indices with the EPIC-InterAct dietary data. We selected three RRR-derived dietary patterns: RRR1 was derived in the American Nurses' Health Study (NHS) using six inflammatory markers as responses [20]; RRR2 was identified in the German EPIC-Potsdam study with HbA1c, HDL-cholesterol, C-reactive protein (CRP) and adiponectin as responses [18]; RRR3 was identified in the British Whitehall II study with the HOMA-IR index as response [19]. An RRR dietary pattern derived with BMI as response along with fasting glucose, triacylglycerols, HDL-cholesterol and hypertension [21] was not considered because we aimed to assess the association of dietary patterns with diabetes independent of body size. Tables 1 and 2 show the individual dietary components of the dietary-pattern scores used in this study and their weighting in the calculation of the scores, respectively. A detailed description of the construction of the dietary-pattern scores in EPIC-InterAct is given in ESM Methods.

Table 1 Individual dietary components of the aHEI and the DASH dietary patterns considered in the analysis, EPIC-InterAct study
Table 2 Individual dietary components of the RRR dietary patterns considered in the analysis, EPIC-InterAct study

Assessment of other covariates

Standardised questionnaires were used at baseline to collect information on sociodemographic characteristics and lifestyle including age, education level, smoking status, occupational and leisure-time physical activity and history of previous illness. Height, weight and waist circumference of participants were obtained by trained staff during the baseline examination using standardised protocols [33]. However, for participants from France and some participants from Oxford (UK), self-reported anthropometric data were collected (4% of EPIC-InterAct study).

Statistical analysis

All dietary-pattern scores were transformed to z scores, based on subcohort distributions. Median dietary-pattern scores by country were computed to quantify country-specific adherence to the dietary patterns. The UK cohorts from Norfolk (population-based) and Oxford (high proportion of vegans, vegetarians and health-conscious individuals) were considered separately. We performed Cox proportional hazards analysis, weighted according to the Prentice method [34], to study the association between the dietary-pattern scores and the hazard of type 2 diabetes. Age was used as underlying time scale. Four models were applied, all stratified by study centre and integers of age (years), but with different levels of adjustment. Model 1 was adjusted for sex. Model 2 included further adjustment for physical activity (classified into ‘inactive’, ‘moderately inactive’, ‘moderately active’ and ‘active’ according to the validated Cambridge Physical Activity Index [35]), smoking status (never, former, current), educational level (none, primary, technical/professional, secondary, university) and total energy intake (continuous). We also applied additional adjustment for BMI (model 3) and BMI and waist circumference (model 4, both continuous).

Heterogeneity among countries in the association of the dietary-pattern scores with diabetes risk was studied by computing country-specific risk estimates and pooling these with random-effects meta-analyses. The two UK cohorts from Norfolk and Oxford were considered separately in the meta-analyses. As our aim was to verify associations of dietary patterns with diabetes, which should be done in independent cohorts, we did not use the Potsdam cohort in the meta-analysis for RRR2 because this pattern was derived in this cohort [18]. To explore potential sources of heterogeneity, country-specific mean age and BMI were related to the log-transformed HRs in subsequent meta-regression analyses [36].

Several sensitivity analyses were performed. To minimise reverse causality caused by a change in diet due to a prediabetic condition or chronic disease, we excluded participants with baseline HbA1c ≥ 6.5% (47.5 mmol/mol; 1.5% of the study population were missing values for HbA1c), incident cases diagnosed with diabetes within the first 2 years of follow-up and participants with baseline cardiovascular disease (myocardial infarction, stroke) or self-reported hypertension or hyperlipidaemia. To investigate potential effects of misreporting, we excluded participants in the top or bottom 1% of the energy intake/energy requirement ratio. Possible confounding by diabetes family history was addressed by further adjusting for history of diabetes in a first-degree relative (information not available in the Spanish centres, Oxford [UK] or Heidelberg [Germany]).

We investigated the importance of individual components of the dietary patterns for diabetes risk by sequentially subtracting components from the score. The change in estimate (CIE) was calculated as the difference between the HRs divided by the HR for the original score and multiplied by 100 (%).

Statistical analyses were performed with SAS (Version 9.2, Enterprise Guide 4.3; SAS Institute, Cary, NC, USA), except for meta-analyses and meta-regressions, which were conducted using Stata 12 (StataCorp, College Station, TX, USA).


Median z-transformed dietary-pattern scores for each country are shown in Table 3. Table 4 shows baseline participant characteristics for the lowest and highest quintiles of the dietary-pattern scores. High scores correspond to favourable adherence. Most notably, high aHEI and DASH scores were associated with being women and never smokers, while there was no strong association with body size. High scores for all three RRRs were associated with being older and women and having a lower body size and higher educational level. Furthermore, macronutrient composition and intake of alcohol, fibre, meat, fruits/vegetables and coffee was clearly associated with the dietary-pattern scores.

Table 3 Dietary-pattern scores by country in the subcohort of the EPIC-InterAct study
Table 4 Baseline characteristics for extreme quintiles of the dietary-pattern scores in the subcohort of the EPIC-InterAct study (n = 12,595)

Table 5 shows HRs for the association of quintiles of the dietary-pattern scores with diabetes in the pooled study population. We observed linear inverse associations of the aHEI and DASH scores with diabetes after correction for age, sex, study centre, sociodemographic factors and lifestyle characteristics (model 2). However, these associations lost statistical significance after additional adjustment for BMI and waist circumference (model 4). For the three RRR scores, we observed relatively strong linear inverse associations with diabetes in our model 2. After additional adjustment for BMI and waist circumference, these linear inverse associations were modestly attenuated but remained statistically significant.

Table 5 HRs for developing type 2 diabetes according to quintiles of the dietary-pattern scores, EPIC-InterAct study (n = 21,616)

We used continuous variables of the dietary-pattern scores in a meta-analytical approach to investigate country heterogeneity in the association with diabetes. Figure 2 shows country-specific HRs (1-SD increment, model 4 adjustments) and combined estimates obtained from random-effects meta-analyses. The combined effect estimates did not indicate a meaningful association of aHEI and DASH with diabetes. We observed inverse associations of all three RRR scores with diabetes, although the combined HR for RRR2 did not reach statistical significance (combined HR [95% CI]: for RRR1 0.91 [0.86, 0.96]; RRR2 0.92 [0.84, 1.01]; RRR3 0.87 [0.82, 0.92]). There was moderate country heterogeneity for DASH (I 2 = 39.4%), RRR1 (I 2 = 47.8%) and RRR3 (I 2 = 52.2%), whereas higher I 2 values were observed for aHEI and RRR2 (>70%). Omitting single countries from the meta-analysis revealed that heterogeneity was mainly introduced by Spain for RRR1 (I 2 without Spain = 22.4%), by the two UK centres, Norfolk and Oxford, for RRR2 (I 2 without Norfolk and Oxford = 0%) and by Norfolk for RRR3 (I2 without Norfolk = 22.4%). For the aHEI and DASH scores, heterogeneity was not introduced by single countries.

Fig. 2
figure 2

HRs (95% CIs) for developing type 2 diabetes for a 1-SD increment in the dietary-pattern scores (a, AHEI; b, DASH; c, RRR1; d, RRR2; e, RRR3) stratified by country and meta-analysed using a random-effects model, EPIC-InterAct study (n = 21,616). Note that the scale of the x-axis is non-linear. Model 4 adjustments were applied (stratified by age and study centre [applicable for country-specific analyses only] and adjusted for sex, physical activity, smoking status, education, total energy intake, BMI and waist circumference). In (d) the German study population is labelled ‘Heidelberg’ because Potsdam was excluded since it was used in the derivation of RRR2

In subsequent meta-regression analyses, we investigated whether mean age and BMI were related to the country-specific HRs. We detected a significant inverse association between mean age and the HR for the aHEI (p = 0.0004, ESM Fig. 1). There were no clear associations between mean age and country-specific HRs for the DASH and RRR scores. Similarly, there was no clear association between mean BMI and country-specific HRs for any of the five dietary patterns. For the RRR scores we observed that certain centres introduced heterogeneity in the diabetes association and so we further explored the risk contributions and intake distributions of single food components in these centres. There was a clearly higher mean intake and wider distribution of reported wine consumption in Spain (mean 136 g/day, SD 239 g/day) compared with the overall EPIC-InterAct study population (mean 82 g/day, SD 160 g/day). When subtracting wine from the RRR1 score, the HR for Spain (0.95 [95% CI 0.87, 1.05]) was more comparable with that of the other EPIC countries. For RRR2, subtracting fruits resulted in more similar HRs for the UK centres (Norfolk 0.82 [95% CI 0.70, 0.97], Oxford 1.06 [95% CI 0.74, 1.53]) compared with other countries. For RRR3, the stronger association for Norfolk than for the other centres was not explainable by any single food component.

None of the sensitivity analyses resulted in a material change of the effect estimates. Also, undertaking analyses separately for men and women did not reveal appreciable differences (results not shown).

We sequentially subtracted components from the RRR scores to analyse their importance for diabetes (Table 6). The subtraction of coffee (CIE 3.3%), and also of processed meat (CIE 2.2%) and sugar-sweetened soft drinks (CIE 1.1%), weakened the observed association for the RRR1 score. Similarly, excluding sugar-sweetened soft drinks and processed meat (CIE 3.3%, respectively), but also fruits (CIE 4.3%), red meat (CIE 2.2%), legumes (CIE 1.1%) and white bread (CIE 1.1%) from the RRR2 score resulted in attenuated HRs. For RRR3, we observed slight attenuations in the HR after excluding honey/jam/sugar, processed meat, white bread and dressing sauces (CIE 1.1–3.4%). The results were materially the same when we repeated these analyses with adjustment for the subtracted component, respectively.

Table 6 Pooled HRs (95% CIs) for developing type 2 diabetes for a 1-SD increment in the RRR dietary-pattern scores and after alternate subtraction of each of its components; EPIC-InterAct study (n = 21,616)a


In this large European case-cohort study, the adherence to several RRR-derived dietary patterns was related to a lower risk of type 2 diabetes. There was no significant association between the aHEI or DASH dietary pattern and risk, independent of body size.

Our observation of a stronger relevance of the RRR dietary patterns for diabetes compared with the diet-quality scores aHEI and DASH is probably due to the fact that the RRR patterns were specifically derived to explain variation in diabetes-relevant biomarkers. The aHEI was originally created to predict chronic disease risk with a focus on cardiovascular disease and cancer [5], whereas the DASH diet was designed to lower blood pressure [31]. Still, some previous studies detected a significant inverse relation of these diet-quality scores to diabetes risk [1113]. It appears plausible that the RRR3 score showed the strongest risk relationship among the RRR dietary patterns because it was originally derived to explain variation in the HOMA-IR. Insulin sensitivity may be more closely linked to diabetes risk than inflammation or dyslipidaemia, which were the responses used to derive the other two RRR dietary patterns.

We observed important similarities between the three RRR dietary patterns with regard to their dietary components. Most notably, processed meat and sugar-sweetened soft drinks loaded negatively on all three patterns. In addition, excluding these components from the scores led to an attenuation of the HRs. These findings are supported by recent meta-analyses that showed that higher consumption of processed meat [37] and sugar-sweetened beverages [38] is associated with development of type 2 diabetes. Furthermore, white bread or refined grains constituted important components of all three diabetes-related dietary patterns in our study. Notably, the RRR patterns also showed differences in their composition, which resulted from the use of different responses, reflecting different pathomechanisms. Despite these differences, an association of all three RRR patterns with diabetes appears plausible given that distinct metabolic pathways are involved.

The RRR3 dietary pattern is also characterised by high intakes of dressing sauces and honey/jam/sugar, which might seem surprising. These foods may not be causally related to diabetes risk but may rather represent markers of other foods with which they are consumed [19]. McNaughton et al emphasised correlations of salad dressings with salad vegetable intake and of jam consumption with wholegrain bread in the British Whitehall II study [19]. Similarly, we observed a correlation between intake of dressings and vegetables in EPIC-InterAct (r = 0.23). The unavailability of specific intake data for wholegrain bread in the individual EPIC countries precluded us from further evaluating whether jam/honey may be a marker of this food in our study. Using non-white bread as an alternative revealed correlations in specific countries (Denmark, Netherlands, UK; r = 0.11–0.23). Furthermore, it may appear counterintuitive that legumes score negatively on the EPIC-Potsdam-derived RRR2 pattern. A possible explanation, in this German population, is that legumes were mostly consumed in the form of stew, often accompanied by processed meat [18].

Our study did not confirm earlier findings of significant inverse associations of the aHEI [11, 12, 39] and the DASH score [12, 13] with diabetes risk after adjustments including body size. Of note, these earlier studies were all performed in American settings. Liese et al observed different risk relations for the DASH score between white and black/Hispanic populations [13]. Therefore, it can be speculated that relations between certain dietary-pattern scores and diabetes risk are somewhat population-specific, possibly because of different distributions in dietary intakes. Furthermore, the scores were not created identically across the studies. Our aHEI score did not consider trans-fatty-acid intake or multivitamin supplement use. However, this should not explain our null finding because recent studies on trans-fatty-acid biomarkers do not support a direct association with diabetes risk [4043] and including multivitamin supplement use in the aHEI did not materially change our results (data not shown). Still, our findings do not exclude the possibility that adherence to the aHEI and DASH diet lowers diabetes risk, at least in some individuals. As we had to rely on self-reported dietary intakes, measurement error may have attenuated the observed statistical associations [44]. Furthermore, a recent meta-analysis of intervention studies suggests that the DASH diet can improve insulin sensitivity independent of weight loss [45].

We detected some degree of heterogeneity between EPIC countries in the association of the dietary patterns with diabetes. Reasons for this heterogeneity may include differences in dietary assessment tools, distributions of dietary intake and confounders as well as general cohort characteristics. This may explain the somewhat divergent results for the Oxford cohort, which includes many vegans, lacto-ovo vegetarians and other health-conscious people.

We aimed to explore sources for this heterogeneity between countries. Meta-regression analyses revealed an inverse association of country-specific mean age with the country-specific HRs for the aHEI. A similar observation was also made for the Mediterranean dietary pattern in EPIC-InterAct [9].

For the three RRR dietary patterns, single centres were responsible for heterogeneity in the association with diabetes risk. Descriptive analyses revealed a clearly higher mean intake and wider distribution of reported intake of wine in Spain, which probably explained the absence of an inverse risk relation of the RRR1 dietary-pattern score for Spain. Because a lower risk for diabetes has especially been observed in the moderate range of alcohol intake [46], a high wine intake at the population level may exert a detrimental rather than a beneficial effect on risk. Indeed, when subtracting wine from the RRR1 score, the effect estimate for Spain was more comparable with the other EPIC countries. For RRR2, heterogeneity was mainly introduced by the two UK centres. When investigating single RRR2 components, we found that subtracting fruits from the RRR2 score resulted in more similar effect estimates for the UK centres compared with the other countries. This agrees with an earlier investigation of EPIC-InterAct, which reported a significant inverse association of fruits intake with diabetes only for the UK [47]. For RRR3, which was originally derived in the British Whitehall II study, we observed a clearly stronger association in the British Norfolk cohort compared with the other EPIC countries. Similarly, RRR2, which was derived in the German EPIC-Potsdam cohort, showed the strongest association in the German EPIC-Heidelberg cohort in our study. It appears plausible that associations between specific dietary patterns and disease may be better generalisable to populations with comparable dietary habits and intake distributions. Consistent with this, an investigation of the American Framingham Offspring Study on the generalisability of RRR dietary patterns associated with diabetes risk found relatively good generalisability for the American NHS-derived RRR1 dietary pattern, whereas the risk association for the European-derived RRR2 and RRR3 dietary patterns was much weaker [21]. However, such comparisons of studies are complicated by the application of different dietary questionnaires that are specific to the regional dietary habits and the use of different food groupings. In our study, we observed overall relatively good reproducibility of inverse associations between RRR-derived dietary patterns and diabetes risk, even for the American NHS-derived RRR1 dietary pattern.

Major strengths of our study include the prospective design and the large number of incident cases of type 2 diabetes. The EPIC study was designed to include countries from various areas in Europe, which enabled us to study heterogeneous populations with wide variations in dietary habits, as also reflected by the country differences in adherence to the dietary patterns. A major limitation is that dietary intake was assessed with self-reported questionnaires. Imprecision in the estimated dietary intakes may have led to an attenuation of the association between dietary patterns and diabetes [44]. Further, although our dietary questionnaires showed reasonable validity [28, 29], differential misreporting (a common problem in nutritional epidemiologic studies) may have distorted our findings. However, there was no apparent change in our results when we excluded participants in the top or bottom 1% of the energy intake/energy requirement ratio.

In conclusion, this study on the verification of relations of predefined dietary patterns to diabetes risk suggests that diet quality is of high relevance for primary prevention of type 2 diabetes. We were able to confirm findings from earlier prospective studies showing that adherence to specific RRR-derived dietary patterns, commonly characterised by high intake of fruits or vegetables and low intake of processed meat, sugar-sweetened beverages and refined grains, may lower risk of type 2 diabetes. However, our results do not support existing scientific evidence proposing protective effects of adherence to the aHEI and DASH diet on diabetes risk independent of body size.