Type 2 diabetes (T2D) affects approximately 26 million people in the United States and 366 million people worldwide, and the number will reach an estimated 552 million worldwide by 2030 [1]. Further, management of diabetes and its complications, such as cardiovascular disease, imposes enormous medical and economic burdens [2]. Therefore, primary prevention of T2D has become a public health imperative.

Because of their high content of calcium, magnesium, vitamin D, whey protein and specific fatty acids, dairy products have been suggested to lower risk of T2D [3]. Experimental studies indicated that whey protein has insulinotropic and glucose-lowering properties [4]. Several epidemiologic studies, but not all, have suggested that dairy or calcium consumption was associated with lower risks for hypertension and coronary artery disease [5]. On the other hand, saturated fats in some dairy products might offset the benefits of the potentially protective dairy components [6], whereas other median chain saturated and ruminant trans fatty acids could reduce risk [7]. Total dairy product intake was associated with marginally significantly lower risk of T2D in a recent meta-analysis of prospective cohort studies [8]. However, three large Harvard cohorts, Health Professionals’ Follow-up Study (HPFS), Nurses’ Health Study (NHS) and II, accounting for 38% of participants included in the meta-analysis, have accumulated up to 12 additional years of follow-up since initial publications and the analyses have not been updated. Furthermore, the meta-analysis has insufficient data to evaluate most individual dairy subtypes robustly, especially yogurt, which has been linked to lower obesity and may influence gut microbiota through probiotics [9]. In a previous study [10], yogurt consumption was associated with the least weight gain among all the food types in our cohorts. To address these questions, we aimed to update our previous analyses of total dairy and T2D risk with longer duration of follow-up in the three large cohorts and then assess subtypes of dairy consumption in detail. We also conducted an updated meta-analysis of our results from these three cohorts and previous literature.


Study population

We used data from three prospective cohort studies: HPFS, NHS and NHS II. The HPFS was composed of 51,529 US male dentists, pharmacists, veterinarians, optometrists, osteopathic physicians and podiatrists, 40- to 75-years old, who returned a baseline questionnaire that inquired about detailed medical history, as well as lifestyle and usual diet in 1986. The NHS was initiated in 1976, when 121,700 female registered nurses, 30- to 55-years old, who lived in one of eleven states, completed a baseline questionnaire about their lifestyle and medical history. The NHS II was established in 1989 and consisted of 116,671 younger female registered nurses, 25- to 42-years old, who responded to a baseline questionnaire similar to the NHS questionnaire. Detailed descriptions of the three cohorts have been published elsewhere [11]-[13]. In all three cohorts, questionnaires were administered at baseline and biennially thereafter, to collect and update information on lifestyle practice and occurrence of chronic diseases. The follow-up rates of the participants in these cohorts were all >90%. In the current analysis, we excluded men and women who had diagnoses of diabetes (including type 1 and type 2 diabetes and gestational diabetes only), cardiovascular disease, or cancer at baseline (1986 for HPFS, 1980 for NHS, and 1991 for NHS II, when we first assessed diet in these cohorts) [14]. In addition, we excluded participants who left >70 of the 131 food items blank on the baseline food frequency questionnaire (FFQ) or who reported unusual total energy intakes (that is, daily energy intake <800 or >4,200 kcal/day for men and <500 or >3,500 kcal/day for women). We also excluded participants without baseline information on dairy consumption or follow-up information on diabetes diagnosis date. After exclusions, data from 41,479 HPFS participants, 67,138 NHS participants and 85,884 NHS II participants were available for analysis. The study protocol was approved by the institutional review boards of Brigham and Women’s Hospital and Harvard School of Public Health. The completion of the self-administered questionnaire was considered to imply informed consent.

Assessment of dairy consumption

In 1980, a 61-item FFQ was administered to the NHS participants to collect information on their usual intake of foods and beverages in the previous year. In 1984, 1986, 1990, 1994, 1998 and 2002, similar but expanded 131-item FFQs were sent to these participants to update their diet records. With the use of the expanded FFQ used in the NHS, dietary data were collected in 1986, 1990, 1994, 1998 and 2002 from the HPFS participants, and in 1991, 1995, 1999 and 2003 from the NHS II participants.

In all FFQs, we asked the participants how often, on average, they consumed each food of a standard portion size. There were nine possible responses, which ranged from `never or less than once per month’ to `6 or more times per day’. Nutrient intake was calculated by multiplication of the frequency of consumption of each food by the nutrient composition in the standard portion size of that food and then summing up the nutrient intake from all relevant food items. The food composition database was created primarily from USDA sources [15]. Questionnaire items on dairy products included `skim/low fat milk’, `whole milk’, `ice cream’, `yogurt’, `cottage/ricotta cheese’, `cream cheese’, `other cheese’, `cream’. From 1994 in NHS and HPFS and 1995 in NHS II, yogurt consumption was separated into two items, `plain yogurt’ (plain or with NutraSweet) and `flavored yogurt’ (without NutraSweet). The standard serving size was an 8 oz. glass for skim, low fat milk, or whole milk, 1 Tbs for cream, sour cream, ½ cup for sherbet or frozen yogurt, ice cream, cottage or ricotta cheese, 1 oz. for cream cheese or other cheese. The reproducibility and validity of these FFQs have been shown in detail elsewhere [16]-[20]. The correlation coefficients between FFQ and multiple dietary records were 0.62 both for low-fat dairy foods and for high-fat dairy foods [20] and ranged from 0.57 for hard cheese to 0.97 for yogurt regarding various dairy products intakes [16].

Assessment of covariates

In the biennial follow-up questionnaires, we inquired about and updated information on risk factors for chronic diseases, such as body weight, cigarette smoking, physical activity, medication use and family history of diabetes, as well as history of chronic diseases, including hypertension and hypercholesterolemia. Among NHS and II participants, we ascertained menopausal status, postmenopausal hormone use and oral contraceptive use.

Assessment of diabetes

A supplementary questionnaire about symptoms, diagnostic tests, and hypoglycemic therapy was mailed to participants who reported that they had received a diagnosis of diabetes. In accordance with National Diabetes Data Group criteria, a case of T2D was considered confirmed if at least one of the following was reported on the supplementary questionnaire [21]: 1) one or more classic symptoms (excessive thirst, polyuria, weight loss, hunger) and fasting plasma glucose concentrations ≥7.8 mmol/L or random plasma glucose concentrations ≥11.1 mmol/L; 2) ≥2 elevated plasma glucose concentrations on different occasions (fasting concentrations ≥7.8 mmol/L, random plasma glucose concentrations ≥11.1 mmol/L, and/or concentrations of ≥11.1 mmol/L after ≥2 hours shown by oral-glucose-tolerance testing) in the absence of symptoms; or 3) treatment with hypoglycemic medication (insulin or oral hypoglycemic agent). The diagnostic criteria were changed by the American Diabetes Association in June 1998, and the threshold for the diagnosis of diabetes became a fasting plasma glucose of 7.0 mmol/L, instead of 7.8 mmol/L [22]. Only cases confirmed by the supplemental questionnaires were included. The validity of the supplementary questionnaire for the diagnosis of diabetes has been documented previously. Of the 59 T2D cases in HPFS and 62 cases in NHS who were confirmed by the supplementary questionnaire, 57 (97%) and 61 (98%) were reconfirmed by medical records [23],[24]. Deaths were identified by reports from next of kin or postal authorities, or by searching the National Death Index. At least 98% of deaths among the study participants were identified [10].

Statistical analysis

We calculated each individual’s person-years from the date of return of the baseline questionnaire to the date of diagnosis of T2D, death, or the end of the follow-up (31 January 2010 for HPFS, 30 June 2010 for NHS or 30 June 2009 for NHS II), whichever came first. We used time-dependent Cox proportional hazard regression to estimate the hazard ratio (HR) for dairy consumption in relation to the risk of T2D. Our basic model (Model 1) simultaneously controlled for age, calendar time with updated information at each two-year questionnaire cycle, body mass index (BMI), and total energy intake. Model 2 also adjusted for various potential confounding factors, including race, smoking, physical activity, alcohol consumption, menopausal status and menopausal hormone use (NHS and II participants only), oral contraceptive use (NHS II participants only), family history of diabetes and diagnosed hypertension or hypercholesterolemia at baseline. Model 3 further adjusted for trans-fat, glycemic load, and intakes of red and processed meat, nuts, sugar-sweetened beverages (SSBs) and coffee. For individual dairy foods, we additionally adjusted for other types of dairy in model 3.

We used the cumulative average of dietary intakes from baseline to the censoring events in order to best represent long-term diet and minimize within-person variation [14]. In our primary analysis we stopped updating dietary intake when participants developed coronary heart disease, stroke or cancer because changes in diet after development of these conditions may confound the relationship between diet and diabetes [14],[25]. We conducted a further analysis by stopping updating dietary information after self-reported diagnosis of hypertension and hypercholesterolemia during the follow-up because these diagnoses appeared to alter consumption of dairy products (see Results).

Proportional hazards assumption was tested with a time dependent variable with the inclusion of an interaction term between the dairy intake and months to events (P >0.05 for all tests). To test for linear trend, the median value was assigned to each quintile and this value was modeled as a continuous variable. All the analyses were conducted separately in each cohort, and we also conducted meta-analyses to summarize the estimates of association across the three studies. No significant heterogeneities were shown when the results were pooled across the three cohorts; therefore, fixed-effect models were used. All statistical tests were two-sided and performed using SAS version 9.2 for UNIX (SAS Institute Inc, Cary, NC, USA).

Updated meta-analysis on dairy products and risk of incident T2D

We further conducted an updated meta-analysis that incorporated our new results from the three cohorts into the findings of previous studies. This meta-analysis was conducted following a review protocol [26]. For study selection, we included prospective studies with cohort, case cohort or nested case-control design investigating the association between intake of dairy products and risk of T2D. The two recent meta-analyses involved a search of the literature up to March 2013 [27] to June 2013 [8]. Thus, we conducted additional literature searches on MEDLINE [28] and EMBASE [29] from June 2013 to October 2013 [see Additional file 1]. In studies that reported the intakes by grams, we used 177 g as a serving size for total dairy products, and 244 g as a serving size for milk and yogurt intake to recalculate the intakes to a common scale (servings/day).


We documented a total of 15,156 cases of incident diabetes mellitus (DM), including 3,364 cases during a maximum of 24 years of follow-up in the HPFS, 7,841 cases during a maximum of 30 years in the NHS, and 3,951 cases during a maximum of 16 years in the NHS II. For both men and women, total dairy intake was inversely associated with smoking, hypertension and hypercholesterolemia, but positively associated with physical activity, and fruit and vegetable intakes (Table 1). Different types of dairy products were moderately correlated (Spearman correlation coefficients from -0.13 to 0.27 in the three cohorts).

Table 1 Baseline age-adjusted characteristics of participants in the three cohorts according to quintile of total dairy consumption a

Total dairy consumption was not associated with risk of T2D in age- and multivariate-adjusted models across the three cohorts (all P for trend >0.05), as shown in Table 2. In the pooled analysis of estimates from the three studies that used fixed-effect models, in the age-, BMI- and energy-adjusted model, one serving/day increment of dairy consumption was significantly associated with a 4% lower risk (95% confidence interval (CI): 2%, 6%); however, further adjustment for lifestyle and other dietary factors attenuated the association to null with the HR of a one serving/day increase of 0.99 (95% CI: 0.98, 1.01). The cohort-specific and combined spline analyses (Figure 1) based on multivariate models also indicated a null association between total dairy consumption and T2D risk. No interactions of total dairy consumption with age, BMI, vitamin D level, physical activity level and diabetes family history were observed [see Additional file 1: Table S1].

Table 2 HRs (95% CI) of type 2 diabetes risk according to quintile of total dairy intake in HPFS, NHS I and NHS II
Figure 1
figure 1

Dose-response relationship between total dairy intake and risk of type 2 diabetes in HPFS, NHS I and NHS II using multivariate model. a) NHS. b) NHS II. c) HPFS. d) Pooled. HPFS, Health Professionals Follow-up Study; NHS, Nurses Health Study.

When we examined the association with dairy products categorized by their fat contents, there were no significant associations between either low-fat or high-fat dairy intake and risk of T2D (Table 3). The associations between individual types of dairy products and risk of T2D were further assessed, as shown in Table 4. All subtypes of dairy products were mutually adjusted for each other in the multivariate models. In the pooled analysis of estimates from the three studies that used fixed-effects models, each one serving/day increase of skim milk, cheese and whole milk was associated with a 2% (95% CI: -1%, 4%), 7% (95% CI: 3%, 11%) and 10% (95% CI: 4%, 16%) higher risk of T2D, respectively (All P for trend <0.05). Conversely, greater yogurt and ice cream intakes were significantly associated with lower risk of T2D with an HR of 0.83 (95% CI: 0.75, 0.92) and 0.78 (95% CI: 0.71, 0.86), respectively. No significant interaction of yogurt consumption with baseline BMI was observed [see Additional file 1: Table S3]. In addition, higher consumption of either plain (HR for each serving: 0.96, 95%CI: 0.88, 1.06) or flavored yogurt (HR: 0.88, 95%CI: 0.77, 1.01) was associated with a nonsignificantly lower risk of T2D.

Table 3 Relative risk (RR) of type 2 diabetes among men according to low-fat versus high-fat dairy food intake
Table 4 Multivariate relative risk (RR) of type 2 diabetes among men and women according to intakes of dairy foods

We conducted a further analysis by additionally stopping updating dietary information after self-reported diagnosis of hypertension or hypercholesterolemia during the follow-up as the consumption of ice-cream was decreased but consumption of skim milk was increased after diagnosis of hypertension or hypercholesterolemia in our three cohorts [see Additional file 1: Table S4]. As shown in Table 5, the significant associations between skim milk, cheese, whole milk and risk of T2D became null, with the corresponding HRs of 1.01 (95%CI: 0.99, 1.03), 1.03 (95%CI: 0.99, 1.07) and 1.03 (95%CI: 0.99, 1.07), respectively (all P-trend >0.05). The inverse association between ice cream and T2D risk attenuated with an HR of 0.89 (95%CI: 0.83, 0.96), although still significant. On the contrary, the inverse association between yogurt intake and risk of T2D remained significant with an HR of 0.86 (95%CI: 0.78, 0.94) for one serving per day increment.

Table 5 Multivariate relative risk (RR) of type 2 diabetes among men and women according to specific dairy foods using different methods of updating diets

By incorporating our new results from the three cohorts together with the findings of previous studies, we conducted an updated meta-analysis. Our updated search on MEDLINE and EMBASE found 513 potential citations, of which one study [30] met the inclusion criteria, in addition to the citations in the two previous meta-analyses. Therefore, a total of eleven prospective studies [30]-[40] for total dairy and six [32]-[34],[36]-[38] for yogurt were included in our updated meta-analysis, along with results from our current analysis. The characteristics of the included studies are shown in Additional file 1: Table S3. Total dairy intake was not significantly associated with risk of T2D whereas yogurt intake was associated with a significantly lower risk of T2D, as shown in Figures 2 and 3. Significant heterogeneity was shown for both total dairy (I 2 = 58.8%; P = 0.003) and yogurt (I 2 = 63.2%; P = 0.005). The RRs (95% CIs) from the random-effects model for one serving of total dairy intake and one serving/day yogurt intake were 0.98 (0.96, 1.01) and 0.82 (0.70, 0.96), respectively. The RRs (95% CIs) from the fixed-effects model for one serving/day of total dairy intake and one serving/day yogurt intake were 0.99 (0.98, 1.00) and 0.84 (0.78, 0.90), respectively.

Figure 2
figure 2

HRs for a serving total dairy consumption per day and type 2 diabetes. The RR of each study is represented by a square, and the size of the square represents the weight of each study of the overall estimate. The 95% CIs are represented by the horizontal lines, and the diamond represents the overall estimate and its 95% CI. HPFS, Health Professional Follow-Up Study; NHS, Nurses’ Health Study. CI, confidence interval; HRs, hazard ratios; RR; relative risk.

Figure 3
figure 3

HRs for a serving yogurt consumption per day and type 2 diabetes. The RR of each study is represented by a square, and the size of the square represents the weight of each study of the overall estimate. The 95% CIs are represented by the horizontal lines, and the diamond represents the overall estimate and its 95% CI. HPFS, Health Professional Follow-Up Study; NHS, Nurses’ Health Study. CI, confidence interval; HRs, hazard ratios; RR, relative risk.

After two studies [30],[31] that did not adjust for total energy intake and other main dietary confounders were excluded, the heterogeneity for total dairy decreased (I 2 = 35.6%; P = 0.11); additionally, heterogeneity for yogurt was reduced when one study [36] was removed (I 2 = 40.7%; P = 0.063). Both exclusions did not change the summary RRs materially. On the basis of a funnel plot [see Additional file 1: Figure S2] and Begg’s test, no significant publication bias was shown for the association between total dairy (P = 0.19) or yogurt (P = 0.92) intake and risk of T2D.

We also classified studies according to duration of follow-up as short-term (≥10 years) or long-term (>10 years). Total dairy consumption was marginally significantly associated with a lower T2D risk in the short-term studies (RR: 0.95, 95%CI: 0.91, 1.00) but not in the long-term studies (RR: 1.00, 95%CI: 0.98, 1.02). Yogurt consumption was associated with a lower T2D risk in both short-term (RR: 0.86, 95%CI: 0.69, 1.08) and long-term studies (RR: 0.76, 95%CI: 0.58, 0.98).


In three prospective cohorts of US men and women, we found that intakes of total dairy products were not significantly associated with the risk of T2D, but higher consumption of yogurt was significantly associated with a lower risk. An updated meta-analysis of our cohorts and published literature suggest a marginally lower risk of T2D with higher dairy consumption and a consistent inverse association between yogurt consumption and T2D risk.

Dairy is a complex food with many bioactive compounds that have divergent health effects, and its association with T2D has attracted much attention [5]. Our results on total dairy intake and T2D risk are consistent with some [33],[34],[36],[37],[40], but not all previous studies. Total dairy consumption was associated with a lower risk of T2D in our earlier investigations in HPFS [11] and NHS [12], but not in NHS II [13]. The reason for the discrepancy between our earlier and current results is probably due to longer follow-up (10 more years) of the NHS and HPFS cohorts, and our meta-analysis suggests that potential benefits of dairy were less evident with long-term follow-up. For yogurt consumption, we observed a consistent and robust inverse association with T2D in our cohorts and the meta-analysis. A previous meta-analysis [8] reported a similar but nonsignificant risk estimate of development of T2D associated with 200 g yogurt consumption with an RR of 0.78 (95%CI: 0.60, 1.02). Our updated meta-analysis suggested that each one serving/day yogurt increase was significantly associated with a 18% lower risk. Surprisingly, consumption of ice cream was inversely associated with T2D risk whereas skim milk was associated with higher T2D risk; however, these associations either became null or attenuated in further analysis when we stopped updating dietary information after self-reported diagnosis of hypertension or hypercholesterolemia during the follow-up. Since consumption of ice cream was decreased but consumption of skim milk was increased after diagnosis of hypertension or hypercholesterolemia in our three cohorts [see Additional file 1: Table S4], reverse causation may explain the findings that did not take into account changes in diet after diagnosis of these conditions.

Certain components in dairy products, such as calcium, vitamin D, magnesium, lactose and dairy protein, have been suggested to have a favorable impact on metabolic factors, including body weight, hypertension [41],[42] and glucose homeostasis [43]. Calcium supplement has been showed to have a small but significant reduction in body weight over a placebo in a recent meta-analysis [44] of seven trials, but the largest study [45] included in the meta-analysis did not find a significant effect of two-year calcium supplement use compared to the placebo. Conjugated linoleic acid, created by bacteria in the gut of ruminants, has been shown to reduce body weight in animals [46]. However, findings from randomized trials did not provide clear support for a role of dairy products in weight reduction [47]. Milk proteins, such as whey, may have insulinotropic properties with a relatively low glycemic load (GL), which may improve glucose tolerance [48]. Circulating trans-palmitoleate concentrations [7] have been inversely associated with insulin resistance, atherogenic dyslipidemia and incident diabetes. Whole-fat dairy product consumption was strongly associated with higher trans-palmitoleate which may offset the unfavorable effect of saturated fat in high-fat dairy product intake.

Several mechanisms may explain the inverse association between yogurt intake and risk of T2D. Probiotic bacteria have been shown to improve lipid profile and antioxidant status in T2D patients [49],[50] and have beneficial effects on cholesterol levels [51]. In addition, our previous study [10] of the three cohorts showed that increased consumption of yogurt was inversely associated with weight gain. However, adjusting for BMI in the multivariate model did not alter the inverse association between yogurt intake and T2D risk.

The strengths of the current study include a large sample size, high rates of follow-up and repeated assessments of dietary and lifestyle variables. The current study was subject to several limitations as well. First, our study populations primarily consisted of health professionals of European ancestry. Although the homogeneity of socioeconomic status helps reduce confounding, the observed associations may not be generalizable to other populations. However, the relatively high educational status is an advantage because high quality and reliable data can be collected from our study participants. Second, because diet was assessed by FFQs, some measurement error of dairy intake assessment is inevitable. However, the FFQs used in these studies were validated against multiple diet records, and reasonable correlation coefficients between these assessments of dairy intake were observed. Moreover, we calculated cumulative averages for dietary variables to minimize the random measurement error caused by within-person variation and to accommodate diet changes over time. Nonetheless, since we did not specifically assess types or brands of yogurt consumed by the participants, it is difficult to attribute the observed benefits to various components of yogurt. Lastly, because of the observational nature of our cohorts, the observed associations do not necessarily mean causation; although we adjusted for established and potential risk factors for T2D, unmeasured and residual confounding is still possible. This is especially true for yogurt consumption, which is typically associated with a healthy diet and lifestyle.


We found that higher intake of yogurt is associated with a reduced risk of T2D, whereas other dairy foods and consumption of total dairy are not appreciably associated with incidence of T2D. The consistent findings for yogurt suggest that it can be incorporated into a healthy dietary pattern. However, randomized clinical trials are warranted to further examine the causal effects of yogurt consumption as well as probiotics on body weight and insulin resistance.

Authors’ contributions

MC, QS, EG, DM, JEM, WCW, and FBH conceived and designed the experiments. MC, QS, EG, DM, JEM, WCW, and FBH performed the experiments. MC analyzed the data. MC, QS, EG, DM, JEM, WCW, and FBH contributed reagents/materials/analysis tools. MC wrote the first draft of the manuscript. MC, QS, EG, DM, JEM, WCW, and FBH contributed to the writing of the manuscript. All authors revised the manuscript for important intellectual content. All authors read and approved the final manuscript.

Additional file