Physical activity and incident type 2 diabetes mellitus: a systematic review and dose–response meta-analysis of prospective cohort studies

Aims/hypothesis Inverse associations between physical activity (PA) and type 2 diabetes mellitus are well known. However, the shape of the dose–response relationship is still uncertain. This review synthesises results from longitudinal studies in general populations and uses non-linear models of the association between PA and incident type 2 diabetes. Methods A systematic literature search identified 28 prospective studies on leisure-time PA (LTPA) or total PA and risk of type 2 diabetes. PA exposures were converted into metabolic equivalent of task (MET) h/week and marginal MET (MMET) h/week, a measure only considering energy expended above resting metabolic rate. Restricted cubic splines were used to model the exposure–disease relationship. Results Our results suggest an overall non-linear relationship; using the cubic spline model we found a risk reduction of 26% (95% CI 20%, 31%) for type 2 diabetes among those who achieved 11.25 MET h/week (equivalent to 150 min/week of moderate activity) relative to inactive individuals. Achieving twice this amount of PA was associated with a risk reduction of 36% (95% CI 27%, 46%), with further reductions at higher doses (60 MET h/week, risk reduction of 53%). Results for the MMET h/week dose–response curve were similar for moderate intensity PA, but benefits were greater for higher intensity PA and smaller for lower intensity activity. Conclusions/interpretation Higher levels of LTPA were associated with substantially lower incidence of type 2 diabetes in the general population. The relationship between LTPA and type 2 diabetes was curvilinear; the greatest relative benefits are achieved at low levels of activity, but additional benefits can be realised at exposures considerably higher than those prescribed by public health recommendations. Electronic supplementary material The online version of this article (doi:10.1007/s00125-016-4079-0) contains peer-reviewed but unedited supplementary material, which is available to authorised users.


Introduction
High fasting plasma glucose was recently ranked as the fifth leading risk for death [1] and 6.8% of global excess mortality was attributed to diabetes [2]. Prevalence of this metabolic disorder is predicted to reach nearly 600 million cases by 2035 [3], posing both a substantial morbidity and mortality burden and a large financial cost on individuals and healthcare systems [4,5].
Evidence on the effects of physical activity (PA) on risk of diabetes arises from interventional [6][7][8][9] and observational studies [10][11][12][13][14]. Prevention trials conducted in patients with impaired glucose tolerance provide some understanding of the extent to which PA may confer a preventive effect on progression to type 2 diabetes in high-risk populations [6][7][8][9]15]. However, the majority of these studies include both diet and PA interventions, and isolation of the impact of PA itself is rarely possible. It is also difficult to evaluate the benefit of the whole PA exposure continuum from trials, as most intervention studies focus on shifting participants' behaviours towards the recommended levels of exercise rather than assessing the benefits of changes at the lowest ends of the normal PA spectrum, or the additional benefits gained at the highest level. Therefore, although associated with a higher risk of confounding, evidence from cohort studies in the general population can provide complementary evidence of the dose-response relationship between PA and diabetes, independent of diet.
Public health guidelines [16,17] recommend a minimum of 150 min of moderate to vigorous PA (MVPA) or 75 min vigorous PA (VPA) a week to maintain general health. Selfreport data suggest that around a third of adults globally are not meeting these targets [18]. A fundamental consideration in the formulation of PA guidelines, however, is the nature of the dose-response relationship between PA and noncommunicable disease incidence.
Dose-response curves for PA and health outcomes, ranging from cardiovascular disease to all-cause mortality, suggest a non-linear dose-response shape [19][20][21][22][23][24], often with large gains when low activity is compared with completely sedentary but much smaller additional benefits beyond that. A recent review suggested a non-linear relationship between PA and diabetes. However, it found differently shaped dose-response curves based on the different ways in which PA was reported in the original studies [25]. Each of the dose-response analyses only included a small portion of the total studies available in this area of research, owing to a lack of data harmonisation and leaving considerable uncertainty about the relative risk for any given exposure since not all of the evidence could be considered.
Providing quantitative estimates regarding the doseresponse relationship is essential for approximating how changes in levels of PA in the general population would impact disease incidence, and would support more nuanced guidance to the public and evidence-based dialogue in clinical settings.
Calculating the dose of PA is associated with considerable uncertainty and can be achieved using a variety of methods. In deciding how to equate activities of varying intensity, one issue is whether to include the resting metabolic rate. In this review we investigate the dose-response relationship between PA and type 2 diabetes via a systematic review and doseresponse meta-analysis. We report results quantifying PA dose, both via inclusion and exclusion of the resting metabolic rate in the summation of PA volume.

Methods
Search strategy PubMed and EMBASE were searched for prospective cohort studies on the association between PA and type 2 diabetes using a combination of medical subject heading (MeSH) and indexed terms (details in electronic supplementary material [ESM] Fig. 1). Search filters for observational studies were applied to refine the search output. The reference list of past systematic reviews were manually searched for further studies [26][27][28][29][30][31][32]. No restrictions on date of publication were set and new results were included up until December 2015.
Eligibility criteria Prospective studies were included if they: (1) followed a cohort of adults; (2) excluded individuals with type 2 diabetes at baseline; (3) ascertained levels of leisuretime PA (LTPA) or total PA at baseline; and (4) reported RRs, ORs or HRs for incidence of type 2 diabetes. Exclusion criteria were: (1) studies which reported insufficient detail of PA assessment to estimate PA dose in metabolic equivalent of task (MET) h/week; (2) studies using measures of fitness as the exposure; (3) studies reporting PA as a dichotomous variable; and (4) duplicate data.
Two researchers (ADS and BR-S) screened titles and abstracts for eligibility according to the pre-specified criteria. When eligibility was ambiguous, the full text was retrieved. To ensure no duplicate data were included, cohort name, recruitment periods or protocols were compared, and only the most complete publication was included. A third researcher (O. Olayinka, London School of Hygiene and Tropical Medicine, London, UK) assessed the identified articles and any disagreements were discussed until consensus was reached. A breakdown of the literature search is shown in ESM Fig. 2.
Data extraction and exposure harmonisation Data were extracted (by ADS) from eligible studies on first author, publication date, geographical location, cohort size, sex and age characteristics, cumulative incidence or incidence rate of type 2 diabetes, case count per category of PA exposure, total persons or person-years per PA category, method and unit of PA assessment, reported levels of PA exposure, ORs/RRs/HRs for type 2 diabetes with 95% CIs for each PA category, and covariates for which the analyses were adjusted. Overall study quality score was derived using the Newcastle Ottawa Scale (NOS); inter-rater reliability (between ADS and O. Olayinka) was 86% (full NOS results are shown in ESM Table 1).
In prospective studies where HRs or ORs for type 2 diabetes were reported, we assumed these approximated the RR [33]. We pooled the most adjusted risk estimates both including and excluding adjustment for BMI. Initially we harmonised group-level exposure estimates to the common unit of MET h/week, thereby allowing integration of activities differing in intensity and duration amassed over the course of a week. For the assignment of specific intensities to categories of PA exposure, average intensity of MVPA and VPA was defined as 4.5 and 8 METs (or 3.5 and 7 marginal METs [MMETs]), respectively [34]. Studies reporting data independently for men and women [35][36][37][38][39] or for multiple cohorts within a study [35] were treated as separate observations. Studies reporting risk estimates relative to the highest category of PA were re-calculated to set the lowest PA [36,[40][41][42] category as the referent [43].
When not directly reported, classic PA volume (MET h/week) was calculated by multiplication of the median or mid-point duration of the reported category with its assigned gross MET value. Open-ended categories for average LTPA duration were converted to point estimates by assuming that the median of the open-ended category was equidistant from the lower category boundary as half the interval width in the neighbouring category [44]. For one study that reported PA as PA level (PAL, a measure of energy expenditure expressed as a multiple of 24 h resting metabolic rate), an approximation of LTPA MET h/week was performed using descriptions of typical PA levels for each category [45]. If PA was reported only as frequency of sessions per week, a single session was assumed to consist of 45 min in the main analysis with an assumption of 30 min tested in sensitivity analysis. Likewise, if only average duration for PA (e.g. walking, cycling) was reported, we assumed this was undertaken at an intensity of 4.5 METs. Marginalised PA volume (MMET h/week) was calculated by discounting the resting metabolic rate of 1 MET in the quantification of PA intensity. An overview of dose assignment calculations is shown in ESM Table 2. For summary data, we subtracted 1 MET h from each 1 h increment over which total reported activity was performed. When the required data were not reported in the original articles we emailed authors from the identified cohorts to acquire further details, e.g. on duration of PA and number of type 2 diabetes cases for each PA exposure category. Following correspondence, updated follow-up data [11,13] and further details on PA behaviour [11,38,46,47] were obtained.
Statistical analysis Generalised least-squares (GLS) regression was performed to estimate study-specific dose-response curves. GLS regression estimates the linear dose-response coefficients taking into account the covariance for each exposure category within each study, as they are estimated relative to a common referent PA exposure category [48,49]. Studyspecific dose-response coefficients were pooled using the DerSimonian-Laird estimator in a random-effects model [50]. First, a linear association was assumed; study-specific RR estimates were calculated per 10 MET h/week increment and subsequently pooled. Two cohorts [51,52] did not provide sufficient data to be included in this model. However, variance-weighted least-squares regression analysis was used to estimate linear associations for both of these studies, allowing us to quantify the influence of excluding these on the overall effect estimates.
Sensitivity analyses were conducted by consecutive removal of individual studies from the summary risk estimate and via restriction to high-quality studies. The impact of duration and intensity assumptions (when necessary) was assessed by applying lower values. Subgroup analysis by sex, study location, cohort size and follow-up time was undertaken. Mediation by BMI was explored according to the degree of adjustment (BMI adjusted vs non-BMI adjusted) and participant obesity (BMI < 30 vs > 30 kg/m 2 ). To further reduce heterogeneity, we separately pooled risk estimates that either focused on LTPA or the more inclusive measures of total PA. Significance of subgroup and sensitivity analysis was judged by the p value for heterogeneity [53].
In addition, we examined possible non-linear associations by modelling PA using restricted cubic spline with three knots located at the 25th, 50th and 75th percentiles of the distribution. Only studies reporting risk estimates for at least three PA exposure levels for incident type 2 diabetes [54] were included in this analysis. Departure from linearity of the final cubic spline model was assessed using the Wald test for nonlinearity [55].
Publication bias was investigated by funnel plot and Egger's test for asymmetry. All reported p values were two sided. All analyses were performed using Stata 13.1 (Stata Corp, College Station, TX, USA). Interactive dose-response curves were visualised using R (R Foundation for Statistical Computing, Vienna, Austria) [56].

Results
Literature search In total, 28 eligible cohort studies were identified which returned a total of 32 independent observations on PA and incidence of type 2 diabetes. The majority of studies (24 cohorts) yielded information on the association between LTPA and type 2 diabetes (28 observations), while four cohorts [39,[57][58][59] reported findings on total PA. Overall, this review includes 1,261,991 individuals and 84,134 incident cases of type 2 diabetes.
Age was the only variable for which all cohorts had adjusted their findings, with adjustment for other confounders varying considerably. Four cohorts [14,36,58,64] did not adjust for BMI, a key variable believed to mediate the effect of PA on type 2 diabetes. Overall, inverse associations between PA and incident type 2 diabetes were observed for all identified cohorts.
Linear association between PA and incidence of type 2 diabetes Study-specific linear RRs (95% CI) for 10 MET h/ week increments of PA sorted by PA domain and publication year, are shown in Fig. 1.
The mean pooled risk reduction for type 2 diabetes was 13% (95% CI 11%, 16%) per 10 MET h/week increment of PA, albeit observed in the presence of high heterogeneity (I 2 93.5%, p Het < 0.001). Consecutive removal of single studies indicated no significant impact of any one study on the overall heterogeneity in the model (I 2 88.3-92.3%, p Het < 0.001). Likewise, restriction to studies rated as high quality did not substantially influence model heterogeneity (I 2 82%, p Het < 0.001, n = 17).
Risk reductions for type 2 diabetes were considerably more pronounced for LTPA compared with the benefits estimated for total PA. Each 10 MET h/week increment of LTPA reduced type 2 diabetes risk by 17% (95% CI 13%, 21%) compared with 5% (95% CI 2%, 7%) for each 10 MET h/week increment of total PA. Benefits from VPA integrated over time to MET h/week were much larger, with a decrease in risk of type 2 diabetes of 56% (95% CI 16%, 77%) per 10 MET h/week increment.
The effects appeared to be more pronounced in women with a pooled RR of 0.83 (95% CI 0. 77 Table 2).
Non-linear dose-response analysis In total, data from 23 cohorts were included in the restricted cubic spline analysis and the ensuing pooling in a two-stage multivariate dose-response model. A significant non-linear dose-response is shown in Fig. 2a (p Non-linearity < 0.001), with greater risk reduction at moderate exposures compared with higher ones.
Results from the cubic spline model suggest that individuals who accumulate 11.25 MET h/week (equivalent to meeting the recommended guidelines of 150 min/week of activity at 4.5 MET) have a reduced risk of developing type 2 diabetes equal to 26% (95% CI 20%, 31%) relative to completely inactive individuals.
We found no indication of a substantial threshold effect or plateau for the obtained benefit across increasing levels of PA. Being active at a level corresponding to double that of the recommended minimal PA (22.5 MET h/week) was associated with a reduced risk of type 2 diabetes of 36% (95% CI 27%, 46%) with further reductions at higher doses (60 MET h/week, risk reduction of 53%), in the cubic spline model. For 8.75 MMET h/week (equivalent to 11.25 MET h/week at a mean gross intensity of 4.5 MET) the pooled RR for type 2 diabetes was 0.74 (95% CI 0.69, 0.80), with risk being 0.64 (95% CI 0.56, 0.73) for those doing twice as much. Point risk estimates of the pooled dose-response relation for LTPA (in MET h/week) and type 2 diabetes are tabulated in Fig. 2 (also available online as an interactive version at http://epiweb.mrcepid.cam.ac.uk/meta-analyses/pa/diabetes/). Sensitivity analyses were run to assess the effect of assumptions regarding duration or intensity of the PA exposure used in the LTPA dose assignment procedure for those studies where this information was not directly available; see Fig. 2 bd and ESM Fig. 4. The shape of the dose-response curve was similar under these different assumptions. Benefits were larger for a given exposure if duration and intensity were assumed to be smaller in the original studies where these assumptions were needed. Furthermore, we repeated the final cubic spline model including variance-weighted linear dose-response gradients of the two identified studies that could not be used in the main model because of incomplete data. The impact of excluding these studies was minimal on the overall final result, with a risk reduction of 24% (95% CI 19%, 29%) at This is an abridged version of ESM Table 4, which includes details of the method of PA assessment and additional comments a Doses were assigned from descriptions identified within the individual studies or from correspondence with study authors. Full details of MET h dose assignment are listed in ESM Table 2 Table 3 and ESM Fig. 4).

Discussion
Our results from a comprehensive literature search identifying relevant longitudinal studies indicate an inverse association between PA and incidence of type 2 diabetes, which was consistently observed across the identified cohorts. Using the restricted cubic splines model, accumulating an activity volume which is commensurate with adherence to the current public health recommendations of 150 min of MVPA per week compared with sedentary individuals was associated with a reduction in the risk of type 2 diabetes by 26% (95% CI 20%, 31%) in the general population.
Our results suggest that the benefits of higher activity levels extend considerably beyond the minimum recommendations. Using the restricted cubic spline model we found that a doubling of activity volume from 11.25 MET h/week to 22.5 MET h/week would further reduce the risk of type 2 diabetes by 10% to a total risk reduction of 36% compared with being inactive. For an intensity of 4.5 MET, our results were very similar under the MMET analysis. However, a greater benefit would be gained from using MMETs for more intensive activity, whereas less intensive activity would gain smaller benefits.
Central to any dose-response analysis for assessing PA in relation to health is the issue of uncertainty in the way by which PA was assessed in free-living individuals. Self-reported PA generally correlates significantly but weakly with objective methods of PA ascertainment, with approximately 10% shared variance [60]. A further crucial issue which may have affected our findings is the substantial heterogeneity in the measurement and reporting of PA behaviour, resulting from questionnaires ascertaining different domains, timeframes and/or units of PA. Methods of outcome assessment were also not consistent across the identified cohorts and it is possible that diagnostic bias may have distorted the results of some of the studies because of differences in diabetes detection accuracy.
When interpreting the findings, the fact that most studies were primarily conducted in samples of well-educated white populations in high-income countries must be taken into account. In the context of type 2 diabetes, earlier studies have found that dose-response curves may be different for Asian  Indians who may require more PA to be protected from their relatively higher susceptibility to develop type 2 diabetes [72,73]. A potential strength of our present analyses is the expression of PA exposure dose in MMET h/week rather than just MET h/week. There is a fine distinction between these two measures; an individual expending 3 METs on a given activity is using double the activity-related energy above rest than an individual performing an activity at 2 METs. By setting the starting point of the PA volume at 0 MMET h/week, better mathematical properties (proportionality) of the exposure variable are taken into account, allowing different intensities of activity to be more fairly equated, both within and across individuals and populations. This calculation gives a relatively higher weighting to time spent in more vigorous activity compared with classic METs. This means that doing more intensive activity would equate to a relatively larger dose in the MMET model than under the MET model. Most cohorts were not designed to specifically investigate PA and the resulting paucity of comprehensive data on all PA behaviours may have hindered our analysis. We used aggregated exposure measures across a range of reported activities from each study, which relied on the originally assigned intensity values for each activity by the primary study analysis alongside aggregated durations, however it is likely that more accurate MMET h estimates could be calculated with access to individual-level raw PA data. Nevertheless, expressing PA in marginal MET units is a promising method to account for activities of differing intensity and would be aided by better reporting of intensity and duration characteristics for each exposure group. As a restricted cubic spline regression model was used to study the shape of the dose-response relationship, we were able to improve precision as to how the association between PA and incident type 2 diabetes varies at different exposure levels [49].
An earlier systematic review [25] also conducted doseresponse meta-analyses for PA and type 2 diabetes. However, this review achieved far less data harmonisation than in our paper. Aune et al report results separately for MET h/week (five studies), hours per week (ten studies) and energy expenditure (four studies). They found a larger   benefit (based on an assumption of moderate intensity activity) and a more linear dose-response curve using the time-based measure compared with the MET h measure. Our results, which are derived from 23 studies, suggest considerably larger benefits for the same PA exposure level, e.g. RR of 0.65 vs RR of 0.76 at 20 MET h/week. Given that our more extensive approach to harmonisation requires more assumptions it is encouraging that our sensitivity analysis found relatively small differences in the size of the effects, and little difference in the shape of the doseresponse curve. Previous research into PA and other health outcomes has often provided evidence in favour of a strongly curvilinear dose-response relationship [20][21][22][23]74]. This curvilinear association has been the basis for further health impact modelling studies [75] and, as such is used to estimate how much gain there would be in population health from different PA interventions or scenarios. Uncertainty about the dose-response shape has been found to contribute substantially to uncertainty about the final results of partaking in PA for disease prevention. Our results indicate that for type 2 diabetes prevention, while probably curvilinear over a much wider exposure range, the relationship is much closer to linearity than that found previously for all-cause mortality or ischaemic heart disease [21]. Our effect estimates are likely to be conservative, given the diluting impact that exposure measurement error stemming from a single self-report measure of activity will have on the observed associations. Even so, our results suggest a major potential for PA to slow down or reverse the global increase in type 2 diabetes prevalence and should prove useful for health impact modelling, which frequently forms part of the evidence base for policy decisions (e.g. WebTAG for transport [76]).
Increasingly, PA research is incorporating the use of objective data, e.g. UK Biobank has recently collected accelerometry data in 100,000 individuals who are also followed up over time to link this data with health outcomes. However, before such studies accrue enough major clinical events to examine prospective relationships, self-report data may be calibrated against objective measures to enhance translation of findings based on self-report into public health action [77].
Given the non-linear nature of the dose-response curve between LTPA and type 2 diabetes, the effects of LTPA are likely to depend on the exposure to non-leisure activity. Our finding of a smaller effect for total PA is unexpected but was based on a much smaller evidence base and may reflect differences in measurement properties between domains. Assuming, however, that the non-linear relationship holds across all domains, the marginal effect of LTPA will be greater in a population that is less active in other domains and vice versa. One way to address this would be to conduct a metaanalysis of LTPA by level of non-leisure PA, e.g. occupational grouping.
The results from this dose-response meta-analysis provide evidence in support of the clinically meaningful role of PA in the primary prevention of type 2 diabetes in the general population. We highlight the necessity for progress in PA measurement and reporting of PA of different intensities and duration in cohort studies. Additionally, we recommend investigations to consider the dose-response relationship of PA and type 2 diabetes prevention in more ethnically diverse population groups.
Overall, we found the dose-response curve for PA and incident type 2 diabetes is curvilinear. Our study suggests that notable health benefits of PA can be realised even at relatively low levels of PA but also that considerable additional decreases in risk for type 2 diabetes are afforded when substantially exceeding the current PA guidelines.
Our meta-analysis supports the generally accepted notion of a graded association between PA and health maintenance [78,79]. It favours a 'some is good but more is better' guideline, in which specific targets are mainly used for a psychological effect. There is no clear cut-off at which benefits are not achieved and health protection increases at activity levels well beyond current recommendations. Enabling cultures and built environments to increase PA at the population-wide level may prevent substantial personal suffering and economic burden. Given the current obesity and diabetes epidemic, the utility of such a strategy may reach beyond any present-day approaches to improve population health.
Public Health Research Centre of Excellence which is funded by the British Heart Foundation, Cancer Research UK, the Economic and Social Research Council, the Medical Research Council, the National Institute for Health Research, and the Wellcome Trust. ADS was partly supported by an MRC PhD studentship. JW is an MRC Population Health Scientist fellow. SB is supported by a program grant from the MRC (MC_UU_12015/3).
Access to research materials Information about how the data can be accessed is available from the corresponding author.
Duality of interest The authors declare that there is no duality of interest associated with this manuscript Contribution statement JW and SB conceived this study. ADS and