Introduction

Type 2 diabetes is an increasing worldwide health problem. More than half a million people in Finland (about 10% of the population) in 2008 [1] and about 170 million people worldwide in 2004 [2] were estimated to have type 2 diabetes. The worldwide figure is estimated to double by 2030 [2].

It has been shown that obesity is a major risk factor for type 2 diabetes [3, 4] and that lifestyle interventions, including diet modification and physical activity, are effective in preventing diabetes [57]. Prospective follow-up studies [812] and a randomised controlled trial [13] suggest that physical activity has an independent role in the prevention of type 2 diabetes. The evidence suggests that any physical activity may be better than none in the prevention of type 2 diabetes, but better results are achieved if individuals engage in moderate-intensity exercise, preferably daily [14].

It is known that physical fitness and the ability to achieve high levels of physical activity have a genetic component [1517]. Type 2 diabetes has been clearly shown to be an environmental disease, but it also has a genetic component [18], based on family, twin and genome-wide association studies [19]. Twin pairs nearly always share the same childhood family environment. Dizygotic (DZ) pairs (like sibling pairs) share, on average, half of their segregating genes, while monozygotic (MZ) pairs are genetically identical at the sequence level. By studying outcomes in twin pairs discordant for an exposure, such as physical activity, the possible confounding role of genetic and early childhood experiences can be assessed.

The main aim of this study was to investigate whether physical activity predicts the development of type 2 diabetes during almost 30 years of follow-up, when controlled for genetic predisposition and childhood family environment (co-twin-control design). Another aim of the study was to see whether the effect of physical activity is independent of BMI.

Methods

Participants

The Finnish Twin Cohort comprises virtually all the same-sex twin pairs born in Finland before 1958 and with both co-twins alive in 1967 [20]. In 1975, a baseline questionnaire (described below in detail) was sent to twin pairs with both members alive. The response rate was 89%. After excluding the participants with diagnosed diabetes at baseline, those of undefined zygosity and those who had moved abroad before 1976, the cohort consisted of 23,585 individuals with self-reported baseline data on education, social and occupational class, alcohol consumption, physical activity and BMI [19]. The final cohort for the present study included 20,487 individuals, with 8,182 complete twin pairs, who had complete physical activity information available for metabolic equivalent (MET) index calculations (see explanation below). Of the total sample, 9,842 were male and 10,645 female, and 6,399 were monozygotic twin individuals and 14,087 were dizygotic twin individuals. Determination of zygosity was based on an accurate and validated questionnaire method [21].

To remove the confounding factors due to disease, we studied a subgroup of 13,291 presumably healthy individuals. Participants with chronic diseases (such as angina pectoris, myocardial infarction, stroke, diabetes, cardiovascular disease, chronic obstructive pulmonary disease and malignant cancer) affecting weight and ability to engage in leisure physical activity prior to 1982 had been identified by a questionnaire in 1981 and by medical records as described in detail by Kujala et al. [22]. Type 2 diabetes [23] and some other diseases can remain subclinical and undiagnosed for some time after the onset of symptoms. Therefore, we set a 6 year period in order to ensure that any undiagnosed cases in 1975 would have been diagnosed by 1981. Thus, we obtained a true cohort of participants free of clinical co-morbidities.

The participants were informed about the purposes of the overall cohort study when given the baseline questionnaire in 1975. In responding to the questionnaire, participants also gave informed consent. The record linkages were also approved by the appropriate authorities responsible for the registers and the Ethics Committee of the Department of Public Health, University of Helsinki.

Baseline physical activity and covariate assessment

The 1975 questionnaire included questions on medical history, education, occupation, physical activity and other health habits. Assessment of leisure-time physical activity volume (MET index) was based on a series of structured questions on leisure-time physical activity (monthly frequency, mean duration and mean intensity of sessions) and commuting physical activity. The index was calculated by first assigning a multiple of resting metabolic rate (MET value) to one of four categories defined according to the strenuousness of the activity [22]. After assigning the MET value, the product of the activity was calculated as follows: MET value × duration × frequency. The MET index was expressed as the sum score of leisure MET h/day (1 MET h/day corresponds to about 30 min walking every other day). The MET index thus established was then divided into quintiles. The same quintiles were used as in our earlier study on mortality [22]. For cut-off points see Table 1. For further analyses the index was dichotomised as sedentary <0.59 MET h/day (QI) and active ≥0.59 MET h/day (combined QII–V).

Table 1 Baseline characteristics of 20,487 individuals according to MET quintiles in 1975

The MET index was validated in a previous study by our group [24] by comparing the MET index with a 12 month detailed physical activity questionnaire conducted by telephone interview. The intraclass correlation between the MET index and the detailed 12 month physical activity MET index was 0.68 (p < 0.001) for leisure-time physical activity and 0.93 (p < 0.001) for commuting.

Baseline self-reported weight and height were used to calculate BMI, which was used as a covariate in the study. In another study of Finnish twins the correlation between self-reported and measured BMI was very high [25].

Self-reported smoking status, use of alcohol, work-related physical activity and social class at baseline in 1975 were also used as covariates. Smoking status was coded into four categories, determined from responses to detailed smoking history questions: never smoked; former smoker; occasional smoker; and current daily smoker [26]. Alcohol use was coded as a dichotomous index of binge drinking and defined by whether the participant had drunk at least five drinks on a single occasion, at least monthly [27]. Alcohol was also used as a continuous variable expressed as grams consumed daily, as described in detail earlier [27]. Six categories were used to describe social class and the classification was based on self-reported job titles according to the criteria used by the Central Statistical Office of Finland [28]. Work-related physical activity was used as a categorical variable with a four-point ordinal scale [16].

Diabetes assessment

Type 2 diabetes information for 1976–1996 was collected from death certificates, the National Hospital Discharge Register and the Medication Register of the Social Insurance Institution by linking this information to the personal identification assigned to all residents of Finland [19]. The Social Insurance Institution of Finland (KELA) is the agency responsible for the provision of basic social security [19, 29]. KELA reimburses whole or part of the cost of necessary medications to patients who are certified by a physician as having a diagnosed severe chronic disease [30]. Although the register is not sensitive to cases of mild disease, it has very high validity and the possibility of false-positive cases is unlikely [29]. The relevant medical records for 1976–1996 were reviewed and cases classified as type 2 diabetes, type 1 diabetes, gestational diabetes, secondary diabetes or other diagnoses as described by Kaprio et al. [31]. The date of onset of disease symptoms was determined and used in the analyses. The diabetes information for 1996–2004 was collected solely from the Medication Register and individuals were presumed to have type 2 diabetes, given their age [19]. For this period the date of being granted the right to reimbursable medication was used in the analysis as the date of disease onset. We have not yet extended the data collection for years 2005–2009, partly because the national programme of screening pre-diabetes and diabetes cases followed with preventive interventions (for example, dietary modification, physical activity) has been intensive during 2005–2009, which may cause a bias in our study design if included in our prospective long-term follow-up.

Data analysis

Cox proportional hazard regression was used to estimate the hazard ratios, with 95% CI, for the incidence of type 2 diabetes by MET quintile. The inactive category (QI: <0.59 MET h/day) was used as the reference group. The follow-up for type 2 diabetes ended at the time of diagnosis and for the others at the time of death, emigration from Finland or end of follow-up (31 December 2004). First, the Cox regression model was conducted as an individual analysis and second, the analyses were done as pairwise analyses, in which the data were stratified by pair and thus the risk estimates were within-pair estimates. For the individual analysis, the Cox regression model was adjusted for age and sex, and additionally for BMI. The pairwise analyses controlled by design for age and sex (co-twin-control design), but the models were also adjusted for BMI and were run separately for MZ and DZ pairs if the numbers permitted. The basic individual analysis was additionally adjusted for work-related physical activity, social class, alcohol use and smoking. In the individual-level analyses, lack of statistical independence of co-twins was taken into account by computing robust variance estimators for cluster-corrected data [32] to yield correct standard errors and p values. Data management and analysis were performed using the Stata statistical software, version 9.0.

Results

Table 1 shows the baseline characteristics of the participants according to the physical activity MET quintiles. The sedentary participants in QI were, as expected, the oldest, had highest BMI and smoked the most.

A total of 535,000 person-years were accumulated during the follow-up from 1976 to 2004. During this period, 1,082 new type 2 diabetes cases occurred among the 20,487 participants. The hazard ratios and 95% confidence intervals for type 2 diabetes between the different MET quintiles for all individuals are presented in Fig. 1a (see also Electronic supplementary material [ESM] Table 1). The individual analyses showed that the participants in physical activity quintiles III–V had significantly lower age- and sex-adjusted hazard ratios during the follow-up compared with the sedentary individuals in QI. Analysis of healthy participants with no known medical constraints on physical activity (n = 13,291 individuals) also showed similar hazard ratios (ESM Table 1). After adjusting the model for all individuals for work-related physical activity, social class, smoking and alcohol use (all separately), the hazard ratios remained similar. When the model was adjusted for BMI, the differences in the hazard ratios between the quintiles were no longer significant (Fig. 1b). There was no difference between individuals in risk by zygosity.

Fig. 1
figure 1

HRs and 95% CIs for type 2 diabetes according to different MET quintiles for all participants: (a) individual analyses; (b) individual analyses adjusted for age, sex and BMI; (c) pairwise analyses; and (d) pairwise analyses adjusted for BMI. QI <0.59 MET h/day; QII 0.59–1.29 MET h/day; QIII 1.30–2.49 MET h/day; QIV 2.50–4.49 MET h/day; and QV: ≥4.50 MET h/day

The pairwise analysis indicated (Fig. 1c) that the participants in physical activity quintiles II to V were significantly less likely to have type 2 diabetes (QII HR 0.61, 95% CI 0.41–0.90; QIII 0.59, 0.39–0.87; QIV 0.61 0.41–0.91; QV 0.61, 0.40–0.94) during the follow-up than their co-twins in the sedentary quintile (ESM Table 2). This analysis takes into account all pairs discordant for physical activity across all the quintiles. The hazard ratios (QII HR 0.50, 95% CI 0.32–0.78; QIII 0.50, 0.32–0.78; QIV 0.57, 0.37–0.88), except that for QV, were reduced even further when the model was adjusted for BMI (Fig. 1d). Similar results were found for both zygosities, with the MZ twins showing the lowest hazard ratios (ESM Table 2). Although numerically the lowest, the hazard ratios for the MZ pairs are not all statistically significant as the MZ also had the lowest number of informative discordant pairs. Again, the results of the subgroup analysis of the healthy participants with no known constraints on physical activity showed similar hazard ratios. The BMI-adjusted hazard ratios for type 2 diabetes remained statistically significant in all quintiles.

Of all the twin pairs, 1,919 pairs were discordant for physical activity when sedentariness (QI <0.59 MET h/day) was compared with any activity category (combined quintiles II–V) and 809 pairs were discordant for type 2 diabetes. Of these, 146 pairs were discordant for both baseline physical activity and follow-up type 2 diabetes. In 85 of the 146 pairs, the co-twin who was diagnosed with diabetes during the follow-up was sedentary at baseline, while the active co-twin remained healthy; in 61 pairs the converse was true. Among the MZ pairs the corresponding numbers were 21 and 10.

Further pairwise analyses showed that the BMI-adjusted hazard ratio (0.54; 95% CI 0.37–0.78) was lower in the members of the twin pairs who were physically active (combined Q II–V: ≥0.59 MET h/day) compared with their inactive (QI: <0.59 MET h/day) co-twins (Table 2). The results of the BMI-adjusted pairwise analyses were significant for all the analysed subgroups, except that for MZ pairs, which was marginally non-significant. However, the MZ pairs showed a similar or even lower hazard ratio than the other groups.

Table 2 Risk for type 2 diabetes during 1976–2004 in active members of twin pairs (≥0.59 MET h/day) compared with their sedentary co-twins (<0.59 MET h/day)a

Discussion

Our 28 year prospective follow-up study in twins showed that leisure-time physical activity reduces the risk for type 2 diabetes when controlled for genetic predisposition and childhood home environment. This was seen in the pairwise analyses among both MZ and DZ pairs, including those using BMI-adjusted data. It can therefore be assumed that physical activity independently protects against type 2 diabetes, as many unmeasured confounding factors (both genetic and environmental) are controlled for by the twin design. These findings are consistent with those of earlier population-based studies [1012]. However, our study had a longer follow-up and we were able to investigate the issue in genetically controlled participants.

On the one hand, high BMI may lead to inactivity and then to more type 2 diabetes; on the other hand inactivity may lead to higher BMI and then to type 2 diabetes. However, use of BMI as a covariate is problematic as both high muscle mass and high fat mass can contribute to high BMI. Our previous twin studies have shown that despite the lack of statistically significant differences in BMI between physically active and inactive members of twin pairs, physical activity reduces waist [24] and high-risk body fat (ectopic fat stores, liver fat and visceral fat) but maintains skeletal muscle mass and function [33], leading to lowered type 2 diabetes risk independent of BMI. It is also possible that the results from BMI-adjusted analyses are over-adjusted, as physical activity may reduce type 2 diabetes by independently reducing BMI.

As chronic exposure of pancreatic beta cells to elevated glucose and fatty acid levels may impair their function and lead to type 2 diabetes [34, 35], both endurance and resistance exercise training has been proven to have effects on various mechanisms that enhance the insulin sensitivity of skeletal muscles [36] and thus diminish glycaemic stress. More specifically, physical activity or exercise training has been shown to reduce visceral fat [33], improve skeletal muscle insulin sensitivity [37, 38] and increase the oxidative capacity of skeletal muscle, which correlates with insulin sensitivity [39], and also leads to increased/modified fat oxidation, which is most likely to prevent lipid-mediated insulin resistance [40].

The evidence to date on the dose-response relationship regarding the amount of physical activity needed to prevent type 2 diabetes remains conflicting [14]. In our study any amount of physical activity seemed to reduce the risk for type 2 diabetes, as seen in the pairwise analyses. As little physical activity as 0.6–1.3 MET h/day or 4.2–9.1 MET h/week, produced significant results compared with sedentariness. Four MET h/week are equivalent to 1 h moderate-intensity exercise weekly and 9 MET h/week are equivalent to about 2 h moderate-intensity exercise weekly, which still is less than the generally advised 150 min moderate intensity exercise per week [14]. The hazard ratios in the pairwise analyses were similar across all the physical activity quintiles (II–V), indicating that total inactivity in particular is a predictor of future type 2 diabetes. However, it may be that during our long-term follow-up those individuals who at baseline exercised most have decreased their exercise levels. The dose-response relation between physical activity and occurrence of type 2 diabetes, and particularly the role of the intensity of activity, still remain unanswered.

Strengths and limitations

The main strengths of the present study are a very long follow-up period, the twin study design and a large sample size. The twin design enabled us to control for both genetic predisposition and childhood family environment. The large sample included a very large proportion of all the same-sex twin pairs born in Finland before 1958 and therefore can be expected to be a good representation of the Finnish general population of that generation. Another important strength of the study is the use of hospital discharge and death registers and information on reimbursable medication for type 2 diabetes assessment, which provide data on outcomes on all participants. There were very few, if any, false-positive cases of type 2 diabetes among our data [29].

However, the registers also have a limitation as diagnoses of type 2 diabetes tend to be delayed, which then means delay in granting of the right to reimbursable medication; this would bias results if the delay was different by physical activity category. Biochemical assessment of all participants for follow-up status would have been ideal. In practice, repeated measures of glucose metabolism from all participants is not possible, as this may also lead to participation bias based on presence of diabetes or related symptoms. Self-reported data on physical activity habits and BMI also have known limitations. However, these physical activity questions correlated well with the results of a detailed interview [24] and predicted mortality [22] consistently with other studies that have used different measures of physical activity. As stated earlier, correlation between self-reported and measured BMI is very high [25]. Another limitation in our study relates to the use of baseline BMI as a covariate. This does not control for the changes in BMI over time that are possible during such a long follow-up. More detailed measures of body composition in 1975 would have been desirable but were not available.

Conclusion

Our longitudinal twin pair study established that leisure-time physical activity protects from type 2 diabetes after taking genetic effects into account. On the basis of our co-twin-control design even small amounts of physical activity compared with sedentariness play a significant role in reducing or postponing the occurrence of type 2 diabetes.