Introduction

Type 2 diabetes has become a major global public health problem through an increasing burden of complicated morbidity and mortality [13]. It has been hypothesised that prenatal development might influence the susceptibility to type 2 diabetes in later life [4, 5]. Low birthweight, a widely used indicator of retarded fetal growth and intrauterine malnutrition, has been consistently related to an increased risk of type 2 diabetes in observational studies [69]. However, it remains unclear whether intrauterine growth plays a causal role in the development of type 2 diabetes, as it is difficult to fully eliminate confounding by socioeconomic status and lifestyle factors in observational studies.

Genetic association analysis is less likely to be affected by confounding and has been increasingly employed to inform causality [10]. A recent genome-wide association study (GWAS) in up to 69,308 individuals of European ancestry identified several genetic variants associated with birthweight [11]. Birthweight is a widely used surrogate for restricted intrauterine growth, which has been suggested to affect the risk of type 2 diabetes in later life. The genetic variants influencing birthweight may be intended to capture exposures that restrict intrauterine growth. We hypothesised that the exposures influencing intrauterine growth might lie in the causal pathway for susceptibility to type 2 diabetes. Although the intrauterine risks are unlikely to be measured, the exposures that affect intrauterine growth can be reflected in the birthweight. It is therefore possible to estimate the potential causal relation between intrauterine risks and type 2 diabetes by using the genetic variants related to birthweight.

In the present study, we first examined the evidence for causality by testing whether the genetic predisposition to low birthweight, which was evaluated by the GWAS-identified genetic variants, was associated with type 2 diabetes. We then performed a Mendelian randomisation analysis to estimate the possible causal effect size of low birthweight on type 2 diabetes.

Methods

Participants

The Nurses’ Health Study (NHS) is a prospective cohort study of 121,700 US female registered nurses aged 30–55 years at the study inception in 1976 [12]. Between 1989 and 1990, 32,826 women provided blood samples. The Health Professionals Follow-Up Study (HPFS) is a prospective cohort study of 51,529 US male health professionals aged 40–75 years at the study inception in 1986 [13]. Between 1993 and 1999, 18,159 men provided blood samples. In both cohorts, information about medical history and lifestyle has been collected biennially by self-administered questionnaires since inception. Both studies were approved by the human research committee at the Brigham and Women’s Hospital (Boston, MA, USA), and all participants provided written informed consent. For this analysis, we used 1986 as the baseline date for the NHS and HPFS. The present analysis included 5928 men and 10,673 women of European ancestry with genotype data available based on previous GWASs [1419].

Ascertainment of type 2 diabetes

Diabetes was defined as self-reported diabetes, with information collected by a questionnaire and confirmed by a validated supplementary questionnaire [20, 21]. For the diagnosis of diabetes before 1998, we used the National Diabetes Data Group criteria to define diabetes [22], which included one of the following: one or more classic symptoms (excessive thirst, polyuria, weight loss, hunger, pruritus or coma) plus a fasting plasma glucose level of ≥7.8 mmol/l, a random plasma glucose level of ≥11.1 mmol/l or an OGTT 2 h plasma glucose level of ≥11.1 mmol/l; at least two elevated plasma glucose levels on different occasions in the absence of symptoms; or treatment with hypoglycaemic medication (insulin or an oral hypoglycaemic agent). For diagnosis of diabetes from 1998 onwards, we used the ADA diagnostic criteria [23]. These criteria were the same as those proposed by the National Diabetes Data Group, except for the elevated fasting plasma glucose criterion, for which the cut-off point was changed from 7.8 mmol/l to 7.0 mmol/l.

Participants with diagnosed type 2 diabetes from the cohort baseline to follow-up until 2012 for the NHS and 2010 for the HPFS were included as cases. The validity of self-reported diabetes was verified in two subsamples from the NHS and HPFS, respectively. A physician blinded to the information reported on the supplementary questionnaire reviewed the medical records according to the diagnostic criteria. The medical record review’s confirmation rate of diabetes as reported by the supplementary questionnaire was 98% for the NHS and 97% for the HPFS [20, 21]. Control participants were defined as those free of diabetes through the follow-up.

Assessment of covariates

Participants in the NHS and HPFS were requested to provide their birthweight on the 1992 and 1994 questionnaires, respectively, within categories (in kg) of <2.26, 2.27–2.49, 2.50–3.15, 3.16–3.82, 3.83–4.4, ≥4.5 and unknown in the NHS, and <2.50, 2.50–3.15, 3.16–3.82, 3.83–4.4, ≥4.5 and unknown in the HPFS. The validity of the self-reported birthweight data has previously been described [24, 25]. Self-reported birthweight was highly correlated with recorded birthweight (r = 0.74) and the birthweight reported by the participants’ mothers (r = 0.71). BMI was calculated as the weight in kilograms divided by the square of height in metres. Information about smoking status and alcohol intake was derived from the baseline questionnaires [12, 13]. Physical activity was expressed as metabolic equivalents (METs) per week using the reported time spent carrying out various activities, weighting each activity by its intensity level. The validity of the self-reported body weight and physical activity data has previously been described [26, 27]. Self-reported and measured weights were highly correlated at 0.97 for men and 0.97 for women [26].

Genotyping and genetic risk score calculation

Single nucleotide polymorphism (SNP) genotyping and imputation have previously been described in detail [15]. In brief, samples were genotyped and analysed using the Affymetrix Genome-Wide Human SNP Array 6.0 (Santa Clara, CA, USA) and the Birdseed calling algorithm (www.broadinstitute.org/mpg/birdsuite/birdseed.html). All samples used in the present study achieved a call rate of >98%. We used MACH (www.sph.umich.edu/csg/abecasis/mach) to impute SNPs on chromosomes 1–22, with National Center for Biotechnology Information build 36 of phase II HapMap Centre d’Etude du Polymorphisme (CEU: Utah residents with northern and western European ancestry) data (release 22; http://hapmap.ncbi.nlm.nih.gov/) as the reference panel.

Seven SNPs have been identified as being associated with birthweight by a previous GWAS [11]. We excluded two SNPs lying in the ADCY5 and CDKAL1 loci that have been implicated by GWASs in susceptibility to type 2 diabetes [28, 29], and calculated a genetic risk score (GRS) on the basis of the other five SNPs to estimate the genetic variation of low birthweight. The GRS was calculated using a weighted method according to each SNP’s relative effect size (β coefficient), obtained from the GWAS data [11]. The calculation equation was: weighted GRS = (β1 × SNP1 + β2 × SNP2 + … + β5 × SNP5) × (5/sum of the β coefficients), where SNPi is the number of risk alleles associated with low birthweight, coding as 0, 1 and 2.

Statistical analysis

Statistical analyses were performed in SAS version 9.3 software (SAS Institute, Cary, NC, USA).

We assigned a median value to each category of birthweight to model this variable as a continuous variable in analyses. General linear models were applied to examine the relationships of the GRS and individual SNPs to birthweight and covariates. Quartiles of the GRS were categorised according to quartile cut-off points for the control participants. Comparisons of the proportions and mean values across quartiles of the GRS were calculated by χ 2 test and ANOVA, respectively. We tested the associations of the GRS and individual SNPs with type 2 diabetes by logistic regression models. Results from the NHS and HPFS were pooled using an inverse-variance-weighted, fixed-effect meta-analysis (all p for heterogeneity >0.05). A restricted cubic spline regression model, which can help to prevent problems resulting from inappropriate assumptions on linearity, was used to test the linear relationship between the GRS (as the continuous variable) and the risk of type 2 diabetes [30].

We further conducted Mendelian randomisation analysis using the method previously described [10, 31, 32]. A schematic presentation of the Mendelian randomisation analysis is shown in Fig. 1. The β3 estimates for each SNP to evaluate the association between low birthweight and risk of type 2 diabetes can be calculated from the direct measurements β1 (the estimate of effect size of each SNP on birthweight) and β2 (the estimate of effect size of each SNP on type 2 diabetes) as: β3 = β21. The SE of β3 is given by: S 3 = \( \sqrt{\frac{1}{\upbeta {1}^2\mathrm{S}{2}^{-2}}} \) where S 2 is the SE of β2. In this study, β1 is the estimate of effect size of each SNP on birthweight standardised using z score transformation derived from the birthweight GWAS [11], and β2 is the log e OR estimate of type 2 diabetes for each SNP calculated from the NHS and HPFS using an inverse-variance-weighted, fixed-effects meta-analysis (all p for heterogeneity >0.05). We first calculated β3 estimates for each SNP at CCNL1, LCORL, ADRB1 and HMGA2 loci, and on chromosome 5q11.2, and then combined β3 estimates for all five SNPs analysed using inverse-variance-weighted, fixed-effects meta-analysis to obtain an overall estimate of genetically determined low birthweight associated with type 2 diabetes (all p for heterogeneity >0.05). In a Mendelian randomisation design, the overall β3 estimate (log e OR) would be considered to be an estimate for the causal association between low birthweight and type 2 diabetes. The OR for type 2 diabetes associated with each 1 SD lower genetically determined birthweight can be given by exp(overall β3). We also performed a two-sample Mendelian randomisation analysis using the latest genetic association summary statistics for the trans-ethnic type 2 diabetes GWAS meta-analysis [33].

Fig. 1
figure 1

Schematic representation of the Mendelian randomisation analysis. β1 is the effect size estimate of each low-birthweight-related SNP on birthweight derived from the GWAS of birthweight reported by Horikoshi et al [11]. β2 is the log e OR estimate of the type 2 diabetes for each SNP derived from the NHS and HPFS using inverse-variance-weighted, fixed-effects meta-analysis (all p for heterogeneity >0.05). β3 is calculated from β1 and β2 for each SNP: β3 = β21. The SE of β3 is given by: S3 = \( \sqrt{\frac{1}{\upbeta {1}^2\mathrm{S}{2}^{-2}}} \) where S2 is the SE of β2. We then combined β3 estimates for all five SNPs analysed using inverse-variance-weighted, fixed-effects meta-analysis to obtain an overall estimate of the relation between genetically determined low birthweight and type 2 diabetes. In a Mendelian randomisation study, the overall β3 estimate would be considered to be an estimate of the causal association between low birthweight and type 2 diabetes. The OR for type 2 diabetes associated with each 1 SD lower genetically determined birthweight can be given by exp(overall β3)

Results

Characteristics of the participants at baseline

The range of the GRS was 0–10 for both women and men; the mean value (SD) of the GRS was 4.45 (1.57) and 4.55 (1.57) among women and men, respectively. As expected, GRS was associated with birthweight in both women (β = −0.014 kg, p = 0.001) and men (β = −0.018 kg, p = 0.001; see electronic supplementary material [ESM] Fig. 1). The baseline characteristics of 16,601 participants from the NHS and HPFS according to quartiles of the low birthweight GRS are presented in Table 1. Mean values for age, BMI, alcohol intake, physical activity, total energy intake and proportions of current smokers did not different across the quartiles of the GRS for either men or women (all p >0.05). In addition, the GRS was not associated with waist circumference, prevalent hypertension or hypercholesterolaemia at baseline (all p >0.05; ESM Table 1).

Table 1 Baseline characteristics among 5928 men and 10,673 women according to quartiles of the low-birthweight GRS

Low-birthweight GRS and type 2 diabetes

The present study included 3627 participants with and 12,974 participants without type 2 diabetes. The OR (95% CI) for type 2 diabetes per 1 SD lower birthweight was 1.35 (1.21, 1.50) and 1.24 (1.06, 1.45) in the NHS and HPFS, respectively. As shown in Table 2, the association between each 1 point increment in the GRS and risk of type 2 diabetes was stronger in women (OR 1.09; 95% CI 1.05, 1.13) than men (OR 1.02; 95% CI 0.97, 1.07), with multivariable adjustment for age, BMI, smoking status, alcohol intake, physical activity, total energy intake and source of genotyping data. In a meta-analysis of the results from women and men, each 1 point increase in the GRS was associated with a 6% (95% CI 3%, 9%) increased risk of type 2 diabetes; the OR (95% CI) for type 2 diabetes was 1.16 (1.01, 1.31), 1.18 (1.03, 1.33) and 1.24 (1.08, 1.39) for the second, third and fourth quartiles of the GRS, respectively, compared with the lowest quartile (p for trend <0.001). Combining the data for both women and men, the low-birthweight GRS showed a linear relation with an increased risk of type 2 diabetes (p for linearity = 0.001; Fig. 2).

Table 2 Association between the low-birthweight GRS and type 2 diabetes
Fig. 2
figure 2

Relationship between the low-birthweight GRS and type 2 diabetes. Data are ORs (solid lines) and 95% CIs (dashed lines), based on the combined data of the NHS and HPFS, adjusting for age and sex. p for linearity = 0.001

Individual SNPs in relation to birthweight and type 2 diabetes

Characteristics of individual SNPs and their relation to birthweight and type 2 diabetes are depicted in Table 3. The results of the NHS and HPFS were pooled using inverse-variance-weighted, fixed-effects meta-analysis (all p for heterogeneity >0.05). Of the five SNPs, CCNL1 rs900400 had the strongest association with a lower birthweight (p < 0.001); CCNL1 rs900400 and 5q11.2 rs4432842 showed dose–response associations with risk of type 2 diabetes, with OR (95% CI) values of 1.09 (1.03, 1.16) and 1.09 (1.02, 1.16), respectively.

Table 3 Characteristics of the genetic variants and the association of increasing number of low-birthweight-related alleles with birthweight and type 2 diabetes

Mendelian randomisation analysis

We further performed Mendelian randomisation analysis to estimate the causal effect size of genetically determined low birthweight on type 2 diabetes (Fig. 3). β3 is the log e OR estimate of the association between genetically determined low birthweight and type 2 diabetes for each SNP. Of the five SNPs, CCNL1 rs900400 and 5q11.2 rs4432842 showed statistically significant associations with type 2 diabetes (β3 = 1.21 and 2.56, respectively; both p <0.05). We then combined β3 estimates for all five SNPs using inverse-variance-weighted, fixed-effects meta-analysis to obtain an overall β3 estimate (p for heterogeneity = 0.318). The overall β3 estimate was 1.08 (95% CI 0.53, 1.64). By exponentiating the overall β3, we calculated an OR of 2.94 (95% CI 1.70, 5.16; p < 0.001), indicating that each 1 SD lower genetically determined birthweight was associated with a 1.94 (95% CI 0.70, 4.16)-fold increased risk of type 2 diabetes.

Fig. 3
figure 3

Mendelian randomisation estimate of the association of low birthweight with risk of type 2 diabetes. The forest plot shows β3 estimates (log e -ORs) of the effect of low birthweight on the risk of type 2 diabetes for each low-birthweight-related SNP. The overall β3 estimate was obtained by using inverse-variance-weighted, fixed-effects meta-analysis (p for heterogeneity = 0.318), and can be interpreted as an OR of 2.94 (95% CI 1.70, 5.16) for type 2 diabetes per 1 SD lower genetically determined birthweight (p < 0.001)

We also derived data and performed a Mendelian randomisation analysis using the latest genetic association summary statistics for recent type 2 diabetes GWAS [33]. A meta-analysis of the associations between the five SNPs and type 2 diabetes showed an OR of 1.03 (CI 1.01, 1.04); each 1 SD lower genetically determined birthweight showed a Mendelian randomisation OR of 1.70 (95% CI 1.32, 2.19; p < 0.001) for type 2 diabetes (ESM Table 2).

Discussion

We tested the potential causal effect of low birthweight on type 2 diabetes in two large prospective cohorts of US men and women. We found an association between the overall genetic susceptibility to low birthweight, which was estimated by a GRS, and an increased risk of type 2 diabetes. We further applied the Mendelian randomisation approach to provide supportive evidence for the causality, and the findings were confirmed using the summary statistics from the latest GWAS.

Evidence from both population and experimental studies has suggested that restricted early life development has a long-term structural and functional influence on individuals’ predisposition to an increased risk of metabolic diseases such as type 2 diabetes. Famine studies from the Netherlands, China and Ukraine have shown that individuals with retarded intrauterine growth are more susceptible to type 2 diabetes later in life [3436]. As a widely used marker of fetal malnutrition and growth retardation, low birthweight has been consistently associated with a high risk of type 2 diabetes. A meta-analysis of 14 studies showed a U-shaped relation between birthweight and risk of type 2 diabetes [6]. A more recent larger meta-analysis of 31 studies confirmed the relation but suggested that the overall relation between birthweight and type 2 diabetes was likely to be linear, i.e. that lower birthweight is related to an increased risk of type 2 diabetes [7]. In the present study, we also reported a linear relation between the low-birthweight GRS and type 2 diabetes, which was in line with the previous observational findings. Although the genetic association in men appeared to be weaker than that in women, the sex difference was not statistically significant (p > 0.05).

According to Mendel’s second law, the alleles of one gene sort into gametes independently of the alleles of another gene. Therefore, genetic associations are less likely to be affected by confounding and have been increasingly used in causal inference [10, 31, 32]. Based on the positive findings of genetic associations, we further used a Mendelian randomisation approach, which can provide an unconfounded estimate of a causal relationship between exposure and disease outcome [37], to test the possible causality. Notably, birthweight itself may not be an exposure relevant to type 2 diabetes; instead, the exposures influencing intrauterine growth are more likely to play a causal role in the development of the disease. As an indicator of intrauterine growth, birthweight reflects the influence of intrauterine exposures on fetal growth. Using the genetic variants related to birthweight, our results lent support to a causal relation between low birthweight and increased risk of type 2 diabetes. Interestingly, the findings of the Mendelian randomisation analysis suggested that the association between genetically determined low birthweight and type 2 diabetes was mainly driven by the SNPs at the CCNL1 locus and on chromosome 5q11.2, which have so far not been related to any known risk factors for diabetes. Therefore, the findings are less likely to be affected by the pleotropic effects of the genetic variants. Moreover, we performed a two-sample Mendelian randomisation analysis using the summary statistics for the latest type 2 diabetes GWAS, and both the summarised genetic association and Mendelian analysis results supported a potential causal relation.

The major strengths of the present study include the prospective design, the high-quality genetic data, the utilisation of genetic and Mendelian randomisation approaches and replication in well-powered summary statistics GWAS data. To our knowledge, this study is the first to investigate the potential causal relation between low birthweight and risk of type 2 diabetes. We also acknowledge several limitations. First, the study included only white participants, and future investigations in other ethnic populations are needed to verify our findings. Second, although we excluded the likely pleiotropic variants at the CDKAL1 and ADCY5 loci from the analysis, little is known about the mechanisms underlying the other five loci. It is possible that some or all of these loci could also influence the processes leading to type 2 diabetes independently of intrauterine growth. Third, we used individual SNPs rather than a GRS as instrumental variables in the Mendelian randomisation analysis, and the potential low power and weak instrument bias could affect the results. Moreover, as the Mendelian randomisation analysis was relatively small, future studies with larger sample sizes are warranted to confirm the findings.

In conclusion, we demonstrated that genetically lowered birthweight was associated with an increased risk of type 2 diabetes. Our findings validate the epidemiological observation of an inverse association between birthweight and type 2 diabetes, and provides new evidence of a role for intrauterine exposures in the pathogenesis of type 2 diabetes.