Introduction

Type 2 diabetes, a lifelong debilitating disorder with rapidly increasing prevalence [1], is characterised by chronic hyperglycaemia due to impaired insulin secretion and sensitivity. This epidemic has been ascribed to a collision between genetic susceptibility and change in the environment [2].

Whereas environmental triggers are well known, it has been more difficult to dissect the genetic causes of type 2 diabetes. Heritability of type 2 diabetes estimated as sibling relative risk (λS) is ∼3 in most unrelated populations but as high as 8 if there are several affected siblings in the family [3, 4]. Lifetime risk of diabetes in an offspring of a parent with type 2 diabetes is ∼40% and higher if the mother rather than the father has type 2 diabetes [5]. We have previously observed sex-specific parent-of-origin effects (POE) on intermediate traits like insulin secretion and lipids in offspring of parents with type 2 diabetes [5, 6]. Genome-wide association studies (GWAS) have revealed >65 loci associated with type 2 diabetes and related traits, although they account for less than 15% of the heritability of type 2 diabetes [2, 712]. Part of the missing heritability could be explained by POE, wherein a certain allele might confer risk when inherited from one parent but might be neutral or even protective when inherited from the other parent [13]. Such a situation would be missed in regular GWAS where it is assumed that both parental alleles are transmitted equally. The most plausible explanation for POE is methylation and, consequently, imprinting of one of the parental alleles. If this occurs during the fetal period, it could influence development of functional beta cell mass, reducing the ability of an individual to increase insulin secretion when exposed to an affluent westernised environment. Kong et al reported POE of four imprinted regions, including rs4731702 (KLF14), rs2334499 (near MOB2), rs231362 and rs2237892 (KCNQ1) [14], and this finding has been replicated in subsequent studies [15].

Whereas information on imprinting was the starting point for the above studies, little information is available on whether other single-nucleotide polymorphisms (SNPs) associated with type 2 diabetes or glycaemic traits show POE transmission and could thereby contribute to the excess risk observed with maternally transmitted diabetes. Such studies have been hampered by the paucity of family-based cohorts. To address this question, we assessed POE transmission of 72 SNPs associated with type 2 diabetes from previous GWAS studies (based upon results up to 2012) corresponding to 65 unique loci in two large family studies, the Botnia study from western Finland and southern Sweden and the Hungarian Transdanubian Biobank (HTB).

Methods

Study population and measurements

The Botnia study was initiated in 1990 in healthcare centres in the Botnia region in western Finland and subsequently extended to other parts of Finland and southern Sweden [5]. For the current study, all individuals from complete trios (DNA from both parents and at least one child) were selected, yielding 4,211 individuals, forming 2,322 trios from 1,083 families after exclusion of patients with type 1 diabetes, glutamic acid decarboxylase autoantibody (GAD) positivity or MODY (Table 1). Of these, 25.7% had a type 2 diabetes diagnosis based on the WHO 1998 criteria. Height, weight and waist and hip circumference were recorded and OGTT performed as reported previously [3]. Blood samples were drawn 10 min before the OGTT and then at 0, 30, 60 and 120 min. Plasma insulin was measured by ELISA (Dako, Ely, UK) [3, 5].

Table 1 Clinical and metabolic characteristics of the study population

The HTB was initiated in 1992 at the Hungarian Heart Center in Balatonfüred. Samples and data were collected from type 2 diabetes patients and their families. The HTB includes 9,279 individuals from 1,022 families from the west side of the Danube area in Hungary. For the current study, type 2 diabetes, GAD-positive and MODY individuals were excluded and 1,463 individuals were genotyped, of whom 29.3% had a type 2 diabetes diagnosis based on the WHO 1998 criteria (Table 1).

Informed consent was obtained from all the participants. The study protocols were approved by the ethics committees of Helsinki University Central Hospital and Lund University for the Botnia study and by local ethics committees for the HTB.

Glucose tolerance status

Glucose tolerance and type 2 diabetes was based on the WHO criteria [16]. Hyperglycaemia included type 2 diabetes + impaired fasting glucose (IFG) + impaired glucose tolerance (IGT) (fasting glucose ≥6.1 mmol/l or 2 h plasma glucose ≥ 7.8mmol/l). Normal glucose tolerance was defined as FPG <6.1 mmol/l and 2 h plasma glucose <7.8 mmol/l. For intermediate phenotype definitions, see electronic supplementary material (ESM) Methods.

Genotyping

Seventy-two SNPs from 65 loci which have shown consistent association with type 2 diabetes and related traits based upon GWAS data available in 2012 were selected for this study (ESM Table 1). Genotyping was performed on the Sequenom MassARRAY iPLEX Platform. The average genotyping success rate was >98% with >98% concordance in ∼6.5% replication samples. SNPs rs12779790, rs553668, rs231362 and rs2237892 were genotyped with an allelic discrimination assay-by-design method on ABI 7900 (Applied Biosystems, Warrington, UK). All SNPs were in Hardy–Weinberg equilibrium in unaffected parents (p >0.001) with the exception of SNPs rs17782313 and rs6467136 (p ≥0.001), which were excluded from analyses.

Statistical analyses

Family-based tests for association

Association was tested by family-based association tests (FBATs), which accommodate any type of genetic model and family construction [17, 18], and transmission disequilibrium test (TDT) using the PLINK software (http://pngu.ngh.harvard.edu/˜purcell/plink/) [19, 20]. A conservative power calculation on the present set of families suggests that the power to detect ORs of 1.5 using TDT at levels of type I error of 0.05 under the null hypothesis is 58%, 69% and 63% for risk alleles with frequencies of 0.1, 0.2 and 0.3, respectively [21].

Mendelian errors within families were identified by Pedstats and PLINK. To increase power we used ParenTDT [22]. To obtain empirical p values, 10,000 gene-dropping adaptive permutations were performed.

To evaluate the association of SNPs with continuous traits, non-diabetic nuclear families were considered; all measures of glycaemia, insulin secretion and sensitivity were log-transformed. The number of alleles shared identical-by-descent (IBD estimates) was computed using Merlin (Merlin 1.1.2, http://csg.sph.umich.edu//abecasis/merlin/tour/ibd.html). Variance components including environmental, polygenic and additive were used to model the phenotypic similarities using maximum likelihood in the orthogonal model [23]. All measurements were adjusted for age, sex and BMI.

Parent-of origin test

Parent-of-origin tests were performed to assess distorted transmission of risk alleles from each of the parents separately to the affected offspring using PLINK [20]. The difference between the transmissions of alleles from the father and mother was computed as a Z statistic and resulting p values were subjected to 10,000 gene-drop adaptive permutations to generate empirical p values while controlling for familial relationships.

To assess maternal and paternal effects on continuous traits separately in nuclear families, the orthogonal model was implemented after modelling for environmental, polygenic and additive variances using maximum likelihood approach using qTDT [23]. Analyses for BMI were adjusted for age and sex while other measurements were adjusted for age, sex and BMI. For all analyses, 10,000 Monte Carlo permutations were performed to obtain empirical p values [23]. Haplotype analyses for association with type 2 diabetes and parental asymmetry tests were performed using FAMHAP [24] using three-marker and four-marker haplotypes (for KCNQ1 variants).

Methylation data

Methylation analyses were performed on peripheral blood lymphocytes and islets and expression studies on the latter (see ESM Methods). The THADA variant showing a genetic POE was selected for analyses. In trios, β values were used to assess association of methylation with genotype taking parental origin into account. Differences in methylation between the two groups at individual CpG sites were assessed using limma package in R Bioconductor [25] for larger groups and Wilcox rank sum test for small test groups. The β values at CpG sites were mapped to the genes of interest and plotted to provide an overview of methylation status for the region of interest. The ratio of methylated CpGs per gene to total number of CpG probes was calculated taking into account genes with fewer CpG probes as follows: (number of differentially methylated CpG probes)/√ (total number of CpG probes) to assess for uniqueness in patterns.

Disease association using case–control data

The THADA variant showing POE was also assessed for association with type 2 diabetes, metabolic, anthropometric and dietary traits in case–control studies from MCC, PPP and the Leipzig cohorts. See ESM Methods.

Results

Family-based replication of variants previously associated with type 2 diabetes in case–control studies

We first assessed whether the effects of the reported loci could be replicated in our families. Of the 72 type 2 diabetes risk variants tested (ESM Table 2) in 4,189 individuals from 1,083 families from the Botnia study, 18 variants showed association with type 2 diabetes and 16 with hyperglycaemia (type 2 diabetes + IFG + IGT) in at least one of the tests (TDT or FBAT) (Table 2 and ESM Table 2 a, b). With the exception of rs9470794 (intron of ZFAND3), the direction of risk was concordant with previously reported GWAS studies for all phenotypes tested. The rs163184 (KCNQ1) variant showed a strong association with type 2 diabetes (p FBAT  = 0.0028, p TDT  = 0.002) and hyperglycaemia (p FBAT  = 0.021, p TDT  = 0.014) in all the tests performed. rs7903146 (TCF7L2) and rs10885122 (near ADRA2A) showed an association with type 2 diabetes (p FBAT  = 0.007 and 0.021 and p TDT  = 0.01 and 0.03, respectively) in all the tests, with the former also being associated with hyperglycaemia (p COM = 0.012). rs10010131 (WFS1) and rs6017317 (HNF4A) were associated with type 2 diabetes and hyperglycaemia in FBAT and TDT (p FBAT  = 0.016 and 0.024 and p TDT  = 0.02 and 0.02, respectively) (Table 2).

Table 2 Replication of association with type 2 diabetes and hyperglycaemia

The rs7957197 in OASL was associated with risk of type 2 diabetes (p FBAT  = 0.028) (Table 2) whereas rs864745 variant in JAZF1 and rs1801282 in the PPARG gene was associated with hyperglycaemia (p COM = 0.019) (Table 2) in at least one of the tests.

POE on risk of type 2 diabetes and hyperglycaemia

We first studied locus-by-locus POE transmissions of all selected alleles to offspring (first, with type 2 diabetes and, second, with hyperglycaemia) among 4,189 individuals from 1,083 families from the Botnia study.

Of the 72 SNPs tested, variants in THADA, KCNQ1 and CRY2 showed nominal POE on transmission to diabetic offspring. Additionally, other variants in KCNQ1 showed POE only on transmission to offspring with type 2 diabetes and can therefore be considered as a positive control, whereas those in PRC1 and near DGKB/TMEM5, and CDC123/CAMK1D genes showed POE on transmission to hyperglycaemic offspring only. In the following sections, we have analysed these variants in more detail (Table 3).

Table 3 Parental specific transmissions to diabetic offspring and hyperglycaemic offspring in families from the Botnia study

KCNQ1

Consistent with previously reported data [14, 15], the G allele in rs163184 showed an increased transmission from mothers to diabetic offspring (p MAT  = 0.008) and to hyperglycaemic offspring (p MAT  = 0.004, p POE  = 0.016) (Table 3 and Fig. 1). The rs2237895 SNP showed a stronger maternal than paternal transmission of risk allele C to diabetic offspring (p MAT  = 0.04, p POE  = 0.037) and hyperglycaemic offspring (p MAT  = 0.026, p POE  = 0.018) (Table 3 and Fig. 1). Another KCNQ1 intronic variant rs2237892 showed excess maternal transmission of risk allele C to hyperglycaemic offspring (p MAT  = 0.041) (Table 3 and Fig. 1). The four SNPs were in weak-to-moderate linkage disequilibrium (LD) suggesting that they may represent independent signals (ESM Fig. 1). Haplotype associations with three-marker (rs163184, rs2237895, rs2237892) and four-marker haplotypes showed no association, although parental asymmetry tests were significant for haplotypes based on four markers (Table 4).

Fig. 1
figure 1

POE of KCNQ1 on type 2 diabetes and hyperglycaemia. (a, b) Transmission of maternal and paternal alleles of rs2237895 to diabetic offspring (a) and hyperglycaemic offspring (b). Risk allele C shows a statistically significant maternal effect and POE in both. (c, d) Transmission of maternal and paternal alleles of rs163184 to diabetic offspring (c) and hyperglycaemic offspring (d). Risk allele G shows statistically significant maternal effect. Black bars, risk alleles; white bars, non-risk alleles. *p < 0.05 and **p < 0.01

Table 4 Haplotype analyses based on rs2237892, rs163184 and rs2237895 and on rs231362, rs2237892.rs163184 and rs2237895

THADA

In the Botnia families, the risk T allele of the THADA variant rs7578597 showed a nominal POE on type 2 diabetes risk with excess transmission from mothers to diabetic offspring (p POE  = 0.01) (Table 3).

To assess whether this effect was restricted to the diabetes phenotype, we examined parental specific transmissions: (1) to all 1,945 offspring regardless of phenotype and (2) to trios with non-diabetic offspring using permutation to obtain empirical p values. The POE was neither seen when all 1,945 offspring were considered nor when the analysis was restricted to non-diabetic offspring trios, suggesting that POE was related to the hyperglycaemic phenotype (ESM Table 3).

The same pattern was seen in the Hungarian families who showed higher transmission of T allele from the mother while the other allele, C, showed a higher transmission from father to type 2 diabetic offspring with a statistically significant replication for the POE offspring (p POE  = 0.045) (Table 5 and Fig. 2). A combined analysis of the two studies showed a significant POE for the SNP (p POE  = 0.0006) (Table 5 and Fig. 2).

Table 5 Parental specific transmissions of risk allele T in the missense coding SNP rs7578597 in THADA to diabetic offspring
Fig. 2
figure 2

POE of rs7578597 (THADA) on type 2 diabetes. Risk allele T shows an excess transmission from the mother while the C allele shows an opposite trend with excess transmission from the father to diabetic offspring in Botnia (a), Hungary (b) and Botnia (lower bars) and Hungary (upper bars) combined (c). POE is significant in both studies and in the combined analyses. *p < 0.05, **p < 0.01, ***p < 0.001

Given the bidirectional effect of each of the parental alleles, we also tested whether the SNP or the region would show differences in DNA methylation pattern in peripheral blood lymphocytes (PBL) from a subset of trios and in human pancreatic islets from unrelated cadaver donors. No clear difference in methylation pattern was seen in blood between carriers of the maternal and paternal T allele but the low frequency (6.3%) of the minor (non-risk) C allele precluded statistical tests in the numbers studied. In contrast, 12 out of 46 CpG sites spanning the THADA gene tested showed differential methylation between diabetic and non-diabetic donor islets. Five CpG sites remained after correction for multiple testing, including the two CpG sites flanking rs7578597 (cg03647861 [p = 0.035] and cg25938803 [p = 0.033]) (Fig. 3). To assess how unusual this pattern of methylation was, we calculated the ratio of differentially methylated CpG probes to the square root of total number of CpG probes, accounting also for the poorly covered genes, considering the bias in the probe distribution on the chip. From this, we found that THADA had a p value of 0.075, with a ratio of 1.8, and scored right in between KCNQ1 (ratio = 4.5, p = 0.0021) and KLF14 (ratio = 0.4, p = 0.58) (ESM Figs 2 and 3). We also found differences in methylation at the same CpG sites between diabetic carriers of CT and TT genotypes, with CT genotype carriers showing lower methylation than TT genotype carriers (ESM Fig. 4).

Fig. 3
figure 3

Methylation status for CpG sites spanning the THADA gene in PBLs from trios (Botnia) (a) and human islets (b). Forty-three CpG sites spanning the THADA gene were tested for differences in methylation status between cases (with type 2 diabetes) and controls, mapped to gene locations. β values showing degree of methylation are plotted on the y-axis. White squares with black lines, cases; black diamonds with dotted lines, controls. None of the CpG sites showed any difference in methylation status between cases and controls in the PBLs. Five CpG sites in the gene body region of donor islets showed a significant difference in methylation status (Δβ), as indicated by the arrows. Interestingly, two CpG sites flanking the SNP rs7578597 were among them. *p adj (p value after adjusting for multiple testing) <0.05

Next, we tested whether the SNP would influence the expression of THADA or a nearby gene PLEKHH2 in cis in four human tissues (skeletal muscle, adipose tissue, blood and islets of Langerhans) as the nine-base-pair insertion introduces a binding site for the transcription factor C/EBP. No significant effect of the variant was observed on expression of these two genes (37–47 individuals analysed per tissue; heterozygote n = 6, homozygote derived n = 36, p>0.05).

We did not observe any significant POE of THADA variants on glucose, insulin secretion and action, lipids, BMI, waist-to hip circumference ratio, blood pressure or liver enzymes in the Botnia families (ESM Table 4). To expand the number of potential phenotypes, we also explored whether there would be differences in 27 different phenotypes between unrelated C and T allele carriers, but none of them reached significance (ESM Table 5), with the exception of a robust association with type 2 diabetes in a case–control cohort (n = 38,069 individuals) (Fig. 4).

Fig. 4
figure 4

Forest plot of the meta-analysis of the association with type 2 diabetes in the four studies: Leipzig, Malmö Case Control (MCC), Malmö Diet and Cancer (MDC) and Botnia Prevalence, Prediction and Prevention of Diabetes (PPP). A significant association was observed for the rs7578597, with an OR of 1.24 (95% CI 1.12, 1.36, p = 1.96 × 10−5)

Variants in CRY2 and PRC1, and near DGKB/TMEM5 and CDC123/CAMK1D

Variants in CRY2 showed POE on type 2 diabetes and hyperglycaemia whereas those in PRC1 and near DGKB/TMEM5, and CDC123/CAMK1D showed POE on hyperglycaemia, with the risk allele being transmitted from the father (Table 3).

Discussion

Heritability of a disease is determined by segregation of disease-promoting alleles in families. Given the paucity of family materials, most genetic studies have been carried out in outbred populations of unrelated individuals where heritability estimates cannot be calculated in a classical way. The present study is one of the first family-based studies assessing specific parental transmissions of alleles in relation to type 2 diabetes risk. Several previous observations have pointed to the possibility of distorted parental transmission of metabolic traits. We [5] and others [26, 27] have demonstrated higher risk of type 2 diabetes in offspring of diabetic mothers than in those of diabetic fathers. Furthermore, we reported that a parental history of type 2 diabetes influenced the insulin response to an oral glucose load, with male offspring of diabetic mothers showing the lowest insulin values [5].

The TDT is robust to population stratification but suffers from low power if the frequency of the risk allele is low as the analyses are restricted to heterozygous parents. Combing the parental phenotypes in the form of the parental discordance tests somewhat enhances the power. We also applied the FBAT, which allows analysis of complex family structures and inclusion of all offspring. Only about one-quarter of variants previously associated with type 2 diabetes/hyperglycaemia in unrelated individuals showed association with type 2 diabetes in at least one of the tests in the present study. While this is most likely due to limited power, it cannot be ruled out that common variants detected in GWAS explain little of clustering of type 2 diabetes seen in some families.

Among the reported GWAS associations, the strongest signal has been the TCF7L2 SNP rs7903146 [28]. This was also the strongest signal for association with type 2 diabetes in our parenTDT analysis (Table 2). Rare mutations in the HNF1A, HNF4A and WFS1 genes account for a substantial proportion of cases with familial MODY, while common variants are associated with type 2 diabetes [9, 29, 30]. Interestingly, variants in these genes were among those showing association with type 2 diabetes in our studied families.

For POE, we found no global signal when all type 2 diabetes risk loci were considered collectively. However, in a locus-by-locus analysis, we found nominal evidence for POE at the KCNQ1 and THADA loci. Here, we used KCNQ1 as positive control as variants in KCNQ1 have shown consistent parental specific associations with type 2 diabetes and hyperglycaemia in previous studies [9, 31]. Consistent with the findings of previous studies [14, 15] the risk allele showed a maternal transmission to affected offspring. We also observed significant differences in methylation status between diabetic and non-diabetic pancreatic islets, but we were not able to analyse which parental allele was methylated and possibly imprinted. KCNQ1 is known to be an imprinted gene, but this seems to be restricted to a relatively brief window of time during the fetal period. In a previous study, we observed that KCNQ1 and KCNQ1OT1 were mono-allelically expressed in fetal tissues but bi-allelically expressed in adult islets [32]. It is thus possible that fetal programming of genes involved in beta cell function could result in reduced functional beta cell mass, which could predispose to type 2 diabetes by aggravating later in life the capacity to compensate for increased needs imposed by insulin resistance and obesity.

One of the key findings of the present study was an indication for POE with maternal transmission of the non-synonymous risk allele T of SNP rs7578597 in the THADA gene. Notably, the maternal effect was restricted to hyperglycaemic offspring and was not seen in non-diabetic offspring, indicative of a disease-specific effect. Imprinting often occurs through DNA methylation, so to explore patterns of methylation at the THADA locus we assessed the degree of DNA methylation in both PBLs and human islets. In both tissues tested, the gene showed distinctive patterns of methylation. Notably, five CpG sites out of 43 sites analysed (11.6%), including those flanking the rs7578597 SNP, showed a difference in the methylation status between diabetic and non-diabetic donors islets (more than expected by chance alone); no difference was seen in blood. A link between methylation and imprinting of the THADA gene is further supported by data showing allelic imbalance for THADA expression in adult human pancreatic islets [33]. However, expression of the THADA gene is not restricted to beta cells but is also seen in alpha and exocrine cells [34].

The parent-of-origin tests are based on the TDT, which is robust but suffers from reduced power when being analysed for transmission of rare alleles from heterozygous parents. While this parent-specific association was nominally significant in the families from Botnia, replication in families from Hungary, supported by methylation and allelic imbalance in expression in the islets, lends further support to the parent-of-origin associations.

The function of THADA (thyroid adenoma associated), which contains an ARM repeat for protein–protein interactions, is not known but it has been shown to be the target of translocations in thyroid adenomas [35] suggesting a potential role in growth. Disruption of orthologues of THADA led to sucrose-dependent toxicity in drosophila [36]. The THADA gene has also been ascribed an interesting role during evolution; THADA is located in a chromosomal region that was suggested to be subject to positive selection from Neanderthals to modern humans, as the Neanderthal genome was depleted of derived alleles [37].

If THADA was a ‘thrifty’ gene, positively selected during evolution, which phenotype would it influence? To address this question we related THADA variants to different phenotypes, including glycaemic, both in the trios and in unrelated individuals. Among possible ‘thrifty traits’, determinants of energy metabolism and muscle function have often been reported [38, 39]. However, of the traits tested (ESM Tables 4, 5), no clear POE was observed. Interestingly, a recent study showed THADA to be one of the strongest signals for cold adaptation [40], but such phenotypes were not available in the trios.

While there are very few reported imprinted genes, the true number of human imprinted genes is not known. Recent studies in mice and other species have shown that the actual number of imprinted genes is much higher than originally thought and that factors like tissue specificity and temporal effects of imprinting status need to be taken into account [41]. While THADA and KCNQ1 are examples of genes with such potential functions, future research should find an answer to the question: what benefits might such POE have for the offspring?