Background

Asthma is a multifactorial disease that is influenced by the interplay between genetic and environmental factors [1]. Studies have shown that the asthma prevalence in girls increases with puberty [14]. The mechanism behind this increase is not yet clear, though we propose that endocrine effects may be involved. In addition, asthma and lung function may vary during different phases of the menstrual cycle, further suggesting a role of sex hormones in asthma [57]. In addition, it has been reported that both exogenous and endogenous sex hormones influence the occurrence of asthma in young women [8].

Because estrogen and progesterone are known to decrease the contractility of airway smooth muscle, their positive correlation with asthma is more likely driven by their effects on the immune system [9]. Specifically, progesterone stimulates IL-4 production and promotes T helper 2 (Th2) differentiation [10]. The immunological effects of estrogen include increased production of TNF-α by the lungs, increased production of IL-4 by the bone marrow, and thus migration of eosinophils during allergic inflammation [11]. Furthermore, estrogen decreased expression of T-regulatory cells [12], increased expression of IL-5 and IL-13 [13], increased the differentiation of naive CD4+ cells into Th2 cells [13], and also increased Th2 responses by augmenting the production of dendritic cells [14]. Hence, estrogen and progesterone may be linked to potential immunological effects and variation in airway responses.

Oral contraceptive pills (OCPs) are exogenous sex hormone preparations used primarily for birth control, but also for irregular menstruation, hirsutism, polycystic ovarian disease, and dysmenorrhea. Some studies find a positive association between OCP use and asthma [15, 16], while other studies find the opposite relationship [17, 18], and yet other studies identified no significant association [19, 20]. In addition, some studies have found early menarche to be associated with the risk of adult asthma [8, 15, 16], while another study has reported no association [17, 18]. Overall, there is a lack of understanding of the association between age at menarche, sex hormones, and asthma.

The transcription factor GATA3, located on chromosome 10, encodes a master regulator of Th2 cell differentiation [19] that plays an important role in the production of cytokines [20, 21]. A study by Wada et al, in a mouse model of asthma, demonstrated increased production of antigen-induced Th2 cytokines in the bronchial lymph node cells of female mice compared to male mice, which was associated with enhanced GATA3 expression [2], suggesting a possible role for sex in regulating the activity of GATA3.

The term epigenetics refers to the changes in phenotype or expression of genes that are not due to changes in the sequence of DNA [22]. Epigenetics is considered to play an important role in regulation and differentiation of T cells and asthma pathogenesis [23, 24]. In particular, DNA methylation (DNA-M) may regulate genes associated with asthma and allergy [25]. Some single nucleotide polymorphisms (SNPs) can act as methylation quantitative trait loci (methQTLs) to influence DNA-M at specific CpG sites, and may be conditional on environmental exposure [2628]. To reflect both the genetic and environmental influences, we call these loci conditional methQTLs.

The role of sex hormones and the potential gender-related activity of GATA3 in asthma motivated us to study a possible interaction between oral contraceptives and GATA3 and further its association with asthma. Therefore, we hypothesize that exogenous or endogenous sex hormone exposure in interaction with genetic variants could be associated with DNA-M of GATA3, which in turn affects the risk of asthma at the age of 18 years. It is also important to understand whether the change in DNA-M is a cause or a consequence of the disease. To address this issue we use a two-stage model proposed by Karmaus et al., which incorporates both methQTLs and genetic variants [29]. In stage 1, we identify the conditional methQTLs (influenced by the use of OCPs) that may result in a change of the DNA-M of specific CpG sites of the GATA3 gene. These CpG sites differentially methylated depending on OCP use subsequently may modify the penetrance of certain SNPs, which then are called modifiable genetic variants (modGVs) [30, 31]. In stage 2, we evaluate the interaction of differentially methylated CpG sites with modGVs on asthma at the age of 18 years.

Age at menarche is related to changes in endogenous sex hormones and reflects body changes. In girls, the earlier the onset of puberty, the longer the exposure to sex hormones. Hence, we additionally ran our stage 1 model using age at menarche as an alternate indicator of a possible endocrine effect. Agreement between both exposures would further support our hypothesis.

Results

There were no significant differences in the prevalence of asthma, BMI, smoking at 18 years, maternal history of asthma, socioeconomic status, and median age at menarche between female offspring of the study group of this birth cohort that participated in the 18 year exam (n = 660) and those who were randomly selected for the DNA-M analysis (n = 245; Table 1). However, OCP missingness was different between the study group at 18 years and the 245 randomly selected girls. The difference is related to more missing information in the female study group at 18 years (n = 32, 4.9%) compared to the random selection of 245 with blood samples (n = 2, 0.8%). Ignoring missingness, the proportion of OCP use did not differ (P = 0.75). The reason for the missingness seems to be parental control at 18 years. In general, participants were interviewed separately in the study center; however, they could also mail the questionnaire or answer some questions on the phone. Fifteen of 16 girls with a mailed questionnaire did not answer the question on OCP use.

Table 1 Characteristics of subjects with available methylation data compared to the female participants of the total cohort

Among the female participants with methylation data, 12.2% had maternal history of asthma, 19.2% had mothers that smoked during pregnancy, 14.3% had asthma at 18 years, 47.8% used OCPs at 18 years (44.4% of the girls in the study group at the 18-year exam). The median age at menarche was found to be 13 years. Use of oral contraceptives and age at menarche in our sample are associated (Wilcoxon test: P = 0.001); 62% of the participants with age at menarche ≤11 years, 45.8% of those between 12 and 14 years, and 36% of those with ≥14 years used OCPs.

Of the thirteen GATA3 SNPs that were genotyped, seven SNPs (rs1269486, rs3802604, rs3824662, rs422628, rs434645, rs12412241, and rs406103) were selected for further analysis since these were uncorrelated (Figure 1). Of the seven SNPs that were analyzed, rs1269486 was located in the promoter, followed by four SNPs (rs3802604, rs3824662, rs422628, and rs406103) in introns, and two SNPs (rs434645, and rs12412241) downstream of the GATA3 gene (Table 2). The mean methylation levels (β value) of six of the 14 CpG sites of the GATA3 gene were low (<0.10; Table 3), four were highly methylated (>0.90), while four CpG sites showed wider variation in methylation between individuals with mean methylation between >0.10 and <0.55.

Figure 1
figure 1

Linkage disequilibrium of GATA3 single nucleotide polymorphisms, standard (D’/LOD) color scheme; D’ LD values displayed.

Table 2 Single nucleotide polymorphisms (SNPs) for GATA3 and their genotypes
Table 3 Distribution of methylation on CpG sites of GATA3 gene

In stage 1, after controlling for cell type composition in peripheral blood, the interaction term ‘OCP use × rs1269486’ was found to be associated with differential methylation of cg17124583 (P value = 0.002; FDR P value = 0.04; Table 4), indicating that this SNP represents a conditional methQTL. OCP users with minor allele (AA) (the difference in a logit scale is -0.86; P value = 0.03) and heterozygous (AG) (the difference in a logit scale is -0.57; P value = 0.002) genotypes for rs1269486 had lower average methylation than those with the major (GG) genotype. The association was adjusted for potential confounders including socioeconomic status, smoking at 18 years, and BMI at 18 years. However, none of these potential confounders changed the interaction effect by more than 10%.

Table 4 Assessment of interaction of single nucleotide polymorphisms with oral contraceptive use, and with age at menarche on the methylation of the CpG site cg17124583 using linear regression a

To replicate the OCP usage model with an alternate indicator for endocrine effects, age at menarche was investigated (Table 4). Indeed, methylation of cg17124583 was differentially methylated by the interaction of same SNP rs1269486 and age at menarche (P = 0.0017). In girls with the minor and heterozygous genotypes for rs1269486 methylation levels at cg17124583 were found to be higher if age at menarche was higher. The interaction was statistically significant only in those with the minor genotype (the difference in logit scale is 0.42; P value = 0.003). Hence, both OCP use and age at menarche in interaction with rs1269486 were associated with differential methylation of cg17124583.

Interestingly, in a small sample of 34 paired DNA-M measurements, the differentially methylated CpG site cg17124583 show some variability from 10 to 18 years (test for stability: ICC = 0.39, P = 0.01) with mean methylation levels of 0.06 and 0.05, respectively. This CpG site shows both stability and variability, but its variance was not explained by OCP use or by age at menarche (data not shown).

In the second stage, we analyzed whether methylation of cg17124583 modifies the association between SNPs and asthma at 18 years. We tested the interaction between seven SNPs and the methylation levels of cg17124583 (differentially methylated in stage 1), and its association with asthma at 18 years. We found statistically significant interactions between the SNPs rs434645 and rs422628 with cg17124583 that modify the risk of asthma at 18 years (Table 5). For rs434645, the minor (AA) and heterozygous (AG) genotypes were combined since the direction of effect on methylation was the same for both. Then the statistical association of the interaction of cg1712583 and rs434645 (AA/AG vs. GG) with asthma at 18 years was checked using the common genotype (GG) as the reference. The interaction was found to be significant (P = 0.01; Table 5), however, it did not survive multiple testing with FDR. For the SNP rs422628, an additive genetic model was used to compare participants who had the minor (GG) and heterozygous (AG) genotypes, with those who have common (AA) genotype. The interaction term ‘cg17124583 × rs422628’ was found to be statistically significantly associated with asthma in those with the heterozygous genotype after adjusting for multiple comparisons (P = 0.006; FDR adjusted P = 0.05; Table 5). The consecutive flow of assessments and its results is outlined in Figure 2. The range of DNA-M for cg17124583 was 0.01 to 0.46. Since the number of participants at methylation levels of <0.02 and >0.14 were low, we grouped lower methylation levels into ≤0.02 (n = 4) and larger into ≥0.14 (n = 9). Descriptively, 157 participants had average methylation levels of 0.05 and less at this CpG, 71 participants had 0.06 to 0.09, and 17 participants had 0.10 to 0.46. For subjects with AG and GG genotypes, we examined the RRs for asthma at different levels of DNA methylation. Here we present RRs for the AG genotype for the following levels: 0.02, 0.04, 0.06, 0.08, 0.10, and 0.12 relative to subjects with AA genotype. For the AG genotype, the corresponding RRs of asthma are 0.31, 0.63, 1.31, 2.71, 5.62, and 11.65 (Figure 3). The respective 95% CI are found in the legend of Figure 3. Figure 3 shows that the relative risk (RR) for the rs422628 AG genotype relative to AA was higher when cg17124583 was more methylated.

Table 5 Log-linear models of interaction between genetic variants (rs434645 and rs422628) with DNA methylation of cg17124583 in the GATA3 gene on the prevalence of asthma at 18 years a
Figure 2
figure 2

Consecutive assessments of stage 1 (conditional methylation quantitative trail locus) and stage 2 (modifiable genetic variant) assessments.

Figure 3
figure 3

Risk ratio of asthma at 18 years versus methylation at different genotypes of GATA3 rs422628: AG and GG compared to AA [reference].

Discussion

Of the 14 CpGs and the seven SNPs that were analyzed, we identified a conditional methQTL (rs1269486) interacting with OCP usage and with age at menarche leading to a differential DNA-M of cg17124583 (Figure 2). The same differentially methylated CpG site cg17124583 in interaction with another SNP rs422628 (modGV) was found to modify the association of asthma at 18 years. This association remained statistically significant after adjusting for multiple comparisons using FDR. To our knowledge, this is the first study to identify those SNPs in the GATA3 gene that in interaction with OCP use, and with age at menarche, are associated with differential methylation of GATA3 CpG sites and consecutively with asthma.

Although the CpG site cg17124583 is located 13,768 base pairs away from rs422628, we can see that the risk of asthma is modulated by this CpG site. It is possible that rs422628 is in linkage disequilibrium with another genetic variant, which is responsible for the functional effect on asthma risk and is adjacent to cg17124583, as has been previously observed for another gene [32].

The probability of a selection bias seems to be negligible as the study participants were randomly selected for the DNA-M analysis and for all but one variable there were no significant differences between the study population and the cohort girls who participated at 18 years. However, the proportion of missing information about the use of oral contraceptives at 18 years of age was higher in female cohort members. This was likely due to parental control, since nearly all girls whose questionnaire was mailed had missing information. We do not consider that the bias of parental control biases our results.

As the information on the use of OCPs is self-reported by the participants, there is a possibility of a misclassification. However, previous studies have shown high agreement between questionnaire data and medical records for any OCP use, current use, and time since first use [33, 34]. In addition, since age at menarche is an important event in a women’s life, thus, misclassifications are unlikely [35]. As the women were neither aware of their SNPs nor the methylation status, any recall bias would result in a non-differential misclassification and likely underestimate the true association. We repeated the analyses with a different exposure marker for endocrine effects, namely age at menarche, which showed a similar result. It therefore seems highly likely that the significant effects we observe on asthma risk represent an authentic link to endocrine events via differential methylation of the GATA3 gene.

The DNA-M in our study was obtained using the Illumina Infinium HumanMethylation450 beadchip array, which is demonstrated to have high validity and high reproducibility [36]. As DNA-M can be tissue-specific, it is also important to consider whether the DNA-M obtained from peripheral blood represented methylation profiles in other tissues. This issue is currently under debate [3740]. In addition, peripheral blood leukocytes represent a mixture of cells [41]. Using CpG site information, we estimated the relative contribution of cell type composition in peripheral blood using the Houseman approach [41]. The estimated cell type composition had only a minor influence on the DNA-M of GATA3 CpG sites (Table 4), suggesting that differences in the proportions of different leukocyte do not underlie the effects reported here.

In the regression models, we observed that, although the main effects of OCP use, age at menarche, and SNPs were not significantly associated with DNA-M of cg17124583, their interactions were found to be significantly associated even after penalizing for multiple testing. Similarly, no main effects were seen for the association of OCP use, cg17124583, and the SNPs on the risk for asthma at 18 years. However, the interaction of the SNP and DNA-M were found to be statistically significant. The importance of genome-epigenome interactions in disease is increasingly recognized [42]. For example, DNA-M at the IL4R locus interacts with a local SNP to increase the RR of asthma much more dramatically than does either genotype or methylation alone [43]. Likewise DNA-M and genotype at the IL13 locus interact to influence lung function [44]. It is therefore of great importance to consider not only the disease risk imparted by the genome sequence, but how this is modified by DNA-M, which itself by be affected by environmental exposures.

Asthma being considered mainly a ‘Th2 disease’, we focused on the GATA3 gene because it is known to be the master regulator of Th2 cell differentiation [19] and has been linked to endocrine responses [45]. Estrogen is an immune modulator and is known to stimulate the production of Th2 cytokines, which include IL-4, IL-5, and IL-13 [11, 13]. Our findings show that OCP use modifies the DNA-M of GATA3 gene. We speculate that OCPs, which contain estrogen and progesterone [46, 47], may influence Th2 cytokine production via the differential methylation of GATA3 gene. Similar findings are seen with age at menarche altering the DNA-M of GATA3 gene. Statistically, although early age at menarche is related to use of OCPs, the two variables are not in complete agreement and seem to measure different features. Age at menarche is related to endogenous sex hormones [48], whereas OCPs are exogenous sex hormones. We believe that the agreement of our stage 1 findings between OCP use and age at menarche provides credence to our results. Our two-stage model suggests a potential pathway in which sex hormone-related exposures such as OCP use and age at menarche alter the DNA-M within GATA3 to subsequently affect the risk for asthma in girls at 18 years. We believe that using the two-stage model prevents reverse associations, namely that asthma initiates changes of CpG sites. In the first stage, cg17124583 was the only CpG site selected due to its relation with the interaction term of OCP use with one genetic variant (rs1269486) of the GATA3 gene and corroborated with age at menarche. Then only this CpG site was tested for an association with asthma at 18 years. However, it is not likely, but still possible that three variables (oral contraceptive use, asthma, and rs1269486) interacted in concert to influence the methylation of cg17124583.

A limitation of our study is that the RRs at methylation levels larger than 9% are only based on a limited number of individuals (n = 17). Another limitation is the lack of availability of the information on the type of OCP (estrogen/progesterone only pills or a combined pill), and length of time on the OCP which can further help to elucidate the role of either estrogen/progesterone or both in DNA methylation, genetic polymorphisms, and asthma.

Conclusions

This study represents the first report of an interaction of genetic variation and DNA-M of GATA3 on the risk for asthma at 18 years, which is modified by the use of OCP and age at menarche. The findings suggest a potential pathway in which OCP exposure and age at menarche, presumably via sex hormones, can alter the DNA-M of a GATA3 CpG site, which subsequently, in conjunction with genetic variants, influences the risk of asthma at 18 years. These findings provide a possible explanation for the increase in asthma prevalence in girls/women after puberty. Our results should motivate other researchers to search for interactions between genetic variants, sex hormones, and DNA-M.

Methods

Study design and population

A whole population birth cohort was established in the Isle of Wight, UK in 1989 to prospectively study the natural history and etiology of asthma and allergic conditions. The local research ethics committee (NRES Committee South Central - Hampshire B) approved the study and written informed consent was obtained from 1,456 children (January 1989 to February 1990), who were followed up at 1, 2, 4, 10, and 18 years. This Caucasian birth cohort has been described in detail elsewhere [49]. Questionnaires were completed for each child at every follow-up. Blood or saliva samples were collected at the ages of 10 and 18 years for genetic analysis.

Exposures

Information on OCP use was collected at 18 years. The question was: ‘Are you on the contraceptive pill?’ Age at menarche was assessed using the National Institute of Child and Human Development (NICHD) questionnaire from the Study of Early Child Care and Youth Development, which is based on the Pubertal Development Scale (PDS) method [50]. Among other questions on pubertal signs, the questionnaire asked: ‘How old were you when you started to menstruate?’

Outcome

Asthma information was collected using the International Study of Asthma and Allergies in Childhood (ISAAC) questionnaire [51]. The questions for assessing asthma were as follows: ‘History of physician diagnosed asthma?’, ‘Wheezing or whistling in the chest in the last 12 months?’ and ‘Asthma treatment in the last 12 months?’ Based on the answers to these questions, asthma at 18 years was defined by physician diagnosis of asthma plus current symptoms and/or currently on asthma medication.

Genotyping

Genomic DNA was isolated from blood samples by using QIAamp DNA Blood Kits (Qiagen, Valencia, CA, USA) or the ABI PRISM 6100 Nucleic Acid PrepStation (Applied Biosystems, Foster City, CA, USA). In some cases genomic DNA was isolated from saliva using Oragene DNA Self Collection Kits (DNA Genotek, Ottawa, ON, Canada). Polymorphisms in the GATA3 gene were examined using the SNPper and Applied Biosystems databases. Genotyping was conducted by fluorogenic 5’ nuclease chemistry PCR using Assays on Demands kits cycled on a 7900HT Sequence Detection System (Applied Biosystems, Foster City, CA, USA), or biotin-streptavidin-based pyrosequencing performed on PSQ-6 instrumentation (Biotage AB, Uppsala, Sweden). SNPs (n = 17) that tagged the GATA3 gene were identified using a tagger implemented in Haploview 4.2 using Caucasian Hapmap data, including 10 kb upstream and downstream of the GATA3 gene [52]. Estimates of linkage disequilibrium (LD) between SNPs were calculated using D’ and r2. An r2 value of 0.85 was the threshold for tagging, and seven SNPs were selected (1 SNP from each of the 5 haplotype blocks and 2 SNPs that did not have strong linkage disequilibrium with other SNPs, Figure 1).

DNA Methylation

Stored blood samples collected at 10 and 18 years were available on the Isle of Wight, UK. For the measurement of DNA methylation at 18 years in girls, the team in the United States provided a list of 245 random identification numbers to be selected from the samples on the Isle of Wight, UK. Then for additional DNA methylation analyses of samples when these 245 women were 10 years of age, we randomly selected 34 blood samples with 16 girls with asthma and 18 girls without. DNA methylation was assessed using Illumina Infinium HumanMethylation450 BeadChips (Illumina, Inc, SanDiego, CA, USA). The 18-year samples were processed in one batch. In addition, DNA methylation data were available for a sample of 34 girls at 10 years of age processed in another batch. DNA from blood samples was extracted for methylation arraying using a salting out procedure. One microgram of DNA was bisulfite-treated for cytosine to thymine conversion using the EZ 96-DNA methylation kit (Zymo Research, CA, USA), following the manufacturer’s standard protocol. Arrays were processed using a standard protocol as described elsewhere [53]. The Bead Chips were scanned using a Bead Station, and the methylation level (beta (β) value) was calculated for each queried CpG locus using Methylation module of GenomeStudio software.

Covariates

Maternal history of asthma and maternal smoking during pregnancy was assessed by a questionnaire administered after birth. Information about the child’s active smoking status and body mass index (BMI) was collected from the 18-year questionnaire and anthropometric measurements conducted at the age of 18 years. Also assessed was ‘family social status cluster’, which is a composite variable derived from a combination of family income, parental occupation (socioeconomic status), and number of children in a child’s bedroom [54].

In addition, since DNA methylation found in peripheral blood cells depends on cell types, we adjusted all stage 1 models for cell mixture using the method proposed by Houseman et al [41]. This method identifies CpGs within differentially methylated regions known to distinguish six types of white blood cells and then utilizes β values at these CpGs to predict the proportions of CD8+ T-cells, CD4+ T-cells, natural killer cells, B-cells, monocytes, and granulocytes for each blood sample. The rationale to use this method for stage 1 is to estimate the change in DNA-M that is due to differential methylation but not due to a change in peripheral blood cell. Once we have identified such differential DNA-M, in stage 2 we are more interested in the concert of cells and their methylation level on the outcome asthma.

Statistical analysis

Preprocessing of the DNA-M data was undertaken using the IMA [55] package implemented in the R statistical computing package [56]. To identify tag-SNPs, LD between SNPs was calculated using D’ and r2[57] and they were tested for Hardy-Weinberg equilibrium using Haploview 3.2 software [52]. DNA-M levels were quantified using β values that present the proportion of methylated (M) over the sum of methylated and unmethylated (U) allele intensities (β = M/[c + M + U]), with c being a constant to prevent dividing by zero [58]. As the β value method has severe heteroscedasticity, it is recommended to use M-values (logit-transformed β values) for differential methylation analysis [59]. A logit transformation was employed for all β values to normalize their distribution. To assess whether the subset population (n = 245) represents the total cohort of girls at 18 years, χ2 tests were used.

In this study, 16 CpG sites that spanned the GATA3 gene were analyzed, out of which one CpG site was removed due to the presence of a probe SNP. A probe SNP is a single nucleotide polymorphism in the probe of 50 base-pairs used to determine the location of methylated CpG site. A SNP in the 50 base-pair probe may interfere with the DNA-M measurement. A second CpG site had an average methylation level <0.05, we removed this site from analysis as CpGs that are either very highly (>0.95) or very lowly (<0.05) methylated have too little variance that can be explained statistically. We added analyses on whether chip and positions had an influence on the M-values in the 245 samples. Neither chip nor positions showed significant effects or any substantial changes. In addition, in 34 female participants stability of DNA-M in blood between 10 and 18 years was estimated using intraclass correlation coefficients (ICCs).

The aim of the first stage of the two-stage model was to detect CpG sites that were affected by an interaction of SNPs and OCP usage. We ran linear regression models, in which each of the 14 CpG sites were modeled against seven SNPs, each interacting with OCP use. Since we performed 98 tests (14 × 7) we adjusted for multiple testing by controlling the overall false discovery rate (FDR; overall FDR = 0.05) [60].

Focusing on the CpG sites with significant interactions with OCPs, we then reran the analyses of stage 1 using age at menarche as exposure to determine if similar associations occurred with this marker of endocrine changes. If statistically significant associations are observed for both sex hormone exposures, then this strengthens the evidence that the association is related to sex hormones.

In the second stage model with asthma as the dependent variable, we used log-linear models (GENMOD procedure in SAS 9.3) to estimate statistical interactions between the methylation levels of CpG sites selected in stage 1 and GATA3 SNPs on the risk for asthma at age 18 years. These models included the following potential confounders: maternal history of asthma, maternal smoking during pregnancy, BMI at 18 years, smoking at 18 years, and socioeconomic status. Those confounders that changed the association of interest by 10% or more were retained as confounders in the final model. All hypotheses tested were corrected for multiple testing using the FDR. The statistical analyses were performed using the SAS statistical package (version 9.3; SAS Institute, Cary, NC, USA).