Interaction between genetic and epigenetic variation defines gene expression patterns at the asthma-associated locus 17q12-q21 in lymphoblastoid cell lines

Phenotypic variation results from variation in gene expression, which is modulated by genetic and/or epigenetic factors. To understand the molecular basis of human disease, interaction between genetic and epigenetic factors needs to be taken into account. The asthma-associated region 17q12-q21 harbors three genes, the zona pellucida binding protein 2 (ZPBP2), gasdermin B (GSDMB) and ORM1-like 3 (ORMDL3), that show allele-specific differences in expression levels in lymphoblastoid cell lines (LCLs) and CD4+ T cells. Here, we report a molecular dissection of allele-specific transcriptional regulation of the genes within the chromosomal region 17q12-q21 combining in vitro transfection, formaldehyde-assisted isolation of regulatory elements, chromatin immunoprecipitation and DNA methylation assays in LCLs. We found that a single nucleotide polymorphism rs4795397 influences the activity of ZPBP2 promoter in vitro in an allele-dependent fashion, and also leads to nucleosome repositioning on the asthma-associated allele. However, variable methylation of exon 1 of ZPBP2 masks the strong genetic effect on ZPBP2 promoter activity in LCLs. In contrast, the ORMDL3 promoter is fully unmethylated, which allows detection of genetic effects on its transcription. We conclude that the cis-regulatory effects on 17q12-q21 gene expression result from interaction between several regulatory polymorphisms and epigenetic factors within the cis-regulatory haplotype region.


Introduction
Phenotypic variation is largely dependent on variation in gene expression levels. To identify the genetic determinants of phenotypic variation (including complex disease) in the human population, several genome-wide studies of genetically deWned diVerences in gene expression levels succeeded to map cis-regulatory polymorphisms for a proportion of genes with variable expression (Dixon et al. 2007;Ge et al. 2009;Goring et al. 2007;Pastinen et al. 2004;Verlaan et al. 2009b;Yan et al. 2002). In a number of regions, including the chromosomal region 17q12-q21, genetic cis-eVects act over several neighboring genes Lluis et al. 2011;Verlaan et al. 2009a, b). Genome-wide association studies (GWAS) of gene expression in LCLs (Verlaan et al. 2009a, b) detected allele-speciWc diVerences in the expression of three genes: zona pellucida binding protein 2 (ZPBP2), ORM1-like 3 (S. cerevisiae) (ORMDL3) and gasdermin B (GSDMB) located in 17q12-q21 (Fig. 1a). This genomic interval is also associated with predisposition to early onset asthma, Crohn disease, ulcerative colitis and rheumatoid arthritis Barrett et al. 2008;MoVatt et al. 2007MoVatt et al. , 2011Stahl et al. 2010). A cis-regulatory region responsible for the observed allele-speciWc diVerences in expression in CEPH LCLs has been mapped to a 160-kb long genomic interval that overlaps IKAROS family zinc Wnger 3 (Aiolos) (IKZF3), ZPBP2, GSDMB and ORMDL3 (Verlaan et al. 2009a) (Fig. 1a). Two common cis-regulatory haplotypes, the asthma-associated HapA and the non-asthma associated HapB (also harboring the risk alleles for Crohn disease, ulcerative colitis and rheumatoid arthritis) have been delineated (Verlaan et al. 2009a) (Fig. 1a, b). HapA is associated with higher expression of ORMDL3 and GSDMB and lower expression of ZPBP2 whereas HapB is associated with an opposite pattern of gene expression, i.e. lower expression of ORMDL3 and GSDMB and higher expression of ZPBP2. Expression of IKZF3 is similar for both haplotypes (Verlaan et al. 2009a). Elucidation of the regulatory mechanism that underlies the eVect of common polymorphisms on gene regulation is essential for the understanding of pathogenesis of asthma and other autoimmune diseases; therefore a search for functional cisregulatory polymorphisms was undertaken. This search identiWed SNP rs12936231 that modiWes a CTCF-binding site and inXuences nucleosome occupancy (Verlaan et al. 2009a). Suggestive functional results were found for several other SNPs from the candidate regulatory region. To further elucidate the transcriptional control of this asthma-associated locus, we focused on the interaction between genetic and epigenetic factors in the promoter regions of the three genes whose expression depends upon the cis-regulatory haplotype.

Materials and Methods
Cell culture of lymphoblastoid cell lines HapMap LCLs were purchased from the Coriell Cell Repositories (Camden, NJ) and grown in T75 Xasks in 1£ RPMI 1640 Media (Invitrogen, Carlsbad, CA) (with 2 mM L-glutamine, 15% fetal bovine serum and 1% penicillin/ streptomycin) at 37°C with 5% CO 2 . For formaldehydeassisted isolation of regulatory elements (FAIRE) and chromatin immunoprecipitation (ChIP) assays, LCLs were grown to 90% conXuence. Two independent cultures of cells were used for the FAIRE assay (input and FAIRE-treated cells).

Transient transfection assays
To test for allelic activity, haplotype-speciWc constructs were subcloned into a pGL3 vector containing a WreXy luciferase reporter gene either without a promoter or with an SV40 promoter (Promega, Madison, WI) using a previously published method (Belanger et al. 2005). All constructs were tested in Wve diVerent human immortalized cell lines: cervical cancer (HeLa), choriocarcinoma (Jeg3), hepatocellular liver carcinoma (HepG2), osteosarcoma (MG-63) and CD4+ T-cell lymphoblast-like (Jurkat). These cell lines were transfected using lipofectamine ™ 2000 according to the manufacturer's protocol (Invitrogen, Carlsbad, CA). To control for transfection eYciency, the measurement of the WreXy luciferase was normalized to the measurement of the Renilla luciferase. Experiments were performed in quadruplicate, the activities of the two luciferases were measured 24 h after transfection and allelic haplotypes for each SNP were compared. Statistical signiWcance (P value) was determined using an unpaired Student's t test.
Formaldehyde-assisted isolation of regulatory elements (FAIRE) assays The FAIRE procedure was performed as described (Giresi et al. 2007) with some modiWcations (Verlaan et al. 2009a). To test for FAIRE enrichment of speciWc SNP regions, 200-400 ng of DNA was ampliWed by PCR. For SNP regions that showed FAIRE enrichment, normalized Sanger sequencing was done. FAIRE-treated DNA samples were compared to the input DNA samples and normalized allelic ratios were calculated. The primers used for FAIRE analysis are listed in supplementary Table 2S.

Sodium bisulWte sequencing methylation analysis
To establish the methylation patterns of regulatory regions, 0.5-2 g of DNA was treated with sodium bisul-Wte as previously described (Clark et al. 1994) with modi-Wcations (Saferali et al. 2010). Assays were designed for each of the regions of interest. Nested PCR was performed for each of the loci. PCR products were puriWed using the MinElute gel extraction kit (Qiagen, Hilden, Germany) and cloned using the TOPO TA cloning kit (Invitrogen, Carlsbad, CA). The sequencing was done by the sequencing platform of the McGill University and Genome Quebec Innovation Centre. On average, 20 clones per sample were sequenced. Characteristics of regions, primers and PCR conditions are summarized in Supplementary Table 1S.  (hg18, chr17: 35,179,985-35,339,296) associated with allelic expression of ORMDL3 and GSDMB (Verlaan et al. 2009a). b Positions of the common SNPs that form the cis-regulatory haplotype. For each SNP, the genotype associated with the haplotype A and B (HapA and HapB, respectively) is indicated on the top. HapA harbors the asthma-associated alleles and HapB harbors the non-asthma associated alleles. c Relative positions of the regions analyzed for in vitro promoter or enhancer activity. d Allelic diVerences in ZPBP2 promoter activity in vitro. The ZPBP2 promoter (region 2) that contains the rs4795397-A allele shows stronger promoter activity in vitro. A pGL3Basic plasmid has been used as negative control. The Y-axis indicates fold increase in transcription. Statistically signiWcant allelic diVerences are indicated by asterisks; e allele-speciWc nucleosome occupancy detected by FAIRE at the rs4795397 region. Chromatograms for the input and FAIRE-enriched samples are shown. The position of the SNP rs4795397 is indicated by an arrow

Allelic diVerences in ZPBP2 and ORMDL3 promoter activity
To determine to what extent allelic diVerences in gene expression levels in the 17q12-q21 region were deWned by genetic polymorphisms within gene promoters, the activity of annotated promoter regions of ZPBP2, GSDMB and ORMDL3 was tested in in vitro transfection assays in Wve diVerent cell types (Table 1; Fig. 1c). The annotated GSDMB promoter region did not show signiWcant promoter activity in any of the cell types tested (region 4, Table 1). Two putative ORMDL3 promoter regions were tested. The promoter region for the major ORMDL3 isoform showed high promoter activity in all tested cell lines with no allelic eVect (region 7, Table 1), whereas the putative promoter region for the minor isoform of ORMDL3 (region 6) that included SNP rs12603332 (C/T) showed promoter activity in MG63 cells with a strong allelic eVect. The construct that carried the haplotype HapA-associated rs12603332-C allele had higher promoter activity (P < 0.01, Student's t test) (Table 1). However, exome sequencing data suggest that this promoter is not active in LCLs ). The construct including both promoters maintained high promoter activity; however, the allelic eVect was lost (Table 1).
In the ZPBP2 promoter region, the construct that carried the HapA-associated rs4795397-A allele in HeLa, Jeg3, HepG2 and Jurkat cells showed higher promoter activity (P < 0.01, Student's t test) ( Fig. 1d and region 2; Table 1). A partially overlapping construct that contained the transcriptional start site, exons 1 and 2 of ZPBP2 was active only in JEG3 cells and showed a signiWcant allelic eVect (P < 0.05, Student's t test) (region 3, Table 1).
In conclusion, the asthma-associated HapA haplotype variants of the ZPBP2 promoter region and the putative promoter for the minor ORMDL3 isoform had higher in vitro promoter activity compared to the variants associated with the HapB haplotype.

Allele-speciWc regulatory elements
Cis-regulatory allelic eVects may arise from allele-speciWc diVerences in transcription factor binding and enhancer activity (Agueda et al. 2011;Bickel et al. 2011;Colombo et al. 2011;Harmon et al. 2010;Mertens et al. 2010). Transcription factor binding to a regulatory DNA element usually results in repositioning of nucleosomes. Allelic eVects of putative regulatory SNPs on nucleosome positioning were explored using the FAIRE assay that identiWes DNA regions with reduced nucleosome occupancy, i.e. regions potentially associated with transcription factors (Giresi  al. 2007). Two of the 22 tested SNP regions, rs12936231 and rs4795397, showed both an overall FAIRE enrichment and allelic diVerences in nucleosome occupancy (Verlaan et al. 2009a). The eVect of SNP rs12936231 on nucleosome occupancy and CTCF-binding has been described in detail elsewhere (Verlaan et al. 2009a). The SNP rs4795397 residing in the proximal promoter region of ZPBP2 also inXuenced FAIRE enrichment in Wve of six heterozygous LCLs tested. The rs4795397 A-allele had about twofold higher FAIRE enrichment than the rs4795397-G allele (Fig. 1e). The A-allele is part of the asthma-associated haplotype HapA and is associated with lower expression level of ZPBP2 in CEU LCLs. However, it shows higher promoter activity in vitro (Fig. 1d). Hence, overall our data indicate that the allele that confers higher promoter activity in in vitro gene reporter assays and is associated with reduced nucleosome occupancy, i.e. with transcription factors in vivo, surprisingly, is the same allele that is associated with lower expression levels of the ZPBP2 gene.
The Encode ChIP-sequencing results show enrichment of at least twelve transcription factors within the rs4795397 region (Myers et al. 2011;Raney et al. 2011). These include the nuclear factor of kappa light polypeptide gene enhancer in B-cells 1 (NF B) p65 subunit, which is a central player in inXammation and immunity, RNA polymerase II (RNA POL II), and the transcriptional co-activator E1A binding protein p300 (EP300) (supplementary Fig. 1S). Analysis of the DNA sequence of the rs4795397 region (Transcription Element Search System database, http:// www.cbil.upenn.edu/cgi-bin/tess) predicts binding sites for Yin and Yang 1 (YY1) and the CCAAT/enhancer binding protein (C/EBP) alpha transcription factors that overlap the SNP. To determine if SNP rs4795397 inXuences transcription factor binding in vivo, enrichment with NF B, EP300, YY1, C/EBP alpha, RNA POL II and insulator protein CTCF was tested in LCLs that were homozygous for either the rs4795397-A or the rs4795397-G allele using ChIP. The region was enriched with NF B, YY1, EP300 and RNA POL II (Table 2). High inter-individual variation between cell lines with respect to transcription factor enrichment and no statistically signiWcant eVect of the genotype were observed. We tested the allelic eVect on NF B alpha and RNA POL II enrichment using ChIP followed by Sanger sequencing in two heterozygous cell lines. No signiWcant allelic diVerences in enrichment were detected (Table 2). We conclude that NF B, RNA POL II, YY1 and EP300 bind both alleles in the rs4795397 region.
The rs4795397 region was also highly enriched for the active histone mark H3Ac, and showed low enrichment for the inactive histone mark H3K27me3 that were also independent from genotype (Table 2). C/EBP alpha ChIP results were not conclusive as enrichment was not detected in any of the regions tested including positive controls, perhaps due to antibody speciWcity.
The transcriptional control of genes within the 17q12-q21 chromosomal region is poorly understood and enhancers that regulate ORMDL3 and GSDMB expression have not been yet identiWed. To locate putative enhancers, we searched the publically available data [UCSC database (Raney et al. 2011;Myers et al. 2011)] for genomic regions that were enriched for enhancer-speciWc epigenetic marks e.g. histones H3K4me1 and H3K27Ac; and/or the transcriptional co-activator E1A binding protein p300 (EP300) (supplementary Fig. 1S). These regions were tested for in vitro enhancer activity ( Fig. 1; Table 3 and supplementary Fig. 1S). The candidate enhancer region overlapping with the 5Ј region of the ZPBP2 gene was too large and had to be tested as 3 separate overlapping constructs (regions 1-3 in Table 3). Enhancer activity was detected for the ZPBP2 promoter region (region 2) in Jeg3 and MG63 cells, for region 1 in Jeg3 cells; for the ORMDL3 promoter (region 6) and 3Ј regions in MG63 cells; for the ORMDL3 promoter (region 7) in all cell lines except Jurkat cells (Table 3). SigniWcant allelic eVects were observed for regions 2 and 3 (Table 3).
Collectively, our data demonstrate that the common SNP rs4795397 is a regulatory polymorphism that aVects promoter activity, nucleosome positioning and is part of an enhancer region. Monoallelic expression of certain X-linked and imprinted genes results from allelic diVerences in promoter methylation. To determine if promoter methylation had an eVect on the expression of the 17q12-q21 genes in LCLs, methylation proWles of the annotated IKZF3, ZPBP2, GSDMB, ORMDL3 and GSDMA promoters and Wrst exons were determined (Fig. 2). The ORMDL3 and IKZF3 promoters were unmethylated in all tested cell lines independent from their genotypes (supplementary Figs. 2S, 3S). The annotated GSDMB promoter and exon 1 of isoform 2 were highly methylated in all genotypes [11 LCLs were tested, (supplementary Fig. 4S)] suggesting that transcription of the major annotated isoform 2 of GSDMB was suppressed in LCLs, which is in agreement with the exome sequencing data (Fig. 2a). It is worth noting, however, that the haplotype HapA contains polymorphisms that abolish three out of seven CG sites in the annotated GSDMB promoter. Moreover, this region had slightly lower mean methylation levels in LCLs that were homozygous for the HapA haplotype (n = 3; mean methylation level 79.7%) compared to LCLs that were heterozygous (n = 4, mean methylation level 95.7%) or homozygous for the HapB haplotype (n = 4, mean methylation level 95.6%). For all 11 LCLs, the methylation level of the GSDMB promoter and exon 1 was inversely correlated with RNA abundance (Pearson's correlation coeYcient r = ¡0.63, = 0.05). In contrast to ORMDL3 and GSDMB promoters, the ZPBP2 promoter and exon 1 region showed highly variable DNA methylation patterns both within and between cell lines (Fig. 3a). Sixteen LCLs were tested. The ZPBP2 promoter was highly methylated in cell lines homozygous for the asthma-associated HapA haplotype (n = 5) and in heterozygous cell lines (n = 6), but had lower methylation levels in cell lines that were homozygous for the non-asthma associated HapB haplotype (n = 5). To determine if ZPBP2 methylation depended upon the parental origin of the allele, we compared the methylation proWles of maternal and paternal alleles in two LCL DNA samples, NA10838 and NA12878. No signiWcant parental origin eVect was detected.
Comparison of ZPBP2 promoter and exon 1 methylation and expression levels showed a strong inverse correlation between methylation of exon 1 and ZPBP2 RNA levels (Pearson's correlation coeYcient R = ¡0.88, = 0.05) (Fig. 3b). Hence, methylation levels of ZPBP2 exon 1 inXuence ZPBP2 RNA levels and explain the apparent contradiction between high in vitro activity and FAIRE enrichment of the ZPBP2 promoter region and lower expression of ZPBP2 in LCLs that carry the asthma-associated haplotype HapA.
GSDMA is not expressed in LCLs (Fig. 2a), therefore LCLs are not the appropriate model for testing genetic Table 3 Enhancer activity of putative regulatory regions tested using in vitro transfection assays * SigniWcant allelic eVect P cis-regulatory eVects on its expression. However, increased expression of GSDMA was found in cord blood lymphocytes of individuals that carry the asthma-associated 17q12-q21 alleles (Lluis et al. 2011), suggesting that GSDMA cannot be excluded from the list of putative asthma genes. Hence, to obtain a complete picture of promoter methylation in the 17q12-q21 region we determined the methylation proWle of the GSDMA promoter region in LCLs and found interindividual variation among LCLs with respect to methylation levels (supplementary Fig. 5S).
We also tested the methylation proWle of the rs4795397 region for allelic eVects and found that it was unmethylated independent of genotype (supplementary Fig. 6S).
Overall, the methylation proWles of promoter regions show a good correlation with the expression levels of respective genes, i.e. highly expressed transcripts such as IKZF3 and ORMDL3 have completely unmethylated promoters, while genes with even partial promoter methylation show a considerably reduced transcriptional activity.

Discussion
The asthma-associated chromosomal region 17q12-q21 harbors several genes that show allelic diVerences in expression in LCL. Our data suggest that allelic variation in expression arises from the interaction between several genetic polymorphisms and epigenetic factors. We have previously reported the eVect of the common SNP rs12936231 on CTCF binding and nucleosome occupancy (Verlaan et al. 2009a). In the present study, we demonstrate that another common SNP, rs4795397 that is part of the cisregulatory haplotype and is located within the promoter region of the ZPBP2 is a putative functional polymorphism that shows allele-speciWc nucleosome occupancy and in vitro promoter activity. The rs4795397 region is enriched with YY1 and co-activator protein EP300. YY1 and EP300 are known to form regulatory complexes that may repress (Galvin and Shi 1997;Lee et al. 1995) or activate (Mokrani et al. 2006, Baumeister et al. 2005 gene transcription in response to diVerent stimuli including endoplasmic reticulum stress and viral infection. The rs4795397 region is enriched with the active histone mark H3K9Ac, but not the repressive chromatin mark H3K27me3, an observation which is consistent with the histone acetyltransferase activity of EP300 (Ogryzko et al. 1996). Overall, the ChIP and FAIRE results indicate an active chromatin state at the rs4795397 region. Furthermore, our data show that although rs4795397 has a strong inXuence on promoter activity in vitro, in LCLs, its eVect on ZPBP2 transcription methylation results for the 17q12-21 region. Red rectangles below the diagram reXect the relative RNA abundance for genes in the region is masked by DNA methylation of exon 1 of the ZPBP2 gene. Moreover, DNA methylation levels of the ZPBP2 exon 1 seem to depend upon the cis-regulatory haplotype as only LCLs that are homozygous for the HapB haplotype have lower exon 1 methylation and higher ZPBP2 RNA levels (Fig. 3).
In summary, our data show that most allele-speciWc regulatory eVects such as nucleosome occupancy, DNA methylation, and in vitro promoter and enhancer activity localize in a 5.3-kb region overlapping with the ZPBP2 gene at least 31 kb away from the ORMDL3 gene that shows allelic diVerences in expression [ (Verlaan et al. 2009a) and this work]. The sum of our data suggests that this region harbors a strong enhancer. Our conclusions are also consistent with the Chromatin State Segmentation by HMM mapping results (http://genome.ucsc.edu/EncodeBroadHmm) . It remains to be determined if the ZPBP2 enhancer region exerts a long-range regulatory eVect that extends beyond the ZPBP2 gene and contributes to the allele-speciWc diVerences in the expression of ORMDL3 and other genes in the region (Verlaan et al. 2009a).
The functional SNP rs4795397 is located within the promoter region of ZPBP2, a gene whose importance for fertilization and male fertility has been demonstrated in both mice and humans (Lin et al. 2007;Redgrove et al. 2011). The rs4795397-A allele that boosts the ZPBP2 promoter activity in vitro is also part of the asthma-associated haplotype HapA. The exon 1 of ZPBP2 is unmethylated in human sperm (S. Berlivet and A. Naumova, unpublished) and cannot block the allelic eVect of rs4795397 on gene expression. Therefore, it is conceivable that spermatozoa from male carriers of the asthma-associated rs4795397-A allele have a higher supply of the ZPBP2 protein and potentially that are located within exon 1 of ZPBP2 and have greater variability in DNA methylation are shown in the Wgure. b Negative correlation between ZPBP2 exon 1 methylation and ZPBP2 RNA abundance. ZPBP2 expression was evaluated using real-time RT-PCR and normalized to the 18S RNA levels as described in (Verlaan et al. 2009a) an increased fertilization capacity. This may provide a slight advantage at the population level and lead to an increased transmission of the asthma-associated haplotype from fathers to oVspring.
Our results provide an example where inter-individual variation in DNA methylation acts as a modiWer of genetic inXuences on gene expression and may interfere with genetic mapping of cis-regulatory polymorphisms by attenuating the genetic eVect on transcription and thereby the signiWcance of genetic association results as in the case of the ZPBP2 gene. Based on our results, we speculate that promoters and Wrst exons of genes that show genetic cis-eVect on expression levels with genome-wide statistical signiWcance are likely not methylated at all, or else may have allele-speciWc methylation, where the low expressing allele would have high promoter methylation and vice versa.
DNA methylation of promoter and enhancer elements varies with cell type and/or developmental stage (Eckhardt et al. 2006;Ghosh et al. 2010). Therefore, cell-type speciWc DNA methylation patterns have to be taken into account in the search for candidate disease gene. Several lines of evidence point to ORMDL3 as the best 17q21 candidate causal gene for childhood asthma. Its expression levels show association with genotype in LCLs (MoVatt et al. 2007;Verlaan et al. 2009a, b), T lymphocytes (Murphy et al. 2010) and cord blood lymphocytes (Lluis et al. 2011). ORMDL3 is also expressed in bronchial epithelial cells and its RNA levels are slightly higher in asthmatic subjects compared to controls, whereas GSDMB and ZPBP2 are practically not expressed (Bochkov et al. 2010). Whether or not ORMDL3 is the causal gene responsible for predisposition to asthma remains to be addressed using other approaches. It is important to note, however, that while the current experimental evidence excludes IKZF3 (IKZF3 is not aVected by the haplotype eVect in LCLs or T lymphocytes), it is not suYcient for ruling out ZPBP2, GSDMB or GSDMA as contributors to predisposition to asthma. Allelic transcription of GSDMB in LCLs and T lymphocytes has been previously demonstrated (Verlaan et al. 2009a, b;Murphy et al. 2010). As for ZPBP2 and GSDMA, it is possible that in certain cell types their promoters may be unmethylated and their transcription may also depend on the haplotype. To exclude ZPBP2, GSDMB and GSDMA and narrow down the list of candidate genes for predisposition to asthma expression studies in cell types that are relevant for the etiology of asthma are necessary.