Background

The aryl hydrocarbon receptor (AhR) is a ligand activated transcription factor (TF) belonging to the basic-helix-loop-helix-PAS (bHLH-PAS) family of proteins that serve as environmental sensors [1]. 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) is the prototypical AhR ligand, a ubiquitous environmental contaminant that elicits diverse species-specific effects, including tumor promotion, teratogenesis, hepatotoxicity, modulation of endocrine systems, immunotoxicity and enzyme induction [2, 3]. These effects result from alterations in gene expression mediated by the AhR [4]. Several studies have demonstrated the requirement for the AhR in mediating TCDD-elicited responses. For example, mice carrying low-affinity AhR alleles are less susceptible to the effects elicited by TCDD [5]. Additionally, AhR-null mice fail to induce responses typically observed following treatment with TCDD and related compounds [6].

TCDD binding to the cytosolic AhR results in a conformational change and translocation to the nucleus. The activated AhR complex heterodimerizes with the aryl hydrocarbon nuclear translocator (ARNT), another bHLH-PAS family member, and binds dioxin response elements (DREs) containing the substitution intolerant 5'-GCGTG-3' core sequence to regulate changes in gene expression [4, 7]. Computational searches for all DRE cores in the human, mouse and rat genome identified the highest density of DREs proximal to a transcriptional start site (TSS) [8]. However, a significant number of DRE cores and putative functional DREs have been identified in distal regions within non-coding intergenic segments of the genome. It has been proposed that enrichments for other TFs on outlying regions may be functionally relevant through tertiary looping of genomic DNA and/or via protein tethering mechanisms [9].

The role of specific transcriptional regulators has been studied on a gene-by-gene basis, primarily focusing on regions proximal to the TSS. However, the coupling of chromatin immunoprecipitation with either genomic tiling microarrays (ChIP-chip) or next-generation sequencing (ChIP-seq) has facilitated genome-wide analysis of protein-DNA interactions for a variety of receptors [1016], TFs [1720] and components of the basal transcriptional machinery [10, 21, 22]. Genome-wide location analyses further suggest that TF binding at cis-regulatory enhancers in intergenic DNA regions of the genome may also have functional significance [10, 17, 23, 24].

Several studies have investigated AhR-mediated gene expression responses using various technologies [2530]. Although AhR-DNA interactions have primarily focused on the regulation of CYP1A1[4, 31], recent global ChIP studies have extended our knowledge of AhR-DNA interactions by examining promoter region binding profiles using in vitro and in vivo models [3235] (Lo et al., in submission). Our study provides a comprehensive analysis by examining TCDD-induced AhR binding across the entire mouse genome. In addition, we examined AhR binding within chromosomes, intragenic and intergenic DNA regions, and in specific genic regions (i.e., 10 kb upstream of a TSS, 5' and 3' untranslated regions [UTRs], coding sequence [CDS]). Global AhR enrichment data are also integrated with computational DRE core analysis [8], and complementary whole-genome gene expression profiling to provide a more comprehensive evaluation of the hepatic AhR regulatory network elicited by TCDD.

Results

Identification and Characterization of TCDD-Elicited AhR Enrichment

In order to identify regions of AhR enrichment induced by TCDD across the genome, ChIP-chip assays were performed using hepatic tissue from immature ovariectomized mice orally gavaged with 30 μg/kg TCDD for 2 and 24 hrs. CisGenome [36] analysis identified 22,502 and 12,677 enriched regions at 2 and 24 hrs, respectively. Applying a conservative FDR of 0.01 resulted in 14,446 and 974 significant AhR enriched regions at 2 and 24 hrs, respectively (Additional Files 1 and 2 provides a complete list of enriched regions). Ligand activation of the AhR in vivo triggers its own rapid degradation and causing a significant reduction of AhR levels [37, 38]. This is reflected in the significantly lower number of TCDD-induced AhR enriched regions at 24 hrs as compared to 2 hrs. The distribution, location and enrichment values for each tiled probes across the Cyp1a1 gene (represented by RefSeq sequences NM_009992 and NM_001136059) are summarized in Figure 1. MA value plots visualize the profile of the enriched region and log2 fold-enrichment values for each probe are also illustrated (Figure 1). Note that the probes are unevenly tiled throughout the genome, resulting in gaps in genome coverage that may coincide with DRE core locations that may affect AhR enriched region identification. For example, two enriched regions were associated with Cyp1a1 (Figure 1, red bars). However, the MA plots for 2 and 24 hrs suggest that there is only one large region of enrichment divided into two as a result of the uneven tiling. Consequently, uneven tiling and the lack of tiling in regions that contain DREs may affect the estimated number of AhR enriched regions.

Figure 1
figure 1

Summary of AhR enrichment within Cyp1a1 genic region at 2 and 24 hrs. Cyp1a1 is represented by two RefSeq sequences (NM_009992 and NM_001136059, dark blue tracks) that have different TSSs (dark blue box at far left). The rectangles and lines represent exons and introns, respectively, and the UTRs are depicted as the thinner rectangles. Arrowhead direction indicates the orientation of the gene. The grey boxes above represent the Affymetrix 2.0R mouse tiling array probe locations across the Cyp1a1 genic regions. The location and matrix similarity (MS) scores of the consensus DREs are represented by the purple histogram. The highlighted yellow box identifies bona fide functional DREs (matrix similarity (MS) score ≥ 0.8473) involved in AhR-mediated Cyp1a1 gene expression. The red boxes identify regions of significant AhR enrichments (FDR < 0.01) based on the moving average (MA) profile by TileMap. The green histogram plots the log2 fold enrichment values for each individual probe.

Genomic regions with significant AhR enrichment were mapped to intragenic (10 kb upstream of a TSS plus the transcribed gene of mature RefSeq sequences) and non-coding intergenic regions (Table 1; Additional File 3). Most regions were enriched 5.7-fold with values ranging from 1.7- to 111.4-fold (Figures 2A-B). Enriched regions varied in width from 108 to 6,990 bp (Figure 2C) with 90.5% spanning ≤ 1,500 bp. There was no correlation between fold enrichment and region width (data not shown). Of the 974 significantly enriched regions at 24 h 899 of them overlapped with a 2 hr enriched region (Figure 2D), consistent with reports of constant shuttling of the AhR between the nucleus and cytoplasm [39], and AhR promoter occupancy of targeted genes in untreated cells [34]. Relaxing the FDR to 0.05 increased the overlap to 906, while reducing the number of 24 hr specific enriched regions to 68. Comparable overlaps were identified in promoter-specific ChIP-chip studies of TCDD-induced AhR enrichment at 2 and 24 hrs in the livers of intact C57BL/6 mice, which identified 1,397 number of genes with 403 overlap (Lo et al., in submission). Further analysis of the 899 enriched regions found that the fold enrichment values from both time points were positively correlated (Pearson correlation coefficient = 0.4853, two-tailed p-value < 0.0001; Figure 2E).

Table 1 Distribution and density analysis of TCDD-induced AhR enriched regionsa in the mouse genome.
Figure 2
figure 2

Characterization of TCDD-induced AhR enriched regions at 2 and 24 hrs (FDR < 0.01). Frequency analysis of enriched regions relative to log2 fold enrichment at 2 hr (A) and 24 hr (B) illustrating enrichment values in intragenic (light green) and intergenic (dark green) DNA regions. Distribution of enriched regions relative to region width (C) at 2 hrs (light red) and 24 hrs (dark red) identified 90.5% of enriched sites were ≤ 1,500 bp. Comparison of AhR enriched regions at 2 and 24 hrs identified 899 overlapping regions (D). Analysis of the fold enrichment values for the 899 overlapping regions at 2 and 24 hrs identified a positive correlation (two-tailed P-value < 0.0001, Pearson correlation coefficient = 0.4853; E).

Although only 40% of the mouse genome consists of intragenic DNA, 71.8% and 64.7% of all sites with significant AhR enrichment at 2 hrs and 24 hrs, respectively, were within this region. The density of AhR enrichment (per million base pairs [Mbp]) was calculated across the entire genome in order to consider the cumulative intergenic and intragenic DNA region lengths (Table 1). Genome and chromosomal analyses (Additional Files 4 and 5) revealed increased enrichment within intragenic regions compared to non-coding intergenic regions further illustrating a bias for gene encoding regions. However, these values may be inflated due to incomplete probe coverage in the intergenic regions and sequence gaps in the genome. Specific analysis of the 10 kb upstream, 5' and 3' UTRs and CDS regions revealed the highest density of AhR enrichment was proximal to the TSS (Table 1 and Additional Files 4 and 5). AhR enrichment density was greatest within ± 1.5 kb at 2 and 24 hrs (Figures 3A-B), coinciding with proximal promoter DRE core densities [8] and RNA polymerase II binding at the TSSs [10]. Interestingly, there is a notable cleft in AhR enrichment 200 bp directly upstream and downstream of the TSSs, possibly to accommodate general transcription machinery. Both global and proximal promoter density analyses illustrate TCDD-induced AhR enrichments are more prominent in regions directly associated with a gene. Nevertheless, there are a significant number of distally located enrichment sites that may also be functionally relevant.

Figure 3
figure 3

TCDD-induced AhR enrichment (FDR < 0.01) densities in the proximal promoter (10 kb upstream and 5 kb downstream of a TSS) at 2 hrs (A) and 24 hrs (B). The bars represent the number of enriched regions in each 200 bp window. The number of DRE cores in 100 bp non-overlapping windows is superimposed (line) illustrating the overlap between AhR enriched regions and DRE cores in the proximal promoter region.

Confirmation of AhR ChIP-chip Enrichment Analysis

Selected regions of AhR enrichment identified by ChIP-chip analysis at 2 hrs were confirmed by ChIP-PCR (Figure 4). Three representative ChIP-chip enrichments from each genomic region (intergenic, 10 kb upstream of a TSS, 5' UTR, CDS and 3' UTR) were selected to validate AhR enrichments with and without a DRE core at different positions relative to the TSS. ChIP-PCR and ChIP-chip analysis of DRE containing regions exhibited similar levels of AhR enrichment relative to IgGTCDD controls and were significantly greater than vehicle controls relative to IgGvehicle. AhR enriched regions without the DRE core were also verified, further demonstrating that the AhR can interact with DNA independent of a DRE core, but does not eliminate the possibility of AhR interaction through DNA looping or protein tethering. Interestingly, the fold enrichment values for regions without the DRE core were consistently lower than those with a DRE core, suggesting AhR interactions are stronger in regions containing a DRE.

Figure 4
figure 4

Confirmation of hepatic TCDD-induced AhR enrichment identified by ChIP-chip analysis (FDR < 0.01) at 2 hrs by ChIP-PCR. Selected regions were chosen for verification based on position relative to a TSS, ChIP-chip fold enrichment and the presence or lack of a DRE core within the region of enrichment (A). Immunoprecipitated DNA was measured by QRTPCR and AhR enrichment was calculated as fold induction above IgG controls. The color intensity of each box represents the mean value of three independent replicates. NS = not significant compared to IgG controls (p < 0.05). 2 hr ChIP-chip enrichment values are provided in Additional File 1.

DRE Analysis of AhR Enriched Regions

TCDD-elicited changes in gene expression are mediated through AhR signaling via binding to the substitution intolerant DRE core sequence (5'-GCGTG-3'). Overlaying TCDD-induced AhR enrichment with DRE core locations throughout the mouse genome [8] identified 57.8% and 48.5% of the enriched regions did not contain a DRE core regions at 2 and 24 hrs, respectively (Table 2 and Figures 5A-B). Other promoter-specific ChIP-chip studies have also reported DRE cores in ~50% of the AhR enriched regions [33, 35]. The remaining enriched regions possessed at least one and as many as 16 DRE cores (Table 2). AhR enriched regions with or without a DRE core exhibited similar widths and levels of enrichment.

Table 2 Distribution of DRE cores in AhR enriched regionsa.
Figure 5
figure 5

Mapping TCDD-induced AhR enriched regions (FDR < 0.01) with DRE locations. Regions of enrichment identified in the intergenic (purple) and intragenic (blue) DNA regions of the genome at 2 hrs (A) and 24 hrs (B) were searched for high scoring (putative functional) DRE sequences (matrix similarity score ≥ 0.8473; dark blue and dark purple segments) and low scoring DRE sequences (matrix similarity score < 0.8473; mid blue and mid purple segments) using a position weight matrix developed from bona fide functional DREs [8]. Light blue and light purple segments represent regions with no DRE core sequence. A total of 6,595 enriched regions (6,093 at 2 hrs and 502 at 24 hrs) contained at least one DRE core (5'-GCGTG-3'). 50% of these regions were within 135 bp of a DRE core (based on the location of maximum enrichment within the enriched region; C).

Matrix similarity (MS) scores have been calculated for each 19 bp DRE sequence within the mouse genome using a position weight matrix (PWM) constructed from bona fide functional DREs [8]. Of the 6,595 significant AhR enriched regions containing a DRE core (6,093 from 2 hr and 502 from 24 hr), 90.7% were within 500 bp of a DRE core (i.e. distance of maximum enrichment within the region to an underlying DRE core) with half of these positions located within 135 bp of a DRE core. However, only 8.3% and 17.8% of the AhR enriched regions at 2 and 24 hrs, respectively, possessed a putative functional (high scoring) DRE sequence (MS score ≥ 0.8473) suggesting the AhR may bind other degenerate sequence elements.

AhR binding to an alternate response element (5'-CATGN6C[T|A]TG-3') has also been reported [40, 41]. Of the 8,353 and 472 enriched regions at 2 and 24 hrs, respectively, that did not contain a DRE core, 482 and 237, respectively, contained the alternate DRE sequence (5.8% and 50.2%, respectively). The higher incidence of AhR enriched regions at 24 hrs containing the alternate response element may represent tertiary AhR binding sites resulting from conformational changes and crowding of the promoter with the general transcription machinery [42, 43].

Transcription Factor Binding Site Over-Representation Analysis

Significantly AhR enriched regions were computationally analyzed for over-represented response elements for known TF binding site (TFBS) families using RegionMiner (Genomatix). DREs as well other sites for early growth response (EGR), E2F, nuclear respiratory factor 1 (NRF1), nuclear receptor subfamily 2 factors (NR2F/COUP-TF) and peroxisome proliferator-activated receptor (PPAR) were over-represented within AhR enriched regions (Table 3; complete list of over-represented TFBS are provided in Additional Files 6 and 7). Many of these TF sites were enriched proximally to a DRE core (i.e. within 10-50 bp; Additional File 8) suggesting possible interactions. Studies have previously reported interactions between AhR and many of these TFs [34, 44, 45]. For example, AhR complexes with EGR-1 following treatment of human HUVEC cells with high glucose concentrations [45]. In addition, AhR aggregates with E2F1 to inhibit E2F1-induced apoptosis [46]. AhR also directly interacts with COUP-TF to repress ER-mediated gene expression [47].

Table 3 Significantly over-represent transcription factor module families in TCDD-induced AhR enriched regionsa.

De Novo Motif Analysis

Approximately 50% of enriched regions lacked the DRE core sequence (Figures 5A-B) suggesting AhR interacts with DNA using alternate strategies. De novo motif analysis of these regions using the Gibbs motif sampler in CisGenome identified over-representation of comparable repetitive elements in both the intergenic and intragenic DNA regions (Additional File 9). Comparison of over-represented non-repetitive motifs to existing TF binding motifs in JASPAR and TRANSFAC [48, 49] using STAMP [50] identified similarities to COUP-TF, hepatocyte nuclear factor 4 (HNF4), liver receptor homolog 1 (LRH1/NR5A2) and PPAR binding sites (Figure 6). Interestingly, COUP-TF and HNF4 belong to the NR2F family identified in the TFBS over-representation analysis of all AhR enriched regions (Table 3). The presence of these binding motifs in non-DRE containing regions of AhR enrichment further suggests that AhR-DNA interactions occur through a tethering mechanism involving other TFs or by tertiary looping of DNA.

Figure 6
figure 6

De novo motif analysis of intragenic ( A ) and intergenic ( B ) AhR enriched regions lacking a DRE core. The non-repetitive over-represented motifs from each region are shown with their consensus and reverse complement sequence, and the Gibbs motif sampler score. Over-represented motifs were associated with specific TFBSs in JASPAR and TRANSFAC based on the consensus sequence alignments and E-value scores.

Gene Level Analysis of AhR Enrichment

Of the 10,369 enrichments identified in the intragenic DNA regions, 43.8% (4,544/10,369) contained a DRE core at 2 hrs, and 52.4% (332/634) at 24 hrs (Figure 5, areas shaded blue). These intragenic AhR enriched regions mapped to 5,307 and 591 unique genes at 2 and 24 hrs, respectively (AhR targeted genes are provided as gene annotated enriched regions in Additional Files 1 and 2). Molecular and cellular functional analysis using Ingenuity Pathway Analysis (IPA) found these genes to be associated with lipid and carbohydrate metabolism, small molecule biochemistry, cell cycle and gene expression based on a Fisher's Exact Test p-value < 0.01 (Figure 7; Additional Files 10 and 11 list the most significant over represented biological functions at 2 and 24 hrs). Furthermore, 63.5 and 56.2% of the genes associated with AhR enrichment at 2 and 24 hrs, respectively, contained a DRE core within the region of enrichment (Figure 8). The higher percentage of genes containing a DRE core compared to enriched regions with a DRE core is due to multiple regions of AhR enrichment associated with a single gene (as illustrated for Cyp1a1 in Figure 1). The remaining genes (36.5% and 54.8% at 2 and 24 hrs, respectively) with significant AhR enrichment were targeted independently of a DRE core.

Figure 7
figure 7

Molecular and cellar functions over-represented by genes associated with significant AhR enrichment (FDR < 0.01) containing a DRE core. The 4,544 and 332 unique genes with AhR enrichment with a DRE core at 2 hrs (A) and 24 hrs (B), respectively, were analyzed using Ingenuity Pathway Analysis for enriched biological functions using Fisher's Exact Test (p < 0.01; orange line). The blue bars represent the log Odds value calculated from the p-value of each functional group.

Figure 8
figure 8

Mapping TCDD-induced AhR enriched regions (FDR < 0.01) and DRE analysis to genes. The 10,283 and 660 AhR enrichments within the intragenic DNA regions at 2 and 24 hrs (blue shaded areas in Figures 5A-B) mapped to 5,307 (A) and 591 (B) distinct genes based on the refGene data from the UCSC Genome Browser. These genes were searched for the presence of high (matrix similarity score (MS) ≥ 0.8473; dark grey areas) and low (MS score < 0.84731; light grey areas) scoring DRE sequences, and the absence of a DRE core (white areas) within the region of AhR enrichment. Comparing 2 and 24 hrs data identified 575 overlapping genes with AhR enrichment and 513 of these genes contained a DRE core within the region of enrichment (C).

At both 2 and 24 hrs, 575 genes had AhR enrichment, with 513 possessing DRE cores in the AhR enriched region (Figure 8C). Only 16 genes exhibited AhR enrichment solely at 24 hrs, with three containing a DRE core. In contrast, 4,732 genes possessed significant AhR enrichment with 60.4% (2,856) containing a DRE core within the region of enrichment at 2 hrs. Due to the large overlap of enriched regions at 2 and 24 hrs, the remaining analysis focuses predominantly on the AhR enrichment at 2 hr.

Comparison of Transcriptional Responses with AhR Enrichment

Gene expression analysis at 2, 4, 8, 12, 18, 24, 72, and 168 hrs identified 1,896 unique differentially expressed genes (|fold change| ≥ 1.5 and P1(t) > 0.999) at one or more time points. Of the 1,896 TCDD-responsive genes, 900 genes (47.5%) possessed significant AhR enrichment within the intragenic region (10 kb upstream of the TSS to the end of the transcript). Moreover, of the 900 genes exhibiting AhR enrichment at 2 hrs, 625 contained a DRE core sequence, suggesting these responses are AhR-mediated. The remaining 275 differentially expressed genes were not associated with a AhR enriched region containing a DRE core, and may be secondary responses. In order to concisely visualize the integration of the DRE, ChIP-chip and gene expression analyses, Circos plots were generated for the genome and individual chromosomes (Figure 9 and Additional File 12). The plots further illustrate the diversity in AhR enrichment locations in relation to the genomic position of dysregulated genes. Further analysis of the responsive genes found that most were induced by TCDD (Table 4) at all time points. Greater than 82% of the induced genes at 2 or 4 hrs had significant AhR enrichment, and more than 62% of them contained at least one DRE core suggesting that regulation is DRE-dependent fashion. In contrast, only 35% of the 691 genes induced at 168 hrs, exhibited AhR enrichment with 26% possessing a DRE core suggesting that these are secondary gene expression responses. Interestingly, down-regulated genes associated with AhR enrichment were relatively consistent across all time points. Approximately one third of the down-regulated genes appear to be AhR regulated with DRE involvement.

Figure 9
figure 9

Circos plots integrating DRE analysis, AhR enrichment (2 hrs; FDR < 0.01) and heatmaps for hepatic differential gene expression responses (|fold change| ≥ 1.5 and P1(t) > 0.999) induced by TCDD across the genome (A) and chromosome 9 (B). The inset legend image provides information represented by each data ring. DRE matrix similarity (MS) scores and AhR enrichment values increase radially outward. The time points for the gene expression heatmaps also increase radial outward. The arc of each heatmap wedge maps directly to the location of the gene in the genome. The arc length is proportional to the length of the transcribed region. Circos plots for the other chromosomes are provided in Additional File 12.

Table 4 Distribution and AhR enrichment and DRE analysis of differentially expressed genes elicited by TCDD.

Functional analysis of the 900 differentially expressed genes associated with AhR enrichment was performed using DAVID [51]. The most over-represented functions were associated with lipid metabolic processes (enrichment score of 7.34, Table 5), consistent with the induced fatty liver phenotype [52, 53]. IPA analysis of these genes also identified lipid metabolism as an enriched molecular and cellular function (Fisher's Exact Test p-value < 0.01; Figure 10; Additional File 13 provides a list of the most significant biological functions). In addition, de novo motif analysis (Figure 6) identified binding sites for TFs associated with lipid metabolism and transport. The induction of AhR regulated xenobiotic enzymes, such as cytochrome P450s, glutathione S-transferases (Gsts) and UDP-glucuronosyltransferases (Ugts), hallmarks of TCDD exposure, were also identified as an enriched cluster (enrichment score of 3.54).

Table 5 Functional enrichment analysis of differently regulateda genes with AhR enrichmentb using DAVID.
Figure 10
figure 10

Molecular and cellar functions over-represented by differentially regulated genes (|fold change| ≥ 1.5, P1(t) > 0.999) associated with significant AhR enrichment (FDR < 0.01) at 2 hrs. The 900 differentially regulated genes with AhR enrichment were analyzed using Ingenuity Pathway Analysis for enriched biological functions using Fisher's Exact Test (p < 0.01; orange line). The blue bars represent the log Odds value calculated from the p-value of each functional group.

Although AhR mediates the expression of enzymes involved in xenobiotic metabolizing enzymes, including NADP(H) dehydrogenase, quinone 1 (Nqo1) and UDP-glucose dehydrogenase (Ugdh) as well as several Ugt and Gst isoforms, they are also regulated by nuclear factor, erythroid derived 2, like 2 (Nrf2) via antioxidant response elements in response to oxidative stress [54, 55]. Recent studies with AhR and Nrf2 null mice report that TCDD induction of Nqo1 is AhR and Nrf2 dependent [56]. Furthermore, specific Ugt and Gst isoforms induced by TCDD require Nrf2. Collectively, these responses are referred to as the "TCDD-inducible AhR-Nrf2 gene battery." ChIP-chip and gene expression analysis indicates that Nqo1, Gstm1, Gstm2, Ugdh and Nrf2 induction is associated with AhR enrichment. Although supportive of the Nrf2-dependency model, these data do not distinguish if these are secondary responses mediated by Nrf2 alone, or involve an AhR-Nrf2 interaction. In contrast, Gsta1 and Ugt2b35 induction occurred independently of AhR enrichment, suggesting they may only be dependent on Nrf2 [56].

Immune cell accumulation following a single acute dose of TCDD at 168 hrs is presumed to be a secondary response to hepatic injury or fatty acid accumulation [52, 53]. DAVID analysis of genes induced at 168 hrs identified multiple over-represented immune-related clusters (enrichment scores > 2). However, several of the genes including complement component 1, q subcomponent, beta polypeptide (C1qb), CD36 antigen (Cd36), complement component 4A (C4a) and interferon regulatory factor 8 (Irf8), did not exhibit accompanying AhR enrichment within their intragenic region (10 kb upstream of the TSS to the end of the 3' UTR). Only 26 out of 105 differentially regulated genes in the enriched immune clusters exhibited AhR enrichment. Collectively, these data suggest that gene expression associated with immune function is a consequence of immune cell infiltration into the liver.

Discussion

This study further elucidates the role of the AhR in mediating the hepatic effects of TCDD in C57BL6 mice. Recent studies have mapped AhR binding using promoter-focused ChIP-chip arrays and found that ~50% of the AhR enriched regions were devoid of the DRE core [3234]. The lack of a DRE core in regions of AhR enrichment was also reported in a AhR genome-wide ChIP-chip study performed in mouse CH12.LX cells [57]. ChIP-seq experiments for other TFs have also demonstrated enrichment in remote genome regions, which may serve important regulatory roles [10, 11, 14, 17]. Collectively these data suggest the AhR uses different mechanisms to regulate gene expression. Moreover, the integration of genome-wide in silico DRE search, with de novo motif analysis and TCDD-elicited hepatic temporal gene expression data has further elucidated the hepatic AhR gene regulatory network.

ChIP-chip analysis identified 14,446 TCDD-induced AhR regions at 2 hrs and 974 regions at 24 hrs, consistent with the rapid nuclear export and subsequent degradation of the AhR following TCDD activation [37]. Approximately half of these regions were within intragenic regions (10 kb upstream of a TSS to the end of the 3' UTR). Furthermore, 25% of these enriched regions at 2 hrs and 19% at 24 hrs were within 2 kb of a TSS, indicating that a large subset of AhR enrichment occurs adjacent to a TSS. Unlike other studies that report a normal distribution of TF binding centered around the TSS [15, 5860], the AhR density profile exhibited a cleft immediately adjacent to the TSS, possibly to accommodate recruited transcriptional machinery.

Although most AhR enrichment regions are intragenic, a significant number are located in distal intergenic regions (i.e. 4,163 of 14,446 at 2 hrs and 344 of 974 at 24 hrs). Studies with the ER, p53 and forkhead box protein A1 [10, 11, 14, 17] suggest distal TF binding may have distinct regulatory roles. Binding proximal to the TSS is presumed to stabilize the general transcriptional machinery, while distal binding regulates transcription by a looping mechanism or by altering chromatin structure [9, 61, 62]. Consequently, AhR binding outside of the proximal promoter region may have important regulatory roles that remain largely uninvestigated.

Comparing AhR enriched regions with DRE cores revealed that their intergenic, intragenic and genic (10 kb upstream, UTRs, and CDS) density distributions were similar. The greatest density of AhR enrichment associated with a DRE core occurred within the proximal promoter. Both exhibited comparable distribution profiles except for the cleft in enrichment at the TSS. The decrease in AhR enrichment at the TSS coincides with RNA polymerase II binding at the TSSs [10] of transcriptionally responsive genes. Although TCDD-elicited differential gene expression is thought to be mediated by the substitution intolerant DRE core sequence (5'-GCGTG-3'), only ~50% of the AhR enriched regions contained a DRE core, consistent with findings in other promoter targeted AhR ChIP-chip studies [33, 35] (Lo et al., in submission). Moreover, relatively few alternative AhR response elements (5'-CATGN6C[T|A]TG-3') [40, 41] were identified in AhR enriched regions lacking a DRE core sequence. Enrichment in regions lacking DRE cores provides additional evidence of AhR-DNA interactions that don not involve the basic bHLH domain [63], such as tethering to other DNA interacting TFs and/or tertiary interactions with looping DNA.

Integration of gene expression, ChIP-chip, and DRE distribution data suggests that ~35% of all differentially expressed hepatic genes are mediated by direct AhR binding to a DRE. Consequently, 65% of the gene expression responses elicited by TCDD do not involve direct AhR binding to a DRE. However, TF binding analyses based on tiling arrays is limited by the extent of probe coverage (Figure 1). Genomic regions lacking probe coverage may falsely inflate the number of DRE-absent AhR enriched regions, thus underestimating the number of AhR regulated genes involving a DRE. Furthermore, the analyses may not be exhaustive due to the technical limitations of ChIP-chip assay coupled with the conservative FDR threshold used to identify statistically significant signals, which may have excluded some positive signals. These limitations of the technology could be addressed in ChIP-seq experiments, which have greater resolution and sensitivity [64, 65]. The shorter sequence reads would improve resolution, but may also identify fewer regions containing a DRE. The higher sensitivity of ChIP-seq could also identify additional regions of AhR enrichment. ChIP-seq studies could also confirm AhR binding in these genomic regions in either a DRE-dependent or -independent manner.

TCDD induces hepatic vacuolization and lipid accumulation with differential gene expression associated with fatty acid metabolism and transport [25, 53]. Independent functional annotation analysis of differentially expressed genes with significant AhR enrichment using DAVID and IPA identified over-represented processes related to fatty acid and lipid metabolism. Computational analysis also identified over-represented binding motifs for TFs involved in the regulation of lipid and cholesterol metabolism, including sites for HNF4, LXR, PXR, PPAR and COUP-TF. COUP-TF is a potent repressor that antagonizes transcriptional responses mediated by other nuclear receptors including HNF4, PPAR, ER, RAR and VDR [66]. For example, COUP-TF antagonizes HNF4α-mediated responses by binding HNF4α response elements [6771]. Furthermore, AhR interactions with COUP-TF repress ER-mediated gene expression responses [47]. Therefore, AhR interactions with COUP-TF may regulate lipid and fatty acid metabolism by blocking HNF4α target gene expression (Figure 10A). Coincidentally, the HNF4 binding motif is over represented within AhR enriched regions lacking a DRE core.

Consistent with this proposed mechanism, several HNF4α regulated genes, including Cyp7a1 and Gck, exhibited AhR enrichment and were repressed by TCDD. Cyp7a1 is the rate-limiting enzyme in the bile acid biosynthetic pathway that converts cholesterol into bile acids. Transgenic mice over-expressing Cyp7a1 are protected from high-fat diet induced obesity, fatty liver and insulin resistance [72]. Moreover, a genetic deficiency of Cyp7a1 in humans results in hyperlipidemia [73]. Gck phosphorylates glucose in the initial step of glycolysis. Mutations in Gck that reduce kinase activity are associated with insulin resistance and maturity onset diabetes of young 2 (MODY2) in humans [7476]. Furthermore, mice over-expressing Gck are resistant to MODY2 [77]. The down-regulation of Cyp7a1 and Gck, possibly due to AhR - COUP-TF interactions at HNF4α response elements, is consistent TCDD-induced hepatic lipid accumulation in mice. Interestingly, TCDD exposure has been linked to diabetes and metabolic syndrome in humans [7884]. Studies examining AhR-COUP-TF interactions and their effects on HNF4 target gene expression are being investigated further.

Conclusion

This study identified the genome-wide locations of TCDD-induced hepatic AhR enrichment in vivo and incorporates DRE distribution and differential gene expression data to further elucidate the hepatic AhR regulatory network. In addition to identifying interactions in regions associated with genes, AhR enrichment in distal non-coding intergenic regions was characterized. The functional significance of these distal interactions is unknown but intergenic binding has been reported for other TFs, and warrants further investigation. Moreover, only ~50% of all AhR enriched regions involved a DRE, suggesting that indirect AhR binding to DNA plays a significant role in the AhR regulatory network.

Methods

Animal Handling and Treatment

Hepatic tissue samples from immature female ovariectomized C57BL/6 mice obtained from a previous study [53] were used for both ChIP assays at 2 and 24 hrs, and gene expression analyses across all time points. Briefly, mice were orally gavaged with 30 μg/kg TCDD and sacrificed by cervical dislocation at 2, 4, 8, 12, 18, 24, 72 or 168 hrs postdose. Tissues were removed, weighed, and multiple samples (~100 mg each) were flash frozen in liquid nitrogen and stored at -80°C until further use.

Chromatin Immunoprecipitation (ChIP) and ChIP-chip Experiments

ChIP assays were performed as previously described [33] with the following changes. Approximately 100 mg of mouse liver was homogenized in 1% formaldehyde and incubated for 10 min at room temperature. Tissue homogenate was centrifuged at 10,000 RPM for 3 min at 4°C. Pellet was washed in ice-cold PBS, centrifuged, and resuspended in 900 μL of TSEI (20 mM Tris-HCl [pH 8.0], 150 mM NaCl, 2 mM EDTA, 1% Triton X-100, 0.1% sodium dodecyl sulfate) + 1× Protease Inhibitor Cocktail (Sigma, St. Louis, MO). Samples were sonicated 12 times for 10 s each time at 25% amplitude using a Branson 450 sonifier. Supernatant was transferred to fresh microcentrifuge tubes and incubated with rabbit IgG (5 μg; Sigma) and anti-AhR (5 μg; SA-210, Biomol) overnight at 4°C under gentle agitation. ChIP samples were washed and the DNA was isolated as previously described [33]. For ChIP-chip experiments, immunoprecipitated DNA isolated following immunoprecipitation with anti-AhR of liver extracts from TCDD-treated mice was linearly amplified using a whole genome amplification kit according to the manufacturer's instructions (Sigma). Linearly amplified DNA (7.5 μg) was fragmented by limited DNAseI digestion and hybridized to Affymetrix GeneChip® mouse 2.0R tiling arrays (Affymetrix, Santa Clara, CA) as previously described [33]. The hybridization and washing steps were performed according to the manufacturer's protocol at the Centre for Applied Genomics (Toronto, Canada). Data were normalized and analyzed using CisGenome and mapped against mouse genome version mm9 [36]. Enriched regions with a false discovery rate (FDR) of 1.0% (0.01) were determined by comparing triplicate samples of AhRTCDD to triplicate IgGTCDD using a moving average (MA) approach with default settings in TileMap v2 [85]. Regions were merged if the gap between them was < 300 bp and the number of probes failing to reach the cut-off was < 5. Regions were discarded if they were < 120 bp or did not contain at least 5 continuous probes above the cut-off. ChIPed DNA was purified using the PCR purification kit from BioBasic Inc. (Markham, ON) and quantified using quantitative real-time PCR (QRTPCR) (KAPA SYBR Fast qPCR Master Mix; KAPA Biosystems, Toronto, ON) (ChIP-PCR). Fold enrichment values were calculated relative to IgG controls. ChIP-PCR primer sequences are provided in Additional File 14.

ChIP-chip Location Analysis

The mouse genomic assembly (mm9) and associated annotation within the refGene and refLink databases were downloaded from the UCSC Genome Browser [86]. Individual segments of a gene region (i.e. the 10 kb sequence upstream of a TSS, the 5' and 3' UTRs and the CDS) for each mature gene encoding reference sequence (RefSeqs with NM prefixed identifiers) were determined using the genomic coordinates within the refGene databases (Additional File 3). Intragenic DNA regions within the genomes were computationally identified by merging overlapping gene regions (Additional File 3) from both strands of the genome, and the DNA between adjacent intragenic regions are defined as the non-transcribed intergenic DNA regions (Additional File 3). AhR enrichment densities were calculated based on the number of significant enriched regions occurring in an interrogated region (e.g. intergenic DNA region or 5' UTR) divided by the total sum of the region length. Gene annotation associated with each RefSeq sequence was derived from the refLink database in the UCSC Genome Browser.

Transcription Factor Motif Analysis

The locations of AhR enrichment were compared against 5'-GCGTG-3' DRE core sequence locations in the mouse genome [8]. Identification of TF motifs over-represented in regions containing a DRE core were performed using the default parameter settings in RegionMiner, a program within the Genomatix suite of applications http://www.genomatix.de that contains an extensive database of TF binding motifs. Identified module families and individual matrices with z-scores > 3 were considered significant [87]. De novo motif discovery was performed using the Gibbs motif sampler in CisGenome on AhR regions of enrichment sequences not containing a DRE. Matrices for over-represented motifs were compared to existing TF binding motifs in JASPAR and TRANSFAC [48, 49] using STAMP [50].

Comparison with Microarray Gene Expression

Results from the ChIP-chip and DRE analysis were integrated with whole-genome gene expression profiling data from mice orally gavaged with 30 μg/kg TCDD using 4 × 44 k whole-genome oligonucleotide arrays from Agilent Technologies (Santa Clara, CA) [8]. The genomic locations of the differentially responsive genes (|fold change| ≥ 1.5 and P1(t) > 0.999) were obtained for each RefSeq sequence associated with the gene from the refGene database in the UCSC Genome Browser. Circos plots [88] were generated to visualize the locations of DRE cores, regions of AhR enrichment and temporal heatmaps of temporal gene expression responses.

Functional Annotation and Pathway Analysis

Functional annotation clustering of Gene Ontology (GO) terms for genes associated with significant AhR enrichment was performed using DAVID (Database for Annotation, Visualization, and Integrated Discovery) [51]. In addition, the regions were analyzed using Ingenuity Pathway Analysis (IPA; http://www.ingenuity.com/) to identify over-represented molecular and cellular functions based on the Fisher's Exact Test p-value < 0.01.