EnhancerFinder is a computational tool that predicts developmental enhancers based on positive examples of biologically active developmental enhancers  and negative examples from genomic background. This method uses a multiple kernel learner (similar to a support vector machine) and characterizes genomic regions through an integrated profile of a large number of genetic and epigenetic data sources. Using in vivo-validated examples to train EnhancerFinder and integrating hundreds of sequence motifs and functional genomics experiments make this approach more accurate at identifying biologically active enhancers compared to other approaches . For this study, we started with the 84,301 candidate developmental enhancers predicted by EnhancerFinder across the human genome. We then examined the genome-wide distribution of EnhancerFinder’s predicted enhancers and found that they cluster near loci that contain important developmental genes. Since developmentally active genes typically rely on tight regulation to exhibit robust spatio-temporal expression patterns, these enhancers likely play a role in coordinating normal development. Genes with the highest number of nearby enhancers in the human genome (Supplemental Table 1) are enriched for several biological functions related to development including epithelial cell development, arterial development, and dorsal/ventral neural tube patterning.
Many genes essential for normal development fall within clusters of EnhancerFinder’s predicted developmental enhancers. Neighboring genes FOXC1 and GMDS are found in one of the densest enhancer clusters in the genome, with 72 predicted nearby enhancers over a 1-Mb range. These genes and 104 predicted enhancers fall within a single TAD of local chromatin interactions in human embryonic stem cells , indicating that these regions have 3D structural interactions during embryonic development. GMDS and FOXC1 are the only two protein-coding genes fully contained within this TAD. Three long non-coding RNA genes are also encoded in this domain, as well as the 3′ end of myosin light-chain kinase MYLK4. Boundaries of topological domains are often consistent across cell types and evolution , suggesting that the topological domain that contains FOXC1 and GMDS is present in many developing tissues during development (Fig. 1).
We tested seven candidate enhancer (CE) regions and validated five novel developmental enhancers near FOXC1 and GMDS using a transgenic mouse enhancer assay. We saw that these five enhancers are active at E11.5 in various embryonic tissues. We chose embryonic day E11.5 as it is an active stage of brain patterning and development and is 1 day prior to when differentiated structures are apparent . Figure 2 shows representative images of CE1–4 from the transgenic mouse assay, as whole embryos, highlighting enhancer activity in different tissues. Associated tables detail the expression patterns of these CEs in the transgenic enhancer assay. A given anatomical region of the embryos was noted in bold when it showed expression in greater than half of the X-gal-positive (+) embryos. For instance, CE1 showed expression in the developing limb in three out of five X-gal-positive embryos (Fig. 2). Additional tissues that showed expression in more than half of the X-gal-positive embryos included the eye (CE1, 2), the spinal cord (CE1–4), and the midbrain (CE1). Expression in the hindbrain neural tube was also seen in embryos for each enhancer, although it was seen most frequently in CE4 (Figs. 2a, 3a). Interestingly, a few embryos had expression in the cortical mesenchyme (CE1, 3, 4) where Foxc1 is expressed endogenously (see Figs. 2b, 3b), although this expression was not consistent across many embryos. CE4 is more likely to be regulating GMDS as GMDS is expressed in the developing hindbrain neural tube, whereas Foxc1 is restricted to the surrounding mesenchyme (Fig. 2c, d). CE5 showed consistent expression in the eye (data not shown).
The four identified CEs contain binding motifs for hundreds of transcription factors. To better understand the region’s transcriptional regulation in brain development, we filtered this list to include transcription factors and co-factors expressed in the developing brain that have at least 12 predicted binding motifs in at least one of the enhancers. We identified a large number of potential transcriptional regulators, widening the number of potential transcriptional regulators of FOXC1/GMDS expression. The neurodevelopmental transcription factors ZIC1 and ZIC3 are amongst the genes with the highest numbers of predicted binding motifs in the CEs (Table 1). Figure 4a shows the enhancer landscape near FOXC1/GMDS and the location of predicted ZIC binding sites. ZIC genes, including Zic1 and Zic4, are known to be active in the developing hindbrain , suggesting that they may regulate GMDS hindbrain expression.
To further investigate the role of ZIC proteins in the regulation of our identified enhancers, we made mutant enhancer constructs that lacked ZIC binding sites. We then used our transgenic mouse enhancer assay to test whether removing these sites affects expression noted in Fig. 2. We found diminished neural expression in three of the four brain CEs (Fig. 4). For the mutant CE4 construct, we did not recover enough embryos to make substantive conclusions (only two X-gal-positive embryos were recovered). Of note, many regions where at least half of the X-gal-positive embryos had expression in with the wild-type (WT) enhancer no longer had robust expression. This included expression in the eye (CE1 and 2), limb (CE1), spinal cord (CE2 and 3), and midbrain (CE1). Also, a number of regions gained expression that was not seen in the WT embryos, including the branchial arch and facial mesenchyme. While mouse ZIC1 or 3 do not appear to be expressed in these structures, ZIC2 is expressed in the developing upper and lower jaws (Figure S1). Thus, ZIC2 may be inhibiting craniofacial expression in the WT context and the lack of ZIC binding sites in the mutant enhancers may lead to disinhibition and activation of expression in craniofacial tissues.
Although the CEs are located very close to FOXC1 and GMDS, they may also regulate other nearby genes. FOXF2 and FOXQ1 are located 215 and 300 kb, respectively, upstream of FOXC1 and the CE cluster. These genes are expressed in many of the same mouse tissues as the CEs (as seen in in situ images of E11.5 mouse embryos from the Allen Brain Map). Although these other potential target genes are not in the same TAD as the CEs in human embryonic stem cells, the genomic distance separating them from the CE cluster is well within the range of enhancer function. Together, FOXC1, GMDS, FOXF2, and FOXQ1 may represent loci for enhancer-driven changes in gene expression that could affect the developing embryo.