Identification of Regulatory Elements in Primary Sensory Neurons Involved in Trauma-Induced Neuropathic Pain

Chronic pain is a significant public health issue that is often refractory to existing therapies. Here we use a multiomic approach to identify cis-regulatory elements that show differential chromatin accessibility and reveal transcription factor (TF) binding motifs with functional regulation in the rat dorsal root ganglion (DRG), which contain cell bodies of primary sensory neurons, after nerve injury. We integrated RNA-seq to understand how differential chromatin accessibility after nerve injury may influence gene expression. Using TF protein arrays and chromatin immunoprecipitation-qPCR, we confirmed C/EBPγ binding to a differentially accessible sequence and used RNA-seq to identify processes in which C/EBPγ plays an important role. Our findings offer insights into TF motifs that are associated with chronic pain. These data show how interactions between chromatin landscapes and TF expression patterns may work together to determine gene expression programs in rat DRG neurons after nerve injury. Supplementary Information The online version contains supplementary material available at 10.1007/s12035-023-03673-5.


Introduction
Nerve injury-induced chronic pain is characterized by complex activity-dependent plasticity and heightened excitability of sensory neurons which is mediated by changes to gene transcription.This context-dependent regulation of enhanced and/or repressive gene expression requires Kimberly E. Stephens, Yun Guan, and Sean D. Taverna contributed equally.coordinated transcription factor binding to cis-regulatory elements (CREs).However, the regulatory landscape which orchestrates these transcriptional changes in the context of nerve injury is just starting to be addressed [1].
Epigenetic mechanisms are well-established regulators of a wide variety of physiological and pathological processes [2].One major pathway of epigenetic modulation is the targeted addition or removal of small chemical post-translational modifications (PTMs) on individual nucleosomes [3,4].These histone PTMs can help alter the positioning of individual nucleosomes and, therefore, facilitate access to CREs.Mono-methylation of the lysine residue at position 4 on histone H3 (e.g., H3K4me1) is one such histone PTM.H3K4me1 is enriched at CREs and facilitates recruitment of the cohesin complex and other remodeling machinery which increases the accessibility of the targeted DNA sequence to relevant transcription factors [5,6].While some pioneer transcription factors can recognize and bind to CREs within nucleosomal DNA, recognition of CREs by non-pioneer transcription factors requires the target sequence to be unobscured [7].Therefore, potential active CREs could be identified by assays that identify accessible chromatin, such as the Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) [8,9].
Here, we comprehensively mapped H3K4me1 enrichment and chromatin accessibility at CREs in the lumbar DRGs in a well-established rat model of neuropathic pain induced by chronic constriction injury (CCI) of the sciatic nerve and naïve rats.We further integrated RNA-seq profiles to understand how differential chromatin accessibility after physical nerve injury may influence gene expression.We confirmed binding by the CCAAT enhancer binding protein gamma (C/ EBPγ) to a CRE with differential accessibility in our nerveinjury model and identified the biological processes in which C/EBPγ plays an important role.These data provide valuable resources for our understanding and the further investigation of injury-induced changes in the regulatory landscape in primary sensory neurons.

Genome-Wide Identification of Chromatin Accessibility in Naïve and Injured DRGs
Adult (i.e., 8-10 weeks old) female Sprague Dawley rats underwent CCI surgery or were left unperturbed (i.e., naïve control).At 14 days following surgery, the ipsilateral lumbar (L4-L6) DRGs were removed from both groups.To identify regions of chromatin accessibility, we first performed ATAC-seq on the DRGs from naive rats and CCI rats (Fig. 1).Two (naïve group) or three (CCI group) biological replicates were processed for a total of five ATAC-seq libraries.These libraries were sequenced to an average depth of 50.6 million total reads and generated 32 million unique reads that aligned to the rat genome (Online Resource 4).We visualized the extent of similarity/dissimilarity of chromatin accessibility of the individual samples using the first two principal components from the principal component analysis of all genes (Online Resource 1A).The first two principal components accounted for approximately 62% of the total variance among the samples and produced distinct clusters of the samples by treatment group (i.e., naïve vs. CCI).We used MACS2 to call peaks that represent genomic regions of chromatin accessibility in each sample.A total of 118,329 unique regions of chromatin accessibility were identified in the CCI rats and 123,738 in the naive group.To ensure the stringency of our analysis, we only considered reproducible regions (Methods; Online Resource 1B).A total of 62,854 unique genomic regions of chromatin accessibility were identified across both naïve and CCI rats.Consistent with known enrichment profiles of active regulatory elements, the distance between chromatin-accessible regions and TSS of the nearest gene suggests that these regions are concentrated in CREs (i.e., introns, intergenic regions) (Online Resource 1C, D).A smaller proportion of accessible regions was found in promoters and TSS of annotated genes.However, the ATAC-seq peaks near the TSS were of greater intensity Fig. 1 Chromatin accessibility in the rat DRG.Schematic of the experimental approach than non-TSS peaks, which supports previous studies that find greater accessibility of chromatin around the TSS than in surrounding genomic regions [10].

Differential Accessibility at Gene Promoters Is Associated with Nociceptive Processes
Analyses of our ATAC-seq on the DRGs from naive and CCI rats identified a union of 6809 unique regions of open chromatin in annotated gene promoters between the groups (Fig. 2a).Most (96.7%) of these promoter regions were accessible in both the CCI and naive groups (Fig. 2a).The promoter regions of 108 genes contained open chromatin that was CCI-group specific, whereas 117 were naïve group-specific.Gene ontology (GO) analysis revealed that the 108 genes with CCI-specific accessibility were enriched for pain-and sensory-related processes (Fig. 2a).Genes associated with the top 5 biological processes (as ranked by the p-values) have well-established roles in persistent pain (e.g., the sodium voltage-gated channel Scn11a, and the transient receptor potential cation channels Trpv1 and Trpa1) (Fig. 2b).For example, DRGs from CCI rats showed higher accessibility at the Scn11a promoter than the DRGs from naïve rats (left panel, Fig. 2c), and this accessibility was associated with the higher gene expression (right panel; Fig. 2c).
Because most chromatin-accessible regions were shared in both groups, we used DiffBind to identify quantitative differences in accessibility between CCI and naïve rats.Of the 6809 promoters that contained a region of chromatin accessibility, 331 promoters were significantly more accessible and 145 were significantly less accessible in the CCI group when compared to the naïve group (Fig. 2d; Online Resource 5).GO analysis of the nearest annotated genes for these 476 differentially accessible regions (DARs) identified biological processes enriched in neuropathic pain (e.g., protein transport and assembly, cell signaling, response to stimulus, cell projection organization) (Fig. 2e).

Multi-omics Analysis to Identify CREs in Distal Intergenic Regions
To identify CREs, we used ChIP-seq targeting H3K4me1.We pooled H3K4me1-enriched regions from biological duplicates of both naive and CCI rats and identified 211,440 unique peaks.Most of these peaks were predominantly located in intergenic and intronic regions with the remaining peaks located near or at an annotated TSS (Online Resource 2a).A total of 58,446 (27.6%) of these 211,440 regions overlapped with one or more accessible chromatin regions identified by our ATAC-seq and were therefore included for further analyses (Fig. 3a).
The differential analysis identified 2145 (3.67%) of the 58,446 consensus regions that showed increased or decreased accessibility between CCI and naïve groups (Fig. 3b; Online Resource 6).PCA of only those reads contained within the 2145 DARs produced distinct clusters of the samples by group (Online Resource 2B).Hierarchical clustering also showed that these 2145 regions alone resulted in samples from the same treatment group clustering together (Fig. 3c), which provides strong evidence that we successfully identified those genomic regions that were important in distinguishing those genomic regions affected by CCI.
Of the 2145 DARs, 999 were located in intergenic regions (Online Resource 2C).Of these 999 intergenic regions, 519 (40%) had increased accessibility after CCI, and 480 had decreased accessibility.Motif analysis of the 519 DARs with increased accessibility showed enrichment for transcription factors within the bHLH/HLH, bZIP, and RUNT families (Fig. 3d; Online Resource 7).The 480 DARs with decreased accessibility after CCI were enriched in motifs from members of the high-mobility group (HMG) and the interferon regulatory factors (IRF) families (Fig. 3d; Online Resource 8).These data suggest that these transcription factors may impact susceptibility to neuron excitability through altered accessibility to their consensus DNA binding sequence.

Changes in Accessibility at CREs Are Associated with Gene Expression
To identify DARs associated with an increase or decrease in gene expression, we integrated RNA-seq data obtained from a cohort of rats that used the same experimental design [11].We found 109 DARs with increased accessibility after CCI were located near 79 genes that were upregulated after CCI (Fig. 4a).GO analysis of these 79 genes largely represented molecular functions, biological processes, and cellular compartments associated with neuronal activation and synaptic signaling (Fig. 4b).A total of 39 DARs with decreased accessibility after CCI were located near 29 genes whose expression also decreased following CCI (Fig. 4c).Examples of intergenic DARs associated with coordinate changes in the expression of the nearest gene are shown in Fig. 4 d.These findings suggest that changes in DNA accessibility at putative CREs regions following CCI can alter the expression of genes involved in nociceptive pathways.

Functional Evaluation of CREs
To determine whether the DARs were able to alter gene expression, we selected regions associated with increased and decreased accessibility in intergenic regions.We cloned a single copy of a selected DAR into the pGL3 promoter vector which also contains the SV40 minimal promoter.
Each construct was co-transfected with pGL4.74Renilla into the 50B11 (immortalized rat nociceptor) cell line.Four constructs showed increased luciferase expression, and three showed decreased luciferase expression as compared to the empty pGL3 promoter vector (Fig. 5(a)).These findings indicate that the DARs are associated with both enhancer (e.g., constructs A, C, E, F) and repressor (e.g., constructs B, D, G) activities, and function to increase (e.g., constructs A, C, G) or decrease gene expression (e.g., constructs B, D, E, F) in the context of physical injury.
We then determined whether the DARs showing regulatory capabilities in the luciferase assay could also bind to transcription factors by using a human transcription factor protein array [12].YBX1, HMGA1, CSDA, C/EBPγ, YBX2, CREB1, TBX3, and RHOXF1 showed significant binding to chr12:12,086,663-12,086,700 (Fig. 5(b-c)).EMSA confirmed C/EBPγ binding to this sequence (Fig. 5(d)).ChIP-qPCR showed significant enrichment of C/EBPγ in this DAR in 50B11 cells with a fold enrichment of 2.72 ± 0.47 over IgG (Fig. 5(e)).Together, these studies suggested a functional role for C/EBPγ in DRG neurons in gene regulation following CCI.

Identification of C/EBPγ-Regulated Genes in 50B11 Cells
To identify the regulatory role of C/EBPγ in DRG neurons, we knocked down Cebpg expression in 50B11 cells using shRNA and used RNA-seq to analyze gene expression changes.Western blots confirmed that Cebpg shRNA reduced C/EBPγ protein levels by >80%, as compared to control cells treated with scrambled shRNA (Online Resource 3).Compared to 50B11 cells treated with scrambled shRNA, we found 5260 differentially expressed genes (adjusted p-value < 0.01) in cells transduced with shRNA against Cebpg (Fig. 5(f)).As expected, in Cebpg shRNAtreated cells, Cebpg was among the 2636 downregulated genes.REVIGO was used to summarize results from GO enrichment analysis (Fig. 5(g)).Knockdown of Cebpg in 50B11 cells produced differentially expressed genes involved in various biological processes, including ion transport, transmembrane receptor signaling, regulation of transsynaptic signaling, and regulation of cell communication.

Discussion
The binding of enhancers, repressors, and recruited chromatin remodeling proteins to accessible regulatory elements initiates an epigenetic pathway that directs specific transcriptional changes in cells following a subsequent stimulus [13].In this study, we determined the dynamic changes in chromatin accessibility in rat DRG cells after peripheral nerve injury, which may play an important role in neuropathic pain development and maintenance.

Nerve Injury-Induced Reciprocal Changes in Gene Expression
We highlight an example of reciprocal changes in chromatin accessibility at distinct CREs, which may act cooperatively to alter gene transcription (Fig. 4d).Following CCI, we found a 26% increase in chromatin accessibility at a 684-bp region located 34.8 kb downstream of the gene Ctsb (chr15:46,281,215-46,281,898), which showed significantly increased expression.In contrast, after CCI, we also found reduced chromatin accessibility at a 973-bp region located 31kb upstream of Cst3 (chr3:143,254,480-143,255,452), which showed an increase in gene expression.Ctsb encodes Cathepsin B, a cysteine protease that promotes chronic inflammatory pain through activation of pro-caspase-1 and secretion of mature IL-1b [14], while Cst3 encodes Cystatin C, a highly efficient cysteine protease inhibitor.The N-terminal region of Cystatin C competitively and reversibly binds to the Cathepsin B active site, thereby preventing access to potential substrates [15].Not only do Cathepsin B and Cystatin C interact directly, but genetic ablation of Cystatin C increases Cathepsin B expression in, and promotes synaptic plasticity of, hippocampal neurons [16].Because Cathepsin B has been nominated as a promising therapeutic target to treat Alzheimer's disease [17], future therapies directed towards correcting chromatin accessibility at respective CREs elements, like those for Ctsb and Cst3, may be able to limit potential side effects from diseases involving aberrant protein expression.

Accessibility of TF Motifs in Differentially Accessible Regions
We identified the enrichment of transcription factor motifs present in differential accessible intergenic regions after CCI.In general, the enrichment of individual motifs was shared across other TFs within the same family.Individual TFs within the IRF and high-mobility group families were enriched exclusively in the regions of decreased accessibility while members of the HLH and RUNT families were Gene Ontology Biological Processes identified from 108 peaks associated only with annotated promoters in DRGs from CCI rats (bottom).The threshold for statistical significance was set to p-value < 0.01 (black horizontal line).b Heatmap that shows the normalized accessibility and hierarchical clustering of the gene promoters that were enriched in the top 5 GO terms from a. c Normalized chromatin accessibility tracks at the Scn11a gene promoter (chr8:128,519,864-128,522,250).The individual and averaged ATAC-seq signal of the normalized bigwig files for each sample as displayed from the Integrated Genomics Viewer (left).The "difference" track was created by subtracting the average track from the naïve group from the average track of the CCI group.Green indicates an increase in accessibility in the CCI group compared to naïve.The region highlighted in yellow (chr8:128,521,066-128,521,432) indicates the differentially accessible region between the naïve and CCI samples.A box plot of the normalized, log 2 transformed gene expression of Scn11a for each sample is provided on the right.d Heatmap of read density for all DARs at the promoter in the CCI and naïve groups with increased accessibility (top panels) and decreased accessibility (bottom panels).Each row represents one promoter region and the regions are aligned to the transcription start site for each gene.The color intensity represents the magnitude of chromatin accessibility.The average read density across all regions for each heatmap is shown on the right.e Bar plot of the gene ontology analysis of the biological processes identified in promoter DARs after CCI versus naïve ◂ enriched in regions of increased accessibility.In addition, we found enrichment of several AP-1 transcriptional factor family motifs in differentially accessible regions following CCI (i.e., Atf3, Fos, Fra1, Fra2, c-Jun, JunB, and JunB) as well as increased transcription of Atf3, Fos, and Jun.Our data is consistent with prior work which used electroconvulsive stimulation in dentate granule neurons to mimic neural injury [18].In this model, the expression of AP-1 pioneer transcription factors increased and promoted alterations in chromatin accessibility at Fos-Jun subfamily motifs.Interestingly, ATF3 can drive increased chromatin accessibility at distal regulatory elements in human umbilical vein endothelial cells [19].Our findings suggest that increased expression of AP-1 transcription factors in DRG cells following CCI is altering the chromatin accessibility at key regulatory elements, and therefore, providing an avenue for prolonged neuronal plasticity following nerve injury.

Peripheral Nerve Injury May Facilitate C/ EBP-Mediated Gene Repression
We show that sciatic CCI is associated with an increased potential for C/EBPγ-mediated gene repression in the lumbar DRGs.C/EBPγ is a ubiquitously expressed member of the CCAAT enhancer binding (C/EBP) family, a group of bZIP transcription factors, which is involved in the regulation of major physiologic processes (e.g., control of cellular proliferation, induction of the integrated stress response, and regulation of metabolism) [20].C/EBPγ possesses broad regulatory capacity and can activate or repress gene transcription based on its cellular context, differential chromosomal binding, and its heterodimerization with other bZIP transcription factors.In the present study, we found that CCI increased chromatin accessibility at a 222-bp region of chromosome 12 in the lumbar DRG.We further show that this same region repressed luciferase expression and bound C/EBPγ in a peripheral nociceptor cell line.Interestingly, Cebpg gene expression was unchanged 14 days following CCI while several bZIP transcription factors that are known to form heterodimers with C/EBPγ (e.g., Atf4, C/EBPβ) were upregulated.Future functional studies may help determine whether known heterodimeric partners for C/ EBPγ bind with C/EBPγ at regions of increased chromatin accessibility and alter the neuronal response to injury.
Given the context-dependent activity of C/EBPγ, its functional roles in developing and mature nervous system cells are poorly understood.To broadly identify genes that could be regulated by C/EBPγ in the context of nerve injury, we knocked down Cebpg in a rat nociceptor cell line and found differentially expressed genes associated with a variety of pain-related biological processes including carboxylic acid transport, trans-synaptic signaling, signaling receptor activity, and ion homeostasis.Our data support existing literature regarding the involvement of C/EBPγ in mediating the development of chronic pain [20].In the sciatic crush nerve injury model, Cebpg expression was significantly upregulated in the DRG at approximately 1 day post-injury before returning to baseline levels after 3 days [21].Similarly, downregulation of Cebpg in the DRG and spinal cord reduced mechanical hypersensitivity following sciatic nerve injury [22].Further research is needed to better understand the impact of C/EBPγ-mediated gene repression in the DRG following CCI and during the development and maintenance of pain hypersensitivity as well as in nerve regeneration.
Our study is the first multiomic assessment of the regulatory landscape of DRG cells following nerve injury.Despite the small number of cells in each DRG, we were able to generate ChIP-seq libraries to identify putative regulatory elements in addition to assessing chromatin accessibility and gene expression in CCI and naïve rats.However, our study has some limitations.Our findings mainly reflect those of bulk tissues which are a heterogeneous mixture of sensory neurons and microsatellite glia, but lack of cell-type specific information.In addition, we also conducted our study using exclusively young adult female rats where the regulatory landscape may differ from male rats and in elderly animals of both sexes.
In conclusion, our multiomic assessment of nerve-injured and naïve DRG provides novel insights into the role of chromatin structure at regulatory elements and activities of transcription factors that together may modulate neuronal excitability and nerve regeneration.Our improved understanding of how epigenetic perturbations alter transcription in response to nerve injury, and their relationship to chronic pain, will facilitate the development of novel classes of analgesics that can target these and other similar mechanisms upstream of transcription.

Animals
Adult female Sprague-Dawley rats (12-16 weeks old) (Harlan Bioproducts for Science, Indianapolis, IN) were housed 2-3 per cage in centralized animal care facilities with a 12-h light/dark cycle.Animals were allowed to acclimate for a minimum of 48 h prior to any procedures and given ad libitum access to food and water.All procedures involving animals were reviewed and approved by the Johns Hopkins Animal Care and Use Committee and are performed in accordance with the NIH Guide for the Care and Use of Laboratory Animals.

CCI of the Sciatic Nerve
CCI surgery on the sciatic nerve was performed on all rats as previously described [23].Under 2-3% isoflurane, a small incision was made at the level of the mid-thigh.The sciatic nerve was exposed by blunt dissection through the biceps femoris.The nerve trunk proximal to the distal branching point was loosely ligated with four 4-O silk sutures placed approximately 0.5mm apart until the epineuria was slightly compressed and minor twitching of the relevant muscles was observed.The muscle layer was closed with 4-O silk suture and the wound was closed with metal clips.All surgical procedures were performed by the same individual to avoid variation in technique.Hypersensitivity of the hind paws was verified by von Frey monofilaments [24] on day 14 post-injury.

Experimental Design
Rats were assigned randomly to either receive CCI surgery or no procedure (i.e., naive control) (Fig. 1a).On postoperative day 14, rats were euthanized by an overdose of isoflurane and decapitation after which the ipsilateral L4-L6 DRGs were quickly dissected.For ChIP-seq and RNA-seq, DRGs were immediately submerged in liquid nitrogen and stored at −80°C until processing.

ATAC-seq Library Preparation
Immediately following dissection, the ipsilateral L4-L6 DRGs from one rat were transferred directly to cold lysis buffer (0.32M sucrose, 5mM CaCl 2 , 3mM mg (acetate), 10mM Tris-HCl, pH 8.0, 0.1% Triton X-100, 1mM DTT, 5mM Na Butyrate, 1mM PMSF).Nuclei were isolated through dounce homogenization of the tissue in lysis buffer followed by ultracentrifugation through a sucrose cushion (1.8M sucrose, 3mM mg (acetate), 1mM DTT, 10mM Tris-HCl, pH 8.0, 5mM Na Butyrate, 1mM PMSF) at 139,800 × g at 4° C for 2 h to remove mitochondrial DNA.The nuclei were resuspended in 1× PBS and counted 3 times using a Neubauer chamber.Tagmentation by Tn5 was performed using reagents from the Nextera DNA Sample Preparation Kit (FC-121-1030, Illumina; San Diego, CA) as previously described [8].Each 50μl reaction contained 50,000 nuclei, 25μl 2× Tagmentation Buffer, and 2.5μl Tn5 enzyme and incubated at 37° C for 30 min.Tagmented DNA was immediately purified using the Clean and Concentrate-5 Kit (Zymo, Irvine, CA) and eluted in 10μl elution buffer.Tagmented DNA fragments were amplified using Nextera Index adapters, PCR primer cocktail, NPM PCR master mix, and 10 cycles of PCR.Each library was purified using Agencourt AMPure XP beads (Beckman Coulter; Atlanta, GA).The fragment distribution of each library was assessed using the High Sensitivity DNA Kit on an Agilent 2100 Bioanalyzer (Agilent Technologies; Palo Alto, CA).Libraries were quantified prior to sequencing using the Qubit DNA HS kit (ThermoScientific, Waltham, MA) and normalized to 2nM, and pooled in equimolar concentrations.Libraries were sequenced using paired-end, dual-index sequencing on an Illumina HiSeq 2500 (Illumina, San Diego, CA) which produced 50bp reads.ATAC-seq was performed on 3 biological replicates for the CCI group and 3 biological replicates for the naive group.

RNA-seq Dataset
We performed RNA-seq from DRGs from a similar cohort of adult, naïve female Sprague-Dawley rats or after CCI and validated this data using qPCR in biological replicates.Details regarding sample acquisition, RNA library preparation, and RNA-seq data processing have been published [11].RNA-seq data are available under accession #GEO100122.

50B11 Cell Culture
The 50B11 cells were a gift from Ahmet Höke of the Johns Hopkins University, Department of Neurosurgery.Cells were maintained in Neurobasal medium (ThermoScientific, Waltham, MA) supplemented with 2% B27 (Sigma; St. Louis, MO), 550μM glutamine (Sigma; St. Louis, MO), 12mM glucose, and 10% fetal bovine serum and incubated at 37°C in a humidified environment containing 5% CO 2 .Cells with low passage numbers (i.e., <20) were used for all experiments.

Cloning
Luciferase reporter constructs were generated by cloning a candidate enhancer region into the pGL3 promoter vector (Promega; Madison, WI).Each region was inserted using standard restriction enzyme-based cloning techniques.The regions were obtained by PCR of rat genomic DNA.The 5′ end of the primers was modified to contain BglII (forward primer) and MluI (reverse primer) restriction sites (Online Resource 9).PCR was performed using the Pfu Turbo polymerase (Agilent Technologies; Palo Alto, CA) and touchdown thermocycling.The PCR products were digested and ligated into the BglII (AGA TCT ) and MluI (ACG CGT ) restriction enzyme sites of the pGL3-Promoter luciferase vector (Promega; Madison, WI).The ligated products were transformed into chemically competent DH5α cells using ampicillin (100mcg/ml) to select for the recombinant plasmid-positive colonies.All constructs were verified by diagnostic restriction enzyme digestion and Sanger sequencing.

Transfection and Luciferase Assays
50B11 cells were seeded at 10,000 cells/well in 48-well plates in 250μl of complete media and grown to 60-80% confluence.Cells were then transfected with each reporter construct (450ng) and 50ng pGL4.74Renilla luciferase expression vector (Promega; Madison, WI) using ViaFect Transfection Reagent (Promega; Madison.WI) in 25ul Opti-MEM (ThermoScientific, Waltham, MA) with a 4:1 ratio in 250μl complete medium.The transfection efficiency of 50B11 cells was evaluated by transfecting cells with EGFP-N1 (Clontech; Mountain View, CA) in parallel reactions.At 48 h post-transfection, Firefly and Renilla luciferase were measured using the Dual-Glo Luciferase reporter assay system (Promega).The amount of Firefly luciferase normalized to the Renilla luciferase and expressed as the relative fold difference of the empty pGL3 promoter vector.Each enhancer construct was tested in quadruplicate.

Transcription Factor Protein Microarrays
Human transcription factors were purified from yeast as GST fusion and arrayed on FAST slides in duplicate as previously described [12,25].The microarrays were probed with a 39-nucleotide sequence from differentially accessible regions used in the luciferase assays.Sixty base oligonucleotides were synthesized by IDT and contained this 39-nucleotide sequence followed by the reverse T7 sequence.The DNA probes were converted to double-stranded DNA with either Cy3-or Cy5-labeled T7 primer as the 5′-end.The labeled double-stranded T7 sequence was chosen as a negative control for T7-specific binding.Each sequence was tested in duplicate arrays with alternating fluorescent labels.The slides were then washed and scanned with a GenePix 4000B scanner (Molecular Devices, Sunnyvale, CA), and the binding signals were acquired using the GenePix 6.0 software.GenePix 6.0 was used to align the spot-calling grid and record the foreground and background intensities for every protein spot.The raw binding intensity for each probe was defined as F ij /B ij , where F ij and B ij are the median values of foreground and background signals of the probes at site (i,j) on the microarray, respectively.We normalized the raw signal of each probe based on the median value of the raw signals of its neighboring probes.The Z-score of each binding assay was calculated by Z i,j = (R' I,j − Ñ)/std(N), where R' I,j is the locally normalized intensity of probe (I,j) on the microarray, Ñ and std(N) are mean value and standard deviation, respectively, of noise distribution on the microarray.Since each protein is printed in duplicate on the microarray and each binding assay was performed in duplicate, a protein was identified as a positive hit only when all of its 4 spots produced a Z-score > 3.0.

EMSA
The binding reaction was carried out with 100 fmol of biotinylated dsDNA probe and 1 pmol of purified C/EBPγ protein in 20μl of binding buffer as previously described [12,25].Tenfold unlabeled T7 was added to the competition assay.The C/EBPγ expression clone used in the EMSA was verified by DNA sequencing.

ChIP-qPCR in 50B11 Cells
ChIP-qPCR was used to quantify Cebpg binding to the putative enhancer element (rn6:chr12:12,085,353-12,088,386) in 50B11 cells.Five million 50B11 cells were cross-linked in 1% formaldehyde for 10 min at room temperature.A final concentration of 0.125M glycine was added to quench the crosslinking reaction.The cells were washed in chilled 1× PBS and subjected to dounce homogenization in lysis buffer (5mM HEPES, 85mM KCl, 1% IGEPAL CA-630 in 1× PBS, 1× Roche Complete protease inhibitors, 5mM Na butyrate).The homogenate was centrifuged for 10 min at 10,00×g at 4°C to pellet the nuclei.Nuclei were resuspended in nuclei lysis buffer (5mM Tris-HCl, pH 8.0, 10mM EDTA, 1% SDS in 1× PBS, 1× Roche Complete protease inhibitors, 5mM Na butyrate) and incubated at room temperature for 10 min.The chromatin was then sheared using a Qsonica Q800R2 (Qsonica, Newtown, CT) at 75% power for 15 cycles of 30 s on/30 s off to produce DNA fragments with lengths between 200 and 600 base pairs.Chromatin was then stored at −80°C until immunoprecipitation.Sheared chromatin was incubated with anti-CEBPG or anti-rabbit IgG control antibody attached to protein A Dynabeads (Invitrogen) and incubated overnight at 4°C.An aliquot of sheared chromatin was taken as the input sample (i.e., preprecipitation control).Following incubation, each immunoprecipitation reaction was washed 3 times with low salt wash buffer (50mM Tris-HCl, pH 7.4, 10mM EDTA, 0.1% SDS, 1% Triton X-100, 150mM NaCl, 1mM PMSF), five times with high salt wash buffer (50mM Tris, pH 7.4, 500mM NaCl, 1% Triton-X-100, 0.1% SDS, 10mM EDTA, 1mM PMSF), and once with 250mM LiCl, 50mM Tris, pH 7.4.The DNA-histone complexes were eluted from the Dynabeads (1% SDS, 100mM NaHCO 3 ) at 65°C.Cross-links between the DNA fragments and histones were reversed, and the DNA fragments were recovered using the ChIP DNA Clean & Concentrator Kit (Zymo, Irvine, CA) and quantified by Qubit.Each 10 μl qPCR reaction consisted of 2× SSoAdvanced Universal SYBR green qPCR Master Mix (Bio-Rad, Hercules, CA), 200 nM each forward and reverse primer, and 3.2 μl ChIP DNA.PCR of each target was performed in triplicate using the CFX384 Touch Real-Time PCR Detection System (Bio-Rad, Hercules, CA) with the following thermocycling conditions: initial denaturation at 98°C for 3 min followed by 40 cycles of 98°C for 15 s and 60°C for 22 s.Nuclease-free water was included as the notemplate control.Ct values were used to calculate the percent input/fold enrichment above the IgG/NTC controls.Primers used to amplify the 5′ region of Cebpg were CCT TGA GGG TTC TTC GGC TG (forward) and CTG TGG TGT GCT CGA GTG AT (reverse).

Generation of CEBPG Knockdown in 50B11 Cells
Stable knockdown of Cebpg in 50B11 cells was done by transfecting 50B11 cells with pGFP-C-shLenti carrying shRNA against Cebpg (1μg/ml; TL709448; Origene, Rockville, MD) following the manufacturer's recommendations.A 29-mer scrambled shRNA cassette in the pGFP-C-shLenti vector (TR30021; Origene) was used as the off-target control.Transduced cells then underwent puromycin selection and visible confirmation for GFP.Cebpg knockdown efficiency compared to control was verified by Western blot (Online Resource 3).

RNA-seq
Total RNA was extracted from transgenic Cebpg knockdown and control 50B11 cell lines using trizol-chloroform.The RNA Clean and Concentrate-5 kit was used to clean up RNA from the aqueous phase with on column DNAseI digestion.RNA concentration was measured using the NanoDrop One Spectrophotometer (Thermo Fisher Scientific, Waltham, MA) and RNA integrity was assessed using RNA Nano Eukaryote chips in an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA).One microgram of total RNA was used to construct sequencing libraries as previously described [1,11].Strandspecific RNA libraries were prepared using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina with NEBNEXT poly(A) mRNA Isolation Module (New England Biolabs) according to the manufacturer's recommendations.Samples were barcoded using the recommended NEBNext Multiplex Oligos (New England Biolabs).The size range and quality of libraries were verified on the Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA).RNA-seq libraries were quantified by qPCR using the KAPA library quantification kit (KAPA Biosystems, Wilmington, MA).Each library was normalized to 2nM and pooled in equimolar concentrations.Single-end sequencing (i.e., 1X75bp) was performed in a single run on an Illumina NextSeq 500 (Illumina, San Diego, CA).Three independent experimental replicates were run for the Cebpg shRNA-mediated knockdown and shRNA control transgenic lines for a total of 6 libraries.

Data Analysis
ChIP-seq Raw fastq files were aligned to rat RN6 genome using Bowtie2 [26].Duplicated reads were removed using Picard tools [27].Peak calling was performed with MACS2 [28].Regions of H3K4me1 enrichment were identified by MACS2 for the CCI biological duplicates and separately for the Naive biological duplicates using input samples as controls and the following settings: --keep-dup all -B --SP RM --nomodel --broad.All regions found to be enriched in either group were merged into a single list and overlapping regions reduced to be expressed as a single region.

ATAC-seq
Paired-end reads were trimmed using Trimmomatic [29] to remove adaptors.The trimmed reads were then aligned to rat genome rn6 using Bowtie2 [26] with the following parameters: X2000-no-mixed-no-discordant.Reads with a mapping quality score less than 10 were removed using SAMtools [30] and duplicated reads were removed using the MarkDuplicates function in Picard.The genomic coordinates for each read were then shifted 4 bases (positive strand) or 5 bases (negative strand) 5′ relative to the reference genome to adjust each read for the Tn5 binding footprint [8].Each read was then trimmed to produce a single base located at the 5′ end of each read.Each single base read was extended 75 bases in each direction so that the Tn5 insertion site was located at the center of a 150-base read.These shifted reads were provided as input for peak calling with MACS2 using the following parameters: --nomodel --extsize 150 -B --keepdup all --call-summits.The read density was calculated in 300 nucleotide bins across the genome for each sample.
Visualization Tracks for each sample were created for visualization in IGV.The number of slopped insertion sites for each sample was downsampled to 30 million and converted to bigWig files using ucsctools.To create an aggregated track for each treatment group, all slopped insertion sites were concatenated into a single file and downsampled to 125 million sites for each group.These files were then converted into bigWig format.In IGV, these aggregated tracks were subtracted to create an additional track to visualize regions of increased and decreased DNA accessibility.

Identification of Differentially Accessible Regions
Only regions where both H3K4me1 enrichment and accessibility in one or more samples by ATAC-seq were used to identify regions of differential accessibility.To obtain a consensus list of candidate regions, ATAC-seq peaks from each sample were subsetted by the regions of H3K4me1 enrichment.These regions were evaluated for differential accessibility using DiffBind [31].To increase the stringency of the analysis, an ATAC-seq peak must have been present in a minimum of 2 samples per group.Therefore, the ATAC-seq peak was required to be present in both CCI samples and 2 of the 3 naive samples for inclusion into analysis.Because the contribution of any single region to the CCI phenotype is predicted to be relatively small, we conducted the differential analysis using a permissive significance threshold of p< 0.05 to avoid missing potentially important contributors.The results were annotated using HOMER according to default parameters and merged with the RNA-seq data.Regions that were associated with increased or decreased gene expression were identified.Gene lists were used as input for GO analysis of biological function using the ToppGeneSuite [32].Motif analysis was conducted using the findMotifsGenome.plcommand in HOMER and the rn6 genome with the following parameters: -size 200 -len 6, 8, 10, 12.
RNA-seq of 50B11 Transgenic Lines Sequencing reads were aligned to annotated RefSeq genes in the rat reference genome (rn6) using HISAT2 [33] and filtered to remove ribosomal RNA.A gene count matrix that contained raw transcript counts for each annotated gene was generated using the featureCounts function of the Subread package in R [34] against the Ensemble rn6 transcriptome.This count matrix was then filtered for low count genes so that only those genes with >0 reads across all samples were retained.We relied on the automatic and independent filtering used by DESeq2 to determine the most appropriate threshold for removing genes with low counts [35].To identify genes that were differentially regulated with Cepbg shRNA-mediated knockdown, raw transcript counts were normalized, log 2 transformed, and analyzed using the default procedures in DESeq2 [35].Adjusted p-values were corrected using the Benjamini-Hochberg method.An adjusted p-value <0.05 and an absolute log 2 fold change > 0.5 were used to define differentially expressed genes between knockdown and control.REVIGO was used to reduce and visualize GO enrichment data [36].

Fig. 2
Fig.2Changes in chromatin accessibility at gene promoters.a Venn diagram shows the overlap of naive and CCI accessible regions at annotated gene promoters (top).Bar plot showing the top 10 enriched Gene Ontology Biological Processes identified from 108 peaks associated only with annotated promoters in DRGs from CCI rats (bottom).The threshold for statistical significance was set to p-value < 0.01 (black horizontal line).b Heatmap that shows the normalized accessibility and hierarchical clustering of the gene promoters that were enriched in the top 5 GO terms from a. c Normalized chromatin accessibility tracks at the Scn11a gene promoter (chr8:128,519,864-128,522,250).The individual and averaged ATAC-seq signal of the normalized bigwig files for each sample as displayed from the Integrated Genomics Viewer (left).The "difference" track was created by subtracting the average track from the naïve group from the average track of the CCI group.Green indicates an increase in accessibility in the CCI group compared to naïve.The region highlighted in yellow (chr8:128,521,066-128,521,432) indicates the differentially accessible region between the naïve and CCI samples.A box plot of the normalized, log 2 transformed gene expression of Scn11a for each sample is provided on the right.d Heatmap of read density for all DARs at the promoter in the CCI and naïve groups with increased accessibility (top panels) and decreased accessibility (bottom panels).Each row represents one promoter region and the regions are aligned to the transcription start site for each gene.The color intensity represents the magnitude of chromatin accessibility.The average read density across all regions for each heatmap is shown on the right.e Bar plot of the gene ontology analysis of the biological processes identified in promoter DARs after CCI versus naïve

Fig. 3
Fig. 3 Differentially accessible regions in putative cis-regulatory regions.a A consensus peakset of 58,446 regions showed chromatin accessibility in regions deposited with the H3K4me1 histone modification.Heatmap of the read density of each of these regions in the naïve and CCI ATAC-seq samples and H3K4me1 ChIP-seq.Each row represents one region and the regions are aligned by its center.The color intensity represents the magnitude of read coverage.b Volcano plot of differential accessibility of the 58, 446 accessible regions

Fig. 4 Fig. 5
Fig. 4 Gene expression associated with DARs. a Scatterplot shows the FC accessibility of each region of increased accessibility and increased expression of the nearest annotated gene.Significant DARs are highlighted in red.b Gene ontology analysis of the 79 genes with increased expression nearest to 109 regions of significantly increased chromatin accessibility.c Scatterplot shows the FC accessibility of each region of increased accessibility and increased expression of the nearest gene by RNA-seq.Significant DARs are highlighted in red.d Chromatin accessibility at 2 intergenic DARs (yellow highlight):