Abstract
Although our understanding of the involvement of heterochromatin architectural factors in shaping nuclear organization is improving, there is still ongoing debate regarding the role of active genes in this process. In this study, we utilize publicly-available Micro-C data from mouse embryonic stem cells to investigate the relationship between gene transcription and 3D gene folding. Our analysis uncovers a nonmonotonic - globally positive - correlation between intragenic contact density and Pol II occupancy, independent of cohesin-based loop extrusion. Through the development of a biophysical model integrating the role of transcription dynamics within a polymer model of chromosome organization, we demonstrate that Pol II-mediated attractive interactions with limited valency between transcribed regions yield quantitative predictions consistent with chromosome-conformation-capture and live-imaging experiments. Our work provides compelling evidence that transcriptional activity shapes the 4D genome through Pol II-mediated micro-compartmentalization.
Similar content being viewed by others
Introduction
Chromosome conformation assays like Hi-C unveiled hierarchical organization of chromosomes within eukaryotic nuclei1,2. In metazoans, Mbp-scale “checkerboard” patterns in contact maps reveal spatial segregation of chromosomes into a euchromatic “A” compartment and a heterochromatic “B” compartment3,4. At a smaller scale (~100 s kbp), chromosomes fold into topologically associating domains (TADs) and loops5,6,7. The prevailing model8 suggests that compartments emerge from the micro-phase separation of epigenomic regions mediated by chromatin-binding architectural proteins9,10, while most TADs result from cohesin-driven chromatin loop extrusion with CTCF acting as a barrier11,12. Hi-C and live imaging experiments indicate that depleting CTCF or cohesin disrupts TADs, weakens CTCF-mediated loops, but has limited effects on compartmentalization7,13,14,15,16. Nevertheless, some loops or TADs remain unaffected by these treatments14,17, likely originating from distinct mechanisms.
3D chromosome organization regulates gene expression during interphase18,19,20. Notably, colocalization of promoters and enhancers within TADs can directly influence transcription initiation, potentially increasing transcription rates20,21. Conversely, recent studies suggest that genes serve as central units of the 3D genome and that transcription itself plays a role22,23,24,25. High-resolution contact maps in mammalian and fly cells reveal transcription-dependent fine structures, such as loops between active gene promoters, promoters and enhancers, or transcriptional start (TSS) and termination (TTS) sites of the same gene13,26,27,28,29. However, the mechanistic origins of these fine structures, despite their potential significance in gene regulation, remain controversial.
Indeed, on the one hand, some experiments in Drosophila and mice indicate higher 3D contacts within expressed gene bodies compared to repressed ones29,30. Remodeling of chromatin structure around genes during mouse thymocyte maturation often coincides with transcriptional changes31. RNA Polymerase IIs (Pol II) form also distinct foci and higher-order clusters known as transcription factories32,33,34,35, and active genes tend to colocalize within transcriptionally active subcompartments27,36. On the other hand, there are cases in mammalian cells where significant unfolding of genes occurs after strong transcriptional activation37,38,39, and acute depletion of Pol I, II, and III has minimal effects on large-scale genome folding40. In budding yeast, gene activity inversely correlates with local chromatin compaction25. Live-cell imaging experiments highlight the relationship between gene transcription and chromatin dynamics41,42,43,44,45, revealing enhanced gene mobility upon Pol II elongation inhibition41,43,44 or gene activation42 and correlated motions between active regions44,45.
Complementing experiments, biophysical and polymer models have also explored the complex interplay between transcription and genome dynamics35,37,46,47,48,49. Elongating or backtracked Pol IIs may act as barriers for SMC-mediated extrusion, indirectly impacting genome organization38,46,47,50,51,52,53,54. Transcription-dependent changes in the local chromatin fiber rigidity and contour length may lead to extended conformations for highly transcribed genes37. Computational models considering Pol II-mediated interactions35, interactions with transcription factories or condensates48,49, or P-TEFb interactions43 suggest that attractive interactions between active regions may capture inter-gene contacts observed in Hi-C and the gene mobility observed in live-imaging.
Overall, the evidence presents a complex understanding of the genome spatio-temporal dynamics in response to gene transcription, necessitating a comprehensive framework to reconcile these observations. In this study, we analyze publicly available Micro-C data for mouse embryonic stem cells (mESCs) and develop observables to characterize transcription-dependent 3D gene folding. Our analysis reveals a nonmonotonic relationship between intragenic contact density and gene transcription, potentially reconciling contradictory data. By dissecting the contributions of loop extrusion and transcription-associated factors, we propose Pol II occupancy as a key determinant of gene folding. Using a traffic model for gene activity and a 3D polymer model55,56, we demonstrate that transcriptionally active subcompartments and intragenic contact enrichment may arise from Pol II-mediated phase separation. Furthermore, we suggest that Pol II-mediated condensation, coupled with transcriptional bursting, may slow down gene mobility, aligning with experimental observations.
Results
RNA Pol II occupancy and gene length correlate with intra-gene condensation
In this study, we aimed to investigate the potential role of transcriptional activity in the local organization of genes within the genome. To accomplish this, we focused on mouse embryonic stem cells (mESC) as they provide abundant quantitative data. Specifically, we utilized publicly available high-resolution Micro-C and Pol II ChIP-seq data13. For our analysis, Micro-C contact maps were distance-normalized to examine contact enrichment compared to a sequence-averaged null behavior, resulting in the observed over expected (obs/exp) contact map. We introduced two scores for each gene (Fig. 1A and Methods): (i) Intra-gene contact enrichment (IC), which represents the mean obs/exp values calculated for all pairs of loci within the gene, capturing the level of self-association and overall gene condensation. (ii) Intra-gene RNA Pol II enrichment (IR), which corresponds to the mean normalized Pol II ChIP-seq profile within the gene and reflects gene transcriptional activity, correlating with RNA-seq data (Supplementary Fig. 1).
A Observed Micro-C contact map (top), observed/expected map (middle) and several ChIP-seq profiles (bottom) of the genomic region including the Ipo5 gene (chr14:120,874-120,984 kb) in mESC. The intra-gene contact (IC) and RNA Pol II (IR) enrichments are illustrated. B Scatterplot of IC versus IR for all genes longer than 1 kb (24,363 genes). Colors refer to the density of dots. C Boxplots of IC after clustering together the genes (dots in (B)) with similar IR. The number of genes in each cluster from left (IR score = −3) to right (IR score = 7) are, respectively, 175, 1629, 6603, 4515, 3926, 3673, 2463, 946, 265, 109, and 37. Boxplots present the median and 25th and 75th percentile, with the whiskers extending to 1.5 times the interquartile range. Two-tailed t-tests were performed between the two last clusters 6 and 7, p-value = 0.0001. Source data are provided as a Source Data file.
Fig. 1B shows a significant positive correlation (Spearman’s ρ = 0.56, t test p-value < 1e-200) between IC and IR scores, indicating that increased transcriptional activity is associated with enhanced intra-gene condensation. This result is consistent with prior research on mouse ESC29, Drosophila30 and mouse DP and DN3 thymocytes31. We also checked that such a correlation remains mainly independent of the phosphorylation status (Ser5P and Ser2P) of Pol II (Supplementary Fig. 2), which is associated with different dynamical and interacting states of the polymerase57. Additionally, a similar correlation between IC and intra-gene H3K36me3 (a histone mark related to Pol II elongation) content was detected (Supplementary Fig. 1). As a control, we observed weak, negative correlations between IC and repressive marks (H3K27me3 and H3K9me3) (Supplementary Fig. 3). Clustering genes based on similar IR scores revealed a nonlinear and non-monotonic relationship (Fig. 1C): IC generally increases with IR, except at very high Pol II levels where a slight but significant relative decrease in contact frequency occurs. Importantly, this behavior cannot be solely attributed to the inherent properties of Micro-C experiments to detect more or less contacts depending on the molecular crowding on DNA25 since similar behavior was observed using mESC Hi-C data19 (Supplementary Fig. 4). Interestingly, this correlation between IR and IC holds true regardless of gene compartment (A or B) (Supplementary Fig. 5) or the number of exons19 (Supplementary Fig. 6). However, genes with higher exon counts tend to exhibit more intra-gene contacts compared to those with fewer exons. Moreover, IC scores for A-compartment genes are generally higher than those for B-compartment genes, which may suggest an interference between the segregation of heterochromatin and the condensation of Pol II-enriched genes. Interestingly, this difference becomes more pronounced for genes with high IR scores, where the drop in IC is more significant for B-genes.
In mammals, highly active genes are typically smaller, and larger genes, when active, are usually lowly expressed (Supplementary Fig. 7). Hence, we investigated whether gene size may be a confounding factor or, on the contrary, could be a determining factor by classifying genes based on both their IR and genomic length (Methods). Figure 2B displays the average IC score for each category, revealing a positive correlation between IC and gene length: longer genes exhibit stronger intra-gene contact frequency at a given Pol II occupancy density. These findings suggest a cooperative effect in intra-gene folding, wherein both Pol II density and gene size play integral roles58.
A Pileup meta-gene analysis (PMGA, see Methods) of the obs/exp map around genes clustered based on their length (horizontal axis) and Pol II enrichment (vertical axis). The number of genes of each cluster is indicated above on each map. Maps for clusters with less than 25 representative genes were not drawn, due to lack of statistics. B Average IC scores for each cluster in (A). C PMGA of different chromatin tracks: in each subplot, all the average profiles of the different Pol II clusters for genes of the same length range are shown (from left to right: from small to large genes); different colors correspond to the different Pol II clusters, from low (blue) to high (red) IR score (respectively, −2,−1,0,1,2,3,4,5,6). Source data are provided as a Source Data file.
To gain deeper insights into the contact patterns and profiles within and surrounding genes, we conducted a pile-up meta-gene analysis (PMGA), aggregating the rescaled obs/exp maps and ChIP-seq profiles of genes with similar size and transcriptional activity (Fig. 2A, C and Methods). PMGA uncovered a strong correlation between Pol II profiles and certain structural features of contact maps: intra-gene contact maps were nearly uniform, consistent with the constant Pol II levels observed within genes; stripes of preferential interactions were observed between Pol II-rich promoters/TSSs and gene bodies (stripe); and loops were formed between Pol II-rich TSSs and TTSs (TSS-TTS loops). Notably, the correlation between Ser2P Poll II (which have no peak at TSS), Ser5P Pol II (only weak peaks at TSS and TTS) and H3K36me3 (strong depletion around TSSs) profiles, taken individually, and Micro-C patterns such as TSS-TTS loops and stripes was less apparent (Fig. 2A, C and Supplementary Fig. 2). Regarding the dependency on gene size, we found that promoters of short genes are often located at the domain borders, while larger genes tend to form their own insulated domains separate from surrounding regions.
To further investigate the role of Pol II occupancy, we analyzed two publicly available datasets involving the treatment of mESC cells with transcriptional inhibitor drugs: triptolide (TRP), which inhibits Pol II initiation, and flavopiridol (FLV), which inhibits Pol II elongation29. Firstly, we confirmed the significant reduction in the intensity of Pol II-mediated loops after both treatments (Supplementary Fig. 8). Consistent with the observed loss of intra-gene Pol II occupancy in all genes, particularly highly transcribed ones (Supplementary Fig. 9 and Supplementary Fig. 10), intra-gene interactions were weaker in the TRP and FLV cases compared to the normal condition (Fig. 3A, Supplementary Fig. 11 and Supplementary Fig. 12), resulting in a 12% reduction in IC for large active genes post-treatment. These results align with a previously reported observation of 25% reduction in the intensities of gene stripes following Pol II inhibition29. Moreover, there exists a notable correlation between the fold-changes (treated vs untreated) in IC and IR scores (Fig. 3C): the greater the reduction in Pol II occupancy for a given gene, the more likely its intra-gene folding is affected. Interestingly, when re-clustering genes based on their new IR scores measured in TRP- and FLV-treated cells, we still observed an average increase in IC as a function of IR similar to the untreated case (Fig. 3D), suggesting that the remaining intra-gene interactions observed after transcription inhibition may be attributed to residual Pol II occupancies. This ‘master curve’ provides further evidence that Pol II level only is predictive—in average—of the intra-gene folding whatever the conditions (treated or untreated).
A Comparison between PMGA of untreated WT cells and cells treated with transcription inhibitors for genes with size of 64-128 kb, IRwt > 1 in WT condition and with a reduced IR score in treated cells (IRtreat.<IRwt). B PMGA for genes with size of 64-128 kb and 1<IRwt < 2 in conditions of reduced CTCF, RAD21 or WAPL levels or for a subset of genes with low SMC1a level in WT cells (WT no SMC1a, most left). C Scatter Plot of fold change of intragene contact enrichment against the fold change in Pol II occupancy after TRP treatment for the genes >64 kb, IRwt > 1 in WT condition and with a reduced IR score in treated cells (IRtreat.<IRwt). The Spearman correlation is given. D The intragene contact enrichment upon acute depletion of RAD21, CTCF and WAPL, by IAA treatment of an engineered ES cell line, or by treatment with triptolide (TRP), as a function of IR score in the treated cells. The Spearman’s correlation between average IC and IR scores of WT is 0.97. Data are presented as mean values ± SD and were computed over a number of genes always higher than 10 (median number = 1735). E Scheme summarizing the different determinants of structures observed inside or around active genes. Source data are provided as a Source Data file.
In summary, our findings demonstrate a correlation between intra-gene condensation, interaction patterns, local transcriptional activity, gene length, and Pol II occupancy.
Cohesin-mediated loop-extrusion activity plays a minor role on intra-gene condensation
Recent studies, both experimental and theoretical, have proposed that the loop extrusion mechanism, which plays a crucial role in the formation of TADs, might have an impact on the transcription machinery23,38,46,47,54. Interestingly, we observed a significant correlation between the occupancy of CTCF and cohesin, the main players in loop extrusion, and the intra-gene contact enrichment and Pol II occupancy (Fig. 2C and Supplementary Fig. 1). This observation led us to investigate whether cohesin-mediated loop extrusion could drive the correlation between transcriptional activity and intra-gene folding discussed earlier.
To address this question, we analyzed our original dataset from wild-type mESCs and excluded genes with high SMC1a (a cohesin subunit) occupancy (Methods). The remaining genes, clustered based on IC and gene length, showed significantly lower levels of CTCF and cohesin, while Pol II profiles remained largely unchanged (Supplementary Fig. 13C). Despite this subset of genes, the IC and IR scores still exhibited a strong correlation at a level similar to wild-type (Fig. 3D, Supplementary Fig. 13A, B). Additionally, PMGA revealed that the typical interaction patterns observed within genes were still visible for cohesin-poor genes, although certain features such as stripes, which are known to be footprints of loop extrusion activity near extruding barriers38,46, were absent outside the genes (black arrows in Fig. 3B left).
Furthermore, we utilized three publicly available mESC datasets where CTCF (∆CTCF), the cohesin subunit RAD21 (∆RAD21), or the cohesin unloader WAPL (∆WAPL) were acutely depleted13. These treatments led to significant alterations in CTCF and cohesin occupancies throughout the genome13, as well as changes in TAD folding and CTCF-CTCF loops7,14, such as a strong reduction in loop intensity in ∆CTCF and ∆RAD21 and reinforcement in ∆WAPL (Supplementary Fig. 8 bottom). However, most gene expressions remained unaltered13, and the majority of loops between Pol II peaks were unaffected (Supplementary Fig. 8 top). Surprisingly, despite the acute changes in intra-gene CTCF and cohesin profiles (Supplementary Fig. 14C, Supplementary Fig. 15C and Supplementary Fig. 16C), we observed only minimal effects on intra-gene interactions (Figs. 3B, 3D, Supplementary Fig. 12, Supplementary Figs. 14–16A, B). The most noticeable—yet weak—changes in IC scores occurred in highly active genes (high IR), with an average 5% reduction in ∆RAD21 (Fig. 3B). However, the changes in IC between WT and ∆RAD21 conditions did not exhibit a clear correlation with changes in RAD21 occupancy (Supplementary Fig. 17). Similar to cohesin-poor genes in WT, the structural features associated with loop extrusion outside genes were lost or significantly reduced in ∆CTCF and ∆RAD21 (and enhanced in ∆WAPL) (Fig. 3E).
Collectively, these results indicate that cohesin-mediated loop extrusion does not significantly affect the specific organization of transcribed genes, suggesting the presence of an independent mechanism.
A biophysical model to investigate the role of transcription on gene folding
Our data analysis strongly suggests that Pol II occupancy drives the 3D organization of genes, independently of cohesin activity. Moreover, recent in vitro and in vivo experiments suggest that Pol IIs could form liquid-like droplets either directly through a phase-separation process mediated by weak interactions between their carboxy-terminal domains36,59,60,61,62 or indirectly via the formation of Mediator condensates triggered by nascent RNAs63. In the following, we developed a biophysical model to better characterize the phenomenology of Pol II-mediated gene folding by investigating how effective self-attractions between Pol II-occupied loci may shape the spatio-temporal dynamics of genes.
First, we built a stochastic model to describe Pol II occupancy and dynamics at a gene using a standard Totally Asymmetric Simple Exclusion Process (TASEP)64,65,66. In this model (Fig. 4A, Methods), Pol IIs can be loaded onto chromatin at the TSS with rate, transcription elongation initiates with rate \({\gamma }_{0}\), Pol IIs then progress along the gene at rate until they unbind from chromatin at TTS with a rate. During this process, Pol IIs cannot overlap or bypass each other. We systematically varied the parameters of the TASEP model in order to predict different Pol II profiles along the gene at steady-state (Supplementary Fig. 18). For example, by varying model parameters (\({\gamma }_{0}/\gamma=1\), \(\beta /\gamma=1-\alpha /\gamma\)), we reproduced uniform average profiles of Pol II occupancy along the gene, ranging from low (~0.02) to high (~0.80) densities (Fig. 4B, C).
A Schematic representation of the TASEP-decorated polymer model for gene transcription and 3D folding. B Pol II profiles along a 100kbp-long gene (L = 50 monomers) for parameters tuned to generate a uniform occupancy along the gene, from low (blue) to high (red) densities, (respectively, 0.016, 0.024, 0.037, 0.058, 0.089, 0.139, 0.215, 0.333, 0.516, and 0.800). The solid and dashed curves are predictions from Monte-Carlo simulations and analytical calculations, respectively (see Methods). C Normalized transcription rate (top), defined as the average number of Pol II unloadings from the TTS per time unit divided by its maximum, and normalized effective elongation rate (bottom), defined as the inverse of the time needed for one Pol II to fully transcribed a gene, as a function of Pol II density. D Predicted contact maps around a 100kbp-long gene for different Pol II densities and valencies. Corresponding IC scores are given. E IC versus IR curves as a function of the elongation rate \(\gamma\) (top), strength of interaction E (middle) and valency (bottom). F IC scores against IR scores (Pol II density) and gene length for two different valencies. The color bar is presented in a log2 scale, while the values are given in a linear scale. G Examples of non-uniform Pol II profiles having significant accumulations at TSS and TTS. (H) (Top) PMGA analysis of the contact around the TSS-TTS loop for 64-128 kb-long genes with increasing IR scores (from left to right) taken from ∆RAD21 dataset. (Bottom) Model predictions around the TSS-TTS loop for the non-uniform cases described in (G). TTS-TTS loops and promoter stripes are shown with black and red arrows, respectively. Source data are provided as a Source Data file.
Next, to assess the spatial organization of a gene, we integrated the TASEP in a 3D polymer model of chromatin fiber55,56 (Fig. 4A, Methods). Briefly, we represented a 20 Mbp-long section of chromatin as a self-avoiding chain (1 monomer = 2kbp = 50 nm). We focused on a region of size L in the middle of the chain, which represents the gene of interest. Each monomer within the gene is characterized by a random binary variable indicating the local Pol II occupancy, whose dynamics is described by the TASEP. To investigate the impact of Pol IIs density and dynamics on gene folding, we assumed that monomers occupied by Pol II at a given time may self-interact at short-range with energy strength E. All the other monomers are considered non-interacting, neutral particles. The coupled stochastic spatio-temporal dynamics of the Pol II occupancies and 3D positions of the monomers are then simulated using kinetic Monte Carlo (Methods).
Self-attraction between Pol II-bound genomic regions drives the intra-gene spatial organization
We quantified generic structural properties of the model and investigated the relationship between intra-gene condensation (IC scores) and Pol II density (IR scores) with respect to model parameters. In particular, we varied IR scores (via \(\alpha /\gamma\)) while keeping other TASEP parameters constant, achieving uniform Pol II occupancies along the gene (as in Fig.4B, C), and we monitored the corresponding IC scores at steady-state.
For fixed gene length L and elongation rate \(\gamma\), IC is an increasing function of both the Pol II density (Fig. 4D, upper panels) and the strength E of self-attraction (Fig.4E, mid panel): the gene’s polymeric subchain undergoes a theta-like collapse67 towards a globular state when the Pol II occupancy reaches a critical value (Supplementary Movie 1), such transition occurring at lower threshold densities for stronger interactions (|E | ). Similar to standard self-interacting homopolymers68,69, intra-gene contacts strengthen with increasing gene length (Fig.4F, left panel), while maintaining a fixed average Pol II level. This reflects the cooperative nature of the theta-collapse70,71.
At a constant average Pol II density, IC is a decreasing function of the Pol II elongation rate (Fig. 4E, upper panel). Indeed, the capacity of Pol II-bound monomers to stably interact depends on the out-of-equilibrium dynamics of the elongating Pol IIs: shorter residence time of Pol II on a monomer (compared to typical polymer diffusion time) results in more transient Pol II-mediated interactions between monomers. Notably, biologically relevant elongation rates (~2 kb/min72,) correspond to the slow elongation regime, maximizing gene condensation.
Overall, our model qualitatively recapitulates the global Pol II and gene length trends observed experimentally (Fig.2B). However, the predicted strengths of intra-gene contact enrichment are much stronger than expected (Fig. 4F, left panel). For instance, a 128kbp-long gene shows a ~6-fold increase in IC score with a ~8-fold rise in Pol II density across the theta-collapse (for \(\gamma=\)2 kb/min and E = -3 kT), whereas experimentally the same change in average Pol II occupancy yields only ~35% increase in IC. We verified that reducing |E| does not resolve the problem as the theta-transition remains sharp and cooperative (Supplementary Fig. 19, left panel).
Intra-gene condensation in mESC is consistent with a limited valency of Pol II-Pol II interactions
In our initial model, unrestricted interactions were allowed among the Pol II-occupied monomers in close proximity in the 3D space. However, such molecular interactions are mediated by only a restricted set of accessible residues and thus one monomer may have only a limited valency (number of simultaneous interactions).
Reducing the valency led to a global, sharp drop in intra-gene contact enrichment (Fig. 4D, 4E, lower panels, Supplementary Movie 2). For instance, at high Pol II occupancy, a ~11-fold reduction in IC score was observed for valency 2 compared to unlimited valency. At lower valencies (2 or 3), the levels of contact enrichment aligned with experimental values (Fig. 4F, right panel, Supplementary Fig. 19, right panel) while still preserving the overall dependence on Pol II density and gene length seen with unlimited valency.
However, an intriguing exception emerged: the IC score now displays a non-monotonic dependency with Pol II levels (Fig. 4E, lower panel), as actually observed experimentally at high IR scores (Fig.1C). Within our framework, this behavior arises from a screening effect on long-range interactions. At high Pol II density, the neighboring Pol II-occupied monomers along the chain are likely to engage in interactions, limiting the ability of a monomer to interact with distantly located monomers and consequently reducing large-scale intra-gene condensation.
Nonuniform Pol II profiles lead to intra-gene architectural details
We previously focused on average gene folding properties by considering flat, homogeneous Pol II densities. However, experimental Pol II profiles show distinct peaks at TSS and TTS. By adjusting the TASEP parameters, we generated qualitatively similar peaked profiles of increasing density (Fig. 4G, Methods). Using interacting parameters (E = −3kT, valency = 2) compatible with the experimental IC vs IR relationship, we obtain for these nonuniform profiles very similar correlations between IC and IR scores and gene length (Supplementary Fig. 20). Additionally, we predicted the formation of a stable loop between TSS and TTS in contact map as well as promoter-gene stripes within gene body, for high Pol II occupancy (Fig. 4H, lower panels). Interestingly, off-diagonal pileup analysis of mESC Micro-C datasets around TSS-TTS anchors exhibits similar patterns independent of the cohesin loop-extrusion mechanism (Fig. 4H, upper panels, Supplementary Fig. 21), implying that such architectural details are driven by Pol II occupancy and effective Pol II-Pol II interactions.
Stochastic dynamics of gene folding in response to transcription bursting
Most mammalian genes undergo discontinuous transcription in bursts73,74,75. To address the impact of such bursting kinetics on the gene spatio-temporal dynamics, we modified the TASEP model minimally: the promoter can stochastically switch between an on state, enabling Pol II binding and transcription, and an off-state refractory to Pol II binding, with rates \({k}_{{on}}\) and \({k}_{{off}}\) (Fig. 5A). These rates define the effective Pol II binding rate (\({\alpha }_{{eff}}=\alpha {k}_{{on}}/({k}_{{on}}+{k}_{{off}})\)), the burst frequency (\({=k}_{{on}}{k}_{{off}}/({k}_{{on}}+{k}_{{off}})\), mean number of bursts per time unit) and the train size (\(=\alpha /{k}_{{off}}\), mean number of Pol II binding and elongating during one burst). For simplicity, we assumed \({k}_{{on}}={k}_{{off}}\equiv k\), allowing variation in burst properties from rare, long trains (k = 0.01/min) to frequent, short ones (k = 0.04/min) (Fig. 5B, C), while maintaining an almost constant average Pol II density profile (Fig. 5D).
A Schematic representation of transcriptional burst, where TSS alternatively switched on and off. B Three different examples of bursty gene activity ranging from long (k = 0.01/min) to short (k = 0.04/min) train size. C Probability distributions of the number of trains elongating on a gene at the same time for the three bursty regimes depicted in (B). D Average Pol II density profiles for the three bursty regimes depicted in (B) and in the absence of burst. E Predicted contact maps with (lower left triangular part) and without burst (upper right triangular part) for long (top) and short (bottom) trains. F Pol II density profiles when TSS is “on” (solid lines) or “off” (dashed lines) for long (blue lines) and short (green lines) trains. G Predicted contact maps for conditions similar to (F). Color scale is the same as panel (E). H (Top to bottom) Time evolution of the radius of gyration (RG) of a gene, TSS state, Pol II density along the gene and the number of trains elongating along the gene for k = 0.01/min. Examples of 3D gene conformation are drawn when the gene is more or less condensed. Bars = 200 nm. I Violin plots of RG in the “off” and “on” states for the three burst regimes in (B). The black dashed lines show the predictions for homopolymer model (i.e. zero interaction case). J Boxplot of RG as a function of the Pol II density for k = 0.01/min. Boxplots present the median and 25th and 75th percentile, with the whiskers extending to 1.5 times the interquartile range. They were computed over a number of snapshots always higher than 10 (median number ~105). K A typical snapshot of gene 3D conformation (gene in light blue, flanking regions in dark blue) in the presence of two trains. The 1D representation shows the locations of Pol II-bound monomers for each train (orange and red dots). All simulations were done for a 100-kb gene with valency = 2, \(E=-3{k}_{B}T\). Source data are provided as a Source Data file.
By averaging over all configurations, we observed a weak—but significant—decrease in intra-gene condensation in the presence of bursting (Fig. 5E). However, when considering the promoter’s on/off states separately, the impact of bursting became apparent with overall more intra-gene contacts and more pronounced TSS-TTS loops and promoter-gene stripes in the on-state (Fig. 5G). This effect was more pronounced for low burst frequency as the difference in Pol II occupancy between the on/off-states became more prominent (Fig. 5F). Similarly, more elongating trains lead to increased condensation (Supplementary Fig. 22).
These findings suggest a time-correlation between transcriptional bursting and gene folding where dynamical changes in the gene’s radius of gyration (RG) are preceded by modifications in Pol II along the gene (Fig. 5H). Indeed, we observed an overall negative correlation between instantaneous Pol II density and RG, which was more pronounced for low burst frequencies (Fig. 5I, J). Interestingly, when multiple trains are present simultaneously along the gene, the dynamic looping between could rise to the formation of ‘factories’ where they colocalize (Fig. 5K, Supplementary Movie 3).
Transcription slows down gene mobility
Live-imaging experiments have indicated that chromatin motion is enhanced after Pol II inhibition or reduced after gene activation41,43,44, suggesting a connection between transcription and a reduced gene mobility. To assess whether our biophysical model aligns with these observations, we computed for each monomer the mean-squared displacement (MSD), that measures the typical space explored by a locus over a time-lag Δt. We observed that \({MSD} \sim D\Delta {t}^{\delta }\), where \(D\) and \(\delta\) are diffusion constant and exponent, respectively (Fig. 6A). \(\delta \sim 0.5\) is independent of Pol II occupancy (Fig. 6B) and its value is consistent with live imaging experiments in mESC15,16 and standard polymer dynamics76,77. Conversely, \(D\) depends on Pol II density and gene length (Fig. 6C) with a perfect opposite trend as the intra-gene condensation (Fig. 4F, right): the more condensed the gene the less mobile77. For example, a 40-70% increase in intra-gene contacts corresponds to a 10-15% decrease for \(D\), consistent with experiments (Fig. 6C, D).
A Mean-squared displacement \({MSD} \sim D\Delta {t}^{\delta }\) vs time-lag Δt for different Pol II densities for a 256kbp-long gene. B Diffusion exponent \(\delta\) as a function of gene size and Pol II density. C As in (B) but for the diffusion constant D normalized by its value D0 in the absence of transcription. The color bar is presented in a log2 scale, while the values are given in a linear scale. D The ratio of MSD with (MSD) and without (MSD0) Pol II at t = 9.3 s as a function of Pol II density for a 256kbp-long gene. Color scale as in (A) and data are presented as mean values ± SD and were computed over 20 different trajectories. All simulations were done with valency = 2, \(E=-3{k}_{B}T\). Source data are provided as a Source Data file.
Transcription-associated long-range contacts correlate with Pol II occupancy
Our analysis of intra-gene folding and dynamics suggests that similar mechanisms may explain the role of Pol II occupancy in distal inter-gene interactions. On the Micro-C map of mESC, we observed selective contact enrichments between distal highly active genes (Fig. 7, Supplementary Fig. 23). For instance, the average contact frequency between the 811 kb-distant large active genes Ahctf1 and Parp1 is 3.2-fold higher than expected at similar genomic distance (Fig. 7A). Both genes belong to the same A compartment, indicating that strongly transcribed genes may further colocalize within A. To test this hypothesis, we clustered all the 32-64 kb-long genes into three categories based on their IR score (Low, Mid and High) and performed PMGA (Methods) of the inter-gene contacts for pairs of genes distant by more than 128 kb but less than 2 Mb (Fig. 7B, Supplementary Fig. 24). When both genes are transcribed (Mid and High clusters in Fig. 7B), a strong promoter-promoter interaction is detected, as already observed in several studies13,27,29,78. In addition, PMGA highlights that highly active genes (High-High) also exhibit significant contact enrichment between their gene bodies compared to the surrounding background in a transcription-dependent and loop extrusion-independent manner (Supplementary Fig. 24). Contact enrichment between inactive genes (Low-Low) is similar to background and can be attributed to their location in the more compact B-compartment30.
A Micro-C contact map of a ~1 Mb region of mESC chromosome 1, with corresponding gene annotation and ChIP-seq profiles below. Inset shows a zoom between the long, highly-active genes of Ahctf1 and Parp1 (respectively, 58.7 kb and 32.3 kb-long and an expression of 22.4 FPKM and 151.4 FPKM). B Inter-gene pileup meta-gene analysis of the contact enrichment between two distant genes as a function of their intra-gene Pol II enrichment. C Model predictions for contacts between 60-kb-long genes for three different Pol II densities. D Examples of simulated 3D configurations illustrating the inter-gene interactions at various Pol II densities (gene regions in yellow and red, surrounding genomic regions in blue). All simulations were done with valency=2, \(E=-3{k}_{B}T\). Source data are provided as a Source Data file.
To rationalize these observations with our biophysical model, we conducted simulations for two 60 kbp-long genes distant by 460 kb, exhibiting similar steady-state Pol II profiles (Fig. 7C). We observed that Pol II-mediated interactions not only affect intra-gene contacts but also drive the formation of inter-gene contacts between TSS and TTS and between gene bodies, whose strengths increase with Pol II density. We obtained similar results for longer genes and shorter inter-gene distances (Supplementary Fig. 25). Interestingly, interacting genes tend to colocalize and segregate from the rest of the simulated polymeric chain79,80 (Fig. 7D, Supplementary Movie 4).
Discussion
In this study, we analyzed publicly available Micro-C data of mESC13,29 to investigate the relationship between transcriptional activity and chromosome organization. Our findings align notably with previous studies29,30,31. Specifically, we showed that, on average, at the single-gene level (2kbp-1Mbp), intra-gene contact enrichment, structural patterns (gene-loops, promoter-stripes, Fig. 3E) and the degree of insulation from the surrounding genomic regions correlate positively with Pol II occupancy along the gene (Figs.1,2). Moreover, our study revealed that these observed features also exhibit a positive correlation with gene length (Fig.2), suggesting a cooperative mechanism for gene folding. Nevertheless, we noted a considerable degree of heterogeneity, implying that specific genes may deviate from the average behavior. These results stand in contrast with the very local structure of the chromatin fiber (<600 bp) that is increasingly open as transcription rate increases25.
For highly expressed genes, we observed reduced contacts within gene body (Fig. 1C), which aligns, although at a lesser extent, with the extended gene conformations observed for very long, highly expressed tissue-specific genes in mice37,39.
Consistent with prior research47, our results underscore the role of the loop extrusion process, recognized for driving the formation of loops and TADs11 and reported to interfere with transcriptional elongation38,46,47, into structural features outside the gene domain (Fig. 3). However, in good agreement with recent high-precision Capture Micro-C data27, we demonstrated that intragenic structure-function relation between gene condensation and gene transcription does not directly associate with loop extrusion79 (Fig. 3). At the inter-gene level, we observed long-range contacts between active genes, not only between gene promoters as already characterized81,82, but also between gene bodies (Fig. 7), here also closely tied to Pol II profiles and independent of loop extrusion (Supplementary Fig. 24).
Altogether, our findings suggest that active genes are central units of the 3D genome25 and form a subcompartment27,79,80, driven by gene activity, Pol II binding and elongation. This observation likely holds true for other cell types, as we recently showed that intra-gene folding during mouse thymocyte maturation is, in average, also associated with change in transcription levels31. The mechanisms described here are also likely to be broadly conserved in animals. Indeed, we analyzed the correlation between IC and IR (spearman’s ρ = 0.48) in whole-embryo Drosophila data at embryonic nuclear cycle 14 (Supplementary Fig. 26)83. Drosophila is interesting as its chromosome organization is believed to be mainly driven by the spatial segregation of the epigenome instead of cohesin loop-extrusion processes4. We found a similar nonmonotonic dependence of IC to IR as well as Pol II-related intra-gene interaction patterns. One exception is the effect of gene length that is less clear. Interestingly, in the bacterium Escherichia coli, higher transcription is also associated with more intra-gene contacts84; in yeast and dinoflagellate, TAD-like structures are associated with (blocks of) active genes25,85.
To better characterize the underlying mechanisms behind the correlations between Pol II activity and the transcriptionally active subcompartment, we introduced a simple biophysical framework that accounts for the 1D dynamics of Pol II along genes coupled to the 3D polymer organization of chromosomes (Fig.4). Previous biophysical models have already addressed some aspects of Pol II-mediated phase separation via attractive35,48 or active forces86, focusing on large-scale inter-gene condensation, but never investigating intra-gene organization nor explicitly accounting for the transcription dynamics. By assuming self-attractive, short-range interactions between genomic loci bound to Pol II35,48, our approach is able to recapitulate qualitatively the overall augmentation of intra-gene contacts associated with an enrichment of Pol II density inside gene body and to longer genes, consistent with a standard cooperative coil-globule transition observed for finite-size chains71,87,88,89. Our model suggests that limiting the number of possible interactions per Pol II-bound region to low values (e.g., 2 or 3) allows us to align quantitatively our predictions with experiments, leading to percolated but less condensed 3D domains90,91. Interestingly, this constraint also explains the weak decompaction observed for highly transcribed genes as interactions between distant positions along the genes (mediating the large-scale gene’s condensation) are screened by (more frequent) interactions between nearest-neighbor Pol II-bound sites. This screening mechanism may also contribute to the formation of the extended transcription loops observed in long highly transcribed genes37, along with the potential stiffening of the chromatin fiber caused by the high density of nascent ribonucleoprotein complexes along the genes, as originally evoked.
Furthermore, our model predicts a strong coupling between gene structure and dynamics: transcription bursts may regulate the stochasticity of intra- and inter-gene contacts at the single-cell scale (Fig. 5)92; such dynamical contacts may conversely reduce locally gene mobility (Fig. 6) and lead to long-range coherent motion between active regions56,93, in good agreement with live-microscopy observations41,43,45.
What are the molecular determinants of the putative attractive interaction between Pol II-bound loci hypothesized in our model? It is likely that several sources may directly or effectively participate in its regulation. The C-terminal domain (CTD) of Pol II can form liquid condensates in vitro under physiological conditions, which become unstable upon CTD phosphorylation59. This mechanism may thus promote direct attractions in vivo between non-elongating Pol II, bound at promoters for example35. CTDs may also interact with co-factors that can themselves phase-separate both at the transcriptional initiation34,94,95 and elongation60,96 stages, like FUS, BRD4, Mediator, P-TEFb or splicing factors. For example, the observed correlation between intra-gene condensation and the number of exons19 at similar Pol II occupancy (Supplementary Fig. 4) suggests a role for splicing-related condensates96. In addition, transcription-generated supercoiling84,97 or specific histone marks deposited along the gene bodies (that may regulate putative nucleosome-nucleosome interactions98) may contribute to transcription-dependent effective interactions.
The limited valency of interactions in our model aligns with a restricted number of simultaneously accessible residues involved in the aforementioned sources of Pol II-Pol II attraction. It is also possible that the screening effect observed at high transcription rates could be explained by the strength of interaction depending on local Pol II concentration and/or the length of nascent transcripts (Supplementary Notes), as RNA size and concentration can impact the stability of transcription-related condensates63.
In conclusion, our results demonstrate the significant impact of Pol II binding and elongation on the spatiotemporal organization of the active genome through an out-of-equilibrium phase-separation process coupling the time-dependent dynamics of transcription to the formation of gene micro-domains and of transcriptionally active subcompartment27,61,79,99. This extends the concept of transcription factories100, typically associated with inter-gene contacts, to the internal organization of long genes having multiple trains of transcribing Pol IIs. Consistent with our findings, recent works also proposed that interactions between Pol IIs may also facilitate promoter-enhancer communications23,101. However, our approach provides only an “average” picture of the role of transcription on 3D chromosome organization and does not account for the various epigenetic, genomic and spatial factors that may interplay with Pol II-mediated phase separation47 around specific genes, potentially explaining the variability of behaviors observed after transcription (de)activation31.
Future investigations should aim to further elucidate the biological function(s) of such transcription-dependent micro-compartmentalization. Indeed, colocalization of active genomic regions may enhance the recycling of Pol II or transcription co-factors102,103 by increasing their local concentrations. Investigating precisely such a “structure-function” coupling between the binding and assembly of transcription-associated components and condensates and the spatial folding of the genome remains an intriguing challenge and would require further developments both at the experimental and modeling levels.
Methods
Experimental data analysis
Datasets
The processed Micro-C data for mESCs (wild-type and mutants) and Drosophila in multi-resolution format mcool were downloaded from National Center for Biotechnology Information’s Gene Expression Omnibus (GEO) through accession no: GSE130275, GSE178982 and ArrayExpress accession E-MTAB-9306.
The ChIP-seq tracks, including Pol II, Pol II Ser5P and 2P, CTCF, RAD21, H3K27me3, H3K9me3 and H3K36me3, for wild type and different mutants in BigWig format were downloaded from GEO through accession no: GSE130275, GSE178982, GSE90893, GSE90994, GSE16013, GSE85191, GSE195830.
Pileup meta-gene analysis (PMGA)
Contact maps
We used cooltools (https://github.com/open2c/cooltools)104 module to compute the obs/exp maps from the balanced contact maps, at various resolutions ranging from 100 bp to 50 kb.
To perform intra-gene PMGA, for each gene \(i\) with size \({l}_{i}\) (>20 x resolution), we considered a domain of size \(3{l}_{i}\) around it, including the gene body and the two upstream and downstream flanking regions, each of size \({l}_{i}\). To ensure consistency and facilitate pileup analysis, we rescaled each corresponding \((3{l}_{i})x(3{l}_{i})\) obs/exp matrix to a (60,60)-pseudo-sized matrix by averaging the original matrix elements. An example of this rescaling process can be seen in Supplementary Fig. 27. We then aligned all the rescaled matrices in the transcription forward direction to maintain uniformity. Finally, we aggregated all the data of genes belonging to a given cluster (clustered by gene length, IR score, etc.).
For inter-gene PMGA, for each pair of genes, we considered the off-diagonal region of the obs/exp map of size \((3{l}_{1})x(3{l}_{2})\) and centered at (\({m}_{1},{m}_{2}\)), with \({l}_{1}\) and \({m}_{1}\) the size and genomic position of the middle of gene 1 (same for gene 2). Then, similarly, we rescaled this region to a (30,30)-pseudo-matrix, aligned the genes in parallel forward direction and aggregated the pseudo-matrices belonging to the same cluster.
ChIP-seq tracks
Using pyBigWig (https://github.com/deeptools/pyBigWig), for each gene, we discretized the \(3l\) domain (see above) into 60 bins and computed the coverage for each bin. Then, we aligned the domains in the transcription forward direction and aggregated over all genes in the same cluster.
ChIP-seq peak calling and calculation of peak contacts
For each ChIP-seq track, we transformed BigWig to bedGraph, used MACS software105 version 2 to call the peaks in the “no model” mode and merged the results from different replicates. Then, we sorted them by fold-change score (compared to input) and selected the most significant peaks (top 1/3). Finally, for every pair of peaks with a genomic distance between 160 and 320 kb, we used the off-diagonal pileup module of cooltools to compute the average peak contacts.
Insulation score and compartments analysis
For computing the insulation score, we analyzed contact maps at 800-bp, 1600-bp and 3200-bp resolutions with the dedicated module of cooltools with sliding windows 3, 5, 10 and 25 times larger than the given resolution, e.g. 2.4, 4, 8 and 20-kb windows for 800-bp resolution. For the compartment analysis, we used the eigs_cis module of cooltools to compute the first eigenvector of Pearson’s correlation matrix of contact map taking as inputs the 6.4-kbp resolution Micro-C maps and the GC coverage computing from mm10 reference genome.
Intra-gene contact (IC) and Pol II (IR) scores
For each gene, the IC score is defined as the average value of the obs/exp map (ci,j) inside gene body: \({IC}=\frac{{\sum}_{i < j}{c}_{{ij}}}{N(N-1)/2}\), where \(i,{j}\) both represent bins within the same gene and \(N\) is the total number of bins along the gene, given by the gene size divided by the contact map resolution. Similarly, the IR score of a gene is defined as the average value of RNA Pol II ChIP-seq signal (\({p}_{i}\)) within the gene body: \({IR}=\frac{{\sum}_{i}{p}_{i}}{N}\).
Biophysical model
We previously introduced a self-avoiding semi-flexible polymer model for chromosomes55,56. In this study, we employed a coarse-graining approach to represent a 20-Mbp-long chromatin fiber using 10,000 monomers. Each monomer corresponds to approximately 2-kbp of the genome and has a size of 50 nm (Fig. 4A). Within this chain, we inserted a 100-kbp-long gene (composed of 50 monomers), where TSS and TTS are located at the first and last monomers of the gene, respectively.
TASEP model
Each monomer \(i\) within a gene (of total size \(n\)) is characterized by a binary state \({s}_{i}\epsilon \{{{{{\mathrm{0,1}}}}}\}\) depending if a Pol II complex is bound to it (\({s}_{i}=1\)) or not (\({s}_{i}=0\)). We simulated the stochastic dynamics of Pol II binding, unbinding and elongation using a simple kinetic Monte-Carlo framework: each Monte Carlo step (MCS) consisted of (i) one attempt to bind a Pol II with rate \(\alpha\) at the TSS if unoccupied (\({s}_{1}=0\to 1\)), (ii) one attempt to unbind Pol II with rate \(\beta\) at the TTS if occupied (\({s}_{n}=1\to 0\)), and (iii) \(n-1\) elongation attempts, each consisting in randomly picking one monomer \(i\) in \([1:n-1]\) and, if occupied, to move with rate \(\gamma\) the Pol II to its adjacent upstream monomer if it is not already occupied (\([{{s}_{i}=1,s}_{i+1}=0]\to [{{{{\mathrm{0,1}}}}}]\)).
In a simple case where the initiation rate is equal to the elongation rate (i.e. \({\gamma }_{0}=\gamma\)), the time-evolution of ensemble-averaged \( < {s}_{i} > \) of monomer i follows
Assuming that the state of each monomer is independent from the states of its neighbors and taking the continuum limit \( < {s}_{i} > (t)\equiv \rho (x,t)\) with x = i/n leads to the Fokker-Planck-like equation
At steady state (\({\partial }_{t}\rho (x,t)=0\)), \(\rho \left(x\right)\) is given by solving the differential equation:
with \(\rho (0)=\alpha /\gamma\) and \(\rho (1)=1-\alpha /\gamma\). For the boundary condition \(\alpha /\gamma=1-\alpha /\gamma\), the solution is uniform \(\rho=\alpha /\gamma\) along the gene. Equation (3) for other conditions can be solved numerically and compared to the results of Monte-Carlo simulations (Supplementary Fig. 18).
Polymer model
The polymer chain undergoes local movements on a FCC lattice with periodic boundary conditions under Metropolis criterion, as described in our previous works55. The total Hamiltonian of a given configuration can be expressed as following:
The first term accounts for the stiffness of the chain with \(\kappa\) the bending rigidity and \({\theta }_{i}\) the local bending angle at monomer \(i\). The second term represents the Pol II-Pol II interaction, where \(E\) denotes the attractive interaction strength, and \({f}_{{ij}}\) equals 1 if monomers \(i\) and \(j\) occupy nearest neighboring sites on the lattice.
For simulations with a limited valency number, we defined an interaction list for each monomer with \({s}_{i}=1\). This list stores the genomic positions of the other monomers it interacts with and is constrained not to exceed the given valency number. It is updated after any polymer or TASEP moves.
Note that, due to the relatively high stall force of Pol II ( ~ 25-30 pN106), we assumed that Pol II-Pol II interactions do not impede Pol II elongation.
Numerical simulations
In our study, we set \(\kappa \sim 1.2\,{k}_{B}T\) and a lattice volumic density of 50% to account for a chromatin fiber with a Kuhn length of 100 nm69,107 and a typical base-pair density found in mammalian and fly genomes (~0.01 bp/nm³)108. Simulations were initiated by unknotted configurations55 and performed with a kinetic Monte Carlo algorithm. In addition to the TASEP moves (see above), each MCS contains N local polymer trial moves. For each parameter set, 20 independent trajectories were conducted, discarding the first 106 MCS from each trajectory to allow the system to reach a steady state. Subsequently, snapshots of the system were saved every 10³ MCS during the simulation during 107 MCS and analyzed subsequently (see below). Since the characteristic spatial and time scales of the phenomenon under study (~100 nm, ~min) are well beyond the discretization scales (50 nm, 3 msec) imposed by the lattice and the kinetic Monte Carlo algorithm, the obtained results are not expected to depend qualitatively on the underlying modeling and simulation frameworks109.
Data analysis
The radius of gyration (RG) provides a measure of the typical spatial extent of a gene, reflecting its overall span in 3D space. In a given configuration, the position of monomer \(i\) can be defined as \({\vec{r}}_{i}\equiv \left({x}_{i},{y}_{i},{z}_{i}\right)\). The RG is then calculated as follows:
where \({\vec{r}}_{m}\equiv \left({x}_{m},{y}_{m},{z}_{m}\right)\) is the mean position of all monomers.
To extract the diffusion coefficient (\({D}_{i}\)) and exponent (\({\alpha }_{i}\)) for monomer \(i\), we first computed the time-averaged and ensemble-averaged mean-squared displacement, \( < {MS}{D}_{i} > \), as a function of the time-lag, \(\Delta t\). Subsequently, we performed a power-law fit of the form \({D}_{i}{\Delta t}^{{\alpha }_{i}}\) to the resulting curve using Numpy function numpy.polyfit(\(\log \Delta t\),\(\log \left\langle {MS}{D}_{i}\right\rangle\),1).
Furthermore, to establish a correspondence between simulation (MCS) and real (seconds) times, we compared our predictions with the typical MSD observed in yeast (~\(0.01(\mu {{{{{{\rm{m}}}}}}}^{2}/{{{{{\rm{s}}}}}}^{0.5}){\Delta t}^{0.5}\), with \(\Delta t\) in seconds)76, leading to 1 MCS ~ \(3\,{{{{{{\rm{ms}}}}}}}\).
Statistics and reproducibility
In Fig. 1B, C, genes smaller than 1 kb are excluded due to resolution limitations. In Fig. 2, clusters with fewer than 25 representative genes are excluded due to inadequate statistical significance. No additional statistical methods were applied to the analyses.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Processed data (intra-gene contact, RNA-seq and ChIP-seq enrichments, compartments and exon numbers for each gene > 1kbp) and source data of the figures are publicly available from Zenodo repository (Hossein Salari, 2024) at https://zenodo.org/records/10998192110. We also use publicly available data from Gene Expression Omnibus (GEO) through accession no : GSE130275, GSE178982, GSE90893, GSE90994, GSE16013, GSE85191, GSE195830 and ArrayExpress accession E-MTAB-9306.
Code availability
Python notebooks for PMGA analysis and simulation codes are available on GitHub (https://github.com/physical-biology-of-chromatin/Transcription).
References
Eagen, K. P. Principles of chromosome architecture revealed by Hi-C. Trends Biochem. Sci. 43, 469–478 (2018).
Jerkovic, I. & Cavalli, G. Understanding 3D genome organization by multidisciplinary methods. Nat. Rev. Mol. Cell Biol. 22, 511–528 (2021).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Rowley, M. J. et al. Evolutionarily conserved principles predict 3D chromatin organization. Mol. Cell 67, 837–852 (2017).
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).
Rao, S. S. P. et al. Cohesin loss eliminates all loop domains. Cell 171, 305–320 (2017).
Mirny, L. A., Imakaev, M. & Abdennur, N. Two major mechanisms of chromosome organization. Curr. Opin. Cell Biol. 58, 142–152 (2019).
Wang, L. et al. Histone modifications regulate chromatin compartmentalization by contributing to a phase separation mechanism. Mol. Cell 76, 646–659 (2019).
Zenk, F. et al. HP1 drives de novo 3D genome reorganization in early Drosophila embryos. Nature 593, 289–293 (2021).
Fudenberg, G. et al. Formation of chromosomal domains by loop extrusion. Cell Rep. 15, 2038–2049 (2016).
Guo, Y. et al. CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell 162, 900–910 (2015).
Hsieh, T.-H. S. et al. Enhancer–promoter interactions and transcription are largely maintained upon acute loss of CTCF, cohesin, WAPL or YY1. Nat. Genet. 54, 1919–1932 (2022).
Schwarzer, W. et al. Two independent modes of chromatin organization revealed by cohesin removal. Nature 551, 51–56 (2017).
Gabriele, M. et al. Dynamics of CTCF- and cohesin-mediated chromatin looping revealed by live-cell imaging. Science 376, 496–501 (2022).
Mach, P. et al. Cohesin and CTCF control the dynamics of chromosome folding. Nat. Genet. 54, 1907–1918 (2022).
Wutz, G. et al. Topologically associating domains and chromatin loops depend on cohesin and are regulated by CTCF, WAPL, and PDS5 proteins. EMBO J. 36, 3573–3599 (2017).
Lupiáñez, D. G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).
Bonev, B. et al. Multiscale 3D genome rewiring during mouse neural development. Cell 171, 557–572 (2017).
Zuin, J. et al. Nonlinear control of transcription through enhancer-promoter interactions. Nature 604, 571–577 (2022).
Chen, H. et al. Dynamic interplay between enhancer-promoter topology and gene activity. Nat. Genet. 50, 1296–1303 (2018).
Barshad, G. et al. RNA polymerase II dynamics shape enhancer-promoter interactions. Nat. Genet. 55, 1370–1380 (2023).
Zhang, S., Übelmesser, N., Barbieri, M. & Papantonis, A. Enhancer-promoter contact formation requires RNAPII and antagonizes loop extrusion. Nat. Genet. 55, 832–840 (2023).
Hilbert, L. et al. Transcription organizes euchromatin via microphase separation. Nat. Commun. 12, 1360 (2021).
Hsieh, T.-H. S. et al. Mapping nucleosome resolution chromosome folding in yeast by micro-C. Cell 162, 108–119 (2015).
Aljahani, A. et al. Analysis of sub-kilobase chromatin topology reveals nano-scale regulatory interactions with variable dependence on cohesin and CTCF. Nat. Commun. 13, 2139 (2022).
Goel, V. Y., Huseyin, M. K. & Hansen, A. S. Region Capture Micro-C reveals coalescence of enhancers and promoters into nested microcompartments. Nat. Genet. 55, 1048–1059 (2023).
Balasubramanian, D. et al. Enhancer-promoter interactions can form independently of genomic distance and be functional across TAD boundaries. Nucleic Acids Res. 52, 1702–1719 (2023).
Hsieh, T.-H. S. et al. Resolving the 3D landscape of transcription-linked mammalian chromatin folding. Mol. Cell 78, 539–553.e8 (2020).
Rowley, M. J. et al. Condensin II counteracts cohesin and RNA polymerase II in the establishment of 3D chromatin organization. Cell Rep. 26, 2890–2903.e3 (2019).
Chahar, S., Zouari, Y. B., Salari, H., Molitor, A. M. & Kobi, D. Context-dependent transcriptional remodeling of TADs during differentiation. PLoS Biol. 21, e3002424 (2023).
Li, G. et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 84–98 (2012).
Cisse, I. I. et al. Real-time dynamics of RNA polymerase II clustering in live human cells. Science 341, 664–667 (2013).
Cho, W.-K. et al. Mediator and RNA polymerase II clusters associate in transcription-dependent condensates. Science 361, 412–415 (2018).
Pancholi, A. et al. RNA polymerase II clusters form in line with surface condensation on regulatory chromatin. Mol. Syst. Biol. 17, e10272 (2021).
Hnisz, D., Shrinivas, K., Young, R. A., Chakraborty, A. K. & Sharp, P. A. A phase separation model for transcriptional control. Cell 169, 13–23 (2017).
Leidescher, S. et al. Spatial organization of transcribed eukaryotic genes. Nat. Cell Biol. 24, 327–339 (2022).
Heinz, S. et al. Transcription elongation can affect genome 3D Structure. Cell 174, 1522–1536.e22 (2018).
Winick-Ng, W. et al. Cell-type specialization is encoded by specific chromatin topologies. Nature 599, 684–691 (2021).
Jiang, Y. et al. Genome-wide analyses of chromatin interactions after the loss of Pol I, Pol II, and Pol III. Genome Biol. 21, 158 (2020).
Germier, T. et al. Real-time imaging of a single gene reveals transcription-initiated local confinement. Biophys. J. 113, 1383–1394 (2017).
Gu, B. et al. Transcription-coupled changes in nuclear mobility of mammalian cis-regulatory elements. Science 359, 1050–1055 (2018).
Nagashima, R. et al. Single nucleosome imaging reveals loose genome chromatin networks via active RNA polymerase II. J. Cell Biol. 218, 1511–1530 (2019).
Shaban, H. A., Barth, R., Recoules, L. & Bystricky, K. Hi-D: nanoscale mapping of nuclear dynamics in single living cells. Genome Biol. 21, 95 (2020).
Barth, R. & Shaban, H. A. Spatially coherent diffusion of human RNA Pol II depends on transcriptional state rather than chromatin motion. Nucleus 13, 194–202 (2022).
Brandão, H. B. et al. RNA polymerases as moving barriers to condensin loop extrusion. Proc. Natl Acad. Sci. USA 116, 20489–20499 (2019).
Banigan, E. J. et al. Transcription shapes 3D chromatin organization by interacting with loop extrusion. Proc. Natl Acad. Sci. USA 120, e2210480120 (2023).
Cook, P. R. & Marenduzzo, D. Transcription-driven genome organization: a model for chromosome structure and the regulation of gene expression tested through simulations. Nucleic Acids Res. 46, 9895–9906 (2018).
Larkin, J. D., Papantonis, A., Cook, P. R. & Marenduzzo, D. Space exploration by the promoter of a long human gene during one transcription cycle. Nucleic Acids Res. 41, 2216–2227 (2013).
Lengronne, A. et al. Cohesin relocation from sites of chromosomal loading to places of convergent transcription. Nature 430, 573–578 (2004).
Busslinger, G. A. et al. Cohesin is positioned in mammalian genomes by transcription, CTCF and Wapl. Nature 544, 503–507 (2017).
Valton, A.-L. et al. A cohesin traffic pattern genetically linked to gene regulation. Nat. Struct. Mol. Biol. 29, 1239–1251 (2022).
Zhang, S. et al. RNA polymerase II is required for spatial chromatin reorganization following exit from mitosis. Sci. Adv. 7, eabg8205 (2021).
Rivosecchi, J. et al. RNA polymerase backtracking results in the accumulation of fission yeast condensin at active genes. Life Sci. Alliance 4, e202101046 (2021).
Ghosh, S. K. & Jost, D. How epigenome drives chromatin folding and dynamics, insights from efficient coarse-grained models of chromosomes. PLoS Comput. Biol. 14, e1006159 (2018).
Salari, H., Di Stefano, M. & Jost, D. Spatial organization of chromosomes leads to heterogeneous chromatin motion and drives the liquid- or gel-like dynamical behavior of chromatin. Genome Res. 32, 28–43 (2022).
Bartkowiak, B. & Greenleaf, A. L. Phosphorylation of RNAPII: To P-TEFb or not to P-TEFb? Transcription 2, 115–119 (2011).
Belaghzal, H. et al. Liquid chromatin Hi-C characterizes compartment-dependent chromatin interaction dynamics. Nat. Genet. 53, 367–378 (2021).
Boehning, M. et al. RNA polymerase II clustering through carboxy-terminal domain phase separation. Nat. Struct. Mol. Biol. 25, 833–840 (2018).
Lu, H. et al. Phase-separation mechanism for C-terminal hyperphosphorylation of RNA polymerase II. Nature 558, 318–323 (2018).
Rippe, K. & Papantonis, A. Functional organization of RNA polymerase II in nuclear subcompartments. Curr. Opin. Cell Biol. 74, 88–96 (2022).
Phatnani, H. P. & Greenleaf, A. L. Phosphorylation and functions of the RNA polymerase II CTD. Genes Dev. 20, 2922–2936 (2006).
Henninger, J. E. et al. RNA-mediated feedback control of transcriptional condensates. Cell 184, 207–225.e24 (2021).
Bressloff, P. C. & Newby, J. M. Stochastic models of intracellular transport. Rev. Mod. Phys. 85, 135–196 (2013).
Schadschneider, A., Chowdhury, D. & Nishinari, K. Stochastic Transport in Complex Systems: From Molecules to Vehicles (Elsevier, 2010).
Mines, R. C., Lipniacki, T. & Shen, X. Slow nucleosome dynamics set the transcriptional speed limit and induce RNA polymerase II traffic jams and bursts. PLoS Comput. Biol. 18, e1009811 (2022).
de Gennes, P.-G. & Gennes, P.-G. Scaling Concepts in Polymer Physics (Cornell University Press, 1979).
Lesage, A., Dahirel, V., Victor, J.-M. & Barbi, M. Polymer coil–globule phase transition is a universal folding principle of Drosophila epigenetic domains. Epigenetics Chromatin 12, 28 (2019).
Socol, M. et al. Rouse model with transient intramolecular contacts on a timescale of seconds recapitulates folding and fluctuation of yeast chromosomes. Nucleic Acids Res. 47, 6195–6207 (2019).
Grassberger, P. & Hegger, R. Simulations of three‐dimensional θ polymers. J. Chem. Phys. 102, 6881–6899 (1995).
Caré, B. R., Carrivain, P., Forné, T., Victor, J.-M. & Lesne, A. Finite-size conformational transitions: a unifying concept underlying chromosome dynamics. Commun. Theor. Phys. 62, 607 (2014).
Jonkers, I., Kwak, H. & Lis, J. T. Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons. Elife 3, e02407 (2014).
Fukaya, T., Lim, B. & Levine, M. Enhancer control of transcriptional bursting. Cell 166, 358–368 (2016).
Tunnacliffe, E. & Chubb, J. R. What is a transcriptional burst? Trends Genet. 36, 288–297 (2020).
Dar, R. D. et al. Transcriptional burst frequency and burst size are equally modulated across the human genome. Proc. Natl Acad. Sci. USA 109, 17454–17459 (2012).
Hajjoul, H. et al. High-throughput chromatin motion tracking in living yeast reveals the flexibility of the fiber throughout the genome. Genome Res. 23, 1829–1838 (2013).
Tortora, M. M., Salari, H. & Jost, D. Chromosome dynamics during interphase: a biophysical perspective. Curr. Opin. Genet. Dev. 61, 37–43 (2020).
Schoenfelder, S. et al. The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements. Genome Res. 25, 582–597 (2015).
Miron, E. et al. Chromatin arranges in chains of mesoscale domains with nanoscale functional topography independent of cohesin. Sci. Adv. 6, eaba8811 (2020).
Gelléri, M. et al. True-to-scale DNA-density maps correlate with major accessibility differences between active and inactive chromatin. Cell Rep. 42, 112567 (2023).
Joshi, O. et al. Dynamic reorganization of extremely long-range promoter-promoter interactions between two states of pluripotency. Cell Stem Cell 17, 748–757 (2015).
Zhao, L. et al. Chromatin loops associated with active genes and heterochromatin shape rice genome architecture for transcriptional regulation. Nat. Commun. 10, 3640 (2019).
Ing-Simmons, E. et al. Independence of chromatin conformation and gene regulation during Drosophila dorsoventral patterning. Nat. Genet. 53, 487–499 (2021).
Bignaud, A. et al. Transcriptional units form the elementary constraining building blocks of the bacterial chromosome. bioRxiv https://doi.org/10.1101/2022.09.16.507559 (2022).
Nand, A. et al. Genetic and spatial organization of the unusual chromosomes of the dinoflagellate Symbiodinium microadriaticum. Nat. Genet. 53, 618–629 (2021).
Shin, S., Shi, G., Cho, H. W. & Thirumalai, D. Transcription-induced active forces suppress chromatin motion. Biochem. 121, e2307309121 (2024).
Abdulla, A. Z., Tortora, M. M. C., Vaillant, C. & Jost, D. Topological constraints and finite-size effects in quantitative polymer models of chromatin organization. Macromolecules 56, 8697–8709 (2023).
Conte, M. et al. Dynamic and equilibrium properties of finite-size polymer models of chromosome folding. Phys. Rev. E 104, 054402 (2021).
Caré, B. R., Emeriau, P.-E., Cortini, R. & Victor, J.-M. Chromatin epigenomic domain folding: size matters. AIMS Biophys. 2, 517–530 (2015).
Ryu, J.-K. et al. Bridging-induced phase separation induced by cohesin SMC protein complexes. Sci. Adv. 7, eabe5905 (2021).
Zeng, X. & Pappu, R. V. Developments in describing equilibrium phase transitions of multivalent associative macromolecules. Curr. Opin. Struct. Biol. 79, 102540 (2023).
Giorgetti, L. et al. Predictive polymer modeling reveals coupled fluctuations in chromosome conformation and transcription. Cell 157, 950–963 (2014).
Di Stefano, M. et al. Transcriptional activation during cell reprogramming correlates with the formation of 3D open chromatin hubs. Nat. Commun. 11, 2564 (2020).
Sabari, B. R. et al. Coactivator condensation at super-enhancers links phase separation and gene control. Science 361 (2018).
Kwon, I. et al. Phosphorylation-regulated binding of RNA polymerase II to fibrous polymers of low-complexity domains. Cell 155, 1049–1060 (2013).
Guo, Y. E. et al. Pol II phosphorylation regulates a switch between transcriptional and splicing condensates. Nature 572, 543–548 (2019).
Portman, J. R., Brouwer, G. M., Bollins, J., Savery, N. J. & Strick, T. R. Cotranscriptional R-loop formation by Mfd involves topological partitioning of DNA. Proc. Natl Acad. Sci. USA 118, e2019630118 (2021).
Gibson, B. A. et al. Organization of chromatin by intrinsic and regulated phase separation. Cell 179, 470–484.e21 (2019).
Nozaki, T. et al. Condensed but liquid-like domain organization of active chromatin regions in living human cells. Sci. Adv. 9, eadf1488 (2023).
Cook, P. R. A model for all genomes: the role of transcription factories. J. Mol. Biol. 395, 1–10 (2010).
Barshad, G. et al. RNA polymerase II and PARP1 shape enhancer-promoter contacts. Nat. Genet. 55, 1370–1380 (2023).
Chiang, M. et al. Gene structure heterogeneity drives transcription noise within human chromosomes. bioRxiv https://doi.org/10.1101/2022.06.09.495447 (2022).
Semeraro, M. et al. A multicolour polymer model for the prediction of 3D structure and transcription in human chromatin. bioRxiv https://doi.org/10.1101/2023.01.16.524198 (2023).
Open2C, et al. Cooltools: Enabling high-resolution Hi-C analysis in Python. PLoS Comput Biol. 20, e1012067 (2024).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Wang, H. Y., Elston, T., Mogilner, A. & Oster, G. Force generation in RNA polymerase. Biophys. J. 74, 1186–1202 (1998).
Arbona, J.-M., Herbert, S., Fabre, E. & Zimmer, C. Inferring the physical properties of yeast chromatin through Bayesian analysis of whole nucleus simulations. Genome Biol. 18, 81 (2017).
Milo, R., Jorgensen, P., Moran, U., Weber, G. & Springer, M. BioNumbers—the database of key numbers in molecular and cell biology. Nucleic Acids Res. 38, D750–D753 (2009).
Halverson, J. D., Kremer, K. & Grosberg, A. Y. Comparing the results of lattice and off-lattice simulations for the melt of nonconcatenated rings. J. Phys. A: Math. Theor. 46, 065002 (2013).
Salari, H., Fourel, G. & Jost, D. Transcription regulates the spatio-temporal dynamics of genes through micro-compartmentalization. Zenodo https://doi.org/10.5281/zenodo.10998192 (2024).
Acknowledgements
The authors are grateful to Xavier Darzacq’s lab for sharing processed data; Marco Di Stefano, Guillermo Orsi, and Aurèle Piazza for critical reading of the manuscript; Kerstin Bystricky, Tom Sexton, Giacomo Cavalli, Cédric Vaillant and the members of the Jost lab for fruitful discussions. We acknowledge Agence Nationale de la Recherche [ANR-18-CE12-0006-03, ANR-18-CE45-0022-01, ANR-21-CE45-0011-01] for funding. We thank PSMN (Pôle Scientifique de Modélisation Numérique) of the ENS de Lyon for computing resources.
Author information
Authors and Affiliations
Contributions
H.S. and D.J. designed the research; D.J. supervised the project; H.S. developed analytical tools and performed the research; H.S. and D.J. analyzed the data; G.F. provided conceptual advice; H.S. and D.J. wrote the paper with input from G.F.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Salari, H., Fourel, G. & Jost, D. Transcription regulates the spatio-temporal dynamics of genes through micro-compartmentalization. Nat Commun 15, 5393 (2024). https://doi.org/10.1038/s41467-024-49727-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-49727-7
- Springer Nature Limited