An ensemble of structures

How are meters of DNA packed inside the 5-μm-diameter nucleus of a cell? Recent developments in imaging (Strickfaden et al. 2010; Müller et al. 2004; Berger et al. 2008; Cremer and Cremer 2010; Yokota et al. 1995) and chromosome capture techniques (Ohlsson and Göndör 2007; Miele and Dekker 2009; Van Berkum and Dekker 2009; Duan et al. 2010) provided new insights into this problem. Before looking at specific observations, however, it is worth asking a question: what kind of DNA structures do we expect to find in this packing?

Fifty years of research in Structural Biology has provided tens of thousands of protein and nucleic acid structures resolved to a fraction of a nanometer. Such high resolution is possible because billions of copies of a particular protein or nucleic acid all have precisely the same shape in the individual cells of a crystal. Moreover, NMR spectroscopy has demonstrated that protein structure in solution largely resembles that in a crystal and, more surprisingly, that the vast majority of copies of a protein freely floating in solution have about the same structure. Several unstructured regions (i.e., regions that have different conformations in individual molecules and/or rapidly interconvert) have recently attracted great attention (Uversky and Dunker 2010; Vendruscolo 2007) in the protein sciences. Do we expect that most DNA/chromatin has a stable, well-defined spatial structure analogous to the situation with proteins? How different are these structures in individual cells and how rapidly do they move around, fold, and unfold?

While some specific loci may have stable conformations that are the same in all cells, we do not expect the majority of the chromatin fibers to be folded in exactly the same way in different cells. The entropic cost of ordering such gigantic molecules as chromosomal DNA in eukaryotes can run too high to achieve precise folding. The resulting cell-to-cell variability in chromatin structure is, however, averaged over millions of cells by methods like chromosome conformational capture (Ohlsson and Göndör 2007; Naumova and Dekker 2010), which provide detailed information about the probability of possible interactions in the ensemble. While such variability makes it difficult to build precise 3D models, some first models based on 5C data have been developed (Baù et al. 2010). Experimental data also allow the study of some general features of chromatin architecture and principles that govern its organization. How can one characterize chromatin folding if no unique structure is attainable?

One productive approach offered by statistical mechanics and used in structural biology of proteins (Vendruscolo 2007) is to consider an ensemble of conformations (not necessarily in equilibrium) found in different cells and/or at different timepoints during an experiment. Statistical properties of the ensemble can tell about the principles that govern DNA packing. Thus, the aim of building a single 3D model consistent with the measurements is replaced with the goal of finding a physical model of folding, which produces an ensemble of conformations whose properties resemble that of the ensemble studied experimentally by chromatin capture and/or optical techniques. In the search for such a model, we turn to the statistical physics of polymers that is concerned with characterizing states of a polymer that emerge as a result of interactions between the monomers, the solvent, and surrounding surfaces.

Below, I will describe classical equilibrium states of the polymer and their biologically relevant and measurable statistical properties. Next, I will focus on a non-equilibrium state, the fractal globule, originally proposed in 1988 (Grosberg et al. 1988) and originally named the crumpled globule (here we adopt the former notation). Later, this state was suggested as a model for DNA folding inside a cell (Grosberg et al. 1993) and recently brought into the spotlight by the discovery that such a state is indeed consistent with Hi-C data obtained for human cells (Lieberman-Aiden et al. 2009). I will then present a summary of our recent work aimed at characterizing biophysical properties of the fractal globule, and the relevance of this architecture for a range of biological functions. Finally, I will discuss our expectations regarding the possibility of finding the fractal globule architecture of chromatin in yeast and bacteria.

Chromatin as a polymer

The approach of statistical physics frequently deals with a coarse-grained “beads-on-a-string” representation of a polymer (Grosberg and Khokhlov 1994; Gennes 1979; Rubinstein and Colby 2003). The power of this approach is that it describes an ensemble of polymer conformations that emerges at scales much greater than the size of the individual monomers and irrespective of their fine structure: whether the monomer is a single chemical group, an amino acid, or a nucleosome.

Several approximations have to be made to model the chromatin fiber as a homopolymer, i.e., a polymer with all monomers interacting in the same way, having the same size and uniform flexibility along the chain.

As a first approximation, eukaryotic chromatin can be considered as a polymer fiber formed by DNA wrapped around nucleosomes and separated by linkers of about 40–60 bp (Routh et al. 2008) This fiber has a diameter of about 10 nm and a flexibility which emerges as a result of the flexibility of the linkers and partial unwrapping of nucleosomal DNA. This permits estimating its persistence length. Given that the persistence length of DNA is 150 bp, then about three to four linkers would provide the flexibility corresponding to this persistence length of the fiber. Steric interactions between nucleosomes and possible occupancy of linkers by other DNA-binding proteins, however, can make the fiber less flexible, leading to the estimate that about five to six nucleosomes form a persistence length fragment. Thus, each “bead” is not a single, but a few neighboring nucleosomes. The arrangement of neighboring nucleosomes within such a bead determines its size but is of less concern for large-scale architecture: it can be some sort of regular zig-zag pattern or an irregular blob whose fold is determined by linker lengths and nucleosome phasing (Routh et al. 2008). If the fiber is modeled as a freely jointed chain, each segment of the chain shall have a length twice the persistence length, i.e. 10–12 nucleosomes which corresponds to 2–2.5 kbp of DNA. Thus, a chromosome/region of 10 Mb can be modeled as a chain of 4,000–5,000 freely jointed segments. Each “bead” then consists of 10–12 nucleosomes and, depending on their arrangement, will have a volume exceeding that of its comprising DNA and histones by a factor of three to four, i.e., \( v \approx 15 \times {10^3}{\hbox{n}}{{\hbox{m}}^3} \), allowing it to be modeled as a sphere of about 20-40 nm in diameter. Alternatively, one can model chromatin as a homopolymer of the 30-nm fiber that has been observed in vitro but whose presence in vivo is debated (Van Holde and Zlatanova 2007), or a heteropolymer with a polymorphic structure and flexibility, which depends on local nucleosome density (Diesinger et al. 2010). If polymorphisms of the fiber are local, the overall architecture at much greater length scales could be independent of a structure of the fiber and determined primarily by its polymeric nature and long-range interactions.

The nature of interactions between the monomers remains to be discovered. These can include DNA bridging and packing by specific structural proteins, like cohesin (Nasmyth and Haering 2009), CTCF (Phillips and Corces 2009) or RNA molecules (Ng et al. 2007), and long-range interactions between enhancers and promoters mediated by assembly of the transcription machinery (Alberts 2008b). Interactions between the chromatin fiber and the rest of the nucleus may involve steric confinement by the lamina, anchoring to the nuclear matrix or protein-mediated bridging to the nuclear lamina (Kind and Van Steensel 2010).

More complicated statistical modeling can also consider specific interactions, replacing a homopolymer with a heteropolymer of several types of beads interacting differently with each other and the lamina, e.g., regions of open and closed chromatin. One can also take into account how the local density of nucleosomes influences local flexibility of the chain and size of the “beads” (Alberts 2008a), for example, loss of nucleosomes in a regulatory region can make it much more flexible. While it may be tempting to model bacterial chromatin as naked DNA subject to interactions mediated by DNA bridging and structural proteins like H-NS (Fang and Rimsky 2008) and MukBEF (Petrushenko et al. 2010), local supercoiling can lead to formation of non-trivial DNA packing. The fluctuating filament model recently introduced by Wiggins, et al. (2010) models bacterial DNA as packed into a uniform-density filament of some yet-unknown structure that is likely to include a stack of plectonemic supercoiled loops. In principle, a polymer decorated by supercoiled loops can be modeled as a branched polymer. Chromatin containing lots of crosslinks, e.g., mitotic chromosomes (Marko 2008) can be considered as a polymer gel.

In summary, to the first approximation, chromatin can be modeled as a homopolymer formed by DNA wrapped into nucleosomes. Such a polymer is assumed to have a constant diameter, DNA density, and flexibility along the chain, with monomers experiencing excluded volume and other interactions as well as spatial confinement. More detailed models of a heteropolymer may include heterogeneity of density, interactions, and shapes of the monomers.

Equilibrium states of a single polymer

The properties of a homopolymer under different conditions are presented in detail in several excellent books which should satisfy both an expert (Grosberg and Khokhlov 1994; Gennes 1979; Rubinstein and Colby 2003) and a novice (Grosberg and Khokhlov 1997) in the field. Here, we provide a quick summary, focusing on biologically relevant quantities that are measured by optical and chromosome capture experiments. Two characteristics of primary interest are (1) the mean spatial distance R(s) between two loci that are a genomic distance s apart along the chain, a quantity that is measured by fluorescent in situ hybridization (FISH); and (2) the probability of contact P c (s) between two loci that are a distance s apart, which can be calculated from the chromatin capture data. Both quantities are averaged over the conformational ensemble in polymer physics and over a population of cells in an experiment.

The random coil

A polymer in which monomers that are far apart along the chain do not interact, even when approaching each other in space, is called an ideal chain. Under certain conditions, the behavior of real chains can be well approximated by an ideal chain. Irrespective of the local mechanisms of chain flexibility (e.g., a worm-like chain, a freely jointed chain, etc.), the behavior of sufficiently long fragments of the chain resembles a 3D random walk. The characteristic size of the polymer R, which can be defined as either its root-mean-squared end-to-end distance \( \sqrt {{\left\langle {{R_{ee}}^2} \right\rangle }} \) or its mean radius of gyration R g , scale with the polymer length N as

$$ R(N) \sim {N^{1/2}}. $$

The end-to-end distance of a subchain of length s has the same scaling, i.e. \( R(s) \sim {s^{1/2}} \). Here and below, polymer length is measured in the units of the polymer's persistence length \( {\ell_p} \), i.e. \( N = L/{\ell_p} \), which depends on the local mechanism of flexibility and for naked DNA was measured to be \( {\ell_p} \approx 150{\hbox{b}}p \). Alternatively, one can use the Kuhn length b which is defined as a length of a bond in a freely jointed chain that has the same end-to-end distance. For the worm-like chain model, the Kuhn length is about twice the persistence length b = 2ℓ p and size of the polymer (the root-mean-squared end-to-end distance)

$$ R(N) = b{N^{1/2}}, $$

where N = L/b.

Characteristic for the ideal chain is the power v = 1/2 of R(s)~s v. This scaling of the end-to-end distance with s can be tested by FISH experiments, where two loci, a distance s apart, are labeled and visualized in individual cells, allowing the measurement of spatial distance between them. A recent review (Emanuel et al. 2009) suggests that significant cell-to-cell variability, however, makes it hard to obtain reliable estimates of v.

Chromosome capture methods (Miele and Dekker 2009; Ohlsson and Göndör 2007; Van Berkum and Dekker 2009; Lieberman-Aiden et al. 2009) in turn, can provide data on the probability of contact between loci distance s apart along the genome. For the ideal chain one can obtain

$$ {P_c}(s) \sim {s^{ - 3/2}}. $$

Note that a polymer in this state is rather expanded and has a low density. For example, a random coil of the Escherichia coli genome has a size of \( R = b\sqrt {N} = \sqrt {{bL}} = \sqrt {{300 \cdot 4.6 \cdot {{10}^6}}} {\hbox{bp}} \approx 12{{\mu m}} \), which is much greater than the size of the E. coli bacterium. If excluded volume interactions between the monomers are taken into account, then the scaling of the polymer size changes to \( R \sim {N^{3/5}} \), a case referred to as the “swollen coil”. The swollen coil has a size even larger than that of the random coil and is unlikely to be a relevant model for DNA packing.

The equilibrium globule

If attraction between the monomers dominates over excluded volume repulsion, or if the polymer is confined to a sufficiently small volume, the polymer undergoes a coil-globule transition into an equilibrium globule. The size of the equilibrium globule scales with the polymer length as

$$ R \sim {N^{1/3}}. $$

Hence, the volume occupied by the polymer scales linearly with polymer length: V~R 3~N, i.e., monomers fill a fixed fraction of the volume, and the density of monomers \( \rho \equiv N/V \sim const \) is independent of the polymer length and is uniform inside the globule. This uniform-density contrasts with that of an ideal chain where the volume populated by the polymer \( V \sim {R^3} \sim {N^{3/2}} \) and the density decrease with N as: \( \rho \sim N/V \sim {N^{ - 1/2}} \), resulting in monomers occupying a tiny fraction of the volume of the coil. Relevant for FISH experiments is the scaling of the end-to-end distance of a subchain with its length s. This scaling in the globule differs from that of the whole chain. In fact, according to the Flory theorem (Grosberg and Khokhlov 1994), interactions of a chain in a dense melt are screened by other chains, making that chain behave almost like an ideal chain (i.e., a random walk). In other words, a chain inside a globule behaves like a random walk (\( R(s)\sim {s^{1/2}} \)), until the “walker” hits the boundary of the confining volume (or the boundary of the globule). After such a “collision” the walker starts a new random walk from the boundary inside the globule. After several such collisions, the volume of the globule becomes filled with random walks that are uncorrelated with each other. Since the volume is filled by the chain uniformly, the walker that experienced several collisions (i.e., \( {s^{1/2}} > R(S)\sim {N^{1/3}} \)) is equally likely to be found anywhere within the volume. The end-to-end distance of a subchain then scales as

$$ R(s) \sim \left\{ {\begin{array}{*{20}{c}} \hfill {{s^{1/2}}} & \hfill {{\hbox{for}}\;\;s \leqslant {N^{2/3}}} \\\hfill {\hbox{const}} & \hfill {{\hbox{f}}or\;\;s > {N^{2/3}}.} \\\end{array} } \right. $$

Note that a similar ideal regime of chains is observed in other dense polymer systems such as melts of many individual polymers. The contact probability for a subchain of an equilibrium globule scales (Lua et al. 2004) approximately as

$$ {P_c}(s) \sim \left\{ {\begin{array}{*{20}{c}} \hfill {{s^{ - 3/2}}} & \hfill {{\hbox{f}}or\;\;s \leqslant {N^{2/3}}} \\\hfill {const} & \hfill {{\hbox{f}}or\;\;s > {N^{2/3}}.} \\\end{array} } \right. $$

Figure 1 shows these scaling behaviors for simulated equilibrium globules. Interestingly, FISH data for yeast chromosomes labeled at the centromere and telomere show a similar roll-over into a plateau, a characteristic feature of the equilibrium globule (see figures in Therizols et al. 2010; Emanuel et al. 2009).

Fig. 1
figure 1

a Root-mean squared end-to-end distance R(s) as a function of the genomic distance s between the ends of a subchain (in the units of \( \ell \)) for globules of N = 32,000 monomers. Blue, equilibrium globule; green, fractal globule. At small s, both globules show scaling characteristic of the self-avoiding random walk (3/5), followed by 1/2 of the ideal coil. Notice there is a plateau for the equilibrium globule. b The probability of a contact as a function of genomic distance s for the equilibrium globule (blue) and the fractal globule (green). Notice the robust scaling of −1 which spans two orders of magnitude for the fractal globule

Another important property of the equilibrium globule is its entanglement. Computer simulations (Virnau et al. 2005; Lieberman-Aiden et al. 2009) and theoretical calculations (Metzler et al. 2002; Grosberg 2000) have demonstrated that a long polymer folded into an equilibrium globule is highly knotted. Such knots can hamper folding and unfolding processes (Bölinger et al. 2010) making knotted conformations rare among naturally occurring protein structures (Virnau et al. 2006; Lua and Grosberg 2006). Because of the high degree of entanglement of the globule, folding into such knotted conformations requires a polymer to thread its ends through different loops many times. Since its slithering motion is rather slow and diffusive (polymer ends move equally forward and backward), formation of the entangled equilibrium globule is a very slow process (with equilibration time, ~N 3 (Grosberg et al. 1988)).

The fractal (crumpled) globule

According to de Gennes, polymer collapse proceeds by the formation of crumples of increasing sizes: first, small crumples are folded, leading to formation of an effectively thicker polymer-of-crumples, which next forms large crumples itself, etc. Grosberg et al. (1988) demonstrated that this process should lead to formation of a long-lived state that they called a crumpled globule (recently referred to as a fractal globule). They also conjectured that such a globule is characterized by a hierarchy of crumples thus forming a self-similar structure (Grosberg et al. 1988). These crumples emerge due to topological constraints: every sufficiently long chain experiences such constraints imposed by other parts of the polymer and collapses into a crumple subject to these confining interactions (Khokhlov and Nechaev 1985).

Since the fractal globule is space-filling, its volume scales linearly with polymer length the same way the equilibrium globule does:

$$ R(N) \sim {N^{1/3}}. $$

According to the conjecture by Grosberg et al (1988), the fractal globule consists of globules (crumples) formed on all scales (Figs. 2 and 3), the scaling of the size of a subchain of length s should follow the same law:

$$ R(s) \sim {s^{1/3}} $$
Fig. 2
figure 2

Conformations of the fractal (a) and equilibrium (b) globules. The chain is colored from red to blue in rainbow colors as shown on the top. The fractal globule has a striking territorial organization, which strongly contrasts with the mixing observed in the equilibrium globule. Territorial organization of the fractal globule (c) is evident when two chains of 1,000 monomers each are outlined. The equilibrium globule (d), in contrast, has two chains mixed together in space

Fig. 3
figure 3

The fractal globule (a) consist of dense globules formed on all scales. Subchains of 100, 300, 1,000, and 3,000 monomers (left to right) are shown by a red tube in a globule of N = 32,000 monomers. For comparison, same regions of the equilibrium globule (b) are diffuse inside the globule

This applies to sufficiently long subchain, i.e. s > N *, where N * is the minimal length of the polymer that can form a spontaneous knotted structure and is believed to be about 10–20 Kuhn's lengths (Grosberg et al. 1988). The self-similar conformation of the fractal globule resembles a statistical fractal with a fractal dimension of 3 (for comparison, the Gaussian coil formed by an ideal chain has a fractal dimension of 2). Comparison of these equations with the scaling for the equilibrium globule (Eq. 5) reveals two major differences: (1) the scaling of the end-to-end distance has a power of 1/3 for the fractal globule, rather than 1/2 for the equilibrium globule; and (2) the plot of R(s) r s for the fractal globule does not have a plateau like that present for the equilibrium globule (see Fig. 1a). Such differences could be detected by high-resolution DNA FISH experiments with averaging over a sufficiently large number of cells (e.g., Yokota et al. 1995) but sufficient cell-to-cell variability can make it hard to distinguish the powers of 1/2 and 1/3 (Emanuel et al. 2009).

The contact probability for the fractal globule was not computed in the original (Grosberg et al. 1988) contribution and was difficult to compute analytically without making drastic simplifications. Our group used simulations to obtain the scaling of P c (s) in the fractal globule. We use traditional Monte Carlo simulations of a polymer freely jointed chain modeled as spherical impenetrable beads with diameter b (Lieberman-Aiden et al. 2009; Imakaev and Mirny 2010). The simulations took care to not violate topological constraints as tested by computing Alexander polynomials on reduced chains (Virnau et al. 2006). Simulating collapse from a polymer coil by applying a confining spherical cage or via pairwise interactions, we obtained fractal globules for chains as long as N = 500,000 monomers (Imakaev and Mirny 2010). The resultant fractal globules show a robust scaling

$$ {P_c}(s) \sim {s^{ - 1}} $$

over a broad range of polymer lengths (N = 4,000–500,000) and subchain lengths (s = 101–105). Comparison with the contact probability for the corresponding equilibrium globule (6) shows a significant difference in the exponent (−1 vs −3/2) and the lack of a plateau for large s which is present in the equilibrium globule (Fig. 1b). These features have been used in the recent analysis of the human Hi-C data (Lieberman-Aiden et al. 2009).

The fractal globule in human chromatin architecture

Recently, chromosomal contacts in human cells have been characterized by the Hi-C experiments (Lieberman-Aiden et al. 2009). Among several important observations brought to light by this study is the dependence of the contact probability \( P_c^{{ \exp }}(s) \) on genomic distance s:

$$ P_c^{{ \exp }}(s) \sim {s^\alpha },\;\;\;\;\alpha \approx - 1, $$

for s in the range from 0.5 Mb to about 6 Mb. The original paper made this statement based on a linear fitting of \( \log P_c^{{ \exp }}(s) \) vs \( \log s \). Our more recent analysis using a maximum-likelihood estimator and systematic model selection has (1) confirmed that Hi-C data contact probability \( P_c^{{\rm{e}}xp}(s) \) is best fit by a power-law with α very close to −1, and (2) that such a fit could extend beyond the 5–10-Mb range (Fudenberg and Mirny 2010).

The scaling of s −1 is easy to intuit. First, it means that loci twofold farther apart are twofold less likely to interact. Second, if contacts are interpreted as chromatin loops, then there is no mean or characteristic loop length: loops of all length are present and the mean is not well defined for s −1 scaling. This observation contrasts with the earlier loop models of chromatin packing (Münkel et al. 1999; Sachs et al. 1995).

Another important feature of \( P_c^{{ \exp }}(s) \) obtained by Hi-C experiments is the lack of a clear plateau at large s, that would be indicative of the equilibrium globule (see Fig. 1b). While some rise in the slope is observed for \( s >rsim 50 - 100\,{\hbox{Mb}} \), it is not statistically significant due to the fact that some chromosomes are shorter than this range and due to the lack of dynamic range which would allow extremely low-frequency interactions to be resolved at such distances.

Another recently introduced polymer model of chromatin is a Random Loop model (Mateos-Langerak et al. 2009; Bohn et al. 2007). In this model, any pair of monomers has a fixed probability of having an attractive interaction. Such attractive interactions lead to polymer collapse. Since the interactions are uniformly distributed, the final configurations show all the features of the equilibrium globule, most prominently, the saturation in the size of fragments: R(s) = const for large s (Mateos-Langerak et al. 2009). Such a plateau in R(s) is consistent with FISH data for a range beyond 25 Mb (for chromosome 11). The plateau in R(s), however, will inevitably lead to a plateau in the contact probability (i.e., P c (s) = const for large s), which is in striking disagreement with Hi-C data for human cells.

The fractal globule, in turn, is the only ensemble of conformations that is consistent with (a) the 1/s scaling of the contact probability for distances in the range of up to 10 Mb, and (b) the lack of plateau in P c (s). Therefore, over this megabase range, the Hi-C results exclude several models for higher order chromatin structure, including the equilibrium globule, the random loop as well as the swollen or ideal coils, and any regular arrangement of open loops. While available FISH data is generally lacking sufficient precision to discriminate between exponents of 1/2 and 1/3 for R(s) (Emanuel et al. 2009), data for large distances s > 10 Mb in human chromosome 4 (Yokota et al. 1995) are best fit by \( R(s) \sim {s^{0.32}} \) (Münkel et al. 1999) which is consistent with the fractal globule's 1/3.

It is presently difficult to propose an adequate model for chromatin packing for s > 25 Mb, due to a significant scatter in the FISH data and a small frequency of contacts at this range in the Hi-C data. However, forthcoming high-resolution chromosome capture data combined with optical methods could change this.

Folding, unfolding, and loop opening

Beyond being the only model that fits Hi-C data, the fractal globule has several important properties that make it an attractive way of organizing chromatin in a cell. The fractal globule is easy to form: as we showed by simulations (Lieberman-Aiden et al. 2009; Imakaev and Mirny 2010) a non-specific collapse of a polymer naturally leads to a fractal globule conformation, provided that topological constraints are in place, i.e., the chain cannot cross itself. For a chromatin fiber, such a collapse of ~5–10-Mb loops could be induced by DNA-binding condensing/linking proteins like cohesin (Nasmyth and Haering 2009), CTCF (Phillips and Corces 2009; Ohlsson et al. 2010), or structural RNAs (Ng et al. 2007) and can span large chromosomal domains.

The fractal globule is unentangled, i.e., it contains no knots since it maintains the topology of an open state. Dynamics of chromatin opening from the unentangled fractal conformation are very different from that of the knotted conformation of the equilibrium globule, as we demonstrated by simulations. Figure 4A shows opening of a region of about 1 Mb in two types of globules of 8 Mb each. While in a fractal globule, it can easily unfold if molecular crosslinks (i.e., attractive interactions) that keep it condensed are removed, a similar region of the equilibrium globule does not fully open up as it remains trapped by multiple entanglements (Fig. 4B).

Fig. 4
figure 4

Opening of a loop that is a part of the fractal globule (a), and the equilibrium globule (b). Globules of 32,000 monomers were folded by pairwise attractive interactions. The fractal globule was formed by Molecular Dynamics which keeps track of topological constrains, while the equilibrium globule was equilibrated folded by Monte Carlo simulations (Reith and Virnau 2010) that violate topological constrains leading to significant entanglement. On the next step of molecular dynamics simulation, attractive interactions for a region of 3,000 monomers were removed allowing the region to open up due to the chain entropy. In the fractal globule, the region opened up forming a large loop (a). The same region failed to open from the equilibrium globule (b) due chain entanglements in this state

Such ability to rapidly unfold can be of great importance for gene activation, which has been shown to cause decondensation of large (0.5–2 Mb) genomic regions (Hubner and Spector 2010; Müller et al. 2001). Our model suggests that such displacement/modification of crosslinking proteins/RNAs in a spatially small area of condensed loop is sufficient to trigger its large-scale decondensation. Modification or displacement of crosslinking proteins can be accomplished by some components of transcription machinery or polymerase complex that are recruited to the activated locus. This can explain why decondensation depends on the presence of transcription factor activation domains (Carpenter et al. 2005) or polymerase activity (Müller et al. 2001). In simulations, the unfolded loop can rapidly move around allowing it to sample space. Such dynamics can allow chromatin loops to search nuclear environment for transcription factories (Cope et al. 2010; Hubner and Spector 2010). Since the unfolded loop has a size of \( R(s) \sim {s^{1/2}} \) as compared to size of the folded loop \( R(s) \sim {s^{1/3}} \), the folded region can exceed in size a much longer domain folded into a compact fractal globule (Fig. 4). This argument is consistent with experiments of Müller et al. (2001) which demonstrated that a 0.5-μm spot decondenses into a 1–10-μm spot.

Thus, the fractal globule architecture allows rapid and large-scale opening of genomic loci as well as their spatial motion in the unfolded state. Importantly, all these events happen spontaneously in response to local removal/modification of crosslinking proteins and are driven by the entropy of the polymer chain.

The fractal globule, topological constraints, and chromosomal territories

Our recent computer simulations of polymer folding (Lieberman-Aiden et al. 2009; Imakaev and Mirny 2010) demonstrated that when a chain is folded into a fractal globule, each sequential region of the chain occupies a distinct spatial region (see Figs. 2 and 3). This segregation of subchains is akin to the segregation of polymer rings that occurs due to topological constraints and was suggested as a mechanism that leads to formation of chromosomal territories (Rosa and Everaers 2008; Dorier and Stasiak 2009; De Nooijer et al. 2009; Grosberg et al. 1988; Vettorel et al. 2009). In contrast to chromosomal territories that separate chromosomes into spatially distinct regions, spatial segregation in the fractal globule occurs on all scales (Fig. 3). This suggests the presence of genomic territories where a continuous genomic region is spatially compact, and different regions occupy different locations (Fig. 2). The fractal globule suggests the presence of genomic territories in a broad range of scales: from tens of kilobases to tens of megabases. While subchromosomal domains have been visualized as non-overlapping spatial entities (Visser and Aten 1999) more systematic study of genomic territories can test predictions made by the fractal globule model.

While scaling of the contact probability suggested the presence of fractal globules for genomic regions of up to 5–10-Mb long, it is possible that full chromosomes and their relative packing follow the same principle of fractal globule architecture. Several recent studies have suggested that topological constraints can lead to the emergence of chromosomal territories (Rosa and Everaers 2008; Dorier and Stasiak 2009; De Nooijer et al. 2009; Visser and Aten 1999; Vettorel et al. 2009). By simulating chromosomes as polymer chains or rings of various lengths that are either confined to a small volume (Dorier and Stasiak 2009; Visser and Aten 1999; De Nooijer et al. 2009) or equilibrated in a melt of other chromosomes (Vettorel et al. 2009) these studies have observed spatial segregation of chains. Such segregation closely resembles chromosomal segregation observed by optical microscopy (Cremer and Cremer 2010). Moreover, Vettorel et al. (2009) demonstrated that polymer rings equilibrated in a high-density melt have statistical properties resembling that of the fractal globule. Dorier and Stasiak (2009) have shown that topological constraints are more important than excluded volume in inducing spatial segregation of rings. However, unrealistically short rings upon extreme confinement used in Dorier and Stasiak (2009) necessitate further studies of this phenomenon. Rosa and Everaers (2008) examined the equilibrium and kinetics of polymer rings and chains. They report observing robust \( R(s) \sim {s^{1/3}} \) scaling for equilibrated rings and as a transient, long-lived intermediate of confined polymer chains. They note that this scaling is consistent with \( R(s) \sim {s^{0.32}} \) obtained for human chromosome 4 using FISH techniques (Yokota et al. 1995). It was also noted (Rosa and Everaers 2008) that an equilibrium conformation of a compact polymer does not exhibit such self-similar behavior.

In summary, these studies demonstrated that topological constraints, the same ones that lead to formation of the fractal globule, lead to spatial segregation of chromosomes. Segregation in the fractal globule, leads to emergence of “genomic territories” on all scales above some N *. Biologically, this means that any region of the genome folded into a fractal globule is spatially compact, rather than spatially spread. Decondensation and spreading could be caused by either active displacement of crosslinking proteins/RNAs (e.g., during gene activation) or by violation of the topological constraints (e.g., by topoisomerase II or unrepaired double-stranded DNA breaks, see below).

Mixing and crosstalk

Despite the territorial organization created by topological constraints, there is a great deal of interaction between individual regions of the fractal globule. Figure 5 presents two neighboring crumples inside a fractal globule, showing a great deal of interdigitation of the two crumples. In fact, our study demonstrated (Lieberman-Aiden et al. 2009) that the number of interactions M(s) a region of length s has with the rest of the fractal globule scales as

$$ M(s) \sim s \sim {R^3}(s) \sim V(s), $$

i.e., linearly with its volume, rather than its surface area. As we showed analytically, this scaling follows directly from \( P(s) \sim {s^{ - 1}} \) scaling of the contact probability. This means that individual regions deeply penetrate into each other's volumes (see 5), rather than touch each other on the surface, as spheres, polyhedra, or other squishy but impenetrable objects would do. In other words, a crumple of a fractal globule has a fixed (independent of its size) fraction of its volume that is involved in interactions.

Fig. 5
figure 5

Despite having an organized territorial architecture, spatially neighboring regions of the fractal globule (shown in red and blue) have a large number of interactions between them, deeply penetrating into each other's volumes. The number of interactions of crumples has scales linear with its volumes (see Eq. 12). Thus a fixed fraction of crumples volume (rather than its surface) is involved in interactions.

Moreover, in the fractal globule the number of contacts between two crumples of lengths s 1 and s 2 (\( {s_{1,2}} \ll N \)) that are separated by a distance l along the chain scales as

$$ {M_{1,2}}(l) \sim \frac{{{s_1}{s_2}}}{l} \sim \frac{{{V_1}{V_2}}}{l}. $$

Thus, the number of interactions is proportional to the product of the crumples' volumes. Such penetration means a great deal of possible crosstalk between individual regions of all sizes (loci, chromosomal arms, etc.) despite their spatial segregations. Thus, the fractal globule simultaneously provides two seemingly contradictory features: spatial segregation of genomic regions on all scales and their extensive crosstalk.

Stability of the fractal globule

While providing a number of advantages, the fractal globule is a long-lived intermediate on the way to becoming an equilibrium globule. What are the factors that determine its metastability? How can cells maintain the fractal globule organization of chromatin for a long time?

The original theory of the fractal globule (Grosberg et al. 1988) suggested that (1) the lifetime of the fractal globule was determined by a time (~N 3) required to thread the ends of the polymer through the whole globule, allowing the formation of sufficiently knotted state; (2) a chain with attached ends (e.g., a loop or a polymer ring) should remain in the fractal globule state. We tested these conjectures by simulations demonstrating that equilibration of the fractal globule is indeed a very slow process (see Fig. 6) with the time exceeding ~N 3. Rosa and Everaers estimate that it would take more than 500 years for a chromosomal fiber to equilibrate (Rosa and Everaers 2008). We also found that, contrary to the second conjecture, a chain with confined ends nevertheless slowly interconverts into an equilibrium globule, while remaining unentangled (Imakaev and Mirny 2010).

Fig. 6
figure 6

Equilibration of the fractal globule. A series of snapshots obtained at four logarithmically spaced timepoints of long equilibration simulations. Notice gradual loss of the territorial organization, characteristic of the fractal globule, and increasing mixing, leading to formation of the equilibrium globule. Since the ends of the globule remain attached to the surface while being able to slide on it, the structure remains unentangled. This equilibration is very slow. The details of these simulations will be published elsewhere

The lifetime of the fractal globule naturally depends on the stringency of the topological constraints. Such constraints can be violated in the cell by DNA topoisomerase II enzyme (topo II). Topo II cuts both strands of one DNA double helix, passes another unbroken DNA helix through it, and then religates the cut DNA. In doing so, it can knot and unknot DNA (Vologodskii 2009). To test the role of topo II, we performed simulations where occasional strand passing was allowed. These simulations show rapid equilibration of the fractal globule into an equilibrium one. This result suggests that active topo II during the interphase could destroy the fractal globule architecture and chromosomal territories requiring some other mechanisms for their stabilization.

It was proposed that formation and maintenance of chromosomal territories requires topological constraints (Dorier and Stasiak 2009). Such constraints will be kept if topo II were unable to act on nucleosomed chromatin fibers. Recent experiments however demonstrated that at least in vitro topo II is able to act on nucleosomed DNA as efficiently as on naked DNA, reducing its positive supercoiling (Salceda et al. 2006). However, the ability of topo II to facilitate the passage of two chromatin fibers through each other in vivo, as well as the activity of topo II enzyme during the interphase remain to be studied.

Stabilization of the fractal globule could involve anchoring as well as reversible and irreversible crosslinking of DNA by proteins or RNA molecules. Simulations show that while reversible crosslinking cannot prevent eventual equilibration, it can significantly slow it down (Imakaev and Mirny 2010).

To manifest in the cell, fractal globule and topological territories should not necessarily be stable indefinitely. They should persist at least for the duration of single cell cycle, as chromosomal architecture is re-established upon mitosis. Recent photo-activation experiments beautifully demonstrated that chromosomal architecture is maintained for 10–15 h and is completely reset upon mitosis (Strickfaden et al. 2010). Mechanisms that suppress stand passing and otherwise stabilize the fractal globule as well as the rest of chromatin architecture during the interphase are yet to be established.

Other organisms

The relevance of the fractal globule architecture to chromosome organization in other organisms depends on the space available for chromosomes in the nucleus during the interface, i.e., due to DNA density. The DNA density in a diploid human cell is approximately 6,000 Mb packed in the nuclear volume of \( \approx 300{{\mu }}{{\hbox{m}}^3} \), i.e., \( 20{\hbox{Mb}}/{{\mu }}{{\hbox{m}}^3} \). Baker's yeast, in contrast has a density of about \( 12{\hbox{Mb}}/3{{\mu }}{{\hbox{m}}^3} = 4{\hbox{Mb}}/{{\mu }}{{\hbox{m}}^3} \). At least a fivefold difference in density could entail very different chromatin architectures. While a significant fraction of the yeast nucleus is occupied by nucleoli that are inaccessible to the chromatin (Therizols et al. 2010) the remaining volume may be almost sufficient for a loose coil formed by an ideal or swollen coil. For example, a long yeast chromosomal arm of N = 0.5 Mb corresponds to about 200b (with Kuhn length b = 30 nm) and has a characteristic size \( R(N) \approx b\sqrt {N} \) or \( \approx b{N^{3/5}} \approx 0.4 - 0.7{{\mu m}} \) which easily fits inside the available volume of the yeast nucleus and matches high-resolution FISH measurements (Therizols et al. 2010). Chromosome capture (Ohlsson and Göndör 2007; Van Berkum and Dekker 2009; Lieberman-Aiden et al. 2009; Duan et al. 2010) measurements in yeast (Duan et al. 2010) will provide critical information about the scaling of the contact probability P c (s) revealing whether chromatin in yeast is packed into a fractal globule or not. Topological constraints in yeast, may nevertheless lead to segregation of chromosomes into less pronounced chromosomal territories (Haber and Leung 1996; Berger et al. 2008; Therizols et al. 2010).

Recent experiments have shed light on the organization of the bacterial chromosome (Wiggins et al. 2010; Toro and Shapiro 2010). Wiggins, et al. (2010) have demonstrated linear organization of the bacterial chromosome in E. coli. The origin of replication was found to be positioned close to cell center, while the two “arms” extend symmetrically. This study demonstrated that the spatial distance between any locus and the origin goes precisely linearly with the genomic distance between the two. This organization and fluctuations in loci positions are explained by a mechanical Fluctuating Spring model, which represents the whole DNA-filled nucleoid as a fluctuating elastic filament, rather than resolving how DNA is organized inside the nucleoid. Another statistical models of DNA packing in bacteria (Jun and Wright 2010) is concerned with a potential mechanism of DNA segregation upon cell division, suggesting that chain entropy is sufficient for spontaneous segregation of two DNA chromosomes (Jun and Wright 2010). A statistical polymer model that can explain the observed linear scaling of spatial and genomic distance has yet to be developed. We conjecture that a fractal globule confined to the elongated geometry of the E. coli nucleoid can exhibit such linear scaling due to segregation of subchains. Again, chromosome capture can provide data complementary to optical measurements, yielding a clearer understanding of the principles that govern folding of bacterial chromosome.

The fractal globule, topological constraints, and cancer

There are a few interesting connections between the concept of the fractal globule and cancer. From the historic work of Boveri (Boveri 1914) to recent characterization of cancer genomes (International Cancer Genome Consortium 2010), it has been known that cancer cells carry numerous genomic rearrangements. Chromatin structure could play a role in molecular mechanisms involved in formation of genomic rearrangements and influence the distribution of rearrangements observed in cancer.

Recent characterization of somatic copy-number alteration across many human cancers (Beroukhim et al. 2010) have provided a high-resolution map of such events and revealed two classes of rearrangements: global, such as deletions or amplifications of a complete chromosomal arm; and focal which occur on much smaller scales. The abundance of such events and significant sample-to-sample differences in the patterns of observed alterations suggest that the vast majority of these events are passenger mutations, i.e., random genetic events. Strikingly, the frequency of an alteration (insertion or deletion) of a genomic region of length s scales as

$$ f(s)\sim {s^{ - 1}} $$

for the range of \( 0.1 \leqslant s \leqslant 5\,{\hbox{Mb}} \). This resembles the scaling of the probability of contact between two loci distance s apart obtained by Hi-C for human chromosomes (Lieberman-Aiden et al. 2009). We conjecture that these two scaling laws are connected: if two loci form a spatial contact, they are more likely to be subject to a recombination/repair event that leads to deletion or amplification of the formed loop. This way, the 1/s scaling in the contact probability leads to the same scaling in the frequency of genomic alterations. Such connection between chromatin structure and the frequency of chromosomal alterations have not been reported earlier and constitutes another experimentally testable hypothesis that stems from the fractal globule model.

Another interesting connection between the fractal globule and cancer stems from the fact that double-stranded DNA breaks can lead to strand passing and hence to violation of topological constraints. Double-stranded breaks are widespread in certain forms of cancer and are produced by deficiencies of repair and recombination machineries (Weinberg 2007). Topological constraints, on the other hand, are central for the maintenance of the fractal globule and chromosomal territories. Abundant double-stranded breaks are likely to cause partial opening of domains folded into fractal globules leading to some degree of chromosome decondensation. Note that if the equilibrium globule were the state of the chromatin, double-stranded breaks would have little effect since a highly knotted conformation of the equilibrium globule constrains motion of the fiber. Consistent with these conjectures are experimental findings of local chromatin decondensation at the sites of double-stranded breaks (Kruhlak et al. 2006) and global chromatin decondensation upon malignant transformation (Ye et al. 2001). Such decondensation, in turn can help cancer cells to reverse chromatin condensation and gene silencing associated with cell differentiation (Weinberg 2007). One provocative hypothesis is that cancer cells may use double-stranded breaks to support the dedifferentiation process.

Double-stranded breaks can also lead to faster equilibration of the globule and melting of the boundaries of chromosomal territories. Note that the Hi-C data (Lieberman-Aiden et al. 2009) discussed above were obtained for two cancer cell lines (GM06990 and K562), both showing contact probabilities characteristic for the fractal globule. Further chromosome capture and fluorescence microscopy experiments on cells subject to different levels of double-stranded break induction treatment could test these predictions.

Summary and outlook

Introduced about 20 years ago and proposed then as a model for DNA packing (Grosberg et al. 1988; Grosberg et al. 1993), the concept of the fractal globule is an attractive model of chromatin organization during interphase in human cells. It is the only statistical polymer model that is consistent with both chromosome conformational capture data and FISH scaling: It delivers experimentally observed \( P(s) \sim {s^{ - 1}} \) scaling (Lieberman-Aiden et al. 2009); and provides scaling of the end-to-end distance close to the FISH scaling of \( R(s) \sim {s^{0.32}} \) (Rosa and Everaers 2008; Yokota et al. 1995). The span of genomic lengths over which the fractal globule persists has yet to be established, as chromosome capture data fit the fractal globule for \( 0.1 \lesssim s \lesssim 10\,{\hbox{Mb}} \), while FISH data has close to 1/3 scaling on longer scales \( s >rsim 10\,{\hbox{Mb}} \) (Rosa and Everaers 2008). High-resolution single-molecule single-cell microscopy methods may be able to overcome current limitations of the FISH method caused, in part, by significant cell-to-cell variability of spatial distances.

Several biophysical properties of the fractal globule make it a particularly appealing model of chromatin organization.

  • The fractal globule is formed spontaneously due to topological constraints by chromatin condensation and is able to maintain its topological state for a long time.

  • By virtue of being largely unknotted, any region of the fractal globule can easily and rapidly unfold and translocate (Fig. 4), becoming accessible to transcriptional and other protein machinery of the cell.

  • Folding into the fractal globule leads to formation of genomic territories (Fig. 2), i.e., a conformation where any specific genomic locus is folded into compact crumples (Fig. 3), and distinct loci occupy distinct spatial locations. Despite this territorial organization, folded loci form a very large number of interactions with each other (Fig. 5, with the number of interactions proportional to the volumes of interacting crumples). When expanded to the scale of whole chromosomes, these features of the fractal globule correspond to chromosomal territories and suggest extensive crosstalk between the chromosomes.

  • The fractal globule is a long-lived intermediate that gradually converts into an equilibrium globule (Fig. 6), which lacks many of the properties of the fractal globule and is not consistent with the experimental data. Activity of the topo II enzyme significantly accelerates this process, while crosslinking between remote chromosomal loci slows it down. Mechanisms that help to maintain the fractal globule state are yet to be found.

Many of these properties, predicted theoretically and observed in simulations, can be tested experimentally and can help to better characterize the state of the chromatin inside a cell. For example, genomic territorial organization can be tested using high-resolution optical microscopy by methods like PALM or STORM (Betzig et al. 2006; Rust et al. 2006).

The role of topo II enzyme in the organization of the interface of chromosomes is intriguing. Its ability to facilitate passage of nucleosomed chromosomal fibers, thus violating topological constraints, can be further studied experimentally. Similarly, stability of chromosomal/genomic territories despite the activity of topo II enzyme in vivo can be assayed by induced topo II overexpression. Chromatin-pulling experiments (Marko 2008) can help to test the degree of DNA entanglement and to characterize contributions of topological constraints and crosslinking by proteins to the folded state. These questions are central to understanding the role of topological constraints in the formation and support of chromosomal organization during the interphase (Rosa and Everaers 2008; Dorier and Stasiak 2009; De Nooijer et al. 2009; Vettorel et al. 2009; Lieberman-Aiden et al. 2009). Chromosome-capture methods (Dekker 2008) can reveal how chromatin is organized in different organisms, different tissues, at different stages of cell cycle, and to observe the evolution of its structural state upon differentiation or malignant transformation.

Solving a precise structure of chromatin akin to the structure of a folded protein may not be feasible as chromatin structures can differ significantly from cell-to-cell. However, approaches based on the statistical physics of polymers and high-quality experimental measurements can help characterize the state of the chromatin as a conformational ensemble, revealing basic organizing principles behind chromatin folding and dynamics.