Long-range chromatin interactions can occur over many megabases, either between regions of the same chromosome (cis) or between different chromosomes (trans). Many chromatin clustering events involve preferential interactions between genomic loci and are cell type specific, indicating a functional role of genome organization in regulating gene expression. Many mechanisms are involved in establishing global organization, including transcription by specific sets of transcription factors or gene repression among similar epigenetically marked domains. Here, we discuss several examples of specific spatial organization patterns from transcriptionally active and silent chromatin and the potential mechanisms involved in their establishment.

Long-range chromatin interactions influence function

A growing number of specific long-range chromatin interactions have been identified, indicating that the three-dimensional organization of chromatin within the nucleus is not random. These interactions have been found using tools such as RNA and DNA fluorescence in situ hybridization (FISH) and the chromatin proximity-ligation assay chromosome conformation capture (3C) and its derivatives [1]. In 3C, genomic regions in spatial proximity are cross-linked and digested with a restriction enzyme while in the nucleus. After nuclear lysis, the cross-linked chromatin complexes are diluted and ligated such that ends of restriction fragments in the same cross-linked complex form novel ligation junctions that can be detected by various methods. Numerous studies using these tools have shown that the three-dimensional organization of chromatin within the nucleus is not random. One of the best known and studied long-range interactions occurs between the erythroid-specific β-globin gene and its long-range enhancer, the distal locus control region (LCR). The mammalian β-globin LCR consists of five DNase I hypersensitive sites (HS1-HS5) distributed over 15 kb, located approximately 50 kb upstream of the β-globin gene. The LCR regulates β-globin gene transcription during erythroid development by physically interacting with the β-globin gene, leaving the intervening 50 kb of DNA looped out [2, 3] (Figure 1a). Deletion of the LCR, or ablation of specific transcription factors or cofactors required for the interaction, leads to dramatic decreases in β-globin gene transcription levels, highlighting the functional significance of the interaction [48].

Figure 1
figure 1

Intra- and inter-chromosomal interactions. (a) The β-globin gene, located approximately 50 kb downstream of the locus control region (LCR), is activated during erythropoiesis. The β-globin gene interaction with the LCR ensures high and efficient β-globin transcription, with the intervening sequence looping out. (b) Naïve T cells show a trans association between the TH2 LCR, on chromosome 11, and the IFN-γ promoter, on chromosome 10. This interaction is lost in favor of specific intra-chromosomal interactions following differentiation into TH1 or TH2 effector cells.

Long-range interactions are also required for the processes of T cell receptor and V(D)J recombination in T cells and B cells. V(D)J recombination involves the selection of one of each gene from the V, D and J gene families of the immunoglobulin gene locus. A single V gene is selected from over 190 different V genes distributed over 2.5 Mb and is brought into close spatial proximity and physically linked to a previously recombined (D)J gene, creating a functional immunoglobulin gene [9]. These findings show that chromatin or genes distally arranged on the same chromosome can interact in close physical proximity in three-dimensional space.

Interchromosomal or trans interactions have also been proposed to regulate gene activity. In murine naïve T cells the T helper cell 2 (TH2) LCR on chromosome 11 interacts with the interferon-γ (IFN-γ) promoter located on chromosome 10 [10, 11]. Following differentiation to effector TH1 or TH2 cells, these trans interactions are lost in favor of cis interactions: TH1 cells have interactions between the IFN-γ promoter and regulator elements located upstream to promote high levels of IFN-γ expression, whereas in TH2 cells the TH2 LCR interacts with three nearby interleukin (IL) genes, IL-4, IL-5 and IL-13, to enhance their expression (Figure 1b). In another example, the H19 imprinting control region, located on chromosome 7 in mice, drives the silencing of the maternally inherited insulin-like growth factor 2 receptor (Igf2r) allele and has been shown to interact in trans with up to four different chromosomes in embryonic tissue [12].

In the examples of the TH2 LCR and H19 imprinting control region mentioned above, deletion of genetic elements on one chromosome affected the expression of interacting genes on other chromosomes, indicating the functional significance of interchromosomal interactions. In contrast, conflicting reports surround the function of the mouse homology (H) enhancer, which engages in cis and trans interactions with odorant receptor genes. The H enhancer is located within the MOR28 odorant receptor gene cluster on mouse chromosome 14, while other odorant receptor gene clusters are scattered on multiple chromosomes. It has been proposed [13] that the choice of expression of a single mouse odorant receptor gene in a sensory neuron is determined by an interaction in cis or trans between the H enhancer and a single odorant receptor gene. However, two later reports [14, 15] showed that deletion of the H enhancer abolished expression of three flanking odorant receptor genes in the MOR28 cluster with no demonstrable effect on odorant receptor gene expression in trans.

Trans interactions may also be indirectly linked to diseases resulting from chromosomal translocations [16]. The Myc and IgH loci (encoding a transcription factor and an immunoglobulin, respectively), which are located on different mouse chromosomes, are frequent breakpoints in chromosomal translocations, in which two different chromosomes are fused together through inappropriate DNA repair. In mouse B cells, Myc and IgH are found in close proximity in the nucleus only when transcribed, suggesting that transcriptional organization could affect their frequency of translocation [17]. This finding is analogous to recent data indicating that, for androgen-receptor-regulated genes, a combination of irradiation-induced DNA breakage and transcription-induced proximity synergistically increases their chromosomal translocation frequency [18].

Architecture of association

Examination of nucleolar structure and function provides some of the first evidence for how clustering of specific genes in three-dimensional space could be achieved. Nucleoli are assembled through association of the nucleolar organizing regions (NORs) and various nucleolar proteins. Each of the five human NORs is composed of many tandemly repeated rRNA genes located on the acrocentric chromosomes 13, 14, 15, 21 and 22 (Figure 2). As cells exit mitosis, NORs are bound by the essential transcription protein upstream binding factor (UBF) [19] and coalesce into between one and four nucleolar structures. The NORs that are transcriptionally quiescent are not bound by UBF and are excluded from nucleoli, indicating that this transcription factor may be fundamental in the organization of these structures [20]. Transcription is also fundamental to the organization of nucleoli. Inhibition of the nucleolar RNA polymerase (RNAPI) with actinomycin D (which intercalates into DNA that is being transcribed and immobilizes the polymerase) results in the formation of 'mini-nucleoli' when cells exit mitosis [21]. Mini-nucleoli contain NORs, but other nucleolar components are distributed to discrete structures, or 'caps', on the mini-nucleolar surface. Removal of actinomycin D and the initiation of RNAPI transcription restores nucleolar morphology, showing that transcription itself has an important role in the organization of nuclear architecture. The nucleolus may represent the first observed specialized 'transcription factory' that can form a trans interaction network with a specific function.

Figure 2
figure 2

NORs cluster as cells exit mitosis. (a) The short arms of acrocentric chromosomes 13, 14, 15, 21 and 22 contain NORs, which are separated during mitosis. (b) As cells exit mitosis and the nuclear membrane begins to reform, chromosomes begin to decondense. (c) Loops of chromatin may extend away from the core of the territory. (d) As G1 phase is established and nucleoli form, loops of NOR-containing chromatin co-associate with the other components of the nucleolus and ribosomal DNA gene transcription is initiated.

RNA polymerase II (RNAPII)-transcribed genes, which represent the majority of protein coding genes, also engage in long-range transcription-dependent associations [22, 23]. Transcriptionally active genes, such as those genes involved with globin synthesis and regulation, have been shown to colocalize with shared RNAPII foci [22, 24] (Figure 3a). Co-regulated genes in cis and in trans share RNAPII foci with each other at higher frequencies than they do with other transcribed genes, suggesting the presence of large-scale transcription networks [24]. These preferential interactions occur at nuclear subcompartments containing high local concentrations of hyperphosphorylated RNAPII, called transcription factories. Described as protein rich structures of about 10 MDa with an average diameter of about 87 nm, transcription factories contain multiple active RNAPII complexes at one time [2527]. Gene interactions at transcription factories rely on active transcription: heat-shock treatment, which blocks initiation and elongation, resulted in release of genes from factories and disruption of their long-range associations [23]. Treatment with 5,6-dichloro- β-D-ribofuranosylbenzimidazole (DRB), which interferes with phosphorylation of the carboxy-terminal domain of RNAPII and thus inhibits transcriptional elongation but not initiation, did not affect the frequency of gene co-associations [23]. Transcription initiation is therefore critical for the long-range association of genes that are being transcribed. Transcription factories remained after heat shock, consistent with previous results suggesting that factories are meta-stable structures [28]. These findings indicate that the structure and function of transcription factories are fundamental to long-range interactions between genes being transcribed.

Figure 3
figure 3

Colocalization of like-regulated genes and specialized transcription factories. (a) Quadruple-label RNA immuno-FISH of three genes that are being transcribed and their association with RNAPII transcription factories. RNAPII staining is shown on the left and an overlay of the RNAPII staining showing the contributions of the genes is on the right. The side panels show the enlarged images of colocalizing FISH signals, showing that transcription factories can simultaneously transcribe at least three genes, located on different chromosomes. (b) Immunofluorescence detection of Klf1 (red) and RNAPII transcription factories (green), showing the selective and specialized nature of transcription factories. (c) Triple-label RNA immuno-FISH for Hbb and Epb4.9, showing association of these genes at Klf1 foci. All images show definitive erythroid cells and the scale bar in each panel represents 2 μm. Reproduced, with permission, from [24].

Gene clustering through specialized transcription factories

The idea of transcription factories being specialized to transcribe a specific subset of genes in order to achieve high-level gene transcription seems logical and reasonable, because no two regions within the nucleus will contain the same genes or proteins. Early investigations in human cells into the spatial distribution of certain transcription factors (glucocorticoid receptor, Oct1 and E2f-1) revealed only a slight overlap with RNAPII and sites of transcription [29, 30], which the authors [29, 30] argued as evidence against transcription factory specialization. Contrary to this, the Oct1/PTF/transcription (OPT) domain was the first example of a nuclear compartment to be shown to contain high concentrations of interacting transcription factors (PTF1 and Oct1) at a transcription factory, which specifically recruited regions from human chromosomes 6 or 7 in early G1 phase [31]. This suggests that specialization of transcription factories could provide a level of control over genome organization by encouraging specific genes to reside in the same factory. This, along with other studies, gives strong evidence in favor of transcription factory specialization. Examination of cotransfected plasmids in COS7 monkey cells showed that constructs with identical promoters colocalized to the same transcription factory to a higher degree than those with heterologous promoters [32]. Furthermore, the finding that the erythroid transcription factor Klf1 mediates preferential co-associations of Klf1-regulated genes at Klf1-specialized transcription factories provided the first functional evidence that transcription factors could be responsible for the organization of a specific subset of genes at transcription factories [24] (Figure 3b,c).

Despite recent demonstrations of spatial clustering in three dimensions by 3C-based methods and RNA and DNA FISH [12, 24, 33, 34], it is still unclear whether association influences gene transcriptional output. Hu et al. [35] noted the appearance of larger RNA FISH signals in primary human breast epithelial cells from spatially associated genes induced by estrogen receptor (ER)α, suggesting increased transcriptional output from clustered alleles. In addition, long-range association of transcription factor binding sites or co-regulated genes correlated with an increased probability of transcriptional activity of the clustered alleles, suggesting that clustered alleles were more likely to show higher transcriptional activity [24, 36].

Spatial organization of silent chromatin

There are obvious potential incentives to cluster specific genes and chromatin regions. For example, clustering of co-regulated genes in specialized factories may be more efficient in terms of the machinery needed for their expression. The clustering of silent chromatin in the nucleus could also decrease the amount of machinery needed for maintenance. Indeed, heterochromatin has long been observed to form clusters that are distinct from euchromatin within the nucleoplasm. For example, centromeres cluster into chromocenters, visualized by staining with the DNA stain 4',6-diamidino-2-phenylindole (DAPI) or immuno-labeling of centromeric proteins. Clustering of centromeres is unusually pronounced in rodent rod cells, where these regions are gathered in the center of the nucleus surrounded by heterochromatin, which is suggested to reduce diffraction and permit more efficient passing of photons [37]. This clustering demonstrates an extraordinary spatial organization of chromatin for a specific function. Silenced genes have also been observed clustering with pericentromeric heterochromatin [38]. For example, the non-functional, rearranged IgH locus is recruited to centromeres concurrent with transcriptional silencing of its V genes in B cells [39, 40]. This relocalization correlates with dramatic deacetylation of the locus [41], but it is currently unclear whether this deacetylation occurs before or after localization to chromocenters. Telomeres are regions of transcriptionally silent chromatin and have been reported to cluster throughout the nucleoplasm [42]. However, human telomeres with NORs located in their short acrocentric arms cluster separately at the perinucleolar compartment [43], again highlighting spatial localization.

Chromatin clustering may also be mediated through long non-coding RNAs (lncRNAs) such as Xist, Air and Kcnq1ot1, which range in size from 17 to 108 kb. The most studied of these lncRNAs is Xist. Transcription of Xist [43, 44] from one of the two X chromosomes results in the inactivation of that X chromosome in female mammals. The Xist RNA (about 17 kb in length) interacts with the future inactive X chromosome to create a nuclear domain devoid of RNAPII and basal transcription factors such as TFIIH and TFIIF. X-linked genes are recruited into this nuclear domain, correlating with their transcriptional silencing [45]. This internal repositioning of previously active genes is the first structural change following Xist accumulation. Intriguingly, genes that escape X-inactivation are located on the periphery of, or outside the Xist domain [45], presumably interacting with RNAPII and various transcription factors.

lncRNAs have also been implicated in the regulation of imprinted gene clusters. Imprinted genes show effects specific to the parent of origin, in which a single allele (maternal or paternal) is epigenetically silenced during development. Imprinted repression of a selected allele may occur in a similar mechanism to that of Xist. For example, the murine Air (antisense to Igf2r) lncRNA is essential for imprinted allele-specific silencing of the cis-linked solute carrier genes Slc22a3 and Slc22a2 together with Igf2r from the paternal chromosome 17 [46]. The Air RNA forms a cloud within nuclei and interacts, by an unknown mechanism, with the Slc22a3 promoter. Air is also required to target the histone H3 lysine 9 histone methyltransferase G9a to the Slc22a3 promoter [47]. It seems plausible that the Air cloud recruits specific genes into the volume it occupies to induce silencing. Unlike Xist, which induces silencing over the entire X chromosome, Air's influence is restricted to a cluster of genes spanning a 300 kb region immediately adjacent to the Air gene. The structural aspects to how Air functions or what restricts the size of the Air compartment remains unclear. This effect is mirrored by the Kcnq1ot1 lncRNA, which also seems to create a repressive domain that is responsible for repression of a variable number of cis-linked genes in embryonic and placental tissues [4851]. Kcnq1ot1 is an imprinted 50 kb lncRNA transcribed in the antisense direction from within the potassium voltage-gated channel gene, Kcnq1, on mouse chromosome 7. The Kcnq1ot1 repressive domain is larger in placental tissue than in embryonic tissue, and this may be correlated with a higher number of silenced genes in the placenta [49, 50].

lncRNA repression may also occur in trans. The 2.2 kb HOTAIR ncRNA, expressed from the HOXC locus on chromosome 12 in humans, has been shown to be necessary for repression of the HOXD locus, present on chromosome 2 [52]. Although loss of the HOTAIR lncRNA results in the reactivation of the HOXD locus, indicating a potential trans mechanism of gene repression [52], no direct interaction between HOTAIR and the HOXD locus has been observed.

Establishing spatial organization

Spatial genome organization implies movement. The tissue-specific clustering of specific genomic elements requires that at some stage chromatin regions must move towards each other, in either a directed or a passive way. As cells exit mitosis and chromosomes decondense, large-scale movements of chromatin domains have been observed [53, 54]; these may result in the repositioning of chromosomal and sub-chromosomal regions to their generalized relative positions. Constrained diffusion [55] or chromatin movements mediated by nuclear actin and myosin [35, 5658] may have a role in refining these positions throughout interphase (Figure 4).

Figure 4
figure 4

Schematic summary of some of the processes and structures that influence the spatial organization of the genome. Although not exhaustive, the figure depicts: (a) chromosome territories; (b) nucleoli and genomic regions clustering through nucleolar organizing regions (NOR); (c) the X chromosome and Xist RNA; (d) regulatory proteins such as CTCF, transcription factors and Polycomb repressive complexes (PRCs) that can induce loops between genomic elements; (e) transcription factories (blue) and specialized transcription factories (red); (f) the potential role of nuclear actin in mediating long-range chromatin movement; and (g) the interactions of chromatin regions with the nuclear lamina. These processes, along with others described in this article and many more, are likely be important in dynamically shaping the spatial environment and organization of the genome.

The organization of the genome as it is transcribed is achieved to a large extent through interactions of genes with transcription factories. Although it is not known how factories form, the pulsatile nature of individual gene transcription during interphase [59, 60], which seems to involve dynamic gene associations with factories [17, 22], suggests two possible models to describe how specialized factories are established. In a deterministic factory model, specific key transcription factors (such as Klf1) are directed to or become concentrated at a subset of factories. Genes requiring that particular factor for transcription would then need to move to those factories to become active. In the second model, referred to as the self-organization model, genes and their bound regulatory factors stochastically engage factories in their local environment. Specialization may occur when several similarly regulated genes associate with the same factory simultaneously. This may stabilize their presence at the shared factory through factor sharing, in other words the increased local concentration of specific regulatory factors may increase occupancy at key regulatory sites on the clustered genes, thus promoting their reinitiation and stabilizing their co-association. There is little evidence in favor of either model at the moment. The deterministic model requires some mechanism to direct specific factors to a subset of factories, suggesting that differences in factories must precede their specialization. In the self-organization model, all factories may start out being equal but then may become specialized, perhaps transiently by character of the transcription units engaged there.

Evidence in favor of the self-organization model can be seen in a population of virally infected cells: the quickest cells to respond by producing IFN-β are those in which the IFN-β gene is in close physical proximity with other genetic loci that bind the NF-κB transcription factor [36]. NF-κB induces the formation of the enhanceosome multiprotein complex, which binds upstream of the IFN-β promoter and interacts with the transcriptional machinery necessary for the induction of the IFN-β gene. The formation of the enhanceosome at the IFN-β promoter is more likely to occur if one NF-κB-dependent gene is in close physical proximity to another NF-κB-dependent gene, thereby enabling these loci to establish an environment that favors transcription [36]. This supports a role for transcription factors mediating chromosomal interactions specific for the tissue and stimulus involved. Such transcriptional organization of genes may also be mediated by other proteins that are not part of the core transcriptional apparatus, such as the CCCTC-binding factor (CTCF) and Polycomb repressive complexes (PRCs).

Some proteins may have a structural role in maintenance of genome conformation. CTCF is a highly conserved vertebrate transcription regulator that has been reported to bind at many thousands of sites in multiple genomes [6165]. This binding does not seem to correlate to specific networks of genes, but CTCF has been suggested to mediate chromatin interactomes [66]. Indeed, CTCF binding has been suggested to silence the maternally inherited Igf2 allele [67], form active chromatin hubs [68], and establish cytokine-induced loops within the human MHC class II locus [69]. Furthermore, CTCF interacts with a large number of nuclear proteins ranging from transcription factors to structural proteins [70]. Cohesin, which is a key component for holding sister chromatids together and which is implicated in several diseases, has been shown to bind to about 70% of all CTCF sites in the human genome [71]. Specifically, CTCF mediates cohesin binding [72], and this interaction has been suggested to impart cell-type-specific intrachromosomal interactions at the developmentally regulated human cytokine locus IFN-γ [72] and the apolipoprotein A1/C3/A4/A5 gene region on human chromosome 11 [73]. These processes suggest a multifunctional role of CTCF in the organization of the genome, adding another organizational layer of complexity.

Repressive domains and complexes may also provide a structural component for establishing long-range interactions and organizing the genome. For example, genome-wide studies have revealed that PRCs associate with promoter regions of some developmentally regulated and silenced genes [74, 75]. Evidence to support long-range interactions through PRCs comes from studies investigating Polycomb response elements (PREs), which allow the recruitment of PRCs to target genes through DNA binding proteins [76]. Fab-7 is a Drosophila regulatory element containing a PRE that contributes to regulated spatial transcription of the Abdominal-B gene of the Drosophila bithorax complex [77, 78]. The endogenous Fab-7 PRE has been shown to interact with transgenic Fab-7 elements inserted at heterologous sites [79], highlighting specific long-range PRE-mediated chromatin interactions. Similarly, Mcp, another PRE containing regulatory element from the Drosophila bithorax complex, can interact with other remote copies of Mcp elements in the genome [80]. These results provided direct evidence that regulatory elements can promote sequence-specific long-range chromosomal interactions, suggesting that PRCs are likely to provide another mechanism for organizing the genome.

Recently, the roles of nuclear actin and myosin have generated considerable interest in the organization of the mammalian genome. Data strongly indicate that nuclear actin is involved in gene transcription by all three polymerases [81]. Long-range directed interphase chromatin movement seems to require actin polymerization, as the expression of mutant actin that cannot polymerize prevents chromatin relocation [56, 57]. Nuclear actin and nuclear myosin I have also been implicated in mediating interchromosomal interactions between the ERα-dependent genes [35] and in repositioning of selected chromosomes during serum starvation [58].

Spatial organization and the future

Here, we have focused on the relationships between transcription, silencing and the three-dimensional organization of the genome (Figure 4). This is at the expense of other structures that also contribute to the genome's organization, such as the nuclear lamina [82, 83]. In summary, it is apparent that the genome is arranged in a non-random, cell- and tissue-specific manner that is suited for various nuclear functions. Highly expressed housekeeping genes are often organized in the linear genome in RIDGES (regions of increased gene expression) [84], but linear clustering of tissue-specific genes is not evident [85]. Although clustering of housekeeping genes may be favored in a two-dimensional arrangement along the chromosome, clustering of tissue-specific genes is evident only in three dimensions across the nucleus [12, 24, 33], presumably reflecting transcriptional and other regulatory requirements. It is clear that the local folding of chromatin, for example between a gene and long-range enhancer or between PREs, is a critical determinant of gene expression. The way these regions interact with other regions of the same chromosome, some of which may be similarly regulated, also seems to be important for function. Similarly, the way these chromosomal regions interact with regions on other chromosomes will undoubtedly affect spatial genome organization, but it may also be important in contributing to tissue-specific gene expression programs. It is likely that three-dimensional organization is an important missing link in understanding how the genome is regulated; unraveling this organization is a major challenge for the future.