Introduction

In placental mammals, individuals carrying an X and a Y chromosome develop as males, whereas XX animals develop as females. The Y chromosome contains only a small number of genes, most of them male-specific. The X chromosome, in contrast, contains around 1,000 genes, posing an enormous copy number imbalance between the sexes. The potentially detrimental effects of copy number imbalances are evidenced by autosomal copy number changes, which invariably result in embryonic lethality or severe developmental defects. Nevertheless, the twofold difference in X chromosome number between males and females is part of normal development, and inheritance of aberrant X copy numbers (as in e.g. XXX or XO females, or XXY males) results in phenotypes that are relatively mild compared with autosomal aneuploidy. The reason for this exceptional behavior of X chromosomes is that, while an X- and Y-chromosomal system evolved to discriminate between the sexes, dosage compensation mechanisms co-evolved to counteract the detrimental effects of the associated copy number variations in hundreds of X chromosomal genes.

Dosage compensation is a mechanism that corrects for the sex-chromosomal dosage differences between the sexes. In 1961, Mary Lyon was one of the first to suggest that dosage compensation in mice occurs by genetic inactivation of one of the two X chromosomes in female cells (Lyon 1961). Earlier studies had reported that nerve cell nuclei from female cats have one chromosome that is structurally distinct and characterized by distinct nuclear morphology visible as a dense heterochromatic region, also known as the Barr body (Barr and Bertram 1949). Other experiments revealed that in female rat liver cells, the Barr body represents one X chromosome whereas the other appears euchromatic like the autosomes (Ohno et al. 1959). Furthermore, mice with a single X chromosome (XO) were found to be phenotypically normal, suggesting that one X is sufficient for normal viability (Welshons and Russell 1959). Lyon suggested that one of the two X chromosomes in female mice is subject to genetically programmed, random inactivation. This theory was supported by the mosaic appearance of female mice that are heterozygous for an X-linked fur color gene: random inactivation of one X chromosome in each cell in the early embryo, followed by clonal expansion accounts for this observation (Lyon 1961).

X chromosome inactivation (XCI) in females thus leads to similar transcription levels of X-chromosomal genes between males and females, who now both express genes from a single X chromosome. However, not all genes on the X are inactivated: genes in the pseudo-autosomal region (PAR), the region of the X homologous to the Y and responsible for XY-pairing during meiosis, as well as a fair number of individual genes on the X are not inactivated. The latter genes are called escapers and it has been estimated that 15–20% of human X-linked genes completely escape inactivation, and another 10% escape partially (Carrel and Willard 2005). PAR genes are expressed from two copies in both males and females, whereas escapers that lack a functional Y homolog are differentially expressed between the sexes. The PAR genes together with the escapers likely account for the phenotypes observed in for example XO and XXX females.

XCI occurs in all marsupials and placental mammals during early development. Interestingly, XCI is not the only solution to compensate for sex chromosome dosage differences; other species have developed completely different approaches to solve the same problem (Fig. 1a–d). In the fruit fly Drosophila melanogaster, male (XY) individuals increase expression of their single X chromosome twofold to meet expression levels of their female (XX) counterparts (Gelbart and Kuroda 2009). In the nematode Caenorhabditis elegans, males have a single X chromosome (XO) and XX individuals are hermaphrodites. Here, XX hermaphrodites reduce transcription levels from both X chromosomes by half to achieve similar transcription levels as in XO males (Meyer et al. 2004). Despite the different approaches, mice, worms and flies have in common that in one sex, specialized, X-specific complexes (dosage compensation complex, DCC) composed of RNA and/or protein target an entire chromosome for stable and inheritable changes in transcription levels through epigenetic modifications (Fig. 1a–d).

Fig. 1
figure 1

Mammalian, Caenorhabditis elegans and Drosophila melanogaster dosage compensation. a Table comparing features of dosage compensation between the three species. Proteins indicated with an asterisk are not specific for dosage compensation and are also a part of other complexes or cellular processes. bd Images showing X chromosome-wide localization of dosage compensation components. b Mouse differentiating ES cell; the inactive X chromosome is coated by Xist RNA (green; RNA FISH for Xist). c Both X chromosomes in a C. elegans embryonic hermaphrodite nucleus are bound by the dosage compensation complex (DCC) (green; DCC component DPY-27, courtesy of Te Wen Lo and Barbara J. Meyer). d Male Drosophila cell with polytene chromosomes, showing the DCC targeting the X chromosome in green (green; DCC component MSL2, courtesy of Ina Dahlsveen and Peter Becker)

In this review, we discuss what is currently known about the initiation and establishment of XCI in placental mammals. We focus on the role of Xist, XCI’s central player, and briefly discuss how knowledge from invertebrate species may help to gain new insight in mammalian dosage compensation.

XCI initiation

In mice, XCI is initiated in the early embryo in two rounds. At an early developmental stage, around the 4- to 8-cell stage, the paternal X chromosome is inactivated in all cells of the developing embryo (imprinted XCI) (Huynh and Lee 2003; Okamoto et al. 2004). Later in development, this chromosome becomes reactivated in the inner cell mass, but remains inactive in the extra-embryonic tissues. A second round of XCI then occurs in the developing embryo proper around embryonic day 5.5. In inbred mouse strains, the choice of the X to be inactivated is random this time; the paternal and maternal X chromosomes now have equal chances of becoming inactivated (random XCI). In interspecies crosses, preferred inactivation of either the paternal or maternal X chromosome (skewing) may occur. Once the choice of the X chromosome to inactivate has been made, the inactive X is propagated clonally to daughter cells.

Xist and Tsix, the master regulators of X inactivation

Central to XCI in mammals is the long, non-coding RNA Xist (X-inactive specific transcript). It is transcribed from the Xist gene, which lies in a region on the X chromosome called the X inactivation center (Xic), containing clustered genes and regulatory sequences involved in the X inactivation process (Fig. 2a). Xist is spliced and polyadenylated and, during XCI onset, becomes transcribed only from the future inactive X chromosome (Xi) (Borsani et al. 1991; Brockdorff et al. 1991, 1992; Brown 1991). The processed Xist transcript coats the Xi in cis (Brown et al. 1992) and recruits chromatin remodeling complexes including PRC2, which trimethylates lysine 27 on histone H3 (H3K27me3), a hallmark of facultative heterochromatin (Chadwick and Willard 2004; Mak et al. 2002; Plath et al. 2003; Silva et al. 2003; Zhao et al. 2008). Xist is absolutely essential for initiation of XCI and covers the Xi in all differentiated somatic cells, resulting in Xist RNA associating with the Xi, forming typical “clouds” when visualized by RNA FISH (Fig. 1b) (Brown et al. 1992). Once the Xi has been completely silenced, the silent state is stably inherited and can not be reversed. Tight regulation of Xist transcription to ensure inactivation of a single X chromosome only in females is, therefore, essential.

Fig. 2
figure 2

Features of the X-inactivation center (Xic) and Xist in mouse. a Schematic overview of the location of the X inactivation center (Xic) on X (top panel) and the genes contained in this region (second panel). The third panel shows the overlapping transcripts Xist and Tsix. Tsix has two annotated promoters, the one downstream being the major promoter. The last panel is a schematic overview of the Xist transcript with repetitive domains indicated in yellow (b). Overview of the repetitive regions in Xist indicated in a

In mice, antagonizing Xist function is Tsix RNA, which is transcribed in the antisense orientation from Xist and fully overlaps with the Xist gene (Fig. 2a) (Lee et al. 1999). Tsix is also a non-coding RNA, is transcribed from the active X (Xa) before and during XCI onset (Lee et al. 1999), and inhibits Xist expression in cis by several mechanisms. First, inhibition may occur by transcriptional interference (Luikenhuis et al. 2001; Sado et al. 2006; Shibata and Lee 2004). Second, Xist/Tsix duplex RNA formation and processing by the RNA interference pathway may play a role by siRNA-mediated deposition of chromatin remodeling complexes (Ogawa et al. 2008). Also, recruitment of chromatin remodeling complexes by the Tsix RNA to the Xist promoter has been postulated as a possible mechanism for Tsix-mediated repression of Xist (Sun et al. 2006). Finally, Tsix is involved in pairing of the two X chromosomes, a process which has been implicated in initiation of XCI (Bacher et al. 2006; Xu et al. 2006). Deletion or truncation of Tsix leads to up-regulation of Xist in cis and skewed XCI with preferential inactivation of the mutated allele (Lee and Lu 1999). Impaired transcription of Tsix has also been reported to lead to ectopic XCI in male cells (Luikenhuis et al. 2001; Sado et al. 2002; Vigneau et al. 2006), although one study indicated absence of XCI in Tsix mutant male ES cells (Lee and Lu 1999). The discrepancy between these studies is most likely caused by differences in differentiation protocols which has recently been shown to lead to altered expression levels of key XCI regulators, including OCT4 (Ahn and Lee 2010).

Xist and Tsix are the master regulatory switch genes in XCI. Interestingly, in female cells with a heterozygous deletion encompassing both genes that includes Xite, a positive regulator of Tsix located upstream of Tsix, XCI is still initiated on the wild type X chromosome (Monkhorst et al. 2008). This suggests that activation of Xist is regulated by other factors, but how? In C. elegans, autosomal and X-linked regulators play a key role in the counting process, by determining the relative number of X chromosomes. Here, initiation of dosage compensation is determined by the balance between autosomal and X-chromosomal signal elements (Powell et al. 2005). X-linked activators thus counteract the effect of autosomal inhibitors of dosage compensation. When the ratio of X-linked versus autosomal signal elements is 1, as in XX hermaphrodites, the dosage compensation machinery is turned on. However, if the ratio is lower than 1, as in XO males (X:A ratio 0.5), the concentration of X-linked activators is not sufficient to overcome repression of the dosage compensation machinery by autosomal inhibitors (Meyer 2000).

Activators of X chromosome inactivation

Several recent findings support a role for X-linked activators and autosomally encoded inhibitors in the regulation of mammalian XCI. The first indications came from studies with triploid and tetraploid mouse ES cell lines generated by cell fusion experiments. Analysis of XXXX, XXXY and XXYY tetraploid ES cells after differentiation showed that a single X chromosome remains active for each diploid autosome set (Monkhorst et al. 2008; Takagi 1983, 1993), as was found for mouse tetraploid embryos (Webb et al. 1992). Comparison of XCI kinetics in these different tetraploid, and also XXY triploid ES cells, indicated an important role for the X:A ratio in the probability to initiate XCI, suggesting the presence of an X-encoded activator of XCI (Monkhorst et al. 2009). The first activator, the E3 ubiquitin ligase RNF12/RLIM, was recently discovered and is one of the few known protein-coding—rather than RNA—regulators of Xist (Jonkers et al. 2009). The Rnf12 gene is located approximately 500 kb upstream of Xist (Fig. 2a) and the encoded protein stimulates Xist expression in a dose-dependent manner. RNF12 expression from a single X chromosome in males is insufficient to activate Xist, whereas the double dose in females is sufficient to initiate XCI. In contrast to Xist and Tsix, RNF12 acts in trans and activates Xist on both X chromosomes. Once the inactivation process is started on one X and silencing spreads over the chromosome, Rnf12 will also become silenced in cis. Given a relatively short half-life for RNF12, this results in an RNF12 expression level that equals that in male cells, and this is too low to activate Xist on the other X. Because initiation of XCI is driven by stochastic processes, and the feedback after XCI initiation is rapid, most XX female cells will initiate XCI on a single X chromosome only (Monkhorst et al. 2008). As would be expected for a trans-acting activator, overexpression of Rnf12 triggers XCI in male cells and leads to inactivation of both X chromosomes in a high percentage of female cells (Jonkers et al. 2009). Also, only very few Rnf12−/− cells initiate XCI (Barakat et al. 2011). A different study also reported impaired imprinted XCI in cells carrying a maternally inherited Rnf12 deletion, but observed a milder effect on random XCI (Shin et al. 2010), possibly as a consequence of differences in expression of other XCI-activators and -inhibitors. Although the target of the E3 ubiquitin ligase RNF12 remains elusive, transgenic studies indicate that Xist is the major downstream target of RNF12 (Barakat et al. 2011). Unexpectedly, XCI is skewed toward the mutated X chromosome in Rnf12+/− female ES cells (Barakat et al. 2011), despite the absence of a phenotype for this mutation in male Rnf12−/Y mice (Shin et al. 2010). This suggests that RNF12 is required for persistent Xist expression, at least during the window when XCI is established (Wutz and Jaenisch 2000). A continuous requirement for RNF12 for the activation of Xist and for maintenance of XCI may also explain why Rnf12+/− female embryos which maternally inherit the mutated allele display severe growth defects: maternal inheritance of the mutated allele would result in an Rnf12 null embryo because the wild type paternal X becomes inactivated during imprinted XCI. Due to the complete absence of Rnf12 expression that follows, these embryos may then be unable to maintain Xist expression and imprinted XCI, explaining the reported early lethality of these mice (Shin et al. 2010).

Many other positive regulators of Xist can be found in the Xic region and include the non-coding RNAs Ftx and Jpx, and the pairing element Xpr (Augui et al. 2007; Chureau et al. 2011; Sun et al. 2006; Tian et al. 2010). The region of the Xic upstream of Xist—including Ftx, Jpx and Xpr—has been shown to be enriched for H3K9 and H3K27 di- and trimethylation, respectively (Heard et al. 2001; Rougeulle et al. 2004). These epigenetic marks may contribute to transcriptional regulation of one or more genes in the region, including Xist itself, and may result from ongoing bidirectional transcription in this region as well as from immediate recruitment of PRC2 by nascent Xist transcripts. The Xpr region is involved in pairing of the two X chromosomes at the onset of XCI (Augui et al. 2007). Pairing of the Xist/Tsix region has been implicated to play an important role in the initiation of XCI (Bacher et al. 2006; Xu et al. 2006), although a regulatory role for direct interaction of two X chromosomes in XCI remains to be determined.

Deletion of Ftx in male mouse ES cells is associated with reduced transcription of Xist, Tsix and Jpx (Chureau et al. 2011). These findings could indicate a direct role for Ftx in Xist activation, but may also be explained by Ftx-mediated global activation of the Xic region. A similar role may be attributed to Jpx/Enox, which encodes another non-coding long RNA, and is located just upstream of Xist. Jpx/Enox has been shown to activate Xist in trans, possibly by interfering with Tsix (Tian et al. 2010). However, unlike Rnf12 transgenic lines, male cell lines with Ftx or Jpx/Enox transgenes did not show induction of XCI on the endogenous X chromosome (Jonkers et al. 2009), arguing against a role in trans for these genes. Similar results were obtained with Xpr transgenic male ES cell lines (Jonkers et al. 2009). Nevertheless, studies with Xist YAC transgenic male ES cell lines covering both Ftx and Jpx/Enox showed induction of XCI on the endogenous X chromosome in a small percentage of cells, but only in multicopy ES cell lines (Heard et al. 1999). These findings suggest that Ftx and Jpx/Enox may require additional factors for their trans-activating properties, and also indicate that RNF12 is a more potent activator of the XCI process.

Inhibitors of X chromosome inactivation

Other important regulators of XCI are the key pluripotency factors NANOG, OCT4, KLF4, REX1 and SOX2, and the reprogramming factor cMYC, acting as autosomally encoded inhibitors of XCI (Donohoe et al. 2009; Navarro et al. 2008, 2010). Binding of different combinations of these pluripotency factors at different locations in the locus can either result in repression of Xist or in activation of Tsix or Xite. NANOG, OCT4 and SOX2 bind the intron 1 region of Xist and binding of these factors has been implicated in direct repression of Xist (Navarro et al. 2008). OCT4 is also recruited to the Tsix regulatory region, and binds the Xite promoter region together with SOX2 (Donohoe et al. 2009). Recruitment of these factors has been implicated in X chromosome pairing and in activation of Tsix, although binding of OCT4 and SOX2 to these specific regions has been disputed by others (Navarro et al. 2010). REX1, KLF4, and cMYC are recruited to the DXPas34 region, a regulatory region in Tsix, and are involved in Tsix activation (Navarro et al. 2010), together with YY1 and CTCF (Donohoe et al. 2007). The repression of Xist is released upon differentiation, as the concentration of pluripotency factors drops, linking XCI to the pluripotent state and differentiation (Navarro et al. 2008). Interestingly, a deletion encompassing the Xist intron 1 region that recruits OCT4, SOX2 and NANOG, only has a mild effect on XCI, leading to skewed XCI at later stages of ES cell differentiation (Barakat et al. 2011). Xist expression in undifferentiated heterozygous intron 1+/− female ES cells was not affected, indicating that ES cell specific transcription factors act in concert to inhibit Xist expression by binding to various sites throughout the Xist, Tsix and Xite genes. Finally, NANOG, OCT4 and SOX2 also have a repressive effect on XCI by binding and inhibiting Rnf12 (Navarro et al. 2011). Altogether, the inhibitors are involved in setting a threshold that has to be overcome by the XCI-activators to induce XCI (Barakat et al. 2010). Indeed, gene ablation experiments of these different factors resulted in ectopic activation of XCI in mutated male cells, supporting a crucial role for these factors in maintaining the threshold for XCI.

Despite this wide plethora of known XCI regulators (see Fig. 3 for an overview), it is likely that more remain to be identified. Evidence for undiscovered activators of XCI comes from the observation that Rnf12+/− heterozygous female mouse ES cells still initiate XCI, albeit at much lower levels (Jonkers et al. 2009), and even Rnf12 null cells still display occasional XCI (Barakat et al. 2011).

Fig. 3
figure 3

Overview of regulators of Xist and XCI. Left activators and inhibitors act through the Xist/Tsix switch region to regulate initiation of XCI. Right XCI is established and maintained by a plethora of histone modifications, bound protein and protein complexes, and RNA specific for the Xi

XCI establishment

As discussed above, the end result of Xist regulation is expression from a single X chromosome in females, which is then to become the Xi. The first observation following Xist expression is coating of the Xi by Xist RNA, exclusion of RNA polymerase II (polII) from the Xist compartment (see below), and gradual accumulation of Xi-specific epigenetic marks.

X chromosome coating by Xist and formation of a nuclear compartment

Spreading of Xist must be restricted to prevent the aberrant inactivation of another X chromosome or possibly even an autosome. Thus, Xist must act in cis and should not spread onto other chromosomes. Xist-tagging experiments have shown that Xist RNA never leaves the territory of the X chromosome from which it is transcribed (Jonkers et al. 2008). How diffusion of Xist is restricted is not known, but the Xist domain forms a distinct nuclear compartment from which RNA polII is excluded. X chromosomal genes are recruited into this domain and subsequently become silenced. The formation of a compartment suggests that structural nuclear factors play a role (Chaumeil et al. 2006; Nakagawa and Prasanth 2011), and at least two proteins that are thought to be components of the nuclear matrix, SAF-A and SATB1, are involved in establishing this compartment (Nakagawa and Prasanth 2011). The nuclear matrix is referred to as the biochemical nuclear structure that is resistant to detergent and high salt treatment and that remains after treatment with nucleases (Berezney 1991). Its composition and even its mere existence are heavily debated. Heterogeneous nuclear proteins (hnRNPs) are important nuclear structural proteins and possible constituents of a nuclear matrix, and are thought to organize the genome by binding to putative matrix-associated regions (MARs) in the DNA. One such hnRNP, SAF-A, colocalizes with Xist and is important for the formation of the Xist nuclear compartment (Fackelmayer 2005; Hasegawa et al. 2010; Helbig and Fackelmayer 2003; Pullirsch et al. 2010). Cells depleted of SAF-A by RNAi fail to form Xist clouds and show lower levels of H3K27me3 (Hasegawa et al. 2010). Another nuclear protein, SATB1—a DNA binding protein involved in nuclear architecture and chromatin looping (Cai et al. 2003, 2006)—also contributes to the framework of the Xist compartment (Agrelo et al. 2009). Although Xist and SatB1 do not colocalize or interact directly, SATB1 forms a ring-like structure that contains Xist RNA in thymic cells. SATB1 is required for correct localization of Xist because SATB1 knockdown leads to impaired formation of Xist clouds in these cells (Agrelo et al. 2009). The expression pattern of this protein overlaps with the permissive window of XCI initiation, and overexpression of SatB1 together with Xist in differentiated cells allows gene silencing to occur (Agrelo et al. 2009). The structural function of SAF-A and SATB1 suggests that they contribute to XCI through the formation of the silent nuclear compartment. A role for nuclear structural proteins is further supported by the finding that the compartment is formed independently of DNA, since DNase treatment does not disrupt the Xist localization pattern (Clemson et al. 1996). However, the nature of the molecular interactions between SAF-A, SATB1 and Xist remain to be determined.

The Xist nuclear compartment is depleted of RNA polII, and initial gene silencing is established by relocalization of active genes into this RNA polII-depleted area, followed by the epigenetic modifications that contribute to long-term silencing (Chaumeil et al. 2006). Thus, genes cease to be transcribed before the appearance of silent chromatin marks such as H3K27me3. Interestingly, it was recently shown that lack of transcription is sufficient to trigger chromatin modifications. Treatment of cells with actinomycin D, which binds the transcription initiation complex and inhibits elongation, markedly reduces the size of the nuclear territory occupied by Xa, approaching the size of the Xi. Nevertheless, the basal packaging of chromatin in 30 nm fibers remains unaffected. This shows that transcription inhibition leads to chromatin compaction and that this occurs at a higher level of compaction than the 30 nm fiber (Naughton et al. 2010). Relocalization into an RNA polII-depleted area may thus be sufficient to establish some of the chromatin marks associated with the Xi, although clearly factors and events are required for stable maintenance of the Xi.

Epigenetic marks associated with XCI

During the establishment of the nuclear compartment, PRC2 (polycomb repressive complex 2) is recruited to the Xi by Xist. PRC2 is composed of the protein subunits SUZ12, EED and EZH2, and it trimethylates lysine 27 of histone H3 (H3K27me3), a hallmark of inactive chromatin. This inactivating mark first appears on active genes (Marks et al. 2009). Active genes are generally characterized by H3K4me3 enrichment in their promoters, and the deposition of H3K27me3 here may thus result in the simultaneous occurrence of opposing (active vs. inactive) chromatin marks, on the Xa and Xi, or may even be present transiently on promoters of active genes on the future Xi. Interestingly, such a bivalent state is typical for many developmental genes in ES cells (Azuara et al. 2006; Bernstein et al. 2006; Pan et al. 2007; Pasini et al. 2008). Next, the H3K4me3 mark is gradually lost (Marks et al. 2009), followed by incorporation of histone macroH2A (Costanzi and Pehrson 1998; Mermoud et al. 1999), enrichment for H3K9me2 (Heard et al. 2001; Mermoud et al. 2002; Peters et al. 2002), ubiquitylation of H2AK119 (de Napoles et al. 2007; Smith et al. 2004), H4K20 methylation (Kohlmaier et al. 2004), DNA methylation (Norris et al. 1991) and hypoacetylation of histone H4 (Jeppesen and Turner 1993) (Fig. 3). The Xi also becomes enriched for PRC1, which is associated with PRC2 and H3K27me3 (Plath et al. 2004). Surprisingly, Xist is not required for the maintenance of the Xi once the inactive state is established. Conditional deletion of Xist in differentiated cells leads to loss of macroH2A incorporation and loss of H3K27me3, but the Xi is not reactivated (Csankovszki et al. 1999; Kohlmaier et al. 2004). In fact, no conditions, except reprogramming, have been found to date that can completely reactivate the Xi. Only harsh and highly artificial conditions involving conditional knockout of Xist, combined with chemical treatments that remove DNA methylation and inhibit hypoacetylation, have been found to lead to limited reactivation of a GFP transgene on the Xi (Csankovszki et al. 2001). Thus, some epigenetic modifications persist in the absence of Xist and are sufficient to maintain the inactive state.

The differential regulation of XCI initiation, establishment and maintenance is further supported by the finding that overexpression of Xist cannot induce XCI in cells once they have differentiated (Kohlmaier et al. 2004; Wutz and Jaenisch 2000). This led to the suggestion that stem cells and early embryos go through an “XCI permissive state” during early differentiation. Only in this time window, Xist is required for and capable of XCI. Proteins that are key to stem cell identity, such as OCT4, REX1, SOX2 and NANOG, may play a role in defining this window, as these also regulate Xist and Tsix expression directly and indirectly (Navarro et al. 2008, 2010).

Conservation of Xist RNA structure and functional elements

Besides a role for nuclear structural proteins in preventing diffusion of Xist throughout the nucleus, self-aggregation properties may also be a feature of Xist. Although predictions have been made regarding the secondary structure of repetitive sequences in Xist, no models currently exist for folding of the complete Xist RNA. Whether Xist’s overall structure is conserved thus remains unknown, but conserved sequence elements may aid in understanding the intriguing characteristics of Xist. Poor overall sequence conservation of Xist between mice, humans and other mammals provides little information on functionality, and suggests that secondary structure is more important for Xist function than the primary sequence. In Drosophila dosage compensation, two non-coding RNAs—roX1 and roX2—play a central role in targeting the DCC to the X chromosome (Fig. 1). Also here, secondary structure seems important, despite these two RNAs being fully redundant; they share almost no sequence similarity (Meller and Rattner 2002). Nevertheless, some parts of Xist are conserved at the sequence level, suggestive of functional elements. Some of these are highly repetitive and are referred to as repeat A–E (Fig. 2b). In addition, the fourth exon of Xist is well conserved.

The conserved repeat A is composed of nine A-rich repeats and is the best characterized region of Xist (Fig. 2a, b) (Nesterova et al. 2001). Several predictions have been made regarding the secondary structure of this repeat, all involving the formation of hairpin structures. It is required for the silencing function of Xist by serving as a recognition and binding site for PRC2 (Maenner et al. 2010; Zhao et al. 2008). Also, the region may dimerize with other A-repeats (Duszczyk et al. 2008). It is unlikely that repeat A contributes to aggregation and localization of Xist, since studies using constructs expressing a mutant form of Xist have shown that Xist localizes normally in the absence of the A-repeat, but lacks silencing activity (Royce-Tolland et al. 2010; Wutz et al. 2002). Others have targeted the endogenous Xist locus for deletion of the A-repeat and showed that this sequence is also required for Xist expression, and deletion leads to ectopic Tsix expression in the pre-implantation embryo (Hoki et al. 2009). Interestingly, a 1.6 kb-long non-coding RNA called RepA is transcribed from the A-repeat region. This RNA recruits the PRC2 complex through interaction with the EZH2 subunit, and may itself be involved in the initiation of XCI (Zhao et al. 2008).

Seemingly contradicting data have been published regarding the role of repeat C (Fig. 2b). Wutz et al. (2002) showed that an Xist transgene with a deletion of repeat C localizes normally to DNA, and is also still capable of silencing. However, another study recently reported that blocking the same repeat C with an LNA probe (locked nucleic acid—an antisense probe with high melting temperature that stably binds the target DNA) completely disrupts Xist localization (Sarma et al. 2010). Since deletion of the same repeat does not affect localization, the mislocalization of LNA-targeted Xist may represent an indirect effect: the LNA may interfere with secondary structure formation and affect the global folding of the molecule, leading to impaired localization, whereas the C-repeat itself is not required for localization.

The other conserved sequences in Xist are less extensively characterized and seem to serve redundant functions (Fig. 2b). Deletion of repeats B, C, D and E showed unaffected localization patterns. Even combined deletion of several repeats simultaneously barely affects the localization pattern (Fig. 2b) (Wutz et al. 2002). Only deletion products lacking repeat A in combination with one or more of the other repeats show impaired localization. For repeat B, an effect on promoter activity has been reported (Hendrich et al. 1997). An Xist allele containing an inversion that includes repeat D (Xist INV) showed compromised mutant Xist localization and reduced silencing efficiency. Although random XCI initially occurs in heterozygous Xist INV/WT cells, the mutant Xist cannot sustain XCI and cells that initially inactivated this allele are gradually selected against and lost from the cell population (Senner et al. 2011). Similar to what was discussed for repeat C, deletion of repeat D was found to lead to less severe phenotypes (Wutz et al. 2002), suggesting that the overall structure of Xist is affected by the inversion. Another highly conserved region in Xist is exon IV. Deletion of this exon does not lead to detectable XCI phenotypes, except for a slight reduction in expression of the mutant Xist transcript (Caparros et al. 2002). However, this does not result in skewing such as observed for mutations in Tsix (Lee and Lu 1999).

Another conserved feature of Xist RNA is its length. A fair number of non-coding RNAs have been identified, many of them functioning in the establishment of epigenetic modifications, imprinting of genes and allele-specific expression. However, measuring 17 kb in mouse and over 19 kb in humans, Xist is the longest functional non-coding RNA described, making the poor sequence conservation even more mysterious. Secondary structure formation and size may both contribute to Xist function by unknown mechanisms. Speculatively, these mechanisms include aggregation and intermolecular interactions, or Xist might even function as a ribozyme. Finally, the size of Xist combined with a complicated secondary structure could act to limit its diffusion through the nucleus and restrict its localization to the chromosome from which it is transcribed—the nuclear matrix possibly serving as a physical barrier.

Spreading of Xist

How does Xist recognize and bind to the X chromosome? In mammals, no specific sequences that designate the X chromosome for dosage compensation are known. In Drosophila and C. elegans, however, sequences important for X chromosome identity have been defined that are involved in targeting the DCC to the X chromosomes. The mechanisms of X-recognition, binding and spreading show striking overlap in these species despite the opposing effects of dosage compensation (activation vs. silencing) (Fig. 1a–d). Sites that specify the X chromosome are necessary in both species because dosage compensation is regulated by trans-acting factors. For example, although the roX genes are X-linked in Drosophila, they act in trans since roX gene expression from autosomal transgenes drives correct assembly and spreading of the DCC on X, although autosomal spreading is also observed (Meller and Rattner 2002). Targeting to the X chromosome is driven by high affinity DCC binding sites called chromatin entry sites (CES), which share a 150 bp motif (Oh et al. 2003, 2004). Although the motif itself is only slightly enriched on the X chromosome compared with autosomes, its positioning downstream of active genes is highly specific for the X chromosome. Higher concentrations of the DCC have been demonstrated to enable occupation of lower-affinity binding sites (Fagegaltier and Baker 2004; Park et al. 2002). Strikingly, the highest affinity binding sites are the roX genes themselves (Kelley et al. 1999). This suggests that complexes are assembled at the site of roX RNA transcription, after which mature DCCs can spread to other sites on the X chromosome (Smith et al. 2001). Interestingly, spreading of the DCC onto autosomal sequences is highly correlated with gene activity, suggesting a role for active chromatin modifications, RNA polII or open chromatin in DCC spreading (Larschan et al. 2007) (Fig. 1a). Sequence motifs for DCC binding have also been identified in C. elegans (McDonel et al. 2006). The sites were mapped by extensive analysis of extra-chromosomal arrays carrying X-sequences for their ability to recruit the DCC (Csankovszki et al. 2004). Termed rex (recruiting element on X) sites, they are capable of DCC recruitment when integrated on autosomes. Their specificity was later confirmed and extended by ChIP-chip analysis, which showed preferred binding of the DCC to promoter regions (Ercan et al. 2007; Jans et al. 2009). The rex sites cooperate with so-called dox sites (dependent on X), which cannot recruit the DCC when detached from X, but are essential for spreading of the DCC (Jans et al. 2009).

The use of trans-acting initiation factors and lack of a cis-acting factor like Xist in flies and worms results in the need to target dosage compensation to X chromosomes by X-specific sequences. Importantly, flies and worms lack an equivalent of mammalian choice: once the decision to initiate dosage compensation has been made, all X chromosomes in the nucleus are subjected to dosage compensation. Studies of hermaphrodite worms with aberrant X chromosome numbers show that, as in mammals, the X:A ratio determines whether dosage compensation is on or off (Meyer 2000). However, no choice has to be made since all worm X chromosomes are subjected to dosage compensation. A similar situation exists in male flies. In contrast, mammals inactivate a single X and the other X chromosome remains active. The cis-specificity of Xist in mammals may be sufficient to recognize which chromosome is to be inactivated and prevents inactivation in trans. Mono-allelic expression might be the mechanism that omits the need for X-specific sequence elements that recruit dosage compensation elements as in worms and flies. In mammals, specification of the X chromosome for targeting dosage compensation may thus not be required. Nevertheless, sequence elements that promote spreading are potentially important.

LINE-1 or LINE elements (long interspersed elements) are retrotransposons that constitute a large part of mammalian genomes, e.g., 17% in humans (Cordaux and Batzer 2009). Because of their relatively high density on human X chromosomes, they have been suggested to play a role in promoting spreading of Xist (Lyon 1998). Several findings further support a role for LINEs in XCI. First, Xist does not spread efficiently when expressed from autosomes, which are relatively LINE-poor, and spreading is especially inhibited into LINE-poor autosomal areas (Popova et al. 2006; Tang et al. 2010). Furthermore, LINEs seem to be expressed specifically from the Xi (Chow et al. 2010), and Xist interacts with LINE elements directly (Murakami et al. 2009). Furthermore, a computational approach on human X chromosomes found that LINE elements are particularly enriched in the 5′-region of genes that are silenced during XCI (Wang et al. 2006).

The LINE hypothesis remains heavily disputed as many findings counteract the above. First, LINE enrichment could merely be a passive and non-functional consequence of the reduced meiotic recombination rate of the X chromosome compared with autosomes, in view of the lack of X recombination in the male germline. Whereas LINE elements are enriched on human X chromosomes, especially around the Xic region (Bailey et al. 2000), no such enrichment is observed for mouse X chromosomes (Chureau et al. 2002). Also, a South American rodent, Oryzomys palustris, has been reported that appears to lack LINE elements in its genome, and this is not associated with more rapid mutations in Xist (Cantrell et al. 2009). Furthermore, LINE-poor areas are generally gene-rich, and gene-poor areas tend to be LINE-rich. Impaired spreading on autosomes at LINE-poor areas is now attributed to the low LINE-density, but could result from selection against cells that efficiently silence these regions; since these same regions are also gene-rich, their silencing may severely affect cell viability. Cells that fail to silence gene-rich autosomal areas may thus be more efficiently propagated due to selection effects. Finally, Xist spreading at gene-poor LINE elements seem to contradict the observation that hallmarks of XCI are first found on active genes (Marks et al. 2009). Finally, as discussed above, the cis-acting properties of Xist may make the need for spread elements redundant.

Chromatin changes and transcription effects

In mammals, many histone modifications and other chromatin changes cooperate to establish and maintain the silent state. These are well characterized and as mentioned above, include amongst others H3K27me3, macroH2A incorporation, H2Aub and DNA methylation. How these modifications are targeted to specific genes and how others escape silencing is unknown. In flies and worms, epigenetic modifications seem to be less extensive, possibly because dosage compensation here involves fine-tuning of expression levels to a twofold change instead of global silencing, therefore, requiring different mechanisms and different chromatin modifications. In flies, the dosage compensated X chromosome in males is characterized by increased acetylation of lysine 16 on histone 4 (H4K16ac), a hallmark of active chromatin (Turner et al. 1992). Notably, this enrichment is biased toward 3′-ends of genes, whereas H4K16ac normally accumulates at 5′-gene promoters (Gelbart et al. 2009; Kind et al. 2008; Smith et al. 2001). It was recently shown that RNA polII density is also enhanced at 3′-gene ends (Larschan et al. 2011) and the current model thus proposes that the Drosophila DCC acts to enhance transcription elongation, rather than initiation. In addition to the bias toward 3′-gene ends, the DCC also preferably targets active genes for H4K16ac modification on the X chromosome, even when they are from autosomal origin and inserted as transgenes on X (Gorchakov et al. 2009). This bias may be responsible for fine-tuning to twofold increase in expression for differentially expressed genes.

In C. elegans, where transcription of both hermaphrodite X chromosomes is reduced by half, no chromatin modifications associated with the dosage compensated X chromosomes have yet been found (Fig. 1). Depletion of the histone variant H2A.Z (HTZ-1 in C. elegans) from the X chromosome is the only nucleosomal change identified so far. This seems to contribute indirectly to dosage compensation because the relative enrichment on autosomes prevents spreading of the DCC onto autosomes (Petty et al. 2009). One subunit of the DCC contains SMC (structural maintenance of chromosomes) family proteins, and thereby resembles condensin, a protein complex that condenses chromatin in preparation for cell division. This suggests that changes in higher order chromatin structure likely play a role in transcriptional silencing in C. elegans (Csankovszki 2009). Despite the highly different systems, recent findings show some parallels between mammalian and C. elegans dosage compensation. First, an SMC protein was recently found to colocalize with the Xi in mice and was implicated in DNA methylation (Blewitt et al. 2008). Furthermore, a recent study showed that one DCC component, DPY-30, is also part of the gene activating MLL/COMPASS complex (Pferdehirt et al. 2011), which stimulates H3K4me3 in mammals (Jiang et al. 2011). Both complexes bind the same genes in C. elegans despite their opposing effects on transcription (Pferdehirt et al. 2011). Speculatively, DPY-30 may be involved in targeting the DCC to active genes by switching between the two complexes.

Despite the different mechanisms of dosage compensation, the preference for targeting active genes seems to be shared by all three species. In C. elegans, DCC affinity for genes depends on transcriptional activity (Ercan et al. 2009). The DCC is targeted to active genes in Drosophila by the chromodomain subunit of MSL3, which recognizes the active H3K36 trimethyl mark (Larschan et al. 2007; Sural et al. 2008). Chromatin immunoprecipitation (ChIP) against H3K27me3 in differentiating mouse ES cells followed by high-throughput parallel sequencing revealed that the H3K27me3 mark appears first at active promoters, indicating that, also in mammals, active genes are preferred targets of dosage compensation (Marks et al. 2009). Hence, some properties of active genes might serve as a basis for targeting dosage compensation. These properties may involve active chromatin marks, ongoing transcription by RNA polII, or a combination of events.

Perspectives

Many questions about X chromosome inactivation in mammalian species remain to be solved, and knowledge about dosage compensation in worms and flies may be helpful to address different points. In mammalian XCI, unsolved issues include, for example, the working mechanism that underlies the cis-acting specificity of Xist. Also, it is likely that more activators and inhibitors of dosage compensation remain to be identified. Furthermore, the role of LINE elements as cis-acting booster elements for spreading of Xist needs to be confirmed. Finally, the key factors that contribute to set the XCI window and to achieve stable and irreversible silencing need to be elucidated. Identifying these factors would not only provide more insight in X chromosome dosage compensation, but potentially also in gene regulation mechanisms in general. Clearly, X-inactivation research is likely to uncover many more interesting insights in the future that may well extend beyond the field of dosage compensation.