Review

The difference in sex-chromosome make-up between mammalian males (XY) and females (XX) has led to the evolution of two main dosage-compensation mechanisms: upregulation of the active X chromosome (Xa) in both sexes to balance X expression with the autosomes; and inactivation of one X chromosome in females to avoid X hyperexpression and correct for the difference in gene dosage between the sexes [13] (see Table 1). These mechanisms evolved to compensate for the presence of only one copy (haploinsufficiency) of X-linked genes in males due to degeneration of the Y chromosome from its origin as an X homolog [4]. Suppression of recombination between the sex chromosomes was apparently mediated by large Y inversions, as deduced by remnant X/Y homology. This led to Y degeneration due to accumulation of mutations and inability to restore the correct DNA sequence [5, 6]. Only small regions of homology and pairing between the sex chromosomes remain, called pseudoautosomal regions (PARs) because genes within these regions behave like autosomal genes.

Box 1 Regulation of the X chromosome in eutherian mammals

Initiation of X inactivation in female embryos depends on the transcription of the long noncoding RNA XIST/Xist (X-inactive specific transcript) from one chromosome (which will become the inactive X (Xi)) and recruitment of a protein complex important for X-chromosome silencing and heterochromatin formation [7, 8]. In humans, XIST (17 kb in size) is located in the long arm of the X chromosome, whereas in mice where there is only one arm, Xist (15 kb in size) is in the middle of the chromosome. Xist RNA spreads along the X chromosome in cis and recruits a protein complex responsible for deposition of repressive histone modifications onto the Xi [911]. As a result the Xi becomes heterochromatic, silent and condensed. Before implantation, X inactivation is imprinted, with the paternal X chromosome always being silenced. At the blastocyst stage, the paternal X reactivates and random X inactivation takes place (see Table 1).

Although most genes on the Xi are silenced, some genes remain expressed from both the Xa and the Xi. Not surprisingly, genes that retain a Y-linked copy - for example, Kdm5c and Kdm5d (which encode histone demethylases) - escape X inactivation and thus have two expressed alleles in both male and female somatic tissues. However, not all 'escaping' genes have a Y copy, for example Car5b (carbonic anhydrase). Recent reports have shown striking differences between human and mouse regarding the identity and number of these 'escape' genes in somatic tissues [12, 13]. Why are there such species differences? Structural differences between the X chromosomes may play a role as well as selective pressure to maintain sex differences.

Escape from X inactivation is not limited to female somatic cells. Indeed, another type of silencing of the X takes place in male germ cells and is known as meiotic sex chromosome inactivation (MSCI; see Table 1). MSCI results in silencing of protein-coding messenger RNAs from the X chromosome, but a majority of the X-linked microRNAs (miRNAs) escape MSCI, suggesting that they play a role in male meiosis [14]. How do genes escape silencing on the heterochromatic X chromosome, whether in somatic or germ cells? Many studies have shown that epigenetics plays a crucial role in X inactivation and escape [7, 15]. In this review, we will summarize recent progress made in the field of escape from X inactivation, compare the number and distribution of human and mouse escape genes, and discuss possible molecular mechanisms involved in genes escaping X inactivation.

Differences in escape genes between humans and mice

We shall first deal with the main type of X inactivation - that is, random X-chromosome inactivation in female somatic cells (see Table 1). In humans, about 15% of X-linked genes consistently escape this type of X inactivation, as determined from their expression in rodent x human hybrid cells that retain the human Xi, and on measurements of relative expression of allelic polymorphisms in primary fibroblasts [12]. Many human genes escaping X inactivation have already lost their corresponding Y copy. This suggests either that establishment of X inactivation may lag behind Y degeneration, or that specific mechanisms may exist to maintain expression of a subset of genes from the Xi as the result of selective advantages. In the mouse, we have recently shown that only 3% of genes escape X inactivation using next-generation RNA sequencing to survey allele-specific expression of X-linked genes. We derived a cell line from a mouse resulting from a cross between two species of mice, Mus spretus and Mus musculus, which are separated by as much as 7 million years of evolution and thus differ by numerous DNA sequence variants (about one variant for every 100 base pairs). These variant sequences were exploited to determine expression from each allele of X-linked genes after RNA sequencing. Because X inactivation is random, we selected for cells with the M. musculus X chromosome inactive to achieve 100% skewing of X inactivation [13]. Following this approach, any gene with RNA sequence reads from both species of mice was classified as an escape gene. From this study we conclude that compared to humans, X inactivation in the mouse is more complete (Figure 1).

Figure 1
figure 1

More genes escape X inactivation in humans than in the mouse. Distribution of genes subject to X inactivation (blue) and of 'escape' genes (orange) in human and mouse. The position of the pseudoautosomal regions (PAR1 and 2 in human, PAR in mouse), of the centromeres (cen, purple bar), and of the X-inactivation center encoding the long noncoding RNA XIST/Xist (black bar) are indicated. Note that as the centromere is located at one end of the mouse X chromosome, there is no short arm or long arm. Data from Carrel and Willard [12] and Yang et al. [13].

Escape from X inactivation in other mammalian species has not been extensively characterized. Nonetheless, escape genes have been identified in marsupials, which differ from eutherian mammals in terms of key features of X inactivation - Xist is absent and the paternal X always silenced. At least four X-linked genes encoding glucose-6-phosphate dehydrogenase (G6PD), hypoxanthine guanine phosphoribosyl transferase (HPRT), phosphoglycerate kinase (PGK1), and a monocarboxylic acid transporter (SLC16A2) show incomplete silencing in a tissue- and species-dependent manner in marsupial females [16, 17].

Significant differences exist in terms of the distribution of escape genes in human and mouse. In humans, most escape genes are located on the X short arm. One reason for this could be because the short arm has most recently diverged from the Y, and so these genes have only recently (in evolutionary terms) lost their Y paralogs [5, 6, 12]. Alternatively, the centromeric heterochromatin might exert a barrier effect that would prevent sufficient spreading of XIST RNA, which is generated from the X-inactivation center located in the long arm [18]. In contrast, escape genes are randomly distributed along the mouse X chromosome, which has its centromere located at one end [13]. In humans, escape genes are clustered (as many as 13 adjacent genes in large domains ranging in size between approximately 100 kb and 7 Mb), whereas in mouse, single genes are embedded in regions of silenced chromatin (Figure 2a). This suggests that escape from X inactivation in mouse is controlled at the level of individual genes rather than chromatin domains [12, 13, 19].

Figure 2
figure 2

Silenced and escape regions have distinct chromatin marks. (a) Chromatin containing escape genes is excluded from the condensed heterochromatic body of the Xi. In mouse, individual escape genes are surrounded by inactivated chromatin. In contrast, human escape genes exist in domains comprising clusters of genes. Orange bars represent escape genes and blue bars inactivated genes. (b) Silenced chromatin in the Xi is coated by Xist RNA potentially via specific DNA motifs (green). Repressive histone modifications and histone variants (for example, H3K27me3, H3K9me3, H4K20me3, and macroH2A1) are recruited and DNA methylation modifies the CpG islands. This type of chromatin structure prevents transcription (blue bar below). In contrast, escape gene regions are enriched for permissive histone marks (for example, H3K4me3, and H3 and H4 acetylation) and RNA polymerase II (RNA pol II) and are hypomethylated at their CpG islands. Insulator sites bound by the insulator protein CTCF, together with unknown factors (as denoted by the '?'), may separate inactivated genes (blue bar) from active genes (orange bar). CTCF binding may block CpG methylation and the spread of repressive chromatin and/or may organize the chromatin into loops.

In both human and mouse, many of the genes that escape X inactivation are expressed more strongly in females. In fact, one study has identified escape genes on the basis of expression levels in women with different numbers of X chromosomes [20]. However, in both humans and mice, differences in levels of expression of the escape genes between males and females are small, indicating partial repression of the escape genes on the Xi [21, 22]. This was confirmed by measuring allele-specific expression of escape genes in humans and in mice [12, 13]. We hypothesize that the Xi allele is either partially silenced by adjacent repressive modifications or might lack modifications associated with X upregulation of the Xa. As we do not know yet what these modifications are, this hypothesis remains to be tested. It is expected that, compared with mice, men and women would demonstrate greater sex differences in X-linked gene expression as a result of the large number of escape genes. Whether such sex differences provide an evolutionary advantage remains to be explored. Possible evolutionary advantages would be, for example, higher expression in female reproductive organs or in neurological tissues, which could influence behavior. It should be noted that most studies about escape from X inactivation have been done using cell lines; thus, tissue-specific effects have not been fully addressed.

Role of escape genes in disease

Escape genes play important roles in human diseases as women with a single X chromosome (X-chromosome monosomy; 45,X) have Turner syndrome, with severe phenotypes including ovarian dysgenesis, short stature, webbed neck, and other physical abnormalities [23]. In addition, as many as 99% of 45,X embryos die in utero [24]. Deficiency in escape genes is thought to play a major role in phenotypes observed in Turner patients [25]. Because the Y chromosome protects men from these deficiencies, the most likely candidate genes would have a Y copy, except for genes that control female-specific phenotypes such as ovarian failure and thus, by definition, would not affect men. So far, the pseudoautosomal gene SHOX (SHORT STATURE HOMEBOX), which encodes a homeodomain transcription factor, is the only gene directly implicated in the short-stature phenotype [26]. Interestingly, early lethality of 45,X embryos may be due to a defect in placenta differentiation, which is supported by the finding that many placental genes have much higher expression in 46,XX versus 45,X cells in differentiated human embryonic stem (ES) cells [27]. Notably, the pseudoautosomal gene CSF2RA (colony-stimulating factor 2 receptor, alpha), which encodes a receptor for a hematopoietic differentiation factor, has more than ninefold higher expression in 46,XX versus 45,X cells, suggesting that this gene may be involved in placenta differentiation defects [27]. In contrast, X0 mice have a near-normal phenotype and are fertile, although the number of oocytes is reduced, potentially as a result of the lack of sex-chromosome pairing [28]. Meiotic arrest due to lack of pairing could be attenuated in mouse compared with human single-X oocytes because of self-pairing of the X in mouse [29].

The fact that few escape genes exist in the mouse is consistent with the significant differences in the impact of X-chromosome monosomy in female mice and in women [13]. Genes that escape from X inactivation in humans but are subject to X inactivation in the mouse may be good candidates for genes responsible for Turner syndrome severe phenotypes. Pseudoautosomal genes may play a prominent role in these phenotypes, as already demonstrated for SHOX, and possibly for CSF2RA. Indeed, the mouse pseudoautosomal region contains only one gene, Sts (steroid sulfatase) [30], whereas all genes located in the pseudoautosomal region in humans are autosomal in the mouse and thus are not affected in X0 mice [31].

Another potential role for escape from X inactivation is in aging. Inappropriate reactivation of an X-linked gene, Otc, which encodes a urea cycle enzyme called ornithine transcarbamoylase, has been reported in mouse tissues [32]. Furthermore, a recent study has found epigenetic alterations including X reactivation in a mouse model of accelerated aging due to telomere shortening [33]. So far, no such reactivation of X-linked genes has been observed in humans. It will be important to determine whether environmental factors could cause inappropriate escape from X inactivation due to changes in epigenetic marks.

Chromatin modifications and escape from X inactivation

The Xi is distinguishable from its active counterpart by its epigenetic marks, including coating with Xist RNA. This is the earliest event in X inactivation during embryogenesis, and gene silencing follows within one or two cell cycles [7]. Interestingly, Xist-induced silencing can only be achieved in early differentiating ES cells, and reaches a point of irreversibility. Just how Xist RNA is spread along the Xi is still not fully understood. One hypothesis suggests that long interspersed repetitive elements (L1) repeats are overrepresented on the X and may serve as 'booster' elements by anchoring Xist RNA to the chromosome, thus aiding spreading [34]. Consistent with this hypothesis, human genes that escape X inactivation have fewer L1 repeats [6, 35, 36]. These genes are also enriched in specific sequence motifs such as Alu repeats and short motifs containing ACG/CGT at their 5' ends [37]. In the mouse, another type of repeat - long terminal repeats (LTRs) - appears to be depleted on escape genes [19]. These observations imply that Xist RNA coating could be deficient at genes escaping X inactivation. This was recently demonstrated in mouse myoblasts using RNA tagging and recovery of associated DNA (modified TRAP) method for identification of targets [38]. In this study, escapees Kdm5c and Kdm6a, which encode chromatin-modifying histone lysine demethylases, were shown to be devoid of Xist RNA coating over their promoters and transcribed regions. Conversely, genes subjected to X inactivation, and L1 repeat elements themselves, recruited Xist RNA [38] (Figure 2b). Taken together, these studies support the idea that specific DNA sequence motifs are involved in recruitment of Xist RNA to the Xi.

While Xist RNA coating is important in the initiation of X inactivation, many other epigenetic modifications follow to silence the X and maintain silencing. An early repressive chromatin mark, tri-methylation of lysine 27 on histone H3 (H3K27me3), is recruited by the Polycomb complex of chromatin-modifying proteins, resulting in compaction of the silenced portion of the Xi (Figure 2a). Other repressive marks include H3K9me3 and the histone variant macroH2A1, which are also enriched on the Xi (Figure 2b) [7, 39]. Concomitantly, 'active' marks such as acetylation of histone H3 and H4 are lost from the silenced chromatin [7, 40]. Modifications characteristic of silenced genes contrast with those within escape genes, which remain euchromatic and harbor histone H3 and H4 acetylation [7, 41]. H3K4me3, another mark associated with transcriptional activity, is absent from most of the Xi except at discrete regions corresponding to areas of escape, as shown in female lymphoblasts [42] (Figure 2b). We recently demonstrated a lack of H3K27me3 at escape genes in mouse, which shows complete concordance in the cell line used to assay allelic expression [13].

The existence of discrete areas of 'escape chromatin' adjacent to silenced chromatin suggests the need for boundary elements, such as insulator sequences, that may block the spreading of heterochromatin into escape regions or prevent repressive marks from being added to escape domains (Figure 2). Supporting this idea are our findings that the insulator protein CTCF (CCCTC-binding factor), which binds known insulator sequences, binds to the transition region between the escape gene Kdm5c and the inactivated gene Iqsec2 (IQ motif and SEC7 domain-containing protein 2) in mouse, whereas in humans, the corresponding region between the same genes, which both escape X inactivation, does not bind CTCF [43]. Furthermore, we have found that the CpG island at the 5' end of Kdm5c remains hypomethylated throughout mouse development, possibly because it is rendered inaccessible to DNA methyltransferases by CTCF binding (Figure 2b). CTCF-binding sites were also identified in other transition areas between escape and inactivated genes, suggesting that CTCF may play a role in the insulation of escape domains [43]. However, a subsequent study showed that insertion of CTCF-binding sites from the HS4 insulator site (from the chicken β-globin gene cluster) at each end of a short reporter gene was not sufficient to protect it from silencing when inserted within an inactivated gene on the Xi in mouse cells [44]. A more recent study reported that a bacterial artificial chromosome clone containing Kdm5c and its flanking regions retains its properties of escape even when inserted at other sites that are normally inactivated on the Xi in mouse cells [45]. CTCF-binding sites may turn out not to be sufficient for insulation, and other elements within or around escape genes may be important.

In particular, the structure of chromatin may have an important role in insulation by looping specific regions out of the condensed Xi (Figure 2a) [46]. Our recent X-chromatin profiles show a discontinuous distribution of the repressive chromatin mark H3K27me3 along the Xi, consistent with the presence of insulator elements and/or specific attachment sites for looped chromatin [13]. However, in human × mouse hybrid cell lines, where the human X can be distinguished from the rodent background, repressive chromatin marks were found to be progressively diminished in the intergenic region between the inactivated RBM10 (RNA-binding motif protein 10) and the escape gene UBA1/UBE1 (ubiquitin-like modifier activating enzyme). Specifically, H3K9me3 and another histone modification associated with gene silencing, H4K20me3, were enriched in the last RBM10 exon but were already depleted approximately 2 kb upstream of UBA1/UBE1 [41].

Escape from X inactivation can vary between different tissues and/or individuals and the escape status can also be developmentally regulated. In humans, about 10% of X-linked genes show variation in escape in different tissues and/or individuals [12, 47]. Some escape genes may have a different chromatin structure throughout development, as suggested by the lack of promoter-restricted H3K4me2 in undifferentiated ES cells before X inactivation [48]. Other escape genes may be initially silenced, and only reactivate in some tissues or with aging [33]. Individual cells may also vary: in an analysis of single-cell allelic expression of Kdm5c in mouse, significant silencing in individual embryonic cells was observed in contrast to consistent expression from both alleles in adult cells [49]. Differences in H3K27me3 enrichment on some genes in a tissue and developmental-stage-specific manner also suggest variability in escape [13]. For example, enrichment in H3K27me3 along Mid1 (midline 1) in mouse embryos but not in adult liver suggests removal of the repressive mark in a tissue-specific manner. It is possible that the recently identified histone demethylases KDM6A and KDM6B may facilitate the removal of H3K27me3 at escape genes [5052].

Escape from early imprinted paternal X inactivation

Imprinted X inactivation silences the paternal X during the preimplantation stage (see Table 1). This imprinting is reversed in the inner cell mass, and is followed by random X inactivation [7]. It is not known whether imprinted X inactivation occurs in humans and the mechanisms for imprinted X inactivation in mice are still unclear. Are there genes that escape the initial imprinted X inactivation? Several recent studies have addressed this question by profiling transcriptional activity from the paternal X during early development. A specific set of genes apparently does escape imprinted X inactivation at the two-cell stage [53, 54]. However, another subset of genes shows a variable escape status during development and in a lineage-specific manner. For example, Huwe1 (HECT, UBA and WWE domain containing 1) shows no evidence of silencing during pre-implantation stages but is efficiently silenced after implantation, whereas Kdm5c is partially inactivated during the preimplantation stage but escapes fully throughout the rest of development, and Atrx (alpha thalassemia/mental retardation syndrome X-linked) is expressed from both alleles in extraembryonic ectoderm but not in trophectoderm (the precursor of some extraembryonic tissues in the preimplantation embryo), or in later embryos [13, 49, 53].

Escape from male-specific meiotic sex-chromosome inactivation

In male spermatogenesis, yet another type of X-chromosome silencing takes place - MSCI [55] (see Table 1). Unlike X inactivation in female somatic cells, where extensive analyses have catalogued the proportion of genes that escape silencing, no such study has been done so far for MSCI. However, the permissive mark H3K4me3 is present in discrete regions of the X in mouse pachytene spermatocytes. Furthermore, immunofluorescence staining for RNA polymerase II in these cells revealed several regions of transcriptional activity, suggesting areas of escape from MSCI [42]. Another study revealed that up to 86% of the 72 known X-encoded miRNAs escape MSCI at different times during spermatogenesis. Some of the miRNAs were upregulated during MSCI and either downregulated or maintained in the context of postmeiotic sex chromatin [14]. Recent evidence suggests that repression of the X chromosome due to MSCI persists, at least in part, into the mature sperm [56], which could be important for suppression of oogenesis-specific genes and/or dosage compensation by potentially enabling transmission of a partially inactivated paternal X [57]. However, not all sex-linked genes remain inactivated following MSCI and evidence points to maintenance of post-meiotic X-chromosome repression being incomplete. In fact, about 18% of X-linked genes, especially multicopy genes, are expressed in postmeiotic cells [58].

X inactivation is an important process required to balance gene dosage in males and females. Equally important are those genes that escape X inactivation. Why is there a far greater number of X-linked genes that escape X inactivation in humans than in mice? Not only does the number of escape genes differ but also their location. Human escape genes exist in large domains of escape whereas mouse escape genes are scattered along the X chromosome. Their location in recent evolutionary strata in humans suggests a major role of sex chromosome evolution in the retention of escape genes. However, their retention may also be linked to their inherent ability to cause sex-specific differences in gene expression levels. We propose that the complexity of dosage compensation in mammals, which involves X upregulation, X inactivation, and escape from X inactivation, may have specific advantages in providing opportunities to modulate gene expression between the sexes in specific tissues. This may be especially advantageous in reproductive organs. Whether sex differences do lead to physiological effects remains to be determined. Specific epigenetic mechanisms may have evolved to ensure maintenance of escape from X inactivation. These may include the accumulation of repeats and DNA motifs to recruit or repel the silencing complex, as well as specific boundary elements. Future studies are needed to further characterize the chromatin structure of escape domains and to understand their role in evolution.