Background

Due to suppression of X-Y recombination, the eutherian X chromosome has not undergone major reorganization for over 100 million years and retains an ancestral state [13]. Our ability to identify the chromosomal components that gave rise to the X chromosome prior to the mammalian radiation has been limited both by the incomplete state of avian genome assemblies, and by the ancestral teleost whole-genome duplication and subsequent chromosome reshuffling that occurred during the long (>400MY) period since divergence of amniote and fish lineages [4].

Within Theria, the human X chromosome long arm and a proximal portion of the short arm correspond to genes on the marsupial X chromosome. This domain, the X-conserved region (XCR), is shared by sex chromosomes of all live-bearing mammals. In contrast, the remainder of the human X short arm, the X-added region (XAR), is autosomal in marsupials [5], and translocated to eutherian sex chromosomes between the divergence of marsupials and placental mammals ~148 million years ago and the eutherian radiation ~100 million years ago [6]. The most basal extant mammal, the egg-laying monotreme platypus, has five pairs of X and Y chromosomes, but these show no homology to the human X. Rather, platypus autosome 6 shares synteny with the entire therian XCR [7, 8] including the SOX3 gene from which the testis-determining gene SRY evolved, consistent with this part of the genome being the progenitor of X and Y [9]. The XAR maps to platypus autosomes 15q and 18p.

Broader comparisons within amniotes show that the human XAR is co-linear with a region on chicken chromosome 1, with much of XCR syntenic to the short arm of chicken chromosome 4 [10]. Controversially, another analysis detected synteny between two XCR regions and chicken chromosome 12 plus several microchromosomes, suggesting a third building block in genesis of the human X chromosome [11]. However, subsequent studies argue that putative X orthologs on chicken chromosome 12 and microchromosomes are actually paralogs, and true orthologs of many genes, especially at the border between XCR and XAR, are missing from the current chicken genome assembly [12].

Since the chicken genome assembly remains incomplete, and the duplicated genomes of teleosts have experienced frequent linkage disruptions [13] fragmenting their X chromosome orthology, a different outgroup is required to elucidate tetrapod chromosomal evolution. Recently, the amphibian Xenopus tropicalis has been the subject of a genome sequence assembly [14] and a meiotic linkage map [15]. The genome of X. tropicalis, unlike those of teleost fish and other Xenopus frogs, displays a canonical diploid vertebrate organization, preserving a high degree of synteny to amniote genomes [14]. The present study uses the X. tropicalis genome assembly and linkage map in combination with cytogenetic localization to clarify the deep evolutionary origin of the mammalian X chromosome.

Results and discussion

Homologies between human X and X. tropicalis chromosomes

We identified putative orthologs of human X chromosome genes in the X. tropicalis genome assembly, and obtained the chromosomal locations of 454 of these in two ways. Many X ortholog-containing sequence scaffolds could be directly assigned to linkage groups/chromosomes using the meiotic map. Cytogenetic locations of a subset of these genes, as well as X orthologs from scaffolds not represented on the meiotic map, were also determined by fluorescence in situ hybridization (FISH). In total, 442 (97%) of these X orthologs were found on chromosomes 2 and 8 (Additional file 1); the remaining 12 orthologs are scattered throughout other X. tropicalis chromosomes. Intriguingly, many of the scaffolds that had not been localized by genetic mapping were placed by FISH on the short arm of chromosome 2, which is known to be missing from the published meiotic linkage map (Additional file 2) [15]. The known positions of scaffolds containing human X orthologs are displayed in Figure 1.

Figure 1
figure 1

Positions of scaffolds containing orthologs of human X chromosome genes in Xenopus tropicalis chromosomes. Of 454 amphibian orthologs of human X-borne genes identified in this study, 442 (97%) localized to X. tropicalis chromosomes 2 and 8. Mosaic distribution of X orthologs (blue = XAR; green = XCR) suggests internal rearrangements after chromosomal fusions. cM – centimorgans, rdc - relative distance from centromere, cen – centromere.

On X. tropicalis chromosomes 2 and 8, scaffolds containing large blocks of human X orthologs are interrupted by gene clusters corresponding to other human chromosomes (Additional file 1). Chromosome 2 contains nearly all the XAR genes (134), with 306 XCR orthologs found on chromosome 8 (Figure 2, Additional file 1). These results confirm the remarkable evolutionary conservation of chromosomal content noted in the genome assembly analysis [14], despite some bias due to easier identification of frog orthologs in synteny blocks where neighbouring gene identities are also conserved. The exceptions to the XCR and XAR conservation are two XCR genes found on chromosome 2 (scaffold 422). This is likely to result from a translocation in Amphibia, since chicken and opossum orthologs of these two genes reside as expected in chromosome 4 and X, respectively. The XAR–XCR boundary is located between the RGN and PCTK1 genes, an interval containing the NDUFB11 RBM10 border previously suggested by human-marsupial comparisons [16]. The X. tropicalis sex determining locus has been mapped [17] to the neighbourhood of scaffolds 494, 605 and 735 on chromosome 7, and does not appear to be linked to amphibian X-borne genes.

Figure 2
figure 2

Regions of homology between human X and Xenopus tropicalis, Gallus gallus and Monodelphis domestica chromosomes. Only gene blocks larger than 0.7 Mbp are shown. Data are from Additional file 1 and the Comparative Genomic display of the Ensembl database [18]. cen – centromere, chr. - chromosome.

Is there a third evolutionary stratum in human X?

Comparisons of human X with the chicken genome have reached differing conclusions. The third evolutionary stratum on the human X identified by Kohn et al. [11] consists of the gene-rich regions Xp11 and Xq28, which were apparently conserved with chicken chromosome 12 and microchromosomes. However, these putative syntenic regions in the chicken genome have since been shown to consist of avian paralogs, with many true orthologs present in EST collections or unanchored contigs but missing from the chicken genome assembly [12], challenging the hypothesis of the third evolutionary stratum.

Our data strongly support a common ancient origin for the entire XCR. We identified 33 orthologs of Xq28 between BGN and IKBKG in the frog genome, of which 32 localize to chromosome 8 together with the remainder of the XCR (Additional file 1). Similarly, 21/22 Xp11 orthologs between GPKOW and FAAH2 localized to frog chromosome 8. Between RGN and GPKOW in Xp11, synteny comparisons among amniotes have been problematic. While this region is well represented on the opossum X chromosome, data from other species comprise only six platypus genes from chromosome 6 and a mixture of orthologs and paralogs from chicken chromosomes 1, 12 and 4 [19]. We used FISH to obtain cytological locations for 13 orthologous frog genes from this area, with all placing to the short arm of chromosome 8.

Overall, we were able to locate putative frog orthologs of 68 human Xp11 and Xq28 genes. 66 of these are located on chromosome 8 together with the rest of the XCR orthology, indicating that the whole mammalian XCR shares ancestry with a single X. tropicalis chromosome. Delbridge et al. [12] also hypothesized that the Xp11 and Xq28 regions could have arisen from an ancient genome by segmental duplication, since paralogous regions exist on single autosomes in human, rat, opossum and chicken. Orthologs of human genes from both Xp11 and Xq28 were found together in the same frog scaffolds (154, 456, 507 and 690) as shown in Additional file 1. This is consistent with the Xp11 and Xq28 regions being located near each other deep in evolution, followed by segmental duplication before divergence of amniotes and amphibians.

X chromosome deep evolution

Kohn et al. [11] have already suggested that XCR existed as an individual autosome in an amniote ancestor, because it persists as the single chromosome 4A in birds (except Galliformes) [2022]. As mentioned above, this ancestral autosome likely acquired the sex-determining gene SRY after divergence of Prototheria and Theria, then fused with the XAR in the eutherian lineage. The distribution of human orthologs in frog chromosomes supports this single-chromosome origin for the XCR. In X. tropicalis, chromosome 8 contains not only XCR homology, but also homology to other human chromosomes (Figure 1). This suggests that in the amphibian lineage, the putative ancestral XCR fused with another autosome to form an initial frog chromosome 8, which was then reshaped by intrachromosomal rearrangements.

The history of the XAR is more complex. In all non-eutherian vertebrates studied, the regions corresponding to the XAR do not exist as separate cytological entities, but are present within chromosomes, surrounded by other conserved gene blocks that are autosomal in eutherians [8, 14, 23]. In order to trace the broader chromosomal context of XAR evolution, we examined homology of these nearby gene blocks in non-eutherian vertebrate genomes. Regions surrounding the identified XAR homology on opossum chromosomes 4 and 7, chicken chromosome 1, and frog chromosome 2 were compared to the human genome; incomplete genomic data for wallaby, platypus and the anole lizard preclude synteny analysis. Strikingly, these XAR-neighbouring regions of opossum chromosomes 4 and 7 showed coherent and complementary stretches of homology to parts of human chromosomes 2, 3, and 13 (Figure 3 and Additional file 3) previously hypothesized to derive from fission of a single predecessor [8, 23]. The homology of these three human autosomes to both opossum 4 and 7 allows us to trace the genesis of the XAR in mammals. Localized human genome homology to both marsupial autosomes strongly supports a single pre-XAR chromosome, whose gene content was nearly identical to opossum chromosomes 4 and 7, which underwent a simple fission event to give these two autosomes in the marsupial lineage (Figure 4, second row). Human chromosomes 2, 3, and 13 show homology to both opossum chromosomes 4 and 7, and thus identify breakpoints in chromosomal rearrangement events following the divergence of marsupials from Eutheria. Human chromosomes with homology to either opossum chromosome 4 or 7, but not both (human chromosomes 11, 13, 15, and 21, Figure 3) are less informative since they do not evince breakpoints. The most parsimonious way to obtain the observed arrangement of homologies (including three breakpoints) in the eutherian lineage is a single internal translocation or inversion event in the pre-XAR, followed by fragmentation of the pre-XAR and fusion with XAR to form the eutherian X and autosomes (Figure 4, top row).

Figure 3
figure 3

Regions of human chromosomes homologous to frog, chicken and opossum chromosomes. Opossum chromosomes 4 (orange) and 7 (purple) both show regions of homology on single human chromosomes 2,3, and 13, supporting their origin from a single ancestral chromosome. Amphibian (blue) and bird (grey). Figure summarizes data from Additional file 3. cen – centromere.

Figure 4
figure 4

Proposed origin of X-added region of human X chromosome. A protochromosome (blue, left), present in the progenitor of Synapsida and Sauropsida, fused with several chromosomes (pink) to form the precursor of the eutherian X-Added Region (‘Pre-XAR’), now extant as most of opossum chromosomes 4 and 7. The Pre-XAR subsequently fragmented and fused with the therian X-Conserved Region (XAR and opossum X, yellow) to produce the eutherian X chromosome as well as with other partners (unshaded) contributing to autosomes. Fusion partners of the proto-XAR in amphibian (green) and avian (orange) lineages are also shown.

Our comparison of tetrapod genomes supports the following model for X evolution (Figure 4). The pre-XAR ancestral chromosome (see Figure 4, pink and blue second tier from top) can be defined as the sum of opossum chromosomes 4 and 7 (minus a region of chromosome 4 orthologous to human chromosome 19, which may represent a subsequent marsupial-specific fusion event). Differences in gene block order between human and opossum in this region suggest that in the period following divergence of marsupials but prior to the eutherian radiation, rearrangements inside the pre-XAR could have taken place. Further evolution of the eutherian karyotype then involves fragmentation of the pre-XAR chromosome, with the mature XAR joining the XCR (Figure 4, yellow) to complete the mammalian X, and the remaining pre-XAR fragments contributing to human chromosomes 1, 2, 3, 11, 13, 15 and 21.

Analysis of synteny data from frog and chicken genomes (Additional file 3) shows that deeper in evolution, regions corresponding to the pre-XAR share almost identical gene blocks with each other, but contain substantially fewer genes than the therian pre-XAR. We therefore infer the existence of a single proto-XAR chromosome, ancestral to the pre-XAR, in the progenitor of Synapsida and Sauropsida (Figure 4, blue). In the amphibian lineage, the proto-XAR region probably fused with another chromosome (Figure 4, green) to form frog chromosome 2. In birds, the proto-XAR now forms a major portion of chicken chromosome 1, plus a small region of chicken chromosome 23 (homologous to part of human chromosome 1 derived from the pre-XAR). Future availability of a more detailed anole genome may help identify differences in chromosomal evolution between birds and other lines of Sauropsida.

Fusion partners of the proto-XAR (Figure 4, pink), identified by their presence in opossum chromosomes, are found in chicken chromosomes 7, 9, 21 and 24 (Additional file 3). In therians, these fusions formed the pre-XAR, which then fragmented to give rise to large areas of human chromosomes 2, 3 and 11 homologous with opossum chromosomes 4 and 7. The structure of the human XAR differs from the corresponding part of the proto-XAR retained in frog by only a single translocation associated with inversion. Orthologs from frog scaffold 253 lie at the start and end of the XAR (Additional file 1), while their counterparts form a continuous region of chromosome 1 in chicken.

Evolution of gene content

The human X chromosome is highly enriched for reproduction- and brain-related genes [2426]. However, the human genome project detected negligible gene movement to the human X chromosome from autosomes [27], and brain-related genes on the human X and syntenic chicken chromosomes share an ancient origin [28]. Our analysis confirms minimal gene traffic from other chromosomes onto the mammalian X, as only 1.5% of human X chromosome single protein coding genes are found on X. tropicalis chromosomes other than chromosome 2 or 8, although identification of new frog orthologs could affect this ratio.

Segmental duplications resulting in t andemly a rrayed g enes (TAGs) are a source for emergence of new genes in mammalian and primate evolution [2931]. Selection pressure sometimes results in repeated duplication of multigene segments. In yeast, the number of tandemly repeated units containing genes for metallothionein and an unrelated gene [32] increases or decreases via non reciprocal recombination in response to intensity of selection by copper. An intriguing convergent feature of human X and chicken Z chromosomes is the presence of TAGs with elevated expression in testis, while expression of single-copy conserved genes shows no sex bias [19]. These findings point to a central role for TAGs in evolution of human X chromosome gene content.

In the case of the human X chromosome, 5 of 15 tandemly-arrayed multigene families have single known orthologs in the X. tropicalis genome (Additional file 1). For example, the MAGE superfamily is represented by a single gene in X. tropicalis chromosome 8 [33]. The same frog chromosome bears single orthologs of the SAGE1 and CT45 families. The gene families BEX, TCEAL (WEX), NXF and GPRASP (GASP) evolved by gene conversion from a common ancestral GPRASP-like gene [34, 35]. The frog genome contains a single ortholog (ENSXETG00000019743) of the ARMCX- and GPRASP-related gene family on chromosome 3, inferring the existence of a new superfamily located as a single block in human X. In addition to such single gene ancestors of amniote TAGs, ancient and conserved TAG clusters such as the ARSD family (Additional file 1) are also seen in the X. tropicalis genome.

Conclusions

The comparison of amphibian and amniote genomes presented here traces the constituents of the human X chromosome back more than 300 MYA to the common ancestor of the tetrapod lineage. Chromosomal fusion partners and breakage events giving rise to the X-conserved and X-added regions and other domains can be inferred from extant genomic and cytogenetic evidence. This analysis demonstrates robust conservation of these chromosomal blocks and unambiguously confirms a 2-component model for the origin of the eutherian X chromosome.

Methods

Homo sapiens Genome Build 37.1 [36] served as the source of the human gene list. We excluded pseudogenes, gene models, microRNAs and miscellaneous RNAs from the evaluation, leaving 182 genes in the human XAR and 627 from the XCR. X. tropicalis orthologs were then identified individually in Ensembl [37] and Xenbase [38] databases. Xenbase, the principle source of X.tropicalis orthologs, contains 4705 manually-annotated and 10,833 machine-annotated gene pages. Entries on the gene list were based on e-values of 1e-10 with a minimum 55% identity and 65% coverage [39]. Chromosomal locations of some scaffolds (JGI X. tropicalis genome assembly 4.1) containing identified orthologs were obtained from the existing X. tropicalis linkage map [15]. Information about blocks that are homologous between human, opossum and chicken (Additional file 3) is from the Comparative Genomics display of the Ensembl database [18]. Orthologs of human genes in gaps between synteny blocks were identified in databases [37, 38] and the X. tropicalis linkage map [15]. Some families of human duplicated genes have single known ancestral orthologs, which were only counted once. In total, we were able to identify chromosomal locations in the X. tropicalis genome for 454 human X orthologs.

For certain X. tropicalis sequence scaffolds not represented on the current linkage map, we also obtained cytological locations using fluorescent in situ hybridization coupled with tyramide amplification (FISH-TSA) using chosen scaffold-specific cDNA probes [40, 41]. Probes for chromosomal in situ hybridization were generated from cDNAs of frog orthologs of human X chromosome genes described in Additional file 4. Images of chromosomes visualized with the fluorophores tetramethylrhodamine and diamidinophenylindole were collected at two different wavelengths (U-MWV and U-MWIY filters) on an Olympus BX40 microscope with 100x objective using a Sony SPT-M320CE camera. Contrast and brightness were adjusted using the ACC program (Sofo, Brno) and the images merged in pseudocolor.