Using RNAseq, in experiment 1, we surveyed gene expression in samples of XY male urogenital ridge across embryonic days 11–14, using low sample coverage at each embryonic day, to give a preliminary assessment of the timing of expression of Sry and other genes known to be involved in testis differentiation in other mammalian species. Based on those results, E13 was identified as the time with highest expression of Sry. In experiment 2, we performed well-powered analysis of gene expression on E13, only in the testis itself, without contamination of adjacent non-gonadal tissues. Using variant mapping between the different copies of Sry, it was then possible to determine the percent of total Sry each copy makes up, which was used to normalize total Sry mapping into each specific copy.
Differentially expressed genes
The principal components analysis of RNASeq samples shows clustering based on principal component (PC) 1 and 2, with different clustering of E11 and E12/13 samples (experiment 1, Fig. 1a); PC1 stratifying the two different experiments; and PC2 stratifying day of embryonic development. The samples from E13 (21 ts) of experiment 2 cluster differently from those of experiment 1 (Fig. 1a). This is likely due to the difference in tissues sampled in the two experiments. Previous studies have shown testis differentiation begins at E12 in rats [24]. Preliminary gene analysis comparing E11 (n = 2) and E12/E13 samples (n = 3) from experiment 1 using DESeq2 (adj p < 0.01) revealed a significant enrichment, based on gene ontology to the entire genome, for genes involved in gonadal development (12 genes: Amhr2, Bcl2, Cga, Cyp1b1, Inhbb, Lhx9, Mgst1, Nr0b1, Nr5a1, Ren, Sohlh2, Wt1, FDR = 0.0015), kidney development (9 genes: Agtr2, Egr1, Hrsp12, Hspa8, Nphs2, Ren, Sdc4, Sulf1, Wt1, FDR = 0.023), and core histones (11 genes: ENSRNOG00000034127, Hist1h2aa, Hist1h2an, Hist1h2bd, Hist1h2bf, Hist1h2bk, Hist1h2bo, Hist1h3c, Hist2h2ab, LOC690131, rCG_23123, FDR = 7.8 × 10−7). Because of the differential clustering of E12/13 relative to either E11 or E14, we plotted two comparisons, E12/13 vs E11 (x-axis) and E12/13 vs E14 (y-axis), identifying multiple genes that, like Sry, have increased expression on E12/13 relative to either E11 or E14 (Fig. 1b, p < 0.001, group 1); genes that are higher in all days except E11 (Fig. 1b, p < 0.001, group 2); and genes that are lower in E12/13 vs. either E11 or E14 (Fig. 1b, p < 0.05, group 3). Although the measurement of these genes is preliminary based on the low sample size, there were numerous genes known to be activated in other studies at testis differentiation that are observed with the expected time course in Fig. 1c, suggesting that E13 was the optimal age for further analysis with a higher sample size in experiment 2. The absolute values of gene expression in experiment 1 and several of the genes potentially elevated at E12/13 are subject to replication in future studies because of the low sample size at each age.
The measurement of gene expression exclusively from gonadal tissue at E13 (21 ts) in experiment 2 suggests that many genes unique to E12/13 groups are from the genital ridge and not from the mesonephros (Fig. 1b, colored dots). The expression of several markers for sexual differentiation of the gonads at each of the days of rat development fall within these groups that begin to increase in expression at E12 of experiment 1 (Sry, Sox9, Wt1, Amhr2, Cyp1b1, Lhx9, Mgst1, Nr0b1, Nr5a1; Fig. 1c), whereas experiment 2 (21 ts) further confirms the presence of these markers. Amh increased expression later, at E14. Ovary markers Foxl2 and Rspo1 appeared to be downregulated by E14 (Fig. 1c). Our bulk dissection of developmental days (experiment 1) confirms that E13 corresponds to the beginning of testis determination. Fine dissections of 21 ts embryos with an n of 5 validate the presence of Sry, allowing for repeated independent assessments of Sry copies at testis differentiation.
Expression of different Sry genes
Expression of Sry increased at E12 and E13, correlated with initiation of testis differentiation, suggesting that these RNAseq runs have the power to segregate copies of different Sry genes by sequence during this period. We began by measuring the read depth for Sry in each of the RNAseq files (Fig. 2a). A total of 16 variants were identified to differentiate the Sry genes (Fig. 2b), 10 of which were found within the protein-coding region (Fig. 2c). Compiling all reads of Sry genes in all RNAseq runs, the average depth at any site within the gene is 563 reads, with > 150 read coverage at amino acids 9-779 (Fig. 2d), where 1 is the first base of the transcript. Using the compiled data, 25 locations have variation of bases within the reads, with all of our markers showing variation (Fig. 2e) and high coverage (197–1007 reads). We removed markers that fall below 150 read depth, removing many variants that show up on the 3′ end of the transcript. With the markers, we calculated the percent of Sry reads that map to each of the genes. At E11 (experiment 1), prior to testis differentiation, the ubiquitously expressed Sry2 represented 100% of reads (Fig. 2f), even though absolute numbers of reads was low at this time point (Fig. 2a gray and black). Over the period E12–E14 (experiment 1), we identified Sry1, Sry3C, Sry4A, and “other Sry3s” (transcripts for which reads did not allow further assignment to specific Sry3 genes) (Fig. 2f). Analysis of data from E13 in experiment 2 confirmed that Sry4A was expressed at the highest level, together with other Sry3s, Sry2, Sry3C, and Sry1 (Fig. 2g). Very low levels of expression on all days were found for Sry3A, Sry4, and the nonHMG Sry, suggesting these genes have minimal involvement in testis differentiation. Normalizing the globally mapped TPM values on each RNAseq dataset for Sry using the percent composition of each copy shows a relatively high value of Sry4A, other Sry3s, Sry3C, and Sry1 starting at E12, continuing at E13, and decreasing at E14 (Fig. 2h).