Introduction

Speciation is a complex process resulting from the divergence of two populations from an ancestral lineage by reproductive barriers capable of preventing gene flow1,2. Among these barriers, postzygotic isolation mechanisms contribute to hybrid incompatibility, and their consequences can be observed by the presence of two main traits, hybrid sterility and/or hybrid inviability, which can evolve at different rates. Overall, hybrid sterility evolves faster than hybrid inviability, mainly when considering the different sexes, since these barriers evolve faster in males than in females3. Indeed, several studies considering intraspecific and interspecific hybridization have shown that the germline is primarily affected, and sterility is often detected4,5,6,7,8,9,10. Although the process of speciation has been widely studied, the causes of postzygotic incompatibility in hybrids are not fully understood. According to the model of Dobzhansky–Muller11,12, fixed mutations in genetically isolated populations can result in deleterious epistatic interactions, disrupting regulatory networks and leading to serious consequences in hybrids2,13,14,15. In addition, according to Haldane’s rule16, sterility in hybrids is more likely to affect the heterogametic sex, and the degree and extent of these genetic incompatibilities are related to the time of divergence and are likely the result of divergent regulatory sequences13,15,17,18.

Most of the hybrid incompatibility genes identified so far seem to have species-specific effects leading to hybrid incompatibility, indicating that this is a complex polygenic trait that has different epistatic effects, depending on the pair of species, which is likely related to the rapid accumulation of genetic changes over time. Studies of recently diverged species presenting different phenotypes regarding postzygotic isolation mechanisms can help to clarify early disruptions in gene regulation and expression that may influence the speciation process. Drosophila mojavensis and D. arizonae are appropriate for such a study. They are sibling species with divergence time estimates from 0.66–0.99 mya19 to 4.2 ± 0.99 mya20, depending on the study. The most recent data, based on more than 5000 genes, indicate ~ 1.5 mya21 of divergence between the two species. This pair of species is widely used in speciation studies due to their ability to produce hybrids in the laboratory; however, introgression has not yet been found in nature despite the favourable ecological conditions for hybridization, mainly between sympatric populations22. D. mojavensis is endemic to the southwestern United States and northwestern Mexico. It is composed of four ecologically distinct subspecies, which are distributed in four different geographic regions, and each subspecies uses a specific host cactus as a feeding and breeding site, with no evidence of recent gene flow, constituting well-structured populations19,23,24,25,26,27. In crosses between D. mojavensis subspecies and D. arizonae, hybrids can be produced in both directions. Nevertheless, they present incomplete and asymmetric postzygotic isolation, since in crosses with D. mojavensis females, the sterility of the male hybrids is variable, whereas in crosses with D. arizonae females, the male hybrids are always sterile because they do not have motile sperm10,28,29.

With this in mind, we sought to assess the magnitude of differential gene expression (DEG) in male and female offspring from D. arizonae × D. mojavensis crosses and its association with hybrid phenotypes. We showed that in fertile hybrid females, very few DEGs were identified. However, in male hybrids, the degree of deregulation of gene expression was related to the severity of the sterile phenotype because males without motile spermatozoa had several DEGs, with a bias for underexpression, and the functions of these genes were related to spermatogenesis. By contrast, in sterile hybrids with motile spermatozoa, the degree of deregulation was lower and had a bias toward overexpression, and the gene functions were not directly related to spermatogenesis.

Results

The transcriptomes of the ovaries and testes of three parental allopatric strains (D. m. mojavensis, D. m. wrigleyi and D. arizonae) and their reciprocal hybrids (Fig. 1) were sequenced. We were able to recover 11,654 coding genes from D. mojavensis r1.04 transcripts. Of those, 9321 (80%) genes were expressed in D. arizonae, D. m. mojavensis and H♀ari♂mmoj ovaries, and 9700 (83.3%) were expressed in D. arizonae, D. m. wrigleyi, H♀mwri♂ari and H♀ari♂mwri female gonads. In testes, we found 11,146 (95.6%) genes expressed in D. arizonae, D. m. mojavensis and H♀ari♂mmoj and 11,223 (96.3%) expressed in D. arizonae, D. m. wrigleyi, H♀mwri♂ari and H♀ari♂mwri. The read alignment rate ranged from 81.7 to 86% in ovaries and from 76.4 to 80.8% in testes (Supplementary Table S1). A similar alignment rate for D. arizonae, D. mojavensis subspecies and their reciprocal hybrids indicated that the genome of the D. mojavensis r1.04 reference can be used in analyses involving D. arizonae and hybrids. Additionally, according to Lopez-Maestre et al.30, the average nucleotide divergence between D. arizonae and D. mojavensis is very low, less than 2%; thus, the sequence alignments of parental species and their hybrids with the r1.04 genome of D. mojavensis should not affect mapping efficiency. Principal component analysis (PCA) was performed to verify the variance of the biological replicates. Within ovaries and testes, the replicates were grouped together (see Supplementary Fig. S1), indicating low variance between replicates.

Figure 1
figure 1

Crosses performed between D. arizonae and D. mojavensis subspecies and differential gene expression between parental lines and hybrids. Geographic distribution of D. m. wrigleyi, D. m. mojavensis and D. arizonae lines used in this study and their respective cross directions with offspring identifications (http://www.d-maps.com/carte.php?num_car=1404&lang=en). H♀mwri♂ari and H♀ari♂mwri are offspring from crosses between D. arizonae and D. m. wrigleyi, and H♀ari♂mmoj are hybrids from crosses between D. arizonae and D. m. mojavensis.

Differential gene expression in D. arizonae and D. mojavensis gonads

We compared the transcriptomes of D. arizonae and D. m. mojavensis and those of D. arizonae and D. m. wrigleyi and identified 501 (5.3%) and 619 (6.3%) DEGs in ovaries, respectively, and the majority of the genes were overexpressed in D. arizonae (X2 = 220, p = 2.2e−16; X2 = 67.231, p = 2.415e−16, respectively) (Fig. 2A, Supplementary Table S6). In testes, we observed that 16% of genes were differentially expressed (Fig. 2E) when comparing D. arizonae with both D. mojavensis subspecies, and there was a higher proportion of overexpressed genes in both comparisons (X2 = 4.34, p = 0.03759; X2 = 11.783, p = 0.0005976) (Fig. 2E, see Supplementary Table S6). Gene enrichment analyses between parental lines showed that in ovaries, the DEGs were mainly enriched for nervous system, metabolism, chemotaxis, and extracellular matrix organization, among others. In testes, DEGs were also enriched for metabolic function, sensory perception, nervous system, gene expression, and behaviour, among other functions (Fig. 3, Supplementary Tables S2 and S3 and Supplementary Fig. S2 for cellular component and molecular function enrichment). Among the DEGs with functions in metabolic processes, those involved in processes of oxidation–reduction and metabolism of carbohydrates and lipids can be highlighted (see Supplementary Tables S2, S3, S4 and S5).

Figure 2
figure 2

Differential gene expression between D. arizonae and D. mojavensis subspecies and between hybrids and their respective parental lines. Scatter plots representing differential gene expression in ovaries (left) and testes (right) between parental lines, as well as between hybrids and their respective parental lines. Genes were considered differentially expressed when they presented twofold differences and FDR-corrected p-values < 0.01. Red dots = overexpressed genes; blue dots = underexpressed genes; grey dots = not differentially expressed. (A) Differential expression in ovaries of D. m. mojavensis × D. arizonae and D. m. wrigleyi × D. arizonae hybrids. (B) Differential expression in ovaries of H♀mwri♂ari × D. m. wrigleyi hybrids and D. arizonae. (C) Differential expression in ovaries of H♀ari♂mwri × D. m. wrigleyi hybrids and D. arizonae. (D) Differential expression in ovaries of H♀ari♂mmoj × D. m. mojavensis hybrids and D. arizonae. (E) Differential expression in testes of D. m. mojavensis × D. arizonae and D. m. wrigleyi × D. arizonae hybrids. (F) Differential expression in testes of H♀mwri♂ari × D. m. wrigleyi hybrids and D. arizonae. (G) Differential expression in testes of H♀ari♂mwri × D. m. wrigleyi hybrids and D. arizonae. (H) Differential expression in testes of H♀ari♂mmoj × D. m. mojavensis hybrids and D. arizonae. The scatter plot was generated using the ggplot2 package (version 3.3.3) in R79. O: overexpressed genes; U: underexpressed genes.

Figure 3
figure 3

GO term enrichment analysis of DEGs between D. arizonae and D. mojavensis subspecies. Dot plot representing GO enrichment for the biological process category for DEGs in (A) ovaries and (B) testes between D. arizonae- D. m. mojavensis and D. arizonae- D. m. wrigleyi (FDR < 0.05). The dot plot was generated using the ggplot2 package (version 3.3.3) in R79.

Gene expression in hybrid female gonads is very similar to that in the parental lines

We compared the patterns of gene expression between hybrids and their parental lines. In hybrids between D. m. wrigleyi and D. arizonae, the percentage of DEGs between the hybrids and the parental lines ranged from 1.27 to 4.2% (Supplementary Table S7). For all the comparisons, the proportion of overexpressed genes was significantly higher than that of underexpressed genes and was always higher in female hybrids from D. arizonae-D. m. wrigleyi crosses (Fig. 2A–C, see Supplementary Table S7). In hybrids between D. m. mojavensis and D. arizonae, the percentage of DEGs between the hybrids and the parental lines was 0.26 and 1.93% when compared with D. arizonae and D. m. mojavensis, respectively (Supplementary Table S7). The proportion of overexpressed versus underexpressed genes was significantly different and dependent on the cross (Supplementary Table S7). When comparing H♀ari♂mmoj with D. arizonae, there was a higher proportion of underexpressed genes than overexpressed genes (X2 = 10.24, p = 0.001374), contrary to the comparison between H♀ari♂mmoj and D. m. mojavensis, for which we identified a higher proportion of overexpressed genes in the hybrids (X2 = 66.006, p = 4.497e−16) (Fig. 2D).

By filtering the genes that were over- or underexpressed relative to both parental lines for all crosses, we greatly reduced our set of DEGs. From the DEGs in H♀mwri♂ari ovaries, 16 genes were overexpressed in relation to both parental lines (Supplementary Table S8); however, no significant GO enrichment was found. Moreover, no underexpressed genes were found. H♀ari♂mwri ovaries showed 13 overexpressed genes (Supplementary Table S9) and only one underexpressed gene (FBgn0135298, with unknown function) when compared with both parental lines. Additionally, H♀mwri♂ari and H♀ari♂mwri ovaries showed only one shared overexpressed gene, FBgn0145754 (unknown function) (Fig. 4A). In H♀ari♂mmoj ovaries, no overexpressed genes were found, while only three genes were underexpressed. These genes included FBgn014602, which corresponds to alpha-Est5 and has carboxylic ester hydrolase activity; FBgn028050, corresponding to the Maverick gene, which has a role in signalling pathways, the regulation of neuromuscular junctions, dendrite development and imaginal disc-derived wing size; and FBgn0146651, orthologous to CG11854 in D. melanogaster, which is involved in reproduction, more specifically in courtship behaviour31,32.

Figure 4
figure 4

Shared DEGs between hybrids. (A) Number of shared overexpressed genes between H♀mwri♂ari and H♀ari♂mwri females. (B) Number of shared overexpressed genes between H♀mwri♂ari, H♀ari♂mwri and H♀ari♂mmoj males. (C) Number of shared underexpressed genes between H♀mwri♂ari, H♀ari♂mwri and H♀ari♂mmoj males. A Venn diagram was generated using the VennDiagram package (version 1.6.20) in R79.

Gene expression in the hybrid male germline is transgressive when compared to the parental lines

In comparison to ovaries, hybrid testes always presented a larger number of DEGs in relation to parental lines (Fig. 2E, Supplementary Table S7). In H♀mwri♂ari testes, we found that 8% and 5.8% of genes were differentially expressed compared with D. arizonae and D. m. wrigleyi, respectively, which displayed a significant bias for overexpression (X2 = 42.637, p = 6.59e−11; X2 = 122.7, p = 2.2e−16, Fig. 2F, Supplementary Table S7). In hybrid testes from the reciprocal cross (H♀ari♂mwri), we found that 6.3% and 11.75% of genes were differentially expressed compared with D. arizonae and D. m. wrigleyi (Fig. 2G, Supplementary Table S7). Unlike H♀mwri♂ari testes, no bias towards over- or underexpression was observed in the comparison with D. arizonae (X2 = 2.244, p = 0.1341) (see Supplementary Table S7). In testes from H♀ari♂mmoj, we observed that 11.9% and 16% of genes were differentially expressed compared with D. arizonae and D. m. mojavensis, respectively, with a significant proportion of underexpressed genes, 67.5% and 66.9%, respectively (X2 = 162.24, p = 2.2e−16, X2 = 203.91, p = 2.2e−16) (Fig. 2H, Supplementary Table S7).

Regarding the DEGs in H♀mwri♂ari testes, 32 were overexpressed and 14 were underexpressed when compared with both parental species, while in H♀ari♂mwri testes, 56 and 128 genes were over- and underexpressed, respectively, when compared with D. arizonae and D. m. wrigleyi. Similar to those of H♀ari♂mwri, testes of H♀ari♂mmoj displayed more underexpressed genes (519) than overexpressed genes (57) when compared to both parental species.

Spermatogenesis-related functions of DEGs in testes

In the male gonads, the functions of differentially expressed genes differed depending on the direction of the cross. In H♀mwri♂ari testes, of the 46 DEGs found relative to both parental species, 39 had an orthologue in D. melanogaster. Most of the DEGs presented functional annotations associated with several functions, but no significant enrichment for GO terms was observed (see Supplementary Tables S10 and S11). On the other hand, in H♀ari♂mwri testes, 183 genes were differentially expressed in both species, and 131 D. melanogaster orthologues were recovered. Functional annotation of these genes showed that several of them were related to processes such as reproduction and cellular division, including spermatogenesis, cilium movement, microtubule-based movement and nucleation, sperm motility, chromosome segregation and mitotic spindle elongation, as well as other functions related to several metabolic processes (Cyp6a9/Cyp6a21, Ugt50B3, Gpo2, CG7140, P5CDh2, inaE), transcription (zen2, hb, Hr3, vnd, ap) and sensory perception (Obp8a, Or98a) (see Supplementary Tables S12 and S13). In H♀ari♂mmoj testes, 576 genes were identified as differentially expressed, and 440 had a D. melanogaster orthologue. In these hybrids, most of the underexpressed genes were functionally annotated as having similar functions as those found in H♀ari♂mwri, but it is important to emphasize some other functions, such as spermatid differentiation and development (Supplementary Tables S14 and S15).

We searched for shared DEGs between hybrids and found that H♀mwri♂ari and H♀ari♂mwri testes shared only one overexpressed gene (Fig. 4B). However, when comparing male gonad expression from crosses of H♀ari♂mmoj and H♀ari♂mwri, which have the same D. arizonae mother but two different D. mojavensis subspecies fathers, we identified 33 shared overexpressed genes (Fig. 4B, Supplementary Table S16) and 107 shared underexpressed genes (Fig. 4C, Supplementary Table S17). Interestingly, some of these genes exhibited GO term enrichment for spermatogenesis-related functions, such as regulation of cytokinesis and cell division, protein localization to the microtubule plus-end, microtubule-based process, male germline cyst formation and cilium movement, in addition to various metabolic processes (Fig. 5). Based solely on these reproductive genes, it is remarkable that some of them might have a direct role in reproduction, since they participate in male meiosis, male courtship, and mating behaviour, and play a role in gene silencing and pre-miRNA processing. Moreover, for many of these shared DEGs, the mutant phenotype has already been described for other Drosophila and corresponds to male sterility (Table 1)33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49.

Figure 5
figure 5

Functions of shared over- and underexpressed genes between H♀ari♂mwri and H♀ari♂mmoj males. Dot plot representing GO enrichment for the biological process, cellular component and molecular function categories. The plot was generated using the ggplot2 package (version 3.3.3) in R79.

Table 1 DEG in male hybrids gonads with reproductive functions.

To better understand the results obtained for hybrid males, we examined some life-history traits, such as viability, sperm motility and fertility. Our analyses revealed that the viability of the hybrid was significantly reduced when compared to that of the parental lines (see Supplementary Table S18), since the average viability ranged from 3.85 to 14.2% in H♀ari♂mmoj, H♀mwri♂ari and H♀ari♂mwri, while in the parental lines, it ranged from 29.58 to 86.8% (Fig. 6, Supplementary Table S18). Additionally, we found that all hybrid females were fertile. Regarding F1 males from the two types of crosses, we observed two phenotypes. H♀mwri♂ari males produced motile sperm but were sterile, since no offspring were produced in backcrosses and F1xF1 crosses. In contrast, H♀ari♂mwri and H♀ari♂mmoj males produced amotile sperm and were also sterile (Fig. 6, Supplementary Table S18). These results suggest a link between the severity of the sterile phenotype and the number and functions of DEGs in male hybrids from crosses with D. arizonae mothers.

Figure 6
figure 6

Average viability and fertility of intraspecific and interspecific offspring. Average viability (%) of D. arizonae, D. m. wrigleyi and D. m. mojavensis and their respective hybrids, H♀mwri♂ari, H♀ari♂mwri and H♀ari♂mmoj. Sperm motility and fertility are represented by—and + signals. +/+: motile sperm/fertile, +/−: motile sperm/sterile, −/− amotile sperm/sterile. The boxplot was generated using the ggplot2 package (version 3.3.3) in R79.

Inheritance of gene expression

We compared the level of gene expression in H♀ari♂mmoj, H♀mwri♂ari and H♀ari♂mwri with that in each parental line, following the six inheritance categories of McManus et al.15. Most of the genes in the ovaries and testes of the interspecific hybrids showed conserved expression (Fig. 7, Supplementary Tables S19 and S20). H♀ari♂mmoj showed conserved expression for 97.8% of the genes in ovaries and 78.5% of those in testes (Fig. 7, Supplementary Table S19). Few genes were classified as having additive expression in the female and male gonads (0.05 and 1.45%, respectively). In the dominant category, H♀ari♂mmoj exhibited an overrepresentation of genes with D. arizonae-like expression (1.8%, in ovaries and 9.5% in testes), and few genes were D. mojavensis-dominant (Fig. 7, Supplementary Table S19). In the overdominant and underdominant categories, almost no genes were found for ovaries (0.03% of DEGs). However, the testes showed several genes in these categories, reaching 0.5% for overdominant and 4.6% for underdominant genes (Fig. 7, Supplementary Table S19).

Figure 7
figure 7

Inheritance of gene expression patterns in ovaries and testes of H♀mwri♂ari, H♀ari♂mwri and H♀ari♂mmoj. Gene expression patterns were classified into six categories of inheritance depending on the significance of the differential expression measured by performing pairwise comparisons in all conditions, according to McManus et al.13. These categories are as follows: Conserved, when the level of gene expression in hybrids is similar to that of both parental lines (not show in the plot). Additive, when the expression levels are different between the two parental lines, but the hybrid expression level is intermediate. D. arizonae-dominant or D. mojavensis-dominant, when the hybrid expression level is similar to that of only one parental line. Overdominant, when the expression level in the hybrids is significantly higher than that in both parental lines. Underdominant, when the hybrid expression level is significantly lower than that in both parental lines. The bar plot was generated using the ggplot2 package (version 3.3.3) in R79.

Similar to H♀ari♂mmoj, most of the expressed genes in H♀mwri♂ari and H♀ari♂mwri female and male gonads had conserved expression, but in testes, the level of conserved inheritance was lower than that in ovaries (Fig. 7, Supplementary Table S20). Additionally, very few genes in the ovaries (0.3 and 0.22%) and testes (1.1 and 1.7%) of H♀mwri♂ari and H♀ari♂mwri displayed additive expression (Fig. 7, Supplementary Table S20), as was observed for H♀ari♂mmoj. An interesting finding is that most of the genes classified in dominant categories displayed a D. arizonae-like pattern of expression in ovaries, but in testes, this pattern was related to the maternal line (Fig. 7, Supplementary Table S20). Considering the overdominant and underdominant categories, few genes were found for these categories in ovaries (0.16% in H♀mwri♂ari and 0.13% in H♀ari♂mwri) (Fig. 7, Supplementary Table S20). Similarly, in hybrid testes, very few genes were classified as over- (0.28% in H♀mwri♂ari and 0.5% in H♀ari♂mwri) or underdominant (0.12% in H♀mwri♂ari and 1.1% in H♀ari♂mwri) (Fig. 7, Supplementary Table S20).

Comparing all inheritance categories of H♀mwri♂ari and H♀ari♂mwri, significant differences in the expression profiles of reciprocal hybrids were observed in ovaries (X2 = 146.67, p < 0.001), which were mainly influenced by paternal effects, since 201 H♀mwri♂ari genes and 23 H♀ari♂mwri genes showed paternally dominant inheritance. In testes, significant differences in the expression profiles were also found between reciprocal hybrids (X2 = 823.2, p < 0.001); however, unlike in the ovaries, these differences were mainly influenced by maternal and underdominant inheritance. More specifically, 140 and 695 genes were found to have exclusive maternal inheritance and 14 and 128 genes had underdominant expression in H♀mwri♂ari and H♀ari♂mwri, respectively.

Discussion

In our study, the target species D. m. mojavensis, D. m. wrigleyi and D. arizonae showed approximately 6% of genes being differentially expressed in the ovaries and approximately 16% of genes being differentially expressed in the testes, agreeing with previous findings from our research group for hybrid female transcriptomes30 and with results obtained for hybrid male gonads7. This proportion of DEGs between species was intermediate when compared to other pairs of species presenting higher or lower divergence times. For example, D. yakuba and D. melanogaster (~ 6.1 mya divergence time20) have a proportion of DEGs varying from 29.59 to 42.58%50. In more recently diverged species, fewer DEGs are generally observed, such as in males of D. yakuba and D. santomea (~ 1.18 mya51, DEG: 19%7) and those of D. p. pseudoobscura and D. p. bogotana (~ 0.25 mya52, DEG: 14.6%53).

The differences in the proportion of DEGs in ovaries and testes could be related to the faster evolution of male-biased genes, as has been previously reported in other Drosophila species. According to previous studies, during the divergence process, male-biased genes display higher evolutionary rates, driven by positive selection, and most of them are preferentially expressed in gonad tissues54,55,56,57. This is the case in D. arizonae and D. mojavensis and can explain the high proportion of DEGs in testes when compared with the parental lines58.

Hybrid female gonads exhibited few DEGs when compared with both parental lines, and most of these genes were overexpressed. In agreement with our results, previous studies in hybrids of D. arizonae-D. m. mojavensis30 and D. melanogaster-D. simulans13 have reported a bias towards overexpression in female offspring. However, this bias was not observed in D. buzzatii and D. koepferae (~ 4.49 mya) female hybrids14. The absence of massive gene deregulation in ovaries could be related to the higher stability of gene expression in females, which is mainly influenced by the slower evolutionary rates of female-biased genes and the presence of two X-chromosomes, as was reported in other Drosophila species17,57, and this phenomenon could be important in the fertile phenotype observed in these hybrids. Thus, we suggest that in hybrid ovaries from recently diverged species, DEGs tend to be overexpressed, but over time, global deregulation will increase due to the accumulation of genetic changes, and the number of over- and underexpressed genes will become symmetric.

Male hybrids from crosses performed in different directions exhibited distinct proportions of genes with transgressive expression (over- or underexpressed relative to the parental species). In hybrids in which the mother was D. m. wrigleyi, we observed an overexpression of genes when compared to the parental lines. In contrast, if the mother was D. arizonae, there was either no difference between the proportion of over- or underexpressed genes or we observed a bias towards underexpression in relation to the parental lines. The differences observed between the reciprocal crosses could be due to the origin of the sex chromosomes. It has been reported that some specific epistatic factors among sex chromosomes and autosomes play a role in the differential expression profiles of interspecific hybrids59, which could explain the differences we observed.

Moreover, differences in sex-autosome interactions between reciprocal hybrids could also affect the severity of the sterility phenotype. There is evidence that in hybrids between D. arizonae females and D. m. mojavensis males, amotile sperm can result from interactions between the D. m. mojavensis Y chromosome and the 3rd autosome and/or the X chromosome from D. arizonae60. However, in the reciprocal cross, male hybrid sterility was associated with interactions between the Y chromosome of D. arizonae and the 4th autosome of D. mojavensis61. Our analyses of sperm motility and fertility showed that hybrid males from crosses between D. arizonae females and D. m. mojavensis or D. m. wrigleyi males had amotile sperm and were sterile, as expected. The sperm of hybrid males from crosses between D. mojavensis females and D. arizonae males was motile, exhibiting normal tails, but the individuals were sterile, indicating that the disruption of spermatogenesis also occurred in these males. These findings are in agreement with Hardy et al.62, who observed that D. arizonae-D. m. mojavensis hybrids display abnormal spermatid development and disruption of the spermatid tails.

The differential gene expression pattern in hybrid testes that we observed for crosses with D. arizonae mothers, with a tendency for an excess of underexpressed genes in hybrids when compared to the parental lines, is thus associated with a more severe male sterile phenotype. The functions associated with these underexpressed genes are related to spermatogenesis and could indicate a breakdown in gene regulation during spermatogenesis55,56,57,63,64,65. Moreover, we also identified genes related to metabolism, which supports the reduction of hybrid viability. These findings are in agreement with several studies that reported that most of the DEGs in male hybrids are underexpressed, and their functions are associated with reproduction, such as meiotic arrest, spermatogenesis4,6,8 and spermiogenesis, which are often observed in sterile hybrids66,67,68,69.

Analyses of the inheritance of gene expression patterns showed that most of the genes had conserved expression in hybrid ovaries and testes. This result is similar to previous findings for D. arizonae-D. m. mojavensis female hybrids30 but is quite different from those found in D. sechellia-D. melanogaster and D. simulans-D. melanogaster hybrids, since only 6% and 11% of their genes had conserved expression, respectively9,15. Furthermore, given that D. simulans-D. melanogaster and D. arizonae-D. mojavensis may have similar divergence times21,70, the number of genes classified as conserved is quite different (11% vs. ~ 95%), indicating that the interactions between different genetic changes accumulated in distinct species might have stronger or weaker effects on hybrids, causing disturbances in gene expression to variable degrees.

Among the DEGs with dominant inheritance in the ovaries and testes of all hybrids, most showed D. arizonae-like expression, regardless of the crossing direction. These findings could indicate that in the gonads of hybrids, the effect of the D. arizonae genome is more important than the maternal effects. Therefore, we speculate that for hybrid ovaries, the D. arizonae genome has a stronger effect on hybrid gene expression, likely due to regulatory sequences. This result corroborates our previous findings30 that among the genes classified in the dominant category, for both crossing directions, the D. arizonae-dominant inheritance pattern was stronger. Likewise, the expression profile in the ovaries of hybrids from crosses between D. melanogaster females and D. sechellia males showed that 49% of the genes were classified as dominant, and among these, 84% showed D. sechellia-like expression15, indicating that the maternal species does not influence the expression profile of the hybrids. In the target species of the current study, the hybrid female gonads showed very few genes classified as over- or underdominant, agreeing with previously reported results30 and demonstrating the greater stability of their gene expression. In testes, on the other hand, the number of differentially expressed genes classified in these categories was higher, with a higher number of underdominant genes for one specific cross direction. In hybrid male gonads, the underexpression of genes, mainly related to reproduction, has often been observed15,56,71,72, indicating that this can be related to disruptions in regulatory networks driven by the rapid divergence of male-biased genes, leading to sterility7,56.

Therefore, here, we showed that in hybrids from recently diverged species, such as D. arizonae and D. mojavensis, the degree of misregulated gene expression is related to the severity of the sterile phenotype. In female hybrids, which are fertile, very few DEGs were identified when compared with both parental lines. In contrast, in male hybrids, the degree of misregulated gene expression was dependent on the subspecies of D. mojavensis and, most importantly, the cross direction. Hybrid males from D. mojavensis females, which are sterile but had motile sperm, displayed a smaller number of misregulated genes, with a bias towards overexpression; these genes are not directly related to the sterile phenotype, since few genes acting on reproduction were found. Nevertheless, in male hybrids carrying the X chromosome of D. arizonae, which are sterile with amotile sperm, the disruption of gene expression was higher and presented a bias towards underexpression. Surprisingly, most of these genes were directly related to spermatogenesis-related functions, as well as sperm movement. However, more analyses must be undertaken to clarify the regulatory differences between this pair of species. These analyses include investigating the divergence of male-biased gene sequences, regulatory studies (cis–trans regulatory changes and the impact of microRNAs on gene expression), and functional analyses of the genes identified in the current study. Thus, we can obtain a better understanding of their relationship with the sterile phenotype in the initial steps of hybrid incompatibility.

Methods

Drosophila strains and RNA sequencing

Intraspecific and interspecific reciprocal crosses were performed between D. arizonae from Metztitlan, Hidalgo, México (stock number: 15081-1271.17) and two subspecies of D. mojavensis: D. m. mojavensis from the Anza Borrego Desert, California, USA (stock number: 15081-1352.01) and D. m. wrigleyi from Catalina Island, California, USA (stock number: 15081-1352.22). Crosses were performed with 3-day-old flies, ten males and ten females, in 2.3 × 9.5 cm vials containing standard Drosophila medium supplemented with yeast under the same temperature (23 °C) and humidity conditions. One-day-old virgin female and male offspring (control and F1 hybrids) were collected after hatching and were isolated until they reached sexual maturity. The male and female reproductive tracts of 9- to 12-day-old flies were dissected in PBS (phosphate-buffered saline) and stored at − 80 °C until use for RNA extraction.

To verify the hybrid status of the F1 offspring of interspecific crosses, 10 individuals of each cross were randomly collected for DNA extraction, and PCR for the ribosomal ITS-1 (internal transcribed spacer 1) from the 18S gene region (NCBI Reference Sequence: EU306666.1)73 was performed. The oligonucleotide primer for ITS-1 amplified 500 bp and 550 bp amplicons in D. arizonae and D. mojavensis, respectively. Therefore, in hybrids, two different fragments corresponding to D. arizonae and D. mojavensis alleles were expected.

After confirming the hybrid status of the offspring, 30 pairs of ovaries and 50 pairs of testes were used to perform total RNA extraction with two biological replicates using the RNeasy kit (Qiagen). The samples were treated with DNase (DNA-free Kit, Ambion) and stored at − 80 °C. The samples were quantified by fluorescence in a Bioanalyzer 2100 (Agilent). Sequencing was performed using the GenomEast platform by a member of the France Génomique consortium (ANR-10-INBS-0009) with an Illumina HiSeq 4000. The samples were sequenced in 2 × 100 paired-end reads, and the average size of the inserts was 300 base pairs.

Twelve transcriptomes were sequenced with two biological replicates each: D. arizonae, D. m. mojavensis, and D. m. wrigleyi (controls, ovaries and testes) and hybrids from crosses between D. arizonae and both D. mojavensis subspecies (ovaries and testes). The hybrid transcriptomes from D. m. mojavensis female and D. arizonae male crosses were not sequenced because the hybrid incompatibility in this direction was very high, and it was not possible to obtain enough material to perform RNA extraction. The low number of replicates is due to the high index of prezygotic reproductive isolation between D. arizonae and D. mojavensis subspecies (ranging from 0.56 to ~ 0.7025) and the low average viability of the hybrid offspring, which limited the number of hybrids obtained to perform RNA extraction.

Mapping and quantification of expression

The sequenced transcriptomes were trimmed using UrQt74 to remove polyA tails and low-quality nucleotides. The sequence quality was then checked with FastQC software75. The transcriptomes were aligned against all annotated coding sequences (CDSs) of the D. mojavensis r1.04 public genome76 (available at http://flybase.org/). Overall, 21,915 Ref-Seq sequences were downloaded from https://www.ncbi.nlm.nih.gov/refseq/. From those sequences, 20,110 corresponding to mRNA were used as a reference to perform the alignments. This approach was used because the public genome of D. mojavensis presents the best quality of sequences and because D. mojavensis and D. arizonae are recently diverged species, a large divergence in their coding protein genes was not expected. Kallisto77 was used to map the reads from parental and hybrid transcriptomes against the D. mojavensis r1.04 reference transcripts. Kallisto is able to perform rapid pseudoalignment to quickly determine the compatibility of the reads with their respective targets. The pseudoalignment of reads preserves the key information needed for quantification and is robust against errors, presenting a similar accuracy as other alignment tools77. After the mapping procedure, BioMart78, an R (3.6.1)79 Bioconductor package, was used to recover the gene names corresponding to each transcript from the reference. This was possible because the BioMart database is maintained by Ensembl80, providing direct access to a diverse set of data and enabling a wide range of powerful online queries, from gene annotation to database mining. Subsequently, due to several genes displaying different isoforms, the package tximport was used to summarize the transcript level estimation for the gene level analysis, allowing us to use these data for the differential expression of gene-level counts.

Differential expression analyses

Differential expression analyses were performed using DESeq281, an R (3.6.1) package79, using raw read counts to identify differentially expressed genes in the hybrids compared to the parental species (controls lines) for each gonad tissue. This package normalizes the counts using size factors that are estimated according to the median counts taken for all genes. Additionally, DESeq2 estimates the means and variances of raw read counts and tests for differential expression based on a model using a negative binomial distribution and uses Benjamini–Hochberg multiple test correction (FDR level of 0.01). The low number of replicates can influence the statistical support of differential expression analyses. Therefore, to be conservative, we implemented stringent statistical thresholds. Genes were classified as significantly differentially expressed when the p-value, which was adjusted by FDR level, was below 0.01 and a higher than twofold change in expression was observed (corresponding to log2(FC) >|1|. Transcripts that presented fewer than ten mapped reads in all conditions tested were excluded from the analyses. The number of over- and underexpressed genes was analysed by a proportion test (prop.test) with R software79.

Functional annotation and gene ontology enrichment analyses

Functional annotation was performed for all DEGs identified in the ovaries and testes of parents and hybrids. For this analysis, an orthologous gene table for Drosophila species was downloaded from http://flybase.org/. The D. melanogaster orthologs corresponding to DEGs in the hybrids were submitted to DAVID GO82,83. In addition, gene enrichment was investigated using a list of specific DEGs. Thus, a target gene list was compared with a reference list, which contained all the genes in the PANTHER Classification System platform84,85,86 (available at http://geneontology.org/) for a selected organism, using Fisher’s exact tests with FDR corrections or p-values. Then, we selected all significant GO terms from our target gene list and submitted them to the REVIGO web server87. By using REVIGO, we were able to summarize and remove redundant GO terms.

Inheritance classifications

The R (3.6.1) package79 was used to sort genes in terms of differences in their expression levels between each parental line and the reciprocal hybrids separately, according to McManus et al.15. The expression data were transformed into log percentages, and a threshold of twofold change and adjusted p-value < 0.01 were set to determine the significance of differentially expressed genes. Genes that were not differentially expressed were considered to have the same expression level as the parental lines, thus being considered conserved. Genes considered differentially expressed were classified as additive, dominant, underdominant or overdominant. Additive expression means that the expression level of a given gene is different between the two parental lines but intermediate in the hybrid. D. arizonae-dominant or D. mojavensis-dominant expression is when the hybrid expression is similar to only one parental line. Overdominant expression means that expression in the hybrids is significantly higher than that in both parental lines, while underdominant expression means that the hybrid expression of a given gene is significantly lower than that in both parental lines. Chi-square statistical tests were performed in the R (3.6.1) package79.

Viability and sterility analyses

Virgin males and females of each strain were separated by sex 10 h after eclosion and stored separately in yeasted cactus-banana vials, with 10 flies per vial, until flies were sexually mature (9 days of age). Crosses were performed between D. m. mojavensis, D. m. wrigleyi and D. arizonae, as well as within the parental lines, as control crosses. Five replicates were performed per cross with 10 couples in each vial, which favours mating30. Mating was performed for 72 h under similar temperature (23°) and light/dark (10/14 h) conditions. After 72 h, males from all crosses were discarded, and females (in pairs) were transferred to new fresh vials to lay eggs. This process was repeated five times every 48 h, so the females laid eggs for 10 days. Immediately after removing the females, the eggs that were laid were counted under a stereomicroscope, and after eclosion (~ 19 days after crossing), the number of imagoes was verified once a week for four weeks. The offspring viability (adults/eggs × 100) was calculated based on the number of eggs and adults.

For sterility analysis, interspecific and control crosses were performed using three-day-old virgin flies to obtain as many hybrids as possible. In previous tests, we noticed an increased production of hybrids when the two species were kept together before they reached sexual maturity. All crosses were performed with five replicates under the same temperature and light/dark conditions for 12 days. Then, the parents were discarded, and the imagoes were separated by sex daily. The offspring were maintained in yeasted food vials until they reached 10 days of age (sexually mature). Sperm motility analyses were carried out for 20 F1 male testes and seminal vesicles of each control and interspecific cross, according to Reed et al.59. No statistical analyses were performed because for each cross, all males presented the same phenotype, motile or amotile sperm. Additionally, fertility analyses were carried out by backcrossing female and male hybrids with their respective parents, D. arizonae, D. m. mojavensis and D. m. wrigleyi. Crosses were performed with five couples per replicate with five replicates by cross. To ensure that the absence of offspring was not due to possible prezygotic, post-mating-prezygotic or postzygotic isolation mechanisms, we increased the crossing time and allowed the couples to mate for 15 days. After that, all parents were discarded, and fertility was evaluated based on the presence or absence of offspring, as reported by Carnelossi et al.28. F1 × F1 crosses were also performed using the offspring of each interspecific cross under the same conditions as the backcrosses. To check that tubes containing only eggs would not produce offspring, the tubes were maintained for 20 days after parent removal and then discarded. Statistical analyses were performed for average fecundity by and viability for each replicate of intraspecific and interspecific crosses by using R (3.6.1) package79. Normality and variance tests (Shapiro–Wilk and Levene’s test, respectively) were carried out, and when we obtained significant p-values (non-normal distribution), a nonparametric Kruskal–Wallis test was performed. Then, a post hoc Wilcoxon test was performed to determine significant differences between the treatments. For results with no significant p-values for normality and variance tests, one-way ANOVA was performed using Tukey’s post hoc test.