UCE sequencing-derived mitogenomes reveal the timing of mitochondrial replacement in Malagasy shrew tenrecs (Afrosoricida, Tenrecidae, Microgale)

Malagasy shrew tenrecs (Microgale) have increasingly been used to study speciation genetics over the last years. A previous study recently uncovered gene flow between the Shrew-toothed shrew tenrec (M. soricoides) and sympatric southern population of the Pale shrew tenrec (M. fotsifotsy). This gene flow has been suggested to be accompanied by complete mitochondrial replacement in M. fotsifotsy. To explore the temporal framework of this replacement, we assembled mitogenomes from publicly available sequencing data of ultra-conserved elements. We were able to assemble complete and partial mitogenomes for 19 specimens from five species of shrew tenrecs, which represents a multifold increase in mitogenomic resources available for all tenrecs. Phylogenetic inferences and sequence simulations support the close relationship between the mitochondrial lineages of M. soricoides and the southern population of M. fotsifotsy. Based on the nuclear divergence of northern and southern populations of M. fotsifotsy and the mitochondrial divergence between the latter and M. soricoides, there was a mean time window for replacement of ~ 350,000 years. This timeframe implies that the effective size of the ancestral M. fotsifotsy southern population was less 70,000.

Tenrecs (Tenrecidae) are small afrotherian mammals endemic to Madagascar. They include 31 species occupying diverse ecological niches (climbing, fossorial, semi-aquatic, hedgehog-like). Shrew tenrecs (Microgale) represent, with 21 currently recognized species, the most speciose genus among all tenrecs. Several species have been described/ elevated in the last two decades (Goodman and Soarimalala 2004;Goodman et al. 2006;Olson et al. 2009), highlighting that this number likely underestimates current Microgale species diversity. In addition, previous genetic data show that speciation is ongoing even in sympatric shrew tenrecs (Everson et al. 2020). These authors uncovered that the Pale shrew tenrec (M. fotsifotsy) consists of two geographic and genetic populations: a northern population in the vicinity of Montagne d'Ambre (the type locality; this population is hereafter depicted with (N) after the species name) and a southern population with a broad distribution across central and southern highlands (S). The southern population is sympatric with the closely related Shrew-toothed shrew tenrec (M. soricoides) across its entire range. Based on ultraconserved elements (UCE), the inferred nuclear topology supports the monophyly of M. fotsifotsy (Everson et al. 2020). UCEs are genomic regions of high conservatism among (even distantly related) species and have been shown to be valuable markers for phylogenetic inferences beyond protein-coding genes (Faircloth et al. 2012). The mitochondrial phylogeny based on the NADH dehydrogenase 2 (ND2) locus, however, suggested M. fotsifotsy (S) specimens to form the sister lineage to M. soricoides rather than to M. fotsifotsy (N). Deeper analysis led the authors to conclude that the mito-nuclear incongruence within M. fotsifotsy is based on limited gene flow between its southern population and M. soricoides. They further conclude that the mitogenomic lineage of M. fotsifotsy (S) has thereby completely been replaced by the introgressing M. soricoides (Everson et al. 2020). To further explore the temporal frame work of this mitogenome replacement in more detail, more complete  (Vieira and Prosdocimi 2019;Allio et al. 2020). We here apply these pipelines to existing sequencing data of shrew tenrecs to (I) increase mitogenomic resources available for the group, and (II) to explore the mitogenome replacement in M. fotsifotsy (S) more deeply. In this study, mitogenomes of five tenrec species were newly assembled. UCE sequencing data for 18 specimens belonging to M. fotsifotsy, M. soricoides, M. cowani and M. drouhardi (Everson et al. 2020) as well as whole-genome sequencing data of Nesogale talazaci (Zoonomia Consortium 2020) ( Table 1) were downloaded from the NCBI Sequence Read Archive (SRA) using the SRAtools 2.10 (http:// ncbi. github. io/ sra-tools/). Adapter sequences and low-quality reads were trimmed from the raw data using cutadapt 2.6 (Martin 2011). Mitogenome sequences were extracted from UCE sequencing data using the recently published MitoFinder pipeline (Allio et al. 2020). Contigs were assembled with MetaSPAdes (Nurk et al. 2017). The mitogenome of Nesogale talazaci was assembled from wholegenome sequencing data using Novoplasty2 (Dierckxsens et al. 2017). Protein-coding genes, rRNAs and tRNAs were annotated using MITOS (Bernt et al. 2013) and manually curated for gene boundaries in AliView (Larsson 2014).
Bayesian phylogenetic analyses were performed using the software BEAST v2.6.3 (Bouckaert et al. 2019). We defined the spiny tenrec Echinops telfairi (NC_002631.2) as the outgroup. The concatenated sequences alignment was based on the protein-coding and rRNA genes. All sequences were aligned per gene using mafft 7.407 (Katoh and Standley 2013).
The best-fit partitioning schemes and models of molecular evolution were first estimated using PartitionFinder v.2 (Lanfear et al. 2017), with linked the branch lengths and applied the greedy search algorithm. For the protein-coding genes, we defined each codon position as one possible partition. The best model was choose based on the Bayesian Information Criterion. Since the effective sample sizes (ESS) for all parameters were low and the runs did not converge under the best-fit partitioning scheme, we alternatively used the HKY substitution model with empirical base frequencies, gamma distributed rate heterogeneity and a proportion of invariant sites to reduce overparameterization. Divergence times were estimated using secondary calibrations from previous studies (Everson et al. 2016(Everson et al. , 2020. We used normal prior distributions with standard deviation of 1 to constrain the split between Echinops and  (Drummond et al. 2006). We ran two independent analyses and performed for each 10 8 generations, sampling every 10 4 generations. The results of the two analyses were combined using LogCombiner v.2.6.3 (Bouckaert et al. 2019) removing the first 10% as burn-in. We used Tracer v.1.7.1 (Rambaut et al. 2018) to check the convergence within and between analyses and to ensure that the ESS for all parameters reached > 3*10 3 . We pruned from the trees 10% as burn-in and generated maximum-clade-credibility trees with node heights set to mean age estimates using TreeAnnotator v.2.6.3 (Heled and Bouckaert 2013).
To evaluate whether the conflicting signal between the nuclear and mitogenome-based phylogenies is supported by significant mitochondrial sequence divergence between M. fotsifotsy (N) and M. fotsifotsy (S), a null distribution of expected mitogenomic sequence divergence was generated in R (R Core Team 2020). For this, mitogenomic sequence divergence was simulated under the nuclear, UCE-based tree topology from Everson et al. (2020) for M. cowani (as outgroup), M. soricoides, M. fotsifotsy (N) and M. fotsifotsy (S). To account for temporal uncertainties of nodes, 100 timetrees were generated with node ages drawn from normallydistributed priors based on 95% confidence intervals in the previously published time tree (Everson et al. 2020) using the compute.brtime function from the ape package (Paradis and Schliep 2019). Sequence evolution was then simulated 100 times for each time tree using the simSeq function of the phangorn package (Schliep 2011), resulting in 10,000 sequences for every species. Sequence lengths in the simulations equaled the length of the empirical, concatenated dataset. Mutation rate was estimated from the uncorrected p-distance of M. cowani and Echinops sequences and their divergence time (Everson et al. 2016). Sequence divergence among M. soricoides, M. fotsifotsy (N) and M. fotsifotsy (S) were calculated in every simulation round using dist.dna function (model = K80, i.e., K2P) from ape package. Finally, observed mitogenomic sequence divergence in the empirical data set was compared to the simulated null distribution. The mean time period necessary for the introgressing mitogenome lineage to reach fixation was computed using the formula from Kimura and Ohta (1969) Mitogenomes could be reconstructed from single contigs for all 19 specimens (Table 1). Complete mitogenomes with full annotation were obtained for 18 specimens. For M. drouhardi (MVZ217026), only a partial mitogenome could be reconstructed and only ten of the protein-coding genes could be annotated. All mitochondrial genomes produced here were submitted to GenBank under the Third Party Annotation (TPA) database (Table 1). Our results increase the number of tenrec mitogenomes from 1 to 20 and mitogenomic resources are now available for six tenrecid species. As the southern population of M. fotsifotsy likely represents a separate species (Everson et al. 2020), complete mitogenomic data are already available for this new candidate.
Our mitogenome phylogeny supports previous findings from a single mitochondrial locus. M. fotsifotsy is paraphyletic as its southern population is sister to M. soricoides, suggesting deep mitochondrial divergence between the northern and southern population of M. fotsifotsy (Fig. 1A). Bayesian dating analysis reveals that the mitochondrial lineages of M. fotsifotsy (N) and the M. fotsifotsy (S)-M. soricoides clade split 4.19 mya (95% CI: 3.29-5.17). M. fotsifotsy (S) and M. soricoides split 2.79 mya (95% CI: 2.13-3.53). Although using the similar age priors as in Everson et al. (2020), node ages to and among outgroups are generally younger than inferred from nuclear or combined data (Everson et al. 2016(Everson et al. , 2020. The divergence between M. fotsifotsy (N) and M. soricoides is nevertheless similar (Fig. 1A; 95% CIs greatly overlap). Simulating sequence evolution further supports deep mitochondrial divergence between the northern and southern population of M. fotsifotsy (Fig. 1B). Empirical sequence divergence is significantly higher than expected under the simulated evolution according a nuclear-based topology (> 95% percentile). In addition, empirical sequence divergence among specimens of M. fotsifotsy (S) and M. soricoides is significantly smaller than expected (< 5% percentile) and implies a closer evolutionary history than suggested by the simulations based on nuclear-based topology (Fig. 1C). To validate our simple model of sequence divergence, we also compared M. fotsifotsy (N) and M. soricoides, whose phylogenetic relationship is unaffected using either nuclear or mitogenomic data. The empirical divergence of their mitochondrial sequences falls close to the mean of the null distribution of expected divergence according the nuclear-based topology (Fig. 1D). It is also in a similar magnitude as the empirical divergence in M. fotsifotsy (N)-M. fotsifotsy (S) (Fig. 1B). This comparison thus serves as a positive control.
Overall, mitogenomic data assembled here and in Everson et al. (2020) imply complete mitochondrial replacement in the southern population of M. fotsifotsy by the M. soricoides mitogenomic lineage. This is in concordance with limited but significant nuclear gene flow reported before (Everson et al. 2020). The replacement must have taken place after the nuclear split between northern and southern population (3.14 mya; Everson et al. 2020) but before the split between the mitochondrial lineages of M. fotsifotsy (S) and M. soricoides (2.79 mya; this study), resulting in a mean time for replacement of ~ 350,000 years (Fig. 1E). This seemingly small temporal window nevertheless covers roughly 100,000 generations in shrew tenrecs. Although empirical data on ancestral population sizes are completely lacking for Malagasy shrew tenrecs, testing different introgression rates with the formula above provides maximum limits for effective population size of M. fotsifotsy (S). Results show that in a population of effective size less than 50,000 the mitogenome is rapidly replaced even under small introgression rates (Fig. 1F). If there have been higher introgression rates, 350,000 years would still be sufficient to fix the introgressing M. soricoides mitogenome lineage under M. fotsifotsy (S)'s effective population size of up to ~ 70,000 (Fig. 1F).
Mitochondrial replacement has frequently been observed in closely related mammal species, for instance in hares, roe deer and chipmunks (Good et al. 2008;Melo-Ferreira et al. 2012;Matosiuk et al. 2014). These studies highlighted the importance of divergence with gene flow in mammals, i.e. when lineage divergence occurs on a shorter timescale than does the completion of reproductive isolation. This phenomenon is most-extensively investigated in chipmunks, which show widespread and multiple mitochondrial introgression among species (Sullivan et al. 2014). It was recently shown that mitogenome introgression and replacement in this group occurred with only little or no nuclear introgression and that species boundaries are largely impermeable to nuclear gene flow (Sarver et al. in press). Gene flow among Microgale species here mirrors the situation in chipmunks as nuclear gene flow was relatively limited despite complete mitochondrial replacement (Everson et al. 2020). Thus, M. fotsifotsy and M. soricoides likely diverged under gene flow permeable for mitochondrial introgression. Signatures of positive directional (adaptive) selection are rarely detectable in cases of mitochondrial replacement (Sarver et al. 2017). More likely, the replacement in M. fotsifotsy (S) arose from a combination of demographic effects associated with range expansion, mate choice and the selection against hybrids (Bonnet et al. 2017). In conclusion, we were able to further elucidate the temporal framework of gene flow among Microgale species that have led to complete mitochondrial replacement. In addition, we highlight the usefulness of existing sequencing data to retrieve other genetic information.
Funding Open Access funding enabled and organized by Projekt DEAL.
Availability of data All sequence read data and newly assembled mitogenomes are publicly available from NCBI GenBank. Accession numbers are listed in Table 1.

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. Fig. 1 Detailed history of mitogenome replacement in the southern population (S) of M. fotsifotsy. A Dated phylogeny based on newly assembled mitogenomes. Grey bars indicate 95% confidence intervals of node ages. M: Microgale. Nodes used for calibration are indicated with a white star. All nodes have posterior probability of 1 except for the node between the sister specimens FMNH166142 and FMNH170750 (PP < 0.5). B-D Simulated distribution of mitochondrial sequence divergence under a nuclear-based topology among different shew tenrec species/populations. Dotted lines indicate 95% (B) and 5% (C) percentiles, respectively. Grey areas indicate range of empirically estimated sequence divergences among specimens. E Summary phylogeny visualizing differences in nuclear (solid line) and mitochondrial (dotted line) topology and divergence dates. Grey area indicates temporal window for mitogenome replacement. F Time to fixation of introgressing M. soricoides mitochondrial lineage in relation to introgression rate under varying M. fotsifotsy (S) effective population sizes (numbers above lines). Grey area indicates fixations times within the 350,000 years window for mitogenome replacement ◂