Introduction

Mitochondria, crucial organelles in eukaryotes, originated from endosymbiotic Alphaproteobacteria within primitive eukaryotic cells approximately 1.5 billion years ago [1]. They gradually evolved into semi-autonomous organelles through gene transfer with the host cell nucleus [2, 3]. In plants, mitochondria are predominantly maternally inherited, with very few species exhibiting patrilineal or biparental inheritance [4]. Beyond their role in providing energy for cellular metabolism, mitochondria are closely associated with biological processes such as plant fertility [5], ecological adaptation [6], programmed cell death [7], and intracellular signal transduction [8]. Furthermore, under stress conditions like drought, salinity, or extreme temperatures, plant mitochondria produce retrograde signals, such as reactive oxygen species, prompting the regulation of stress response-related genes and cellular activities to restore homeostasis [9].

Plant mitochondrial genomes display distinct sequence specificities and complex physical structures. Varying significantly in size across species (ranging from 66 Kb to 11.7 Mb) [10, 11], they exhibit a much greater range than animal mitochondria (14–20 Kb) [12]. Plant mitogenomes also showcase structural diversity, featuring cyclic, linear, and reticular types [13]. The gene-coding region typically constitutes a small portion of the plant mitogenome, around 10%, and harbours numerous repeated sequences that drive evolution and structural diversity in angiosperm genomes [14, 15]. For instance, the “fossilised” mitogenome of Liriodendron tulipifera contains 65 genes [16], whereas extant plants may contain 32–67 genes due to numerous gene transfer or loss events across their evolutionary history [13]. Many RNA-editing events in plant mitogenomes are intricately linked to species evolution [17], phenotypic variation [18], and cytoplasmic male sterility [19]. Additionally, plant mitogenomes can integrate exogenous or migratory DNA sequences through mechanisms like homologous recombination, sequence transfer, or selective domestication, contributing to their rapid evolution [20, 21]. Owing to these characteristics, plant mitogenomes serve as crucial tools for species identification, evolutionary analysis, and trait inheritance research.

Blackcurrant (Ribes nigrum L.), also referred to as black currant, cassis, black cassis, or dry grape, is a perennial deciduous shrub indigenous to northern Europe and northern Asia, belonging to the genus Ribes of the Grossulariaceae family. It is extensively cultivated in temperate and boreal regions, with major producers including Germany, Poland, Norway, China, the United States, and Russia [22, 23]. In 2021, the global planted area and annual production of R. nigrum were approximately 41,860 ha and 118,002 tons of fresh fruit, respectively (International Blackcurrant Association, http://www.blackcurrant-iba.com/agronomy/statistics/). Known for its exceptionally high content of vitamins, γ-linolenic acid, and phenolics, blackcurrant’s bioactive components confer antioxidant and anticancer effects, cardiovascular and cerebrovascular benefits, and immune enhancement [24]. Apart from fresh consumption, R. nigrum is processed into juice, jam, or wine [23, 24], and its seeds are extracted for medicines, oils, etc., often used in cosmetics [25], while the leaves are utilised for tea and functional foods [26, 27]. The Ribes genus boasts rich germplasm resources, with over 160 species worldwide [24], including 59 species and 30 varieties identified in China alone, with significant wild resources distributed in Heilongjiang, Jilin, Liaoning, Nei Mongolia, Xinjiang, and Tibet [28]. A comprehensive understanding of plant genomes, including the mitogenome, is crucial for leveraging these germplasm resources for future development; however, no mitogenome has been reported for the Grossulariaceae family to date.

Integration of second and third-generation high-throughput sequencing technologies facilitates more effective exploration of plant mitogenomes [29]. In this study, we focused on the widely cultivated Chinese blackcurrant cultivar ‘Hanfeng’. Initially, we assembled and characterised the mitogenome features of this cultivar. Subsequently, we predicted and verified the presence of repeated sequences and their involvement in repeat-mediated recombination, besides identifying RNA-editing events in the mitogenome protein-coding genes (PCGs). Additionally, we assembled the plastome and identified homologous DNA sequences using the mitogenome. Finally, we delved into the evolutionary relationship of R. nigrum with related species. This study lays a scientific foundation for a deeper understanding of the genetic characteristics of R. nigrum, with implications for germplasm identification and evolutionary studies.

Results

R. Nigrum mitogenome assembly

The assembly of the R. nigrum mitogenome utilised 11.81-Gb short reads and 8.73-Gb long reads employing a hybrid assembly strategy. The detailed assembly process for the graphical representation of the mitochondrial genome in R. nigrum is depicted in Fig. S1, while a simplified representation is presented in Fig. 1. The genome sketch comprises three contigs (Fig. 1A), with contig1 being the longest at 302,735 bp length, with a sequencing depth of 129X and 18,325 reads. Subsequently, contig2 spans 145,234 bp with a sequencing depth of 128X and 10,710 reads. Meanwhile, contig3 is the shortest at 1,129 bp in length, with a sequencing depth of 255X and 513 reads. Notably, the genome sketch features a double-bifurcating structure represented by contig3. The utilisation of the Unicycler software to exclude repeated regions identified by long reads led to the establishment of a complete master circle structure encompassing a total length of 450,227 bp (Fig. 1B) with a guanine–cytosine (GC) content of 45.82%. This genome surpasses in size those of Sedum plumbizincicola (212,159 bp), Rhodiola tangutica (257,378 bp), and R. crenulata (194,106 bp) within the same Order, while exhibiting similar GC contents (44.24–45.21%) (Table S1).

Fig. 1
figure 1

A mitogenome sketch for R. nigrum (A) and the master circle structure (B)

Molecular features

Annotation of the R. nigrum mitogenome identified sixty-one unique genes (Fig. 2; Table 1), comprising 24 core, 15 non-core, 19 transfer RNA (tRNA), and three ribosomal RNA (rRNA) genes. The coding sequences for PCG, tRNA, and rRNA measured 32,874 bp, 1,690 bp, and 5,432 bp, respectively, cumulatively accounting for only 8.88% of the entire genome. Notably, the exons of nad1, nad2, and nad5 are dispersed or not co-located on the same strand, representing trans-splicing genes (Table S2). Furthermore, ten genes in the R. nigrum mitogenome PCGs contain introns, surpassing the number of introns observed in closely related species (Table S3 and Table S4).

Table 1 Encoding genes of R. nigrum mitogenome

Relative synonymous codon usage (RSCU) analysis, reflective of gene expression optimization through molecular evolution [30, 31], revealed a general preference for amino acid codon usage in the R. nigrum mitogenome (Fig. S2 and Table S5), with exceptions noted for the start codon (AUG) and tryptophan (UGG) (RSCU = 1.00). For instance, alanine prefers GCU (RSCU = 1.58), while the stop codon favours UAA (RSCU = 1.54). Conversely, histidine exhibits a lesser preference for CAC (RSCU = 0.47), and both tyrosine and phenylalanine demonstrate insignificant usage preferences.

Fig. 2
figure 2

Mitochondrial genome diagram (outer ring) and repetitive sequence distribution (inner ring) of R. nigrum. The colour line on the C1 circle connects two repeated dispersed repeats. The blue, orange and purple lines represent palindromic, forward, and reverse repeats, respectively. The red line on the C2 circle represents tandem repeats. The black line on the C3 circle represents SSRs

Repeat elements and repeat-mediated recombination

A substantial number of repeated sequences were identified in the R. nigrum mitogenome, with their distribution depicted in Fig. 2. Specifically, this genome harbours 180 simple sequence repeats (SSRs) (Table S6), encompassing 70 monomers, 38 dimers, 19 trimers, 45 tetramers, five pentamers, and three hexamers, where monomers and dimers collectively constitute 60.00% of all SSRs. Notably, thymine repeats comprise 48.57% (n = 34) of all monomers. Additionally, twelve tandem repeats with a match of ≥ 85% and lengths ranging from 10 to 39 bp were identified in this mitogenome (Table S7).

Dispersed repeats classified as transposable factors, represent a class of repeated sequences capable of changing their position within the genome. In this investigation, we detected 432 pairs of dispersed repeats ≥ 30 bp in length, including 193 pairs of palindromic repeats, 238 pairs of forward repeats, one pair of reverse repeats, and no complementary repeats (Table S8). The longest forward and palindromic repeats measure 1,129 bp and 337 bp, respectively. The cumulative length of dispersed repeats amount to 43,204 bp, accounting for 9.60% of the mitogenome. This suggests that repeated sequences may serve as crucial drivers of mitogenome size amplification in plants [32].

In general, the plant mitogenome represents a complex, dynamically evolving hybrid state devoid of a singular circular structure [33]. Analysis of the aforementioned repeats unveiled that the forward repeat R1 (contig3) serves as a double-bifurcating structure potentially mediating genomic recombination. Consequently, long reads were employed to validate potential recombinant DNA sequences based on BLASTn. As depicted in Fig. 3A, the recombinant sequence delineates four potentially inferred genomic paths, where ctg1-ctg3-ctg2 and ctg2-ctg3-ctg1 signify the master circular configuration. At the same time, ctg1-ctg3-ctg1 and ctg2-ctg3-ctg2 denote the potential split configuration of the two smaller circles. A total of 35 long reads corroborates the master circle, whereas 170 reads support the split configuration, accounting for proportions of 17.07% and 82.93%, respectively (Table S9). Furthermore, primers were designed to validate the potential structures (Fig. 3A), with the corresponding electrophoresis results depicted in Fig. 3B. Specifically, all borders of the repeated sequences amplified bands of the anticipated size, and the Sanger sequencing comparison results corroborated these findings (Fig. S3). Collectively, both approaches substantiate that the repeated sequence ctg3 can mediate the formation of the two major conformations of the R. nigrum mitogenome: a master circle (Fig. 1B) and two smaller circles (Fig. 3C).

Fig. 3
figure 3

Structural prediction of R. nigrum mitogenome. (A) Primer design, (B) PCR validation, (C) two small circle structures

RNA-editing events

RNA editing, a widespread phenomenon in higher plant mitochondria, constitutes a post-transcriptional modification and regulatory process primarily occurring within coding regions [34]. In the R. nigrum mitogenome, we identified RNA-editing events across all 39 PCGs, totalling 731 editing sites, exclusively manifested as C-to-U edits (Fig. 4A), with the majority occurring at a frequency exceeding 0.80 (614 instances, approximately 83.99%) (Table S10). Notably, nad4 hosts the highest count of editing sites among the mitogenome PCGs, with 53 edits, comprising nearly 7.25% of the total, followed by ccmB, harbouring 47 editing events, accounting for 6.43% of the total. Subsequent analysis revealed 415 RNA edits affecting the second base encoding the amino acid, while 238 edits occurred at the first base, constituting 56.77% and 32.56% of the total edits, respectively.

Fig. 4
figure 4

Number of RNA editing sites (A) and amino acid conversions (B), new start codon and stop codon sites validation (C) and Sanger sequencing sequence comparison (D) in PCGs of the R. nigrum mitogenome

Of all RNA-editing events observed, 653 resulted in non-synonymous codon changes (approximately 89.33%), predominantly involving three types of amino acid alterations: Pro to Leu (164 instances), Ser to Leu (143 instances), and Ser to Phe (97 instances) (Fig. 4B). We predicted three editing sites within these PCGs responsible for creating start codons and two editing sites for generating stop codons (Table 2). Validation via polymerase chain reaction (PCR) yielded results confirming the existence of these five editing sites (Fig. 4C), with Sanger sequencing further validating these findings (Fig. 4D), as detailed in Fig. S4. Notably, the editing efficiency of the site atp6-718 was found to be low, while the site rps10-391 exhibited a larger band in genomic DNA (gDNA) compared to complementary DNA (cDNA) due to the presence of an intron spanning 850 bp.

Table 2 Specific RNA editing events predicted in R. nigrum mitogenome

DNA transfer

Using the same sequencing data, we assembled the R. nigrum plastome, which spans 157,459 bp in size (Fig. 5A). Throughout evolution, sequence migration between plant plastomes and mitogenomes has been extensive, leading to the incorporation of mitochondrial plastid DNAs (MTPTs) - chloroplast-derived DNA fragments - into the mitogenome [13, 35]. We identified fourteen MTPTs exhibiting > 80% sequence similarity (Fig. 5B), with a cumulative length of 4,990 bp within the R. nigrum mitogenome. Notably, two fragments surpassed 1,000 bp in length: MTPT13 (1,755 bp) and MTPT14 (1,278 bp). Additionally, among the MTPTs, twelve chloroplast genes were identified, including six PCGs - petG, petL, psbE, psbF, psbJ, and psbL, and six tRNA genes - trnD-GUC, trnH-GUG, trnM-CAU, trnN-GUU, trnp-UGG, and trnW-CCA. Similarly, eight fragments of chloroplast genes were detected within the MTPTs (Table 3). However, genes within the MTPTs exhibited varying degrees of sequence loss or replacement in the mitogenome, evolving into pseudogenes [13].

Fig. 5
figure 5

R. nigrum plastome map (A) and sequence migration analysis (B). Note: The bule arcs in Fig. 5B represent the mitogenome and the light green arcs represent the plastome. The connecting lines between the arcs are homologous fragments, where the red line represents sequence similarity equal to 100%, yellow represents sequence similarity 90-100%, and dark blue represents sequence similarity 80-90%

Table 3 Fragments transferred from plastome to mitogenome in the R. nigrum

Genome evolution

Plant mitogenomes have undergone extensive events of PCG loss or gain throughout their evolutionary history [2, 35]. While the fossilized plant mitogenome boasts 41 PCGs, only 18 PCGs are common to both the mitogenomes of R. nigrum and 29 related species. These shared PCGs include atp1, atp4, atp6, atp8, ccmB, ccmC, ccmFC, cox2, cox3, matR, nad1, nad2, nad4L, nad5, nad6, nad7, nad9, and rpl5. The phylogenetic tree constructed in this study reveals R. nigrum’s closest relation to S. plumbizincicola of the Crassulaceae family, within the Order Saxifragales (Fig. 6A). Moreover, the phylogenetic topology based on mitochondrial DNA aligns with the latest classification of the Angiosperm Phylogeny Group IV system [36].

Fig. 6
figure 6

Phylogenetic relationships (A) and variable gene deletions (B) between R. nigrum and related species. Note: The numbers in Fig. 6A indicate bootstrap support values

Comparisons of variable genes among 11 closely related species within Santalales, Saxifragales, Zygophyllales, and Ranunculales were also conducted. While rpl5 is present across all species, deletions of rps2 and rps11 are more prevalent. Notably, rps11 is exclusive to Pulsatilla chinensis, and rps2 is only found in Tolypanthus maclurei and P. chinensis (Fig. 6B). Intriguingly, rpl2, rps1, rps10, rps19, and sdh3 are absent in S. plumbizincicola, R. tangutica, and R. crenulata but present in R. nigrum.

Furthermore, mitogenome collinearity analysis was performed on eight closely related species within Saxifragales, Zygophyllales, and Ranuncullales (Fig. 7 and Table S11). Despite detecting numerous homologous collinear blocks between R. nigrum and its Saxifragales relatives, these blocks are relatively short in length. Specifically, between R. nigrum and S. plumbizincicola, only 65 collinear blocks exceed 300 bp in length, with the longest measuring 3,241 bp and a total collinear length of 69,888 bp, constituting merely 16% of R. nigrum’s mitogenome. Moreover, the order of these collinear blocks varies significantly, indirectly indicating frequent events of homologous recombination and/or genomic rearrangement between R. nigrum and its evolutionary relatives.

Fig. 7
figure 7

Evolution between R. nigrum and related species. The gray areas and red areas indicate collinear blocks with consistent and inconsistent arrangement orders, respectively. Light green, yellow-gray, orange-red, blue, dark green, purple, indigo, and red in the figure indicate Zygophyllum fabago, Tribulus terrestris, Rhodiola crenulata, Rhodiola tangutica, Sedum plumbizincicola, Ribes nigrum, Pulsatilla chinensis, and Aconitum kusnezoffii, respectively. The proportional numbers on either side indicate the percentage of all homologous sequences between the two species

Discussion

Molecular features and comparison of the R. Nigrum mitogenome among close relatives

The assembly of the complete mitogenome of R. nigrum was successful, revealing a length of 450,227 bp, with a GC content of 45.82%, and adopting a circular structure. This structure mirrors that of 29 related species, yet the genome size significantly differs from those of its counterparts (ranging from 194,106 to 900,031 bp), with slight variations observed in GC content (ranging from 42.48 to 46.86%) (Table S1). Although the genomic GC content has historically remained relatively stable during evolution, recent research suggests that GC-rich species exhibit an enhanced ability to thrive in regions characterized by extremely cold winters or seasonal drought [37]. Additionally, during adaptive evolution, the R. nigrum mitogenome has developed its preferences in amino acid codon usage, such as an elevated preference for GCU in alanine (RSCU = 1.58) and CAU in histidine (RSCU = 1.53) (Table S5).

In plants, complete mitochondrial genes typically encompass 24 core and 17 variable genes [16]. Despite their limited number, mitochondrial genes play crucial roles. For instance, Ayabe et al. [38] demonstrated that defects in the mitochondrial gene nad7 of Arabidopsis thaliana can profoundly affect mitochondrial gene expression by altering copy numbers, resulting in severe growth inhibition and, potentially, plant mortality. Throughout angiosperm evolution, the transfer and loss of ribosomal proteins and succinate dehydrogenase genes to the nucleus have been notably frequent and episodic [2, 13]. The R. nigrum mitogenome encompasses all core genes. However, in comparison to the “fossilised” L. tulipifera [16], R. nigrum lacks two variable genes (rps2 and rps11) while harbouring additional genes (rpl2, rps1, rps4, rps10, rps19, and sdh3) in comparison to S. plumbizincicola, a close relative within the same Order [39]. Furthermore, the mitogenomes of land plants feature diverse numbers of introns, some of which encode open reading frames involved in the evolutionary dissemination and/or splicing of introns [13]. In this investigation, introns were also identified in ccmFC, cox2, nad1, nad2, nad4, nad5, nad7, rpl2, rps3, and rps10, potentially serving as scientific references for species identification [40].

Repeat sequences and genome recombination

Repeated sequences are abundant in mitogenomes and play crucial roles in plant adaptive evolution, phenotypic trait variation, gene expression regulation, and genome size expansion [32]. As mitogenome coding sequences exhibit higher conservation compared to chloroplast or nuclear genes, the development of mitochondrial markers for species identification offers higher accuracy [41]. In this investigation, 180 SSRs were identified in the R. nigrum mitogenome, offering ample references for group classification within Ribes species.

Although the structure of the plant mitogenome is complex and variable, it is commonly depicted as a cyclic molecule [13, 33]. Longer repeated sequences (> 1,000 bp) can facilitate genome recombination, resulting in mitogenome rearrangements and alterations in mitogenome conformation [42]. The frequent occurrence of mitogenome rearrangements is associated with the high incidence of DNA repair through non-homologous DNA end joining at non-coding regions. Such imprecise repairs often incorporate nuclear or plastid DNA fragments during the repair process, significantly contributing to the duplication of non-coding regions and increasing the frequency of genome recombination [43, 44]. In higher plants like sugar beets [45] and rice [5], frequent recombination of repeated sequences can lead to cytoplasmic male sterility, and alterations in promoter positions, thereby affecting gene expression [32]. In this study, 432 pairs of dispersed repeats were identified, offering substantial reference information for genetic evolution and epigenetic traits in Ribes and closely related species.

Roles of RNA editing in plant evolution and ecological adaptation

RNA editing is prevalent in mitochondrial transcripts and is crucial for the production of functional proteins [17, 46]. Particularly, RNA editing involves effective non-synonymous edits that tend to produce codons encoding hydrophobic amino acids, thereby promoting protein folding and functionality [34]. For instance, Jiang et al. [47] identified two editing sites in the conserved region of the soybean maintenance line NJCMS2B that could convert hydrophilic serine into hydrophobic leucine, altering the direction of protein transmembrane spanning; conversely, RNA editing did not occur in the atp9 transcript product of the soybean sterility line NJCMS2A. In this study, 731 C-to-U editing sites were identified in 39 PCGs, most of which cause shifts in amino acids toward hydrophobic ones; however, further exploration is needed to understand their functional roles thoroughly.

Additionally, RNA editing typically restores evolutionarily conserved codons, potentially creating new start and stop codons [48] and enhancing the expression of mitochondrial genes [46]. Despite RNA editing producing new stop codons that shorten the original transcript, these truncated transcripts often remain active. For example, Gallagher et al. [19] demonstrated that maize pollen sterility is linked to truncation of the orf77 chimeric open reading frame due to early termination of mitochondrial editing. Intriguingly, new start codons introduced by RNA editing can initiate gene transcripts from new points. For instance, Kadowaki et al. [49] and Quiñones et al. [50] identified new start codons for cox1 in tomato and potato transcripts, respectively. Notably, three editing sites (cox1-2, nad1-2, and nad4L-2) that create new start codons and two editing sites (atp6-718 and rps10-391) that generate new stop codons were predicted; however, further investigation is warranted to elucidate the roles these editing sites might play in the growth and development of R. nigrum.

Gene transfer and plant evolution

Gene transfer within plant mitochondrial DNA is a common phenomenon. In this investigation, we identified 14 plastome fragments within the R. nigrum mitogenome, constituting 1.11% of its total length. Similar occurrences have been noted in other species such as Mangifera indica (7–10 homologous fragments, comprising 0.51–0.61%) [51], Mentha spicata (No. 17, 3.97%) [52], and Panax notoginseng (No. 12, 3.11%) [53]. Among these, twelve chloroplast genes were discerned in the R. nigrum MTPTs, while most other gene-coding sequences have undergone degradation. Typically, due to base mutations or genomic rearrangements, these genes tend to degenerate into pseudogenes post-integration into the mitogenome [13, 35]. Despite their inability to perform standard functions, these MTPT genes are pivotal in driving adaptive evolution and fostering genetic diversity among terrestrial plants [21].

Evolutionary analysis of R. Nigrum and its relatives

Plant mitogenomes exhibit a propensity for integrating exogenous or migratory DNA sequences, thereby resulting in frequent gain or loss PCG [2, 35]. Our study revealed that only 18 PCGs were shared between R. nigrum and its relatives, underscoring significant divergence during evolution. Collinearity analysis further demonstrated substantial discrepancies in genome sequences between R. nigrum and related species. Even in the case of S. plumbizincicola, which is considered a closer relative, the concordant sequence length accounted for approximately 16% of the total length. However, multiple factors contribute to sequence disparity among mitogenomes, including homologous recombination among species, repeat-mediated recombination within genomes, and horizontal gene transfer, which may be a primary contributing factor [43, 44]. Therefore, understanding the variation and evolution of the R. nigrum mitogenome, along with its evolutionary lineage, necessitates further investigation.

Conclusions

This study marks the successful assembly and comprehensive analysis of the mitogenome of R. nigrum, representing the first complete exploration within the Grossulariaceae family. With a length of 450,227 bp, it harbours 61 unique genes and features a rich presence of repeat sequences, RNA-editing events, and plastome fragments. Notably, it exhibits both master and double circle major conformations. Despite evolutionary conservation, our findings indicate multiple genomic recombination and/or gene transfer events during the species’ evolutionary journey. This study offers valuable insights for deeper investigations into genetic evolution and germplasm identification not only in R. nigrum but also in related species.

Materials and methods

Plant sample preparation and sequencing

Plant material was sourced from a plot cultivating the ‘Hanfeng’ variety of R. nigrum within the National Modern Agricultural Demonstration Area at the Institute of Rural Revitalization Science and Technology (coordinates: 45°50’3’’N; 126°51’14’’E; elevation: 139 m) of the Heilongjiang Academy of Agricultural Sciences, China. Young new shoots were meticulously selected, promptly frozen in liquid nitrogen and preserved at − 86 °C in an ultra-low temperature refrigerator (Thermo Fisher Scientific, Massachusetts, USA). DNA extraction was carried out using the Plant Genomic DNA Extraction Kit (Tiangen Biotech Co., Ltd., Beijing, China), while RNA extraction employed the Plant Total RNA Extraction Kit (Tiangen, Beijing, China), conducted in triplicate to ensure robustness. The quality of the extracted samples was meticulously assessed using a NanoDrop One Microvolume UV-Vis Spectrophotometer (Thermo Fisher Scientific, Massachusetts, USA).

High-quality samples underwent sequencing procedures at Wuhan Benagen Technology Co., Ltd. (Wuhan, China). Second-generation sequencing was executed utilizing a NovaSeq 6000 (Illumina Inc., San Diego, CA, USA), with subsequent evaluation and quality control of the raw short reads achieved through Trimmomatic v0.35. Third-generation sequencing was conducted employing a Nanopore PromethION sequencer (Oxford Nanopore Technologies, Oxford, UK), and quality control filtering of the raw long reads was accomplished using NanoFilt v2.8.0. For long non-coding RNA analysis, an MGISEQ-2000 (Shenzhen Huada Intelligent Technology Co., Ltd., Shenzhen, China) was employed, and SOAPnuke v2.0 facilitated quality control filtering of the raw reads.

Mitogenome assembly

The long-read sequencing data underwent de novo assembly using Flye v2.9.2 (University of California, San Diego, USA) with default settings [54]. The resulting assembly contained sequences from the nuclear, chloroplast, and mitochondrial genomes. Subsequently, a contig library crucial for subsequent analyses was generated using the makeblastdb utility. To pinpoint contigs containing mitochondrial DNA segments, we employed the BLASTn algorithm, referencing the A. thaliana mitogenome (NC_037304) and applying stringent parameters: “-evalue 1e-5 -outfmt 6 -max_hsps 10 -word_size 7 -task blastn-short” [55]. Next, GetOrganelle v1.7.7.0 (Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China) was utilised to extract short reads specific to the mitochondrial genome, which were then used to construct a graphical representation of the genome via SPAdes v3.15.5 (Saint Petersburg Academic University of the Russian Academy of Sciences, Saint Petersburg, Russia). To validate and consolidate these findings, we employed Unicycler v0.5.0 (The University of Melbourne, Victoria, Australia) and integrated BWA v0.7.11 [56] to align the contigs derived from the graphical genome assembly with the mitochondrial contigs obtained from the long-read assembly. Finally, the assembly results were visualised using Bandage v0.8.1 [57].

Genome annotation and analysis

L. tulipifera (NC_021152.1) and A. thaliana were used as references to annotate R. nigrum mitogenome PCGs using Geseq v2.03 [58]. Exon and intron information of the encoded genes was manually extracted based on the annotation files. Subsequently, rRNA genes were annotated using BLASTN and tRNA genes were annotated using tRNAscan-SE v.2.0.11 [59]. Annotation errors were manually corrected using Apollo v1.11.8 software [60]. The protein-coding sequences of this genome were extracted using PhyloSuite v1.2.2 [61], and Mega v7.0.26 software was used to analyse codon preference and calculate RSCU values [62]. If RSCU > 1, the codon is preferentially used by the amino acids; if RSCU < 1, the opposite is true.

Repeat sequences analysis and repeat-mediated recombination validation

The online tool MISA v2.1 (https://webblast.ipk-gatersleben.de/misa/) [63] was used to identify SSRs with a length of 1–6 bp in the R. nigrum mitogenome using parameters “1–10 2–5 3–4 4 − 3 5 − 3 6 − 3.” Tandem repeat sequences were identified using TRF v4.09 (https://tandem.bu.edu/trf/trf.unix.help.html) [64] with the following threshold requirements: match ≥ 85% and length ≥ 7 bp. The REPuter program (https://bibiserv.cebitec.uni-bielefeld.de/reputer/) was used to identify dispersed repeats [65], and the minimum size was set to 30 bp. The location and details of genome distribution were visualised using Circos v0.69-9 and Excel v2021 [66].

Two approaches were used to validate the possible genomic structures mediated by repeats, as follows. (1) The BLASTN program was applied using long reads to compare repeat sequences and extend 500 bp at both ends to obtain all possible paths (with the parameter -evaluate 1e-10) and calculate the proportion of predicted structures based on the screening comparison results. (2) Based on the R. nigrum mitogenome data, repeat sequences and 500 bp upstream and downstream sequences were extracted, and primers were designed using Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast) for validation. Primers were then synthesised by Sangon Biotech Co., Ltd. (Shanghai, China) (Table S12). The PCR amplification system included 1 µL of template gDNA, 2 µL each of upstream and downstream primers (10 µmol/L), 25 µL of 2× Rapid Taq Master Mix, and 20 µL of ddH2O. The amplification program consisted of pre-denaturation at 95 °C for 3 min, denaturation at 95 °C for 15 s, annealing at 60 °C for 30 s, and extension at 72 °C for 30 s, with a total of 35 cycles, and final extension at 72 °C for 10 min [55]. Electrophoresis was performed using 1% agarose gels, and photographs were obtained by a fully automated gel-imaging analysis system (Peijqing JS-2000, Shanghai, China). PCR products were sent to Sangon Biotech for Sanger sequencing, and PDF files were generated using SeqMan Pro v7.1.0 (44.1) (DNASTAR, Madison, USA).

Prediction and validation of RNA-editing sites

To predict RNA-editing sites in R. nigrum, RNA-seq reads were aligned to the mitogenome PCG coding sequence using TopHat2 [67] with a tolerance of up to seven mismatches. Differences between DNA and RNA sequences were analysed to identify potential RNA-editing sites using REDItools v2.0 [68], with stringent criteria: a minimum coverage depth of 50× and an editing frequency of at least 0.1. To validate the predicted RNA-editing sites, we focused on genes where editing sites corresponded to start and stop codons. Primers were designed based on sequences flanking the predicted editing sites, using Primer-BLAST (Table S13). RNA extracted from the samples was reverse transcribed into cDNA, followed by PCR amplification using both gDNA and cDNA templates. The resulting amplicons were sequenced and compared, and differences in base peaks were analysed using SnapGene Viewer v7.0.1 (GSL Biotech LLC, Chicago, IL, USA).

Plastome assembly and DNA transfer analysis

The plastome of R. nigrum was assembled by integrating short and long reads, utilising GetOrganelle [69] with specific parameters “-R 15 -k 21,45,65,85,105 -F embplant_pt.” Annotation of chloroplast genes was conducted using CPGAVAS2 [70], with subsequent corrections made using CPGView [71]. Homologous fragments shared between the plastome and mitogenome were identified through BLASTN analysis, applying strict filtering criteria: an e-value threshold of 1e-6, a minimum word size of 7, a sequence length of at least 30 bp, and a sequence similarity of 80% or higher. The distribution pattern of homologous sequences was visualised using the Circos package [66].

Phylogenetic and collinearity analysis

For phylogenetic analysis, mitogenome sequences of closely related species were retrieved from NCBI (Table S1), with Pulsatilla chinensis (NC_068017.1) and Aconitum kusnezoffii (NC_053920.1) used as outgroups. Common genes among these genomes were extracted and aligned using MAFFT v7.505 [72]. Phylogenetic reconstruction was performed with IQ-TREE v1.6.12 using the “GTR + F + I + I + R2” model [73], and the resulting tree was visualised using iTOL v6 (https://itol.embl.de/) [74]. Comparison of the mitogenomes of eight closely related species, including A. kusnezoffii, P. chinensis, R. crenulata, R. nigrum, R. tangutica, S. plumbizincicola, T. terrestris, and Z. fabago, was conducted using the BLAST program. Homologous sequences with a length of at least 300 bp were retained for subsequent analysis, and gene synteny was visualised using TBtools v1.09876, based on the MCscanX algorithm [75].