Genomic analysis based on chromosome-level genome assembly reveals Myrtaceae evolution and terpene biosynthesis of rose myrtle

Yang, Ling; Jin, Jingjing; Lyu, Shanwu; Zhang, Fangqiu; Cao, Peijian; Qin, Qiaomei; Zhang, Guanghui; Feng, Chen; Lu, Peng; Li, Huiguang; Deng, Shulin

doi:10.1186/s12864-024-10509-6

Genomic analysis based on chromosome-level genome assembly reveals Myrtaceae evolution and terpene biosynthesis of rose myrtle

Research
Open access
Published: 10 June 2024

Volume 25, article number 578, (2024)
Cite this article

Download PDF

You have full access to this open access article

BMC Genomics Aims and scope Submit manuscript

Genomic analysis based on chromosome-level genome assembly reveals Myrtaceae evolution and terpene biosynthesis of rose myrtle

Download PDF

Ling Yang^1,2^na1,
Jingjing Jin³^na1,
Shanwu Lyu¹,
Fangqiu Zhang⁴,
Peijian Cao³,
Qiaomei Qin⁴,
Guanghui Zhang⁵,
Chen Feng⁶,
Peng Lu³,
Huiguang Li¹ &
…
Shulin Deng^1,2

285 Accesses
Explore all metrics

Abstract

Background

Rose myrtle (Rhodomyrtus tomentosa (Ait.) Hassk), is an evergreen shrub species belonging to the family Myrtaceae, which is enriched with bioactive volatiles (α-pinene and β-caryophyllene) with medicinal and industrial applications. However, the mechanism underlying the volatile accumulation in the rose myrtle is still unclear.

Results

Here, we present a chromosome-level genomic assembly of rose myrtle (genome size = 466 Mb, scaffold N50 = 43.7 Mb) with 35,554 protein-coding genes predicted. Through comparative genomic analysis, we found that gene expansion and duplication had a potential contribution to the accumulation of volatile substances. We proposed that the action of positive selection was significantly involved in volatile accumulation. We identified 43 TPS genes in R. tomentosa. Further transcriptomic and TPS gene family analyses demonstrated that the distinct gene subgroups of TPS may contribute greatly to the biosynthesis and accumulation of different volatiles in the Myrtle family of shrubs and trees. The results suggested that the diversity of TPS-a subgroups led to the accumulation of special sesquiterpenes in different plants of the Myrtaceae family.

Conclusions

The high quality chromosome-level rose myrtle genome and the comparative analysis of TPS gene family open new avenues for obtaining a higher commercial value of essential oils in medical plants.

View this article's peer review reports

A chromosome-scale Rhubarb (Rheum tanguticum) genome assembly provides insights into the evolution of anthraquinone biosynthesis

Article Open access 23 August 2023

Chromosome-level genome assembly provides insights into the genetic diversity, evolution, and flower development of Prunus conradinae

Article Open access 19 June 2024

Upgraded durian genome reveals the role of chromosome reshuffling during ancestral karyotype evolution, lignin biosynthesis regulation, and stress tolerance

Article 17 May 2024

Background

Volatile compounds play important roles in nature, such as altering plant-animal interactions and altering the local abiotic environment. The Myrtaceae species are regarded as essential oil producers because of the high concentration of cyclic mono- and sesquiterpenes [1,2,3]. Eucalyptus oil contains 1,8-cineole as the main component [4,5,6]. The leaves of Melaleuca alternifolia are dominantly detected with terpinen-4-ol, terpinolene, and 1,8-cineole, and these volatiles are usually called tea tree oil [7, 8]. Rose myrtle (Rhodomyrtus tomentosa (Ait.) Hassk), belonging to the family of Myrtaceae, is a paradigmatic example of terpenes-rich medicinal plant [9, 10]. Rose myrtle is increasingly used in a wide field of applications, including medicine, cosmetics, healthy food, and for industrial purposes [11, 12]. At present, many volatile compounds have been detected in rose myrtle [13], especially the leaves are enriched with (+)-α-pinene and β-caryophyllene [14, 15]. These simple and polymeric terpenoids function as photoprotectants, antifeedants, or physical barriers, playing vital roles in plant growth, development, and environmental interaction [16,17,18]. Commercially used essential oils, including myrtle oil, lavender oil and tea tree oil, are a mixture of volatile terpenes [19], with a growing amount of importance in industrial applications [20, 21].

Volatile terpenes are the largest class of natural products, which essentially originate from the C5 substrates dimethylallyl diphosphate (DMAPP) and isopentenyl diphosphate (IPP) [22]. The DMAPP and IPP precursors are produced through the mevalonate (MVA) and methylerythritol phosphate (MEP) pathways, respectively [23]. In plants, the MEP pathway typically operates in plastids while the MVA pathway operates in the cytosol [23]. Hemi-, mono-, and diterpenes, as well as carotenoids (tetraterpenes), are produced via the MEP pathway [24]. Terpene synthase (TPS) catalyzes complex carbocation cascade reactions on the prenyl diphosphate substrate, resulting in cyclic or linear terpene backbones [25, 26]. However, biology of volatile terpenes biosynthesis and accumulation is still unclear in rose myrtle.

TPS gene family members are divided into seven subgroups (-a, -b, -c, -d, -e/f, -g, and -h) [27, 28]. TPS-a and TPS-b mainly synthesizes sesquiterpenes and monoterpenes, respectively [27, 28]. TPS-g can synthesize monoterpenes, sesquiterpenes and diterpenes [27]. Terpenes and terpenoids play important roles in plant resistance to herbivores and response to environmental stimuli [29]. The genetic basis of terpene synthesis has been widely concerned in family Myrtaceae. The gene numbers encode putative terpene synthase in Eucalyptus grandis, M. alternifolia, and Leptospermum scoparium were 113, 37, and 49, respectively [7, 30, 31]. Further research into terpene biosynthesis is demanded for industrial production of essential oils.

A gap-free rose myrtle T2T genome has been reported recently [32], and their genome assembly provides a foundation for investigating the anthocyanin accumulation mechanism of R. tomentosa. However, as medicinal resources and undomesticated plants, the genetics of the special medicinal components and environmental adaptation strategies of R. tomentosa requires a better understanding and possible improvement. Here, we assembled a chromosome-level genome for R. tomentosa using third-generation PacBio in association with Illumina sequencing and Hi-C technique. Gene amplification and natural selection shaped the genetic adaptation of R. tomentosa to the harsh biotopes. The structure of the genes involved in the terpenoid synthesis pathway are positively selected. Our study represents the basis for exploring the genetic potential of R. tomentosa which contributes to the accumulation of essential oils.

Materials and methods

Plant sampling

For whole-genome assembly, a mature adult R. tomentosa individual was selected from a natural population from the South China National Botanical Garden (23.1817 N, 113.3671 E, Chinese Academy of Sciences, Guangzhou, China). The voucher specimen was kept at the South China Botanical Garden Herbarium (IBSC 0925721). Fresh leaves were collected for whole genome sequencing with Illumina HiSeq X Ten and PacBio Sequel sequencing platforms. For RNA sequencing in support of gene annotation, young leaves, petal lower lips, young stems, green fruits, and roots were sampled from the same individual.

Genome sequencing, assembly and quality assessment

We extracted and purified the total DNA from fresh leaves. For Illumina short-reads sequencing, PCR-free libraries with 300bp, 500bp, and 10kb-20kb paired-end (PE) insert were prepared and sequenced on the Illumina HiSeq X Ten platform. SMRT long-read sequencing was performed on a PacBio Sequel platform with the Sequel Sequencing Kit 2.1. For Hi-C sequencing, young and fresh leaf tissues were preserved in 1% (vol/vol) formaldehyde, DNA was cross-linked according to protocol, and a single library (150-bp PE) was sequenced on the Illumina HiSeq X platform. More detailed information on sequencing can be found in Table S1 (see online supplementary material).

Wtdbg2 (v1.3.1) [33] and FALCON (v0.4.1) [34] were used for error correction in PacBio long reads according to Illumina short reads and then generate consensus sequences. Further, these subreads were assembled into contigs by Flye v0.2.1. We applied SSPACE v1.2.0 [35] to generate scaffolds using Illumina mate-paired reads. Preassembled scaffolds were clustered, ordered, and orientated onto pseudo-chromosomes with ALLHiC software (v0.8.11) [36]. The genome size was estimated based on k-mer distribution analysis by GenomeScope (v2.0) [37] using Illumina short reads without a flow cytometry analysis. Hi-C libraries of fresh young leaves were constructed with NEB Next Ultra II DNA library preparation kit and DpnII enzyme (Ipswich, MA, USA).

Benchmarking Universal Single-Copy Orthologs (BUSCO) v5.6.1 were used to evaluate the accuracy and completeness of the assembled genome. Genome completeness was assessed using the plant’s dataset of the BUSCO database, with an e-value < 1e-5. Single-copy embryophyta_odb10 homologous genes in BUSCO were used to predict the gene status of the existing sequences in the genome.

Finally, we used Merqury (v1.3) [38] to estimate the consensus QV of the assembly. Augustus [39] was utilized in de novo gene prediction while Trinity were implemented to generate EST evidence with RNA-seq data from four different tissues (root, leaf, flower, stem, and green fruit). The quality of assembled genome was evaluated by mapping RNA-seq reads from these different tissues using Bowtie2 [40].

Chromosome counting and karyotype analysis

Root tips were pretreated with 0.002% hydroxyquinoline at 4 ℃ for 3 h [41]. After a thorough wash, tips were fixed in 1:3 acetic ethanol and digested in HCl (1 M) solution for 45 min in a 37 ℃ water bath. The root tips were stained with Carbol-fuchsin solution for 72 h, then cells were crushed onto a glass plate and drawn under oil immersionlens.

Repeat and noncoding RNA annotation

We performed repeat masking using EDTA (v1.9.4 with parameter: –sensitive 1 –anno 1 –evaluate 1) with cDNA assembled from RNA-seq reads by Trinity. Four types of non-coding RNA genes, including tRNAs, rRNAs, miRNAs, and snRNAs, were predicted in the R. tomentosa genome. The tRNA genes were predicted using tRNAscan-SE with eukaryote parameters. INFERNAL with default parameters was used to annotate miRNA, snRNA, and rRNA.

Structural and functional annotation of genes

A combined strategy of homology-based search, de-novo gene prediction, and RNA sequencing-aided annotation was used to annotate gene structure for the R. tomentosa genome. For homolog prediction, sequences of proteins from 13 species, including 6 closely related species from Myrtaceae (E. grandis, L. scoparium, Psidium guajava, Syzygium oleosum), other Myrtales species (Punica granatum, Sonneratia alba, Rhizophora apiculata, Sonneratia caseolaris), some representative species (Arabidopsis thaliana, Solanum lycopersicum, Vitis vinifera, Vaccinium corymbosum) and monocot species (Oryza sativa). The protein sequences were aligned to the genome using tBlastn with an e-value cut-off of 1e-5. De-novo gene structure identification was based on Augustus [42], SNAP [43], and Fgensh, respectively. RNA-seq reads from different tissues were aligned to the genome using Bowtie2 (v3.2.7). Finally, putative protein-coding genes in the R. tomentosa genome were integrated using the Maker package (v 3.01.03).

Functional annotation of the protein-coding genes was conducted by performing BlastP (e-value cut-off 1e-05) searches against entries in the NCBI nr and SwissProt databases. Searches for gene motifs and domains were performed using InterProScan. The GO terms for genes were obtained from the corresponding InterPro or Pfam entry. Pathway reconstruction was performed using KOBAS (v2.0) and the KEGG database.

Phylogenetic analysis and estimation of divergence time

OrthoFinder was used to identify orthologous genes from R. tomentosa and 13 other species including A. thaliana, O. sativa, V. vinifera, E. grandis, L. scoparium, P. granatum, P. guajava, R. apiculata, S. alba, S. caseolaris, S. lycopersicum, S. oleosum, and V. corymbosum. Single-copy orthologous genes were retrieved from these 14 species and aligned using MUSCLE [44] with default parameters and low-quality alignment regions were removed using Gblocks (v 0.91b) with default parameters. All alignments were combined to produce a super-alignment matrix, which was used to construct a maximum likelihood (ML) phylogenetic tree using RAxML (v8.2.12) with parameters: -f a - × 12,345 -p 12,345 -# 100 -m PROTGAMMALGX -s ex.fa.gb -n ex -T 30. Divergence times between species were calculated using the r8s with the default parameters.

Gene family expansion and contraction analysis

Gene family expansion and contraction were conducted using the default settings by CAFÉ (v4.2.1) [45]. Gene families were identified by OrthoFinder. We determined the gene family expansions or contractions when the difference in gene copy number was significant with P-value < 0.01.

Comparative genome analyses

To assess the degree of collinearity, we try to identify syntenic blocks among R. tomentosa, P. guajava, and E. grandis using MCScanX [46]. A syntenic region was highlighted if it contained at least 30 shared genes.

Identification of TPS gene family

For the identification of TPSs, representative members of the subfamilies of M. alternifolia, P. guajava, and R. tomentosa were used as queries to perform Blastp searches against the protein database of each species with an E-value cut-off of 1e-5. Candidate sequences identified as orthologs were then aligned using Mafft to remove those that did not contain the intact domain. For phylogenetic analysis, sequences were combined to produce a super-alignment matrix, which was used to construct a maximum likelihood (ML) phylogenetic tree in RAxML (v8.2.12) with parameters: -f a - × 12,345 -p 12,345 -# 1000 -m PROTGAMMALGX -s ex.phy -n ex -T 30. The successfully constructed phylogenetic tree is displayed and annotated using iTOL software. Conserved motifs were identified by MEME tools, conserved domains were identified by NCBI Batch CD-search and visualized in TBtools-II [47].

RNA extraction, library construction, and sequencing

Total RNA was extracted using a Trizol reagent kit (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocol. RNA quality was assessed on an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) and checked using RNase-free agarose gel electrophoresis. The cDNA fragments were purified and ligated to Illumina sequencing adapters. The ligation products were size selected by agarose gel electrophoresis, PCR amplified, and sequenced using Illumina HiSeqTM 4000 with PE 150 bp. The unigene expression was calculated and normalized to TPM (transcripts per million).

Positive selection analysis

For positive selection analysis, we first identified single-copy orthologous genes from R. tomentosa and the three most closely related species with assembled genomes: L. scorparium (Myrtaceae), E. grandis (Myrtaceae), and P. guajava (Myrtaceae), S. oleosum (Myrtaceae) and P. granatum (pomegranate, Lythraceae). For these genes, based on the phylogenetic topology, we employed the branch-site model incorporated in the PAML package v4.9 [48] to detect positively selected genes (PSGs). When one of the five species of Myrtaceae was specified as a foreground branch, the other four and the pomegranate branches in the phylogenetic tree were used as background branches. We conducted likelihood ratio tests to determine whether the positive selection was operating on the foreground branch. In this study, PSGs were identified only when P < 0.001.

Results

De-novo genome assembly and pseudo-chromosome construction

The genome of R. tomentosa, which is commonly grown in the South China including Guangzhou (Fig. 1A) was sequenced. The estimated genome size was 459 Mb based on the 31-mer depth distribution analysis of the sequenced short reads (Fig. S1). We obtained around 130 Gb of a high-quality dataset, including ~ 44.5 Gb of short-read sequences from three mate-pair libraries and ~ 85.8 Gb of Pacbio sequences, which represent ~ 300X coverage for the genome (Table S1). As shown in Fig. S2, the karyotype consists of 2n = 2X = 22 chromosomes. We assembled the genome with size of 466 Mb (Table 1, S2, and S3), consisting of 1,143 contigs with an N50 of 1.01 Mb (Table S2). Using Hi-C technology, 99.56% (463.9 Mb) of contigs can be ordered and anchored onto 11 pseudo-chromosomes (Fig. 1B, Table 1 and Fig. S3), which finally consisted of 28 scaffolds (11 pseudo-chromosomes and 17 scaffolds) with an N50 of 43.7 Mb (Table 1 and Table S2). The GC content of the assembled genome is 40.59% (Table 1), which is similar to those of P. guajava and E. grandis, the two closely related species to R. tomentosa.

Table 1 Summary statistics of the genome assembly and annotation of R. tomentosa

Full size table

The BUSCO database detected 1,546 (95.8%) and 24 (1.5%) complete and fragmented gene models, respectively out of 1,614 BUSCO genes (Table S3). To evaluate genome assembly quality, Merqury results showed that the integrity of the genome assembly was 86.2%, QV = 36.7, and the error rate was only 0.021%, indicating that a genome with high integrity and accuracy was constructed (Fig. S4). Moreover, 76.11%-95.50% of RNA-seq reads generated from different tissues can be successfully mapped to the assembled genome by hisat2 (Table S4). We also obtained the RNA-seq datasets of leaf samples reported by He et al. [14], and 75.33%-92.42% of RNA-seq reads were mapped to the assembled genome. Taken together, these observations suggest the high quality and completeness of the chromosome-level reference genome assembly of R. tomentosa.

Repetitive elements and protein-coding gene annotation

Repeat sequence annotation showed that the R. tomentosa genome contained 35.21% of repetitive sequences (Table 1 and Table S5). Among these sequences, long terminal repeats (LTRs) were the most abundant interspersed repeats, occupying 32.06% of the genome, including 18% Gypsy LTRs and 5.33% Copia LTRs (Table S5). TIR repeats and helitron repeats accounted for 2.12% and 0.45%, respectively (Table S5). We confidently annotated 35,554 protein-coding genes of which 95.7% had a homolog in a suite of functional databases (Table 1 and Table S6). In addition, 2,892 noncoding RNAs, comprising 143 conserved microRNAs, 601 transfer RNAs, 1,754 ribosomal RNAs, and 394 small nuclear RNAs, were identified in the R. tomentosa genome (Table S7). These results indicated that a little higher number of genes were annotated in R. tomentosa compared with that of other species (Table S8). A comparison of gene models for R. tomentosa species revealed that the length of exons and intron in R. tomentosa was relatively conserved, whereas the length of introns is a little shorter in A. thaliana (Fig. S5). However, the average length of genes was a little shorter in R. tomentosa, compared with other species (Fig. S5).

Gene family analysis

To identify evolutionary characteristics and gene families, the R. tomentosa genome was compared with 13 published genomes, including 6 closely related species from Myrtaceae (E. grandis, L. scoparium, P. guajava, S. oleosum), other Myrtales species (P. granatum, S. alba, R. apiculata, S. caseolaris), some representative species (A. thaliana, S. lycopersicum, V. vinifera, V. corymbosum), and a monocot rice (O. sativa) (Table S8). Based on gene family clustering analysis, 31,645 gene families were identified in total, of which 2,913 were shared by all 14 species, and 14 of these shared families were single-copy gene families (Table S9).

Gene family numbers were compared between R. tomentosa and other species. As shown in Fig. 1C, 10,506 gene families were shared between species, and 952 gene families were specific to R. tomentosa. Compared with P. guajava, there were more species-specific genes in R. tomentosa (Fig. 1D). Phylogenetic analysis of a concatenated sequence alignment of R. tomentosa and 13 other plant species indicated that R. tomentosa, as expected, clustered with Myrtaceae species (Fig. 1E). The divergence time between R. tomentosa and the most closely related species, P. guajava, was estimated to be ~ 24.4 million years ago (Fig. 1E). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis revealed that specific genes were especially enriched in terpenoid backbone biosynthesis and pyruvate metabolism (Table S10 and S11, Fig. 2A). Compared with the most recent common ancestor of the 14 plants, there were significant differences in the gene family in different species of Myrtaceae. In the case of the family gene in rose myrtle and eucalyptus, it tended to expand, while mainly experienced contraction in guava. Functional analysis showed that the significantly expanded genes were over-represented in ontology terms related to pyruvate metabolism, phenylpropanoid biosynthesis and flavonoid biosynthesis (Table S12 and S13, Fig. 2B). However, the contracted gene families did not show many specificities with marginal enrichment terms in phenylpropanoid biosynthesis (Table S14 and S15). These results suggested that gene expansion correlated with the terpenoid biosynthesis in R. tomentosa.

Gene duplication affected terpenoids synthesis in R. tomentosa

The distribution of synonymous substitutions per synonymous site (Ks) across all paralogous genes (regardless of gene order) showed a peak at Ks = 0.9, and similar peaks were found for P. guajava (Ks = 1.08) and E. grandis (Ks = 1.07) (Fig. S6). As shown in Fig. S6, the whole-genome duplication (WGD) event of R. tomentosa occurred later than that of P. granatum, which was consistent with previous publications in R. tomentosa [32]. These results provided additional evidence of one WGD event in Myrtaceae after the well-known paleo-hexaploidization event, γ, in the most recent common ancestor (MRCA) of all eudicots. We then analyzed the different origins of gene duplicates. All types of duplications were found and dispersed account for the largest proportion (32.4%), followed by the type of proximal (7.8%), tandem (5.4%) and WGD/segmental (0.7%) (Table S16). Further analysis showed that the KEGG pathway was enriched in monoterpenoid biosynthesis, sesquiterpenoid and triterpenoid biosynthesis, pyruvate metabolism, flavonoid biosynthesis, and phenylpropanoid biosynthesis (Fig. 3A). These results provided clues about the potential contribution of the gene expansion and duplication on the accumulation of volatile substances.

Positively selected genes (PSGs) in R. tomentosa

To better understand the evolutionary footprint in the accumulation of volatile compounds, we further analyzed the positive selection genes in R. tomentosa. Positive selection analyses for R. tomentosa, P. guava, E. grandis, and E. citriodora were conducted using the orthologs from P. granatum as the outgroup. We identified 872 genes possibly under positive selection in R. tomentosa among the 3,923 single-copy orthologous genes (P < 0.001; Table S17). A GO functional classification of PSGs indicated that the terms associated with DNA repair, protein ligase, membrane-bounded organelle, intracellular membrane-bounded organelle, and vesicle transport were significantly over-represented (Fig. S7). We found six PSGs involved in terpenoid synthesis (Table S17). Moreover, these PSGs were detectable at the transcriptional level in various stages of development, especially in leaf (Fig. 3B, Table 2). We also identified three positive selection genes related to the stomatal development pathway (Table S17). Synthetically, these results indicated that R. tomentosa exhibited a remarkable pattern of adaptive evolution in response to environmental cues.

Table 2 Positive selected genes associated with terpenoid synthesis in R. tomentosa

Full size table

TPS family genes probably affect terpenoids synthesis

To infer the influence of the TPS family on terpenoid biosynthesis in R. tomentosa, molecular evolutionary analysis was conducted. In total, 43 and 32 TPS genes were identified in R. tomentosa and P. guava (Table 3, Table S18), respectively, which contained 7 previously reported RtTPS genes [14]. To gain further insights into the RtTPS gene members, we surveyed the evolutionary relationships (Fig. 4A), motifs (Fig. 4B), domains (Fig. 4C), gene structure and chromosomal location of each TPS gene copy (Fig. 4D, Fig. S8). The RtTPS genes were classified into six subgroups based on their conserved domain structures. Gene structure and conserved domain analysis revealed that all TPS had conserved domain associated with terpene biosynthesis (Fig. 4), which suggests a conserved function in these RtTPSs. These results revealed both conservation and divergence between each subfamily in RtTPSs.

Table 3 Experession levels of structural genes associated with terpenoid synthesis in R. tomentosa

Full size table

In order to get a better understanding of the evolutionary relationship and classification of the RtTPS members, a ML phylogenetic tree was generated based on amino acid sequences of the TPS domains from M. alternifolia, P. guajava, and R. tomentosa (Fig. 5). Compared with P. guava, we found a specifically expanded TPS-a subgroup in M. alternifolia, and R. tomentosa, which may be related to the accumulation of sesquiterpenoid compounds, such as β-caryophyllene. In TPS-a subgroup, we found that the a1 subgroup was enriched with RtTPSs, but MaltTPS were predominantly clustered to a2 subgroup. Additionally, TPS-a3 clade was missing completely in P. guava. It suggested that the diversity of TPS-a subgroups led to the accumulation of special sesquiterpenes in different plants of Myrtaceae family. Compared R. tomentosa and M. alternifolia, TPS-g subgroup was enriched with TPS family genes of P. guava.

To probe the underlying mechanism of the terpene accumulation pattern, we drawn a predicted terpene biosynthesis pathway with the expression of structural genes in different tissues of R. tomentosa using transcriptome data (Fig. 6). Different structural genes participating in the cytosolic MVA pathway and plastid MEP pathway were identified in this study, exhibiting distinct expression patterns. Tissue-specific expression analysis revealed that the RtTPS genes were differentially expressed in various rose myrtle tissues. RtTPS family genes were dominantly increased in leaf and with a low expression in root, especially in TPS-a1 subgroup. Additionally, we found that two TPS genes, RtTPS03 and RtTPS39, belonging to TPS-a and -b subgroups, were highly expressed in leaf. The results showed that RtTPS family genes affected characteristic terpene accumulation by specifically expanded subgroup and functional differentiation.

Discussion

Diversity among genome datasets contributes to comparative genomics analysis

Rose myrtle belongs to the family Myrtaceae [49], which has attracted increased attention recently because of its industrial and economic applications. A gap-free rose myrtle T2T genome has been reported recently during the period when we prepared the manuscript [32]. The genome size, GC contents, genome structure, and gene numbers of the R. tomentosa genome presented here is quite similar to the reported gap-free genome. These results indicated our assembly was of high quality, and it will provide useful datasets for comparative genomics. Another genome is subsequently reported, but the genome size (442 Mb) is smaller than the gap-free genome and our genome [50].

Tandem duplication and specific subfamily expansion of TPS in R. tomentosa

This manual annotation of the rose myrtle genome revealed that genes and pseudogenes from the same TPS subfamily with high sequence similarities were frequently located in close proximity on the same chromosome. This marked clustering of TPS genes into tandem arrays in rose myrtle paralleled the tandem clusters found in M. alternifolia [51]. Our study found that the gene duplication through unequal crossing over, and subsequent sub- or neo-functionalization, or the expanded specific subfamily evolution were critical mechanisms underpinning the evolution of TPS in rose myrtle. The mechanisms of tandem duplication and specific subfamily expansion are considered as contributors to the adaptive diversification of genes [52], such as TPS family genes, as they are more likely to be retained following gene duplication due to stress pressures [53].

The distinct gene subgroups of TPS affect specific terpene accumulation

All angiosperm TPS subfamilies are represented in R. tomentosa but variations in the size of certain subfamilies relative to the other Myrtaceae were observed [2].The largest distinction were evident in subfamilies that produce secondary metabolites, and thus are likely to be subject of adaptive pressures. For example, R. tomentosa has twice as many TPS-a (sesquiterpenoid) genes compared to P. guava, which is similarity to M. alternifolia [51]. This subgroup in rose myrtle is likely to have had the same significance historically as it had in M. alternifolia, which contributed to the abundance of aromatic compounds. P. guava has more TPS-g subgroup genes than M. alternifolia and R. tomentosa, and these results indicate that the distinctive gene subgroups of TPS led to the biosynthesis and accumulation of different aromas. We particularly find that TPS-a1 subgroup genes were significantly expanded and thus are key potential targets to produce β-caryophyllene in R. tomentosa [14]. The aromatic compounds and essential oils present are a key indicator in determining the economic value of R. tomentosa [54]. Then an in-depth understanding of terpene metabolism will help improve the potential application of secondary metabolites.

Conclusion

We presented a high-quality chromosome-level reference genome for R. tomentosa. The genome characterization including the genome size, GC content, genome structure, gene number, duplication of the genome and divergent time with the close relatives were quite consistent with a recently reported gap-free R. tomentosa genome. Elaborate genomic information on R. tomentosa has primely illustrated the evolutionary relationship of TPS gene family associated with terpene accumulation, especially the TPS-a subfamily which plays an important role in synthesizing the special terpene. Our study provides a further opportunity to research the potential application of secondary metabolites among Myrtaceae in the future.

Availability of data and materials

The R. tomentosa genome assembly and transcriptome raw reads were submitted to BIG Sub (https://ngdc.cncb.ac.cn/search/specific?db=bioproject&q=PRJCA013967).

Abbreviations

DMAPP:: Dimethylallyl Diphosphate
IPP:: Isopentenyl diphosphate
MVA:: Mevalonate
MEP:: Methylerythritol phosphate
TPS:: Terpene synthase
BUSCO:: Benchmarking Universal Single-Copy Orthologs
TPM:: Transcripts per million
PSGs:: Positively selected genes
KEGG:: Kyoto Encyclopedia of Genes and Genomes
WGD:: Whole-genome duplication
MRCA:: Most recent common ancestor

References

Biffin E, Lucas EJ, Craven LA. Ribeiro da Costa I, Harrington MG, Crisp MD: Evolution of exceptional species richness among lineages of fleshy-fruited Myrtaceae. Ann Bot. 2010;106(1):79–93.
Article CAS PubMed PubMed Central Google Scholar
Padovan A, Keszei A, Külheim C, Foley WJ. The evolution of foliar terpene diversity in Myrtaceae. Phytochem Rev. 2014;13:695–716.
Article CAS Google Scholar
Barbosa dMÂA, Celeste dJPF, Oliveira FO, Pompeu VEL, Diniz dNL, Moraes CM, Pereira dSDR, Sandro P, Santana dOM, Helena dAAE. Myrcia paivae O.Berg (Myrtaceae) essential oil, first study of the chemical composition and antioxidant potential. Molecules. 2022;27(17):5460–5469.
Merghni A, Noumi E, Hadded O, Dridi N, Panwar H, Ceylan O, Mastouri M, Snoussi M. Assessment of the antibiofilm and antiquorum sensing activities of Eucalyptus globulus essential oil and its main component 1,8-cineole against methicillin-resistant Staphylococcus aureus strains. Microb Pathogenesis. 2018;118:74–80.
Article CAS Google Scholar
Galan DM, Ezeudu NE, García JV, Geronimo CA, Berry N, Malcolm BJ. Eucalyptol (1,8-cineole): an underutilized ally in respiratory disorders? J Essent Oil Res. 2020;32:103–10.
Article CAS Google Scholar
Kainer D, Padovan A, Degenhardt J, Krause ST, Mondal P, Foley WJ, Külheim C. High marker density GWAS provides novel insights into the genomic architecture of terpene oil yield in Eucalyptus. New phytol. 2019;223(3):1489–504.
Article CAS PubMed Google Scholar
Calvert J, Baten A, Butler JB, Barkla BJ, Shepherd M. Terpene synthase genes in Melaleuca alternifolia: comparative analysis of lineage-specific subfamily variation within Myrtaceae. Plant Syst Evol. 2021;304:111–21.
Article Google Scholar
Corona-Gómez L, Hernández-Andrade L, Mendoza-Elvira SE, Suazo FM, Ricardo-González DI, Quintanar-Guerrero D. In vitro antimicrobial effect of essential tea tree oil (Melaleuca alternifolia), thymol, and carvacrol on microorganisms isolated from cases of bovine clinical mastitis. Int J Vet Sci Med. 2022;10:72–9.
Article PubMed PubMed Central Google Scholar
Hamid HA, Mutazah SR, Yusoff M. Rhodomyrtus tomentosa: A phytochemical and pharmacological review. Asian J Pharm Clin Res. 2017;10(1):10–6.
Article Google Scholar
Vo T-S, Ngo D-H. The health beneficial properties of Rhodomyrtus tomentosa as potential functional food. Biomolecules. 2019;9:76.
Article CAS PubMed PubMed Central Google Scholar
Srisuwan S, Mackin KE, Hocking DM, Lyras D, Bennett-Wood V, Voravuthikunchai SP, Robins-Browne RM. Antibacterial activity of rhodomyrtone on Clostridium difficile vegetative cells and spores in vitro. Int J Antimicrob Agents. 2018;52(5):724–9.
Article CAS PubMed Google Scholar
Zhao ZF, Wu L, Xie J, Feng Y, Tian J, He X, Li B, Wang L, Wang X, Zhang Y et al. Rhodomyrtus tomentosa (Aiton.): A review of phytochemistry, pharmacology and industrial applications research progress. Food Chem. 2020;309:125715–125724.
Pham TN, Le XT, Pham VT, Le HT. Effects of process parameters in microwave-assisted extraction on the anthocyanin-enriched extract from Rhodomyrtus tomentosa (Ait.) Hassk and its storage conditions on the kinetic degradation of anthocyanins in the extract. Heliyon. 2022;8:e09518-e09525.
He SM, Wang X, Yang SC, Dong Y, Zhao QM, Li YJ, Cong K. De novo transcriptome characterization of Rhodomyrtus tomentosa leaves and identification of genes involved in α/β-pinene and β-caryophyllene biosynthesis. Front Plant Sci. 2018;9:1231–42.
Article PubMed PubMed Central Google Scholar
Hiranrat A, Mahabusarakam WJT. New acylphloroglucinols from the leaves of Rhodomyrtus tomentosa. J Asian Nat Prod Res. 2008;64(49):11193–7.
CAS Google Scholar
Kuntorini EM, Nugroho LH. yani, Nuringtyas TR: Anatomical structure, flavonoid content, and antioxidant activity of Rhodomyrtus tomentosa leaves and fruits on different age and maturity level. Biodiversitas. 2019;20:12.
Article Google Scholar
Uddin ABMN, Hossain F, Reza ASMA, Nasrin MS, Alam AHMK. Traditional uses, Pharmacological activities, and phytochemical constituents of the genus Syzygium: a review. Food Sci Nutr. 2022;10(6):1789–819.
Article PubMed PubMed Central Google Scholar
Hu X, Chen Y, Dai J, Yao L, Wang L. Rhodomyrtus tomentosa fruits in two ripening stages: Chemical compositions, antioxidant capacity and digestive enzymes inhibitory activity. Antioxidants. 2022;11:1390.
Article CAS PubMed PubMed Central Google Scholar
Bakkali F, Averbeck S, Averbeck D, Idaomar M. Biological effects of essential oils–a review. Food Chem Toxicol. 2008;46:446–75.
Article CAS PubMed Google Scholar
Singh B, Sharma RA. Plant terpenes: defense responses, phylogenetic analysis, regulation and clinical applications. 3 Biotech. 2014;5:129–151.
Vattekkatte A, Garms S, Brandt W, Boland W. Enhanced structural diversity in terpenoid biosynthesis: enzymes, substrates and cofactors. Org Biomol Chem. 2018;16:348–62.
Article CAS PubMed Google Scholar
Oldfield E, Lin FY. Terpene biosynthesis: modularity rules. Angew Chem Int Ed Engl. 2012;51(5):1124–37.
Article CAS PubMed Google Scholar
Zeng L, Dehesh K. The eukaryotic MEP-pathway genes are evolutionarily conserved and originated from Chlaymidia and cyanobacteria. BMC Genomics. 2021;22(1):137.
Article CAS PubMed PubMed Central Google Scholar
Vranová E, Coman D, Gruissem W. Network analysis of the MVA and MEP pathways for isoprenoid synthesis. Annu Rev Plant Biol. 2013;64:665–700.
Article PubMed Google Scholar
Degenhardt J, Köllner TG, Gershenzon J. Monoterpene and sesquiterpene synthases and the origin of terpene skeletal diversity in plants. Phytochem. 2009;70:1621–37.
Article CAS Google Scholar
Sun P, Schuurink RC, Caissard J-C, Hugueney P, Baudino S. My way: noncanonical biosynthesis pathways for plant volatiles. Trends Plant Sci. 2016;10(21):884–94.
Article Google Scholar
Lanier E, Andersen TB, Hamberger BR. Plant terpene specialized metabolism: Complex networks or simple linear pathways. Plant J. 2023;114(5):1178–201.
Article CAS PubMed Google Scholar
Karunanithi PS, Zerbe P. Terpene synthases as metabolic gatekeepers in the evolution of plant terpenoid chemical diversity. Front Plant Sci. 2019;10:1066.
Article Google Scholar
Boncan DAT, Tsang SSK, Li C, Lee IHT, Lam HM, Chan TF, Hui JHL. Terpenes and terpenoids in plants: interactions with environment and insects. Int J Mol Sci. 2020;21(19):7382.
Article CAS PubMed PubMed Central Google Scholar
Thrimawithana AH, Jones D, Hilario E, Grierson E, Ngo HM, Liachko I, Sullivan S, Bilton TP, Jacobs JME, Bicknell R, et al. A whole genome assembly of Leptospermum scoparium (Myrtaceae) for mānuka research. N Z J Crop Hortic Sci. 2019;47:233–60.
Article CAS Google Scholar
Külheim C, Padovan A, Hefer CA, Krause ST, Köllner TG, Myburg AA, Degenhardt J, Foley WJ. The Eucalyptus terpene synthase gene family. BMC Genomics. 2015;16:450.
Article PubMed PubMed Central Google Scholar
Li FP, Xu SQ, Xiao ZT, Wang JM, Mei Y, Hu HF, Li JY, Liu JY, Hou ZW, Zhao JL et al. Gap-free genome assembly and comparative analysis reveal the evolution and anthocyanin accumulation mechanism of Rhodomyrtus tomentosa. Hortic Res. 2023;10:uhad005.
Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods. 2019;17:155–8.
Article PubMed PubMed Central Google Scholar
Wagh S, Tople S, Benhamouda F, Kushilevitz E, Mittal P, Rabin T. Falcon: honest-majority maliciously secure framework for private deep learning. Proc Priv Enhanc Technol. 2021;2021:188–208.
Google Scholar
Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffolding pre-assembled contigs using SSPACE. Bioinform. 2011;27(4):578–9.
Article CAS Google Scholar
Wang YB, Zhang X. Chromosome scaffolding of diploid genomes using ALLHiC. Bio-protoc. 2022;101: e4503.
Google Scholar
Vurture GW, Sedlazeck FJ, Nattestad M, Underwood CJ, Fang H, Gurtowski J, Schatz MC. GenomeScope: fast reference-free genome profiling from short reads. Bioinform. 2017;33:2202–4.
Article CAS Google Scholar
Rhie A, Walenz BP, Koren S, Phillippy AM. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 2020;21(1):245.
Article CAS PubMed PubMed Central Google Scholar
Haas BJ, Papanicolaou A, Yassour M, Grabherr MG, Blood PD, Bowden JC, Couger MB, Eccles DA, Li B, Lieber M, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494–512.
Article CAS PubMed Google Scholar
Langdon WB. Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks. Biodata Min. 2015;8:1.
Article CAS PubMed PubMed Central Google Scholar
Vijayakumar NK, Subramanian D. Cytologia: Cytotaxonomical studies in south indian Myrtaceae. Cytologia. 1985;50:513–20.
Article Google Scholar
Hoff KJ, Stanke M. Predicting genes in single genomes with AUGUSTUS. Curr Protoc Bioinform. 2019;65: e57.
Article Google Scholar
Leskovec J, Sosič R. SNAP: A general-purpose network analysis and graph-mining library. ACM Trans Intell Syst Technol. 2016;8(1):1–20.
Article PubMed PubMed Central Google Scholar
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.
Article CAS PubMed PubMed Central Google Scholar
Mendes FK, Vanderpool D, Fulton B, Hahn MW. CAFE 5 models variation in evolutionary rates among gene families. Bioinform. 2021;36(22–23):5516–8.
Article Google Scholar
Wang YP, Tang HB, DeBarry JD, Tan X, Li JP, Wang XY, Lee T-H, Jin HZ, Marler BS, Guo H, et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;40:e49–e49.
Article CAS PubMed PubMed Central Google Scholar
Chen C, Wu Y, Li J, Wang X, Zeng Z, Xu J, Liu Y, Feng J, Chen H, He Y, et al. TBtools-II: A “One for All, All for One” bioinformatics platform for biological big-data mining. Mol Plant. 2023;16:1733–42.
Article CAS PubMed Google Scholar
Yang ZH. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–91.
Article CAS PubMed Google Scholar
Grattapaglia D, Vaillancourt RE, Shepherd M, Thumma B, Foley WJ, Külheim C, Potts BM, Myburg AA. Progress in Myrtaceae genetics and genomics: Eucalyptus as the pivotal genus. Tree Genet Genomes. 2012;8:463–508.
Article Google Scholar
Detcharoen M, Bumrungsri S, Voravuthikunchai SP. Complete genome of rose myrtle, Rhodomyrtus tomentosa, and its population genetics in Thai Peninsula. Plants. 2023;12:1582.
Article CAS PubMed PubMed Central Google Scholar
Voelker J, Mauleon R, Shepherd M. The terpene synthase genes of Melaleuca alternifolia (tea tree) and comparative gene family analysis among Myrtaceae essential oil crops. Plant Syst Evol. 2023;309:13.
Article CAS Google Scholar
Rathnayaka S, Kaewwongwal A, Laosatit K, Yimram T, Lin Y, Chen X, Nakazono M, Somta P. Tandemly duplicated genes encoding polygalacturonase inhibitors are associated with bruchid (Callosobruchus chinensis) resistance in moth bean (Vigna aconitifolia). Plant Sci. 2022;323: 111402.
Article Google Scholar
Hanada K, Zou C, Lehti-Shiu MD, Shinozaki K, Shiu S-H. Importance of lineage-specific expansion of plant tandem duplicates in the adaptive response to environmental stimuli. Plant Physiol. 2008;148:1003–993.
Article Google Scholar
Salni S, Marisa H. Antibacterial activity of essential oil from rose myrtle leaves (Rhodomyrtus tomentosa (Ait.) Hassk). Molekul. 2020;15:158–165.

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the Guangdong Forestry Science and Technology Innovation Project (No. 2020-KJCX011), National Natural Science Foundation of China (No. 32070340) and Guangzhou Innovation Leading Team Project (No. 202009020004).

Author information

Ling Yang and Jingjing Jin contributed equally to this work.

Authors and Affiliations

Key Laboratory of National Forestry and Grassland Administration On Plant Conservation and Utilization in Southern China & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
Ling Yang, Shanwu Lyu, Huiguang Li & Shulin Deng
University of Chinese Academy of Sciences, Beijing, 100049, China
Ling Yang & Shulin Deng
National Tobacco Gene Center, Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
Jingjing Jin, Peijian Cao & Peng Lu
Guangdong Eco-Engineering Polytechnic, Guangzhou, 510520, China
Fangqiu Zhang & Qiaomei Qin
State Key Laboratory of Conservation and Utilization of Bio-Resources in Yunnan & the Key Laboratory of Medicinal Plant Biology of Yunnan Province, Yunnan Agricultural University, Kunming, 650201, China
Guanghui Zhang
Jiangxi Provincial Key Laboratory of Ex Situ Plant Conservation and Utilization, Lushan Botanical Garden, Chinese Academy of Sciences, Jiujiang, 332900, China
Chen Feng

Authors

Ling Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jingjing Jin
View author publications
You can also search for this author in PubMed Google Scholar
Shanwu Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Fangqiu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Peijian Cao
View author publications
You can also search for this author in PubMed Google Scholar
Qiaomei Qin
View author publications
You can also search for this author in PubMed Google Scholar
Guanghui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chen Feng
View author publications
You can also search for this author in PubMed Google Scholar
Peng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Huiguang Li
View author publications
You can also search for this author in PubMed Google Scholar
Shulin Deng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SLD conceived the project. SLD, LY, PJC and FQZ designed the study. LY, JJJ, QMQ, PL, CF, HGL, GHZ, and SWL performed the sampling and experiments, and data analysis. JJJ, YL, and SWL designed and visualized the figures. LY, JJJ and SLD wrote the manuscript. All authors read and approved the final manuscript.

Authors’ information

LY and JJJ contributed equally to this work and share the first co-authorship.

Corresponding author

Correspondence to Shulin Deng.

Ethics declarations

Ethics approval and consent to participate

The study was conducted the plant material that complies with relevant institutional, national, and international guidelines and legislation. The mature adult R. tomentosa individual was cultivated in the South China National Botanical Garden (Chinese Academy of Sciences, Guangzhou, China). The voucher specimen was kept at the South China Botanical Garden Herbarium (IBSC 0925721).

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

12864_2024_10509_MOESM1_ESM.pdf

Supplementary Material 1: Fig S1. K-mer frequency distribution curve of Illumina short reads for the R. tomentosa genome by GenomeScope. Fig S2. Chromosome karyotype analysis of R. tomentosa. 2n = 2X = 22. Bar = 10 μM. Fig S3. Hi-C contact data mapped to the R. tomentosa chromosome. Fig S4. Merqury assembly spectrum plots for evaluating k-mer completeness to the R. tomentosa chromosome. Fig S5. Comparison of gene models between R. tomentosa with those in other species. Fig S6. Distribution of synonymous substitution levels (Ks) of paralogous (A) and orthologous genes (B). Fig S7. Go analysis of positively selected genes in R. tomentosa. Fig S8. Chromosomal location of TPSs on chromosomes in R. tomentosa.

12864_2024_10509_MOESM2_ESM.xlsx

Supplementary Material 2: Table S1. Summary statistic for raw sequencing dataset. Table S2. Summary statistics for the final genome assembly of R. tomentosa. Table S3. Evaluation of the genome assembly of R. tomentosa using Benchmarking Universal Single-Copy Orthologs (BUSCO). Table S4. Statistics of the R. tomentosa RNA-Seq data from different tissues. Table S5. Statistics of the repeat annotation results. Table S6. Statistics of gene annotation. Table S7. Statistics of non-coding gene annotation. Table S8. List of plant genome sequences used in the comparative genomic analysis. Table S9. Gene families clustered by OrthoFinder in 14 species. Genes used for OrthoFinder were proteins without splice variants. Table S10. KEGG enrichment analysis of species-specific genes in R. tomentosa. Table S11. GO enrichment analysis of species-specific genes in R. tomentosa. Table S12. KEGG enrichment analysis of significant expansion genes in R. tomentosa. Table S13. GO enrichment analysis of significant expansion genes in R. tomentosa. Table S14. KEGG enrichment analysis of significant contraction genes in R. tomentosa. Table S15. GO enrichment analysis of significant contraction genes in R. tomentosa. Table S16. Classification of different origins of duplicate genes in R. tomentosa. Table S17. Positive selection genes in R. tomentosa. Table S18. The RtTPS genes in R.tomentosa..

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Yang, L., Jin, J., Lyu, S. et al. Genomic analysis based on chromosome-level genome assembly reveals Myrtaceae evolution and terpene biosynthesis of rose myrtle. BMC Genomics 25, 578 (2024). https://doi.org/10.1186/s12864-024-10509-6

Download citation

Received: 19 December 2023
Accepted: 06 June 2024
Published: 10 June 2024
DOI: https://doi.org/10.1186/s12864-024-10509-6

Genomic analysis based on chromosome-level genome assembly reveals Myrtaceae evolution and terpene biosynthesis of rose myrtle

Abstract

Background

Results

Conclusions

Similar content being viewed by others

A chromosome-scale Rhubarb (Rheum tanguticum) genome assembly provides insights into the evolution of anthraquinone biosynthesis

Chromosome-level genome assembly provides insights into the genetic diversity, evolution, and flower development of Prunus conradinae

Upgraded durian genome reveals the role of chromosome reshuffling during ancestral karyotype evolution, lignin biosynthesis regulation, and stress tolerance

Background

Materials and methods

Plant sampling

Genome sequencing, assembly and quality assessment

Chromosome counting and karyotype analysis

Repeat and noncoding RNA annotation

Structural and functional annotation of genes

Phylogenetic analysis and estimation of divergence time

Gene family expansion and contraction analysis

Comparative genome analyses

Identification of TPS gene family

RNA extraction, library construction, and sequencing

Positive selection analysis

Results

De-novo genome assembly and pseudo-chromosome construction

Repetitive elements and protein-coding gene annotation

Gene family analysis

Gene duplication affected terpenoids synthesis in R. tomentosa

Positively selected genes (PSGs) in R. tomentosa

TPS family genes probably affect terpenoids synthesis

Discussion

Diversity among genome datasets contributes to comparative genomics analysis

Tandem duplication and specific subfamily expansion of TPS in R. tomentosa

The distinct gene subgroups of TPS affect specific terpene accumulation

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Authors’ information

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

12864_2024_10509_MOESM1_ESM.pdf

12864_2024_10509_MOESM2_ESM.xlsx

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation