Integration of small RNAs, degradome, and transcriptome sequencing provides insights into the differences between Shizhu ginseng and Yuan ginseng

Panax ginseng is one of the most popular herbs which have been used as an important traditional Chinese medicine since ancient times. Yuan ginseng and Shizhu ginseng,which belong to P. ginseng, are widely used as substitutes for wild ginseng in clinical practice. Clinical practice has proved that the clinical efficacy of Shizhu ginseng is better than Yuan ginseng. However, current research cannot completely explain this phenomenon. Considering that small RNA may be one of the pharmacodynamic substances of P. ginseng, it is challenging to investigate differential miRNAs between Shizhu ginseng and Yuan ginseng. In this study, the transcriptome, small RNAome and degradome of P. ginseng were studied by high-throughput sequencing. A total of 63,875 unigenes and 43,950,137 small RNA clean reads were obtained from the roots of P. ginseng. Among 3206 differentially expressed genes, 1190 genes were up-regulated in Yuan ginseng when compared with Shizhu ginseng. 24 known differential miRNAs and 7 novel differential miRNAs were obtained. The 304 targets of 24 differentially expressed miRNA (17 known and 7 novel) families are mainly related to energy metabolism, biotic stress and disease immunity in ginseng itself. Through the association analysis of mRNA and miRNA, our work gives a better understanding of the difference between Yuan ginseng and Shizhu ginseng. Considering the cross-kingdom regulation of plant miRNAs, our results may provide a foundation for understanding the miRNA-dependent clinical efficacy in P. ginseng.


Introduction
Ginseng is the dry root and rhizome of Panax ginseng C.A. Mey, which belongs to the Panax genus in Araliaceae family. It has widely been used as an important traditional medicine for thousands of years in Eastern Asia, especially China. Evidence from clinical trials demonstrated that P. ginseng could play significant roles in many human diseases such as cancer, neurodegenerative diseases (Rajabian et al. 2019) and so on. Owing to its important medicinal properties, P. ginseng has been highly valued and has been called ''the king of herbs''. Because P. ginseng is consumed in large quantities and grows very slowly in natural conditions, wild-grown ginseng cannot meet the growing demand from the medical market. Consequently, P. ginseng was artificially cultivated to meet demand for this precious herb and cultivation of this herb dates back at least as far as the Ming dynasty. According to different growth years, cultivated ginseng can be divided into Shizhu ginseng and Yuan ginseng. Yuan ginseng is a ginseng artificially cultivated in gardens or fields for 6-8 years. Shizhu ginseng is a ginseng cultivated in a special method for 10-20 years, and its medicinal value is comparable to that of wild ginseng.
Although Shizhu ginseng and Yuan ginseng belong to cultivated ginseng, years of medical practice have shown that the clinical efficacy of Shizhu ginseng is significantly better than that of Yuan ginseng. To date, modern pharmacological research has focused on the ginsenosides and polysaccharides, the major bioactive compound of P. ginseng. However, it cannot completely explain the reasons for the huge difference in clinical efficacy between Shizhu ginseng and Yuan ginseng (Chunlu et al. 2016;Keqiang et al. 2013).
MicroRNAs (miRNAs) are a class of endogenous single-stranded RNAs of about 20 nt to 24 nt that can be paired with the 3 0 -untranslated region (UTR) of a target gene in a complete or incomplete manner. It can regulate gene expression at post-transcriptional levels (Dong et al. 2013), which plays a major role in lots of aspects (Yingfang et al. 2019). MiRNA research technology such as small RNA sequencing has been widely used in clinical and basic research which has become an important means for scientists to study the function of miRNAs. The transcriptome reflects the expression of specific cell or tissue genes at a specific development or physiological stage. Usually, differentially expressed genes (DEGs) were combined with differentially expressed miRNAs (DEMs) to analyze the regulatory mechanisms in which miRNAs might be involved. Degradome sequencing, as a highthroughput experimental method for identifying the target transcripts of plant miRNAs , can be employed to verify the miRNA target genes in plants accurately on a large scale (Mathiyalagan et al. 2013).
In this study, high-throughput sequencing technology was used to compare the differences of miRNAs contained in Shizhu ginseng and Yuan ginseng. In order to explore the mechanism of the difference in clinical efficacy of ginseng with different growth years, we began with the biological functions of non-coding miRNAs contained in Shizhu ginseng and Yuan ginseng.

Plant material
The 15-year-old Shizhu ginseng and the 6-year-old Yuan ginseng were all collected from the ginseng market of Changbai Mountain in Wanliang Town, Fusong County, Jilin Province, China. They were identified as Panax ginseng C.A. Mey by the Associate Professor Hongyan Ma from the School of Traditional Chinese Medicine of Guangdong Pharmaceutical University. The main roots of ginseng were cleaned with distilled water, cut into small pieces, and then immediately frozen in liquid nitrogen until further processing.

RNA extraction
The total RNA was extracted by using Trizol reagent (Invitrogen) according to the manufacturer's protocols, and 100 mg of snap-frozen Shizhu ginseng and Yuan ginseng root tissue was used, respectively. The quantity and purity of the total RNA were measured by 1% agarose gel electrophoresis, NanoDrop 2000 spectrophotometer and Agilent 2100 Bioanalyzer (RNA integrity number (RIN) [ 7.0, 28S/18S C 0.7). For each of two samples, at least three individual ginseng roots were collected for RNA isolation. The mixed total RNA from Yuan ginseng or Shizhu ginseng was used for library construction for transcriptome, small RNA, and degradome sequencing. The library construction and sequencing were conducted by OE Biotech (Shanghai, China).
Library construction and high-throughput sequencing for transcriptome, small RNA and degradome The high-quality total RNA was used for transcriptome library construction using Illumina's mRNA-Seq sample preparation kit. Briefly, mRNA was isolated from total RNA using oligo-dT magnetic beads and fragmented, followed by cDNA synthesis. The cDNA fragments were amplified to generate transcriptome libraries which were sequenced on Illumina HiSeq 2500 platform.
Two libraries were constructed using the extracted RNA according to standard procedures for small RNA library construction and sequencing. In brief, high-quality RNA from each sample was separated by polyacrylamide gel electrophoresis. RNA fragments of length 18-30 nt were enriched and were ligated to 3 0 and 5 0 adapter. Then the fragments were t subjected to cDNA synthesis, followed by PCR amplification and sequencing on Illumina HiSeq 2500.
The degradome library construction was performed as previously described (Garg et al. 2019) followed by sequencing.

Analysis of transcriptome sequencing data
The raw data obtained from sequencing all these samples were processed to remove adaptor contamination and lowquality reads using FASTQC (Cock et al. 2010). The Trinity software was used to assemble the unigene sets from the clean reads of Yuan ginseng and Shizhu ginseng, using the default parameters. After short-sequence assembly and splicing, non-all-unigene was obtained. Then it was compared with NCBI non-redundant (NR), EuKaryotic Orthologous Groups (KOG), Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO) and other databases to obtain the protein functional annotation and metabolic pathway information of unigene. The gene expression level for each transcript was calculated by FPKM (reads per kb per million reads). Differentially expressed genes (DEGs) were identified between different samples by using DESeq software (Kvam et al. 2012). A gene was considered to be differentially expressed between two samples if it exhibited fold change (FC) value of C 2 (up-regulation) or B 0.5 (down-regulation) with the p value B 0.05. GO functional enrichment and KEGG pathway analyses were performed, respectively, to understand the function of DEGs.

Analysis of small RNA sequencing data
Adapters were removed and low-quantity sequences were filtered from raw data to obtain clean reads. The clean reads were aligned with the Rfam, Gene, Repbase to exclude rRNA, tRNA, snRNA, snoRNA, and other ncRNA, and repeats. The remaining reads were aligned with miRBase 21.0 databases to map known miRNAs, and the unpaired sequences were used for the prediction of novel miRNAs using miRDeep2 software (Mackowiak 2011). The expression levels of known miRNAs and newly predicted miRNAs were calculated by TPM (transcript per million). Differentially expressed miRNAs (DEMs) were identified with FC [ 2 (up-regulation) or FC \ 0.5 (downregulation) and the threshold of p value \ 0.05. The p value was calculated with Audic Claverie statistic. miRNA targets were predicted from the assembled P. ginseng unigene set by the psRNATarget server using default parameters. GO annotation and KEGG pathway analysis were performed on the target genes of DEMs.

Analysis of degradome sequencing data
Clean tags were obtained by data filtering. After annotation of Rfam and Genbank database and identification of Poly N,functional annotations for clean tags were obtained. The remaining tags were compared with reference genome to obtain cDNA-sense. The tags (cDNA-sense) which positively aligned to the reference genome (unigenes) in the clean tags were selected to predict and analyze the miRNA degradation sites. Cleaveland 3 software was used for identification and annotation of miRNA cleavage sites. The reference gene sequence in which the predicted degradation site is located is aligned with the Nr, KEGG, and GO databases to obtain functional annotation information of the degraded gene.

Validation of DEMs and DEGs expression by qRT-PCR
17 DEGs and 9 DEMs of interest were randomly selected for quantitative real-time PCR (qRT-PCR) analysis. The total RNA was extracted by using the Trizol reagent, then it was determined using a NanoDrop 2000 spectrophotometer and agarose gel electrophoresis stained with ethidium bromide. Quantification was performed with a two-step reaction process: reverse transcription (RT) and PCR. qRT-PCR analysis was carried out on LightCycler Ò 480 II Realtime PCR Instrument (Roche, Swiss) using QuantiFast Ò SYBR Ò Green PCR Master Mix (Qiagen, Germany). Each sample was run in triplicate for analysis. The expression levels of mRNAs and miRNAs were normalized to 18S rRNA and U6 respectively, and were calculated using the 2 -DDCt method.

Total RNA quality, concentration and integrity testing
Agarose gel electrophoresis showed that the 28S and 18S bands of the two samples were clear and complete (Fig. 1), indicating that the extracted RNA was in good integrity. And the total RNA band brightness of the Yuan ginseng was twice as bright as that of the total RNA band of the Shizhu ginseng. This may be due to the long growth period of the Shizhu ginseng that enriched secondary metabolites such as polysaccharide and polyphenols.
The Agilent 2100 Bioanalyzer was used to perform a quality check on the RNA indicators (Table S1). The concentrations of Shizhu ginseng and Yuan ginseng were 909.24 ng lL -1 and 1008.79 ng lL -1 , respectively. RNA integrity number (RIN) of Shizhu ginseng and Yuan ginseng was 9.7 and 8.8, respectively. 28S/18S of Shizhu ginseng and Yuan ginseng was 1.9 and 1.8, respectively. The results showed that the total RNA quality met the criteria for high-throughput database construction and the Illumina HiSeq TM 2500 sequencing.

DEGs between Shizhu ginseng and Yuan ginseng
To identify transcripts that are involved in different growth year of P. ginseng, RNA-seq libraries from Shizhu ginseng and Yuan ginseng were constructed. Approximately 52.5 million raw reads were obtained from each RNA-seq library (Table S2). A total of 104.97 million reads were generated from the paired-end sequencing of these 2 samples. After the stringent quality filters, 94.8% (99.51 million) of the reads representing high-quality reads (Q30) were processed for further analysis. Then the high-quality reads were used for de novo assemble and 63,875 unigenes were obtained.
According to the differential gene screening method in Method, 3216 differentially expressed genes (DEGs) were obtained. 1190 genes were up-regulated in the Yuan ginseng compared to those of Shizhu ginseng, while 2026 genes were down-regulated (Fig. 2, Table S3). To verify DEGs identified by RNA-Seq,17 DEGs were randomly selected for qRT-PCR verification. The results showed the same expression patterns with RNA-Seq data (Fig. S1).  Fig. 2 Volcano map of DEGs. The X-axis is the average of all sample expressions used for comparison after normalization, and the Y-axis is log2 fold change. Red is marked with a difference of significant p value \ 0.05 unigene transcription DNA-templated, monolayer-surrounded lipid storage body, transcription factor activity, etc. The DEGs which down-regulated in the Yuan ginseng were enriched in the unsaturated fatty acid biosynthetic process, apoplast, heme binding, and so on (Fig. 3).

Functional analysis of DEGs
To further understand the function of DEGs between Shizhu ginseng and Yuan ginseng, we performed KEGG analysis. KEGG pathway enrichment analysis of DEGs was done based on hypergeometric model with a significance threshold of p value 0.05. Most of the significantly enriched up-regulated DEGs belong to the MAPK signaling pathway, photosynthesis-antenna proteins, spliceosome and so on. Down-regulation of DEGs was significantly enriched in pathways such as phenylpropanoid biosynthesis, biosynthesis of unsaturated fatty acids and fatty acid metabolism, etc. (Figure 4).

Identification of differentially expressed miRNAs (DEMs) among Shizhu ginseng and Yuan ginseng
The raw data obtained by high-throughput small RNA sequencing was purified to obtain clean reads, and the length distribution of small RNAs with a length of 18-41 nt was statistically analyzed. The small RNA sequences of Shizhu ginseng and Yuan ginseng were enriched at the length of 24-21 nt (51.64% and 66.03%, respectively), indicating that miRNA is the main fragment in the deep sequencing results, which provides a powerful basis for the analysis of miRNA.
Differential expression analysis between Shizhu ginseng and Yuan ginseng was performed. DEMs were selected with FC [ 2 or \ 0.5 and p value B 0.05 by TPM statistic.
For known miRNAs, 24 DEMs were obtained, 13 of which were up-regulated and 11 were down-regulated (Fig. 5, Table S4). A total of 39 novel miRNAs (21 new miRNA hairpin precursors and 18 miRNA* sequences) were obtained through prediction (Table S5). For novel miR-NAs, a total of 7 differential miRNAs were obtained, of which 5 novel miRNAs were up-regulated in the ginseng and 2 were down-regulated (Fig. 6). 7 DEMs were randomly selected for qRT-PCR verification and the results showed consistence with small RNA sequencing data (Fig. S2).

Target gene prediction and analysis of DEMs
Plant miRNAs show perfect or near-perfect complementarities to their targets. It allows an effective prediction of miRNA targets through computation. To predict the targets of DEMs in P. ginseng, we used psRNATarget (Dai et al. 2018), PMRD (Zhang et al. 2010) and TAPIR (Bonnet et al. 2010) and other biological software. We identified a total of 304 unigenes from 63,875 assembled P. ginseng unigenes to be targets of 17 known miRNA families and 7 novel miRNAs (Table 1).
GO and KEGG functional enrichment analysis was performed on 197 target genes of 17 known miRNA families, and a total of 260 GO functions and 36 pathway pathways were annotated. The GO analysis results contain three parts: biological process, cellular component, and molecular function, and all the most strongly enriched elements are shown in Fig. 7a. Within biological process, regulation of transcription, cell differentiation, leaf development and auxin-activated signaling pathway were the Through KEGG Pathway functional enrichment analysis, it is known that differential miRNA target genes are enriched in plant hormone signal transduction, Wnt signaling pathway, ubiquitin mediated proteolysis and so on (Fig. 7c).
For target genes of 7 novel miRNAs, we performed KEGG and GO enrichment analysis. A total of 167 GO functional categories and 37 KEGG pathways were annotated. The results of GO analysis have been showed in Fig. 7b Within biological process, regulation of transcription, transcription (DNA-templated) and root development were the most significantly enriched elements; within cellular component, the most significantly enriched element were nucleus, integral component of membrane and plasmodesma; and within molecular function, binding, catalytic activity and transcription factor activity groups were the most significantly enriched elements. These GO functional categories of novel miRNA target gene were very similar to those predicted by known differential miRNAs.
KEGG Pathway functional enrichment showed that the target genes of novel miRNAs were enriched in plant hormone signal transduction, p53 signaling pathway, metabolism, ubiquitin mediated proteolysis (Fig. 7d). This Fig. 4 The KEGG analysis of differential expressed genes. a The KEGG analysis of up-regulated genes. b The KEGG analysis of downregulated genes

miRNA targets validation by degradome sequencing
To further understand the role of miRNA in P. ginseng, degradome sequencing of P. ginseng was used to identify miRNA targets. The reference gene sequence in which the degraded gene fragment is located is aligned with the NR, KEGG, and GO databases using Cleaveland 3 software to perform functional annotation of the target gene. Two target genes for miR159 were identified, namely GAM1_ORYSI and TCP24_ARATH. GAMYB transcription factor is an important transcription factor in the gibberellin signal transduction pathway, and TCP transcription factor is involved in various signal transduction pathways, connected growth and mediated stress response. miR393 targets AFB2_ARATH and TIR1_ARATH to altering the auxin distribution in plants by negative regulation of TIR1 (transport inhibitor response 1) and AFB2 (protein auxin signaling F-BOX). This was basically consistent with the target gene type in high throughput sequencing of small RNA. MiR396 participates in the growth process of ginseng by targeting the expression of GRF (growth-regulating factor) and protein auxin signaling . The target genes of the three novel miRNAs were detected to be associated with SPX factor and TIR1 disease resistance factor (Table 2).

Discussion
In this study, three high-throughput approaches, namely, transcriptomics, small RNA and degradome sequencing were used for better understanding of differences between Shizhu ginseng and Yuan ginseng. After the stringent quality filters and de novo assemble, a total of 63,875 unigenes were obtained. We have identified 3206 DEGs (1190 up-regulated and 2016 down-regulated). These DEGs mainly enriched in phenylpropanoid biosynthesis, biosynthesis of unsaturated fatty acids, apoplast, etc. Plants are rich in miRNA, which can regulate plant growth and development, nutrient metabolism and stress response. For example, the target gene ICE1 (transcription factor of basic helix-loop-helix (bHLH) DNA-binding superfamily protein) of miR393 plays a role in cold tolerance by controlling cold-responsive genes (Ohta et al. 2018).
The regulation of some miRNA target genes is involved in the production process of plant secondary metabolites, which is conducive to the accumulation of pharmacodynamic substances. The target gene of miR159 is MYB (Transcription factor MYB51), and the MYB transcription factor is involved in the biosynthesis of fat and glucosinolate (Frerigmann and Gigolashvili 2014). Glucosinolate is abiological compound with anti-cancer and anti-oxidant activity, and its hydrolysateisothiocyanate can prevent and treat various cancers, scavenge free radicals, protect cardiovascular and cerebrovascular, and delay aging (Konic-Ristic et al. 2016). The target gene of miR172 encodes the glycosyltransferase (GATLA),which is a member of a multi-gene family of plants that transfer single or multiple activated sugars to a range of plant molecules, resulting in glycosylation of plant compounds. The UDP glucosyltransferase gene P. ginseng is a key enzyme for the  (Lu et al. 2017), which is most likely to participate in triterpenoid synthesis (Fan et al. 2018;Khorolragchaa et al. 2014).
miRNA target genes are evolutionarily conserved among many known species. It was reported that plant miRNAs could transfer and regulate gene expression in a cross-kingdom manner, namely, affecting the organism from which they do not originate. Zhang Chenyu et al. fed fresh rice which contained abundant MIR168a to mice and found that MIR168a could bind to the exon 4 of low density lipoprotein receptor adapter protein 1 (LDLRAP1) mRNA and reduces its protein expression in vitro and in vivo, and consequently regulate the lipid metabolism of the mouse (Zhang et al. 2012). Zhang Chenyu et al. found that miR2911 contained in honeysuckle signifcantly inhibited H1N1-encoded PB2 and NS1 protein expression, and consequently played a direct role in the influenza virus in mice (Zhou et al. 2015). It has been found that the miRNA of dodder (Cuscuta spp.) can as a causative agent to target host mRNA across species to regulate its gene expression and play a role in the process of parasitization (Shahid et al. 2018). Plant miRNAs which were are more enriched in beebread than inroyal jelly can induce larval developed into worker bees by delaying development and decreasing body and ovary size (Zhu et al. 2017). It provides a basis for the cross-kingdom regulation of plant miRNAs.
Wang Yingfang et al. had detected abundant miRNAs in fresh ginseng and ginseng decoction Wenjuan et al. 2017). It is speculated that ginseng miRNA may be one of the material basis for its clinical efficacy. The difference in clinical efficacy of ginseng with different growth years may be caused by the regulation of its intrinsic miRNA. Based on this, the research group carried out this study. Shizhu ginseng and Yuan ginseng are different varieties of Chinese ginseng, and their growth years differ by more than 10 years. The DEMs of Shizhu ginseng and Yuan ginseng were identified by high-throughput sequencing, and some related target genes were annotated.
In the experiment, there were 17 known miRNAs and 7 novel miRNAs with different in Shizhu ginseng and Yuan ginseng. GO and KEGG functional enrichment analysis was performed on targets of known and novel miRNA families. For GO functional enrichment, in the main category of ''biological processes'', the most significant GO terms were ''regulation of transcription'', ''cell differentiation'', and ''auxin-activated signaling pathway''. With respect to the main category of ''cell component'' class, the ''nucleus'', ''integral component of membrane'' and ''cytoplasm'' was the domain term. Under the third main category of ''molecular function'', the dominant terms were ''binding'', ''catalytic activity'', and ''transcription factor activity''. In functional KEGG analysis, target genes are enriched in plant hormone signal transduction, p53 signaling pathway, metabolism, and Ubiquitin mediated proteolysis.
Among them, the target genes of miR396 are LU7L3 (luc7-like protein 3 isoform X2) and TBCD (tubulin-folding cofactor D). LUC7L is defined at the centromeric border of the human alpha-globin domain and inhibits replication of hepatitis B virus by inhibiting enhancer II or basal core promoter activity (Li et al. 2016). TBCD encodes tubulin folding cofactor D, a dynamic cytoskeletal component that coordinates and supports a variety of neuronal processes, including cell division, migration, intracellular trafficking, and signal transduction. Mutations or altered microtubule dynamics can cause neurodevelopmental disorders and neurodegenerative diseases (Flex et al. 2016;Miyake et al. 2016). Studies have shown that TBCD interacts physically with the intracellular domain of Down's syndrome cell adhesion molecule (Dscam), which is important for neural development (Okumura et al. 2015). The novel miRNA comp46638 targets B561A (cytochrome b561, DM13 and DOMON domain-containing protein). B561A is found in various organs and cell types of plants and animals. Physiological functions of CYB561s include stress defense, cell wall modification, iron metabolism, tumor suppression, and various neural processes and memory retention (Asard et al. 2013). TBCD and B561A are involved in the process of neurodevelopment, which may be the target of ginseng miRNAs to play a stimulating role.
The novel miRNA comp65451 target gene is involved in the regulation of TIR NBS-LRR (Toll-interleukin-like receptor, nucleotide-binding site and C-terminal leucinerich repeat), and may be a novel miRNA specific for ginseng species.
The target of miR319 is SR542, which is highly conserved and exists in bacteriaand mammals. Autosomal dominant mutations in SRP54 are key factors in co-translational protein targeting pathways, leading to neutropenia like Shwachman-Diamond syndrome (Carapito et al. 2017).
The target of miR393, NTPCR, encodes a cancer-associated nucleoside phosphatase, a cancer-associated PRUNE2 protein that is nucleotide-associated and highly expressed in mature neural tissue, and its neural tissuespecific and post-developmental expression may help maintain maturity the nervous system (Iwama et al. 2011).
Some target genes indirectly exert anti-tumor effects, such as NTPCR, MYB, FDL3, etc., by proteins encoding cancer-associated proteins and secondary metabolites with anticancer effects. Chin AR et al. found that oral miR159 mimics significantly inhibited the growth of mouse xenograft breast tumors (Chin et al. 2016).
The above analysis shows that this study lays a foundation for further exploring the differences between Shizhu ginseng and Yuan ginseng and revealing the possible regulation of ginseng miRNA and its target genes.