Background

Leaf color is one of the most important characteristics of ornamental plants, and plants with colored foliage were often called “colored-leaf plants” [1, 2]. Some researchers have analyzed and determined systematically the pigments and physiological indexes of the leaves of colored-leaf plants [3, 4]. Result showed that the change of leaf color is closely related to the synthesis and accumulation of anthocyanins in leaves [5]. Anthocyanins are one of the important secondary metabolites of plants and they often have anti-cancer, anti-oxidation and anti-atherosclerosis properties [6]. Anthocyanins confer orange, red, magenta, violet and blue and the biosynthetic pathway leading to floral or pulp pigment accumulation had been well characterized and the genes encoding relevant enzymes and transcriptional factors have been isolated [7, 8]. The molecular mechanisms of the anthocyanin biosynthesis pathway also have been comprehensively reported. However, most of the researches mainly focused on f fruit color [9] and petal color [10,11,12], and anthocyanin biosynthesis in colored-leaf plants has rarely been researched prior to this study. In recent years, some scholars have identified PAL, CHS, CHI, DFR, ANS, F3H, F3’H, F3’5’H [13, 14] and a few related regulatory genes such as MYB, bHLH and WDR in color changing of colored-leaf plants [15, 16]. The process of anthocyanin synthesis and accumulation is relatively complex, and is regulated by multiple enzymes and transcription factors [17], as well as being influenced by external environmental factors such as light [18], water stress [19], and temperature [20]. Thus the mechanism of leaf color change in colored-leaf plants needs to be further studied.

Acer pseudosieboldianum is a small deciduous tree belonging to the Acer genus of the family Aceraceae. Because of its beautiful shape and brilliant leaves, it is an often used autumn leaf ornamental tree species [21]. In addition, it has high economic value, whose woods can be used for making utensils and leaves can be used as dyes [22]. Recently, some scholars have reported and studied the introduction, cultivation, and breeding of A. pseudosieboldianum [23, 24]. However the key genes affecting leaf color change have not been determined yet, and relative information is relatively scarce. This fact means that the molecular regulatory mechanisms related to leaf color formation needs further study.

In recent years, transcriptome high-throughput sequencing technology has been widely used to study the mechanism of leaf color in various plants [25, 26]. In this study, de novo transcriptome sequencing assembly, annotation, and bioinformatic analysis on leaves from A. pseudosieboldianum were performed at different color-changing stages in autumn. The DEGs at different transformation stages were analyzed and validated. At last, combined this data with anthocyanin metabolism analysis data, some genes related to anthocyanin synthesis were identified. This study provides a theoretical basis for studying the molecular mechanism of leaf color in A. pseudosieboldianum.

Results

Contents of anthocyanin in the leaves

In order to explore the mechanism of pigment formation in A. pseudosieboldianum leaves, we carried out qualitative analysis of anthocyanin components in the middle (M) and last stage (A) of leaf color transformation (The anthocyanin content was extremely low in early stage (B), Therefore, only M and A stage were analyzed). According to our UPLC–Q–TOF–MS data, five anthocyanins were identified (Fig. 1). They were Peonidin O-hexoside, Rosinidin O-hexoside, Cyanidin 3-O-glucoside, Cyanidin 3, 5-O-diglucoside, and Pelargonidin 3-O-beta-D-glucoside. The content of five anthocyanin metabolites were different during the middle stage (M) and last stage (A). The contents of Rosinidin O-hexoside and Pelargonidin 3-O-beta-D-glucoside in the leaves were both very low. Peonidin O-hexoside and Cyanidin 3, 5-O-diglucoside, especially Cyanidin 3, 5-O-diglucoside in the leaves were abundantand, and displayed significant differences at two periods, meaning they may be the key substances for the final color of A. pseudosieboldianum.

Fig. 1
figure 1

Anthocyanins components and contents detected in A. pseudosieboldianum

Production statistics of sequencing data

In order to understand the molecular mechanism of color change in A. pseudosieboldianum leaves in autumn, sequencing was performed using the Illumina Hiseq 2500 (Additional file 1: Table S1). A total of 67.47 Gb of clean data was obtained from these sequencing results, and the percentage of Q30 bases was 93.10 % or more. After assembly, 50,501 unigenes were identified. Among these there were 20,706 unigenes over 1 kb in length, and the error rate of sequencing was less than 0.1 %, which indicates that the quality of sequencing data was good and could be used for subsequent analysis.

Statistics of sequencing data assembly results

These recombinant sequence dataset yielded 115,413 transcripts and 50,501 unigenes, among which, the N50 (accounting for 50 % of the maximum length nucleotide sequence of all single genes) was 2267 nt and 1979 nt, respectively. There were 17,366 (34.39 %) unigenes between 300 and 500 nt, 23,580 (46.69 %) unigenes between 500 and 2000 nt, and 9,555 (18.92 %) unigenes longer than 2000 nt (Table 1).

Table 1 Length distributions of the transcripts and unigenes from de novo assembly

Functional annotation and classification

Unigene sequence was then compared with gene sequences in the NR, Swiss-Prot GO, COG, KOG, eggNOG 4.5, and KEGG databases using BLAST software (e < 0.00001). 35,316 unigenes were identified, accounting for 70.01 % of the 50,501 unigenes. 12,984 unigenes were annotated in the COG database, 25,375, 12,487 and 19,460 unigenes were annotated in the GO, KEGG, and KOG databases respectively. 25,226 unigenes were annotated in the Pfam database. 19,796 unigenes and 32,498 unigenes were also annotated in the Swanshot and eggNOG databases respectively (Table 2).

Table 2 Statistics of comparisons with databases

According to NCBI NR database and E-value distribution, the number of unigenes annotated in our dataset was 35,024, of which 71.53 % of these unigenes (E < 10 − 50) had strong homology and 47.87 % of these unigenes (E < 10− 100) had very strong homology (Fig. 2a).Ten popular-related species were also annotated based on the NCBI NR database (Fig. 2b). The highest homology to A. pseudosieboldianum was Citrus sinensis, accounting for 12.25 % homology, followed by Citrus clementina, which accounted for 9.74 % homology.

Fig. 2
figure 2

Characteristics of homology search of A. pseudosieboldianum unigenes. a E-value distribution in the NR database for each unigene. b Species taxonomy based on the NR database

GO databases are divided into three categories: biological process, cellular component and molecular function, which are further divided into 42 functional subgroups. Biological process had the largest number of annotated unigenes, included metabolic process and cellular process with 13,141 (51.78 %) unigenes and 11,546 (45.5 %) unigenes, respectively. The cellular component class mainly included cell and cell part, with 11,886 (46.84 %) unigenes and 11,806 (46.53 %) unigenes, respectively. The molecular function category mainly included catalytic activity and binding, and there were 12,691 (50.01 %) unigenes and 1, 1049 (43.54 %) unigenes (Fig. 3).

Fig. 3
figure 3

Histogram of GO classification of assembled unigenes

In addition, Annotation data about COG and KEGG were found in Additional file 2: Fig. S1 and Additional file 3: Table S2, respecially.

Differentially Expressed Genes (DEGs)

In order to explore the genes related to anthocyanin biosynthesis in A. pseudosieboldianum at different color-changing stages, the differential expression of A. pseudosieboldianum samples at different color-changing stages were then analyzed. The results showed that there were 16,521 DEGs in the three color-changing periods of A. pseudosieboldianum (Fig. 4a). Comparing between the early stage (B) and the middle stage (M), there were 87 significant DEGs, with 52 up-regulated and 35 down-regulated. Between with the early stage (B) and the final stage (A), there were 14,855 DEGs, of which 7984 were up-regulated and 6871 were down-regulated. In a comparison of the middle stage (M) and the final stage (A), there were 12,402 DEGs, 5683 up-regulated and 6719 down-regulated, in A. pseudosieboldianum (Fig. 4b).

Fig. 4
figure 4

Differentially expressed genes at three stages. a The statistics of differentially expressed genes; b Venn Diagram result among three stages

In order to further understand the function of these respective DEGs, we carried out KEGG pathway enrichment analysis in the three stages of A. pseudosieboldianum. Our results showed that there were 16,521 differentially expressed genes in the three stages (B, M and A). The anthocyanin biosynthesis pathways related to leaf tone control were significantly enriched in B vs. M and B vs. A up-regulated genes. Phenylalanine metabolic pathways were significantly enriched in B vs. M and B vs. A up-regulated genes (Additional file 4: Table S3; Additional file 5: Table S4).

Candidate genes involved in the anthocyanin biosynthesis Pathway

Twenty candidate genes were identified that covered almost all known enzymes involved in anthocyanin biosynthesis. Four PAL genes (c118011.graph_c0, c118229.graph_c0, c60818.graph_c0, c97964.graph_c0), one CHS gene was detected (c100615.graph_c0), one CHI gene (c108255.graph_c0), two F3H genes (c114916.graph_c0, c56266.graph_c0) were detected in the upstream phenylalanine pathway, and two F3’H genes (c110935.graph_c0, c108910.graph_c0), one ANS genes, two DFR genes, and six GT genes also detected in the downstream phenylalanine pathway. Combined with contents of metabolites, the gene pathways related to anthocyanin synthesis were analyzed in A. pseudosieboldianum (Fig. 5).

Fig. 5
figure 5

Thermographic analysis of gene pathways related to flavonoid synthesis in A. pseudosieboldianum leaves at B, M and A stages. Early stage: B; mid- stage: M; last stage: A. B, M and A are arranged horizontally at all stages and single genes are listed vertically. The annotations are displayed next to the corresponding genes. All FPKM values of single genes are plotted logarithmically

Screening of different transcription factors for anthocyanin biosynthesis

Transcription factors play an important role in plant development and secondary metabolism. In this experiment, we screened out 31 MYB genes, 15 bHLH genes, and 28 WD40 protein genes from the three DEGs of B, M and A stages of A. pseudosieboldianum. In the 31 MYB genes, 17 were up-regulated and 14 down regulated (Additional file 6: Table S5). In the 15 bHLH genes, 6 were up-regulated and 9 down regulated. In the 28 WD40 protein genes, 25 were up-regulated and 3 down regulated.

qRT-PCR confirmation of RNA-seq data

In order to verify the accuracy of our sequencing data, we selected eight genes involved in anthocyanin biosynthesis, and analyzed the expression level in leaves of different color from these three different stages of A. pseudosieboldianum by qRT- PCR. The results showed that all of these selected genes had similar expression patterns than identified in the RNA sequencing data (Fig. 6). Therefore, the data obtained in our study can be used to analyze the anthocyanin biosynthesis and metabolism gene in A. pseudosieboldianum.

Fig. 6
figure 6

Expression analysis of eight differentially expressed genes related to flavonoid and anthocyanin biosynthesis in A. pseudosieboldianum

Discussion

A. pseudosieboldianum is a wild ornamental maple native to Northeast China. Like A. palmatum Thunb., A. pseudosieboldianum belongs to Sect. Palmata Paxand Ser. Palmata (Pax) Pojark. There were many cultivars of A. palmatum and they had strong ecological adaptability [27]. However, there are few varieties of A. pseudosieboldianum, which was still in the wild state or in scenic forests, and are rarely used in urban greening even if the maple leaves are red and beautiful in autumn and have high ornamental value.

At present, transcriptome sequencing technology has been used to study vegetables color formation [28], flower color mechanisms [10, 29], fruit development [30, 31]. Some scholars have analyzed the color mechanism of the related species in Acer [32]. However, due to the lack of genomic reference sequences, the molecular mechanism of leaf color is difficult to decipher in A. pseudosieboldianum. The change of anthocyanin content in plants was shown to be related to the differential expression of key genes encoding structural enzymes in the anthocyanin biosynthesis pathway [10]. The different genes including PAL, CHS, ANS, UFGT, FLS, C4H, 4CL, DFR and ANR were identified in the flavonoid biosynthesis pathway from the purple bud tea plant by transcriptome sequencing [33]. In this study, we used transcriptome sequencing technology to sequence and compare three different coloring stages of A. pseudosieboldianum leaves in autumn. We detected four PAL, one CHS, one CHI, two F3H, two F3’H, one F3’5’H, two DFR, one ANS, and six UFGT genes in the flavonoid anthocyanin complex related to leaf color in A. pseudosieboldianum. Three GT genes were down regulated in the M vs. A stage, which indicated that the change of leaves from green to red was controlled by multiple single genes.

Both F3’H and F3’5’H belong to the cytochrome P450 superfamily [34]. F3’H is an important intermediate in the synthesis of cyaniding, and F3’5’H is a key enzyme in the synthesis of blue flower anthocyanin. Masukawa T [35] reported that F3’H could make red cyanidin accumulate in purple and red root radishes. F3’5’H mainly accumulated in the blue waterlily [11]. Many important flower crops can’t produce turquoise, meaning they cannot appear blue. In this study, two F3’H and one F3’5’H were detected, but the expression of F3’5’H was very small, which may be caused leaf colour did not appear blue.

DFR is the key enzyme that catalyzes the conversion of dihydroflavonol to corresponding colorless geranium delphinium and cyanidin [36]. The main function of ANS is to oxidize colorless proanthocyanidins to produce colored anthocyanidins, which are the first colored compound in the anthocyanin synthesis pathway [37]. ANS was originally identified in a maize A2 mutant and cloned by the transposon tagging technique [38]. GT is mainly responsible for transforming unstable anthocyanins into stable anthocyanins. Studies have shown that the expression of UFGT is different in different varieties [39]. For example, anthocyanin accumulation in apple was positively correlated with UFGT activity. The change of UFGT activity in grape leads to the change of their phenotype from white to red [40]. In this study, UFGT (c103768.graph_c0) expression in A. pseudosieboldianum leaves first accumulated and then was consumed in the process of leaf color formation, which was consistent with the conclusion that UFGT consumption was needed for paeoniae anthocyanin synthesis.

At present, the research about anthocyanin biosynthesis structural genes has been gradually improved, and the research on transcription factors has become the focus. The transcription factors may also be one of the important indicators of causing A. pseudosieboldianum to turn green and red. Now MYB transcription factors for anthocyanin biosynthesis have been identified and isolated in many plants. Some studies have shown that MYB transcription factor can enhance or inhibit some aspects of regulation [10]. It was found that PqMYB113 was a transcription factor promoting anthocyanin synthesis in the leaves of peony, while PqMYB4 was a transcription factor inhibiting anthocyanin synthesis in the leaves of peony. In this study, we found that ApMYB4 gene was down-regulated in the stage of green to red transformation, which is consistent with the previous research results, indicating that ApMYB4 gene may be a transcription factor promoting anthocyanin synthesis in A. pseudosieboldianum.

It is worth mentioning that Rosinidin O-hexoside was found in A. pseudosieboldianum leaves, although the content was very small. There was no Rosinidin O-hexoside found in Acer in previous studies. The distribution of Rosinidin O-hexoside in plants is very limited and has only been reported in Catharanthus roseus [41] and Primula [12]. In addition, the cyanidin 3-glucoside contents could be used as a quantitative index to determine the color of Acer palmatum ‘atropurpureum’ [42]. It was also found that the leaf color changing from green to red in A. palmatum was the result of the increase of the mass fraction of cyanidin galactoside and the decrease of the mass fraction of chlorophyll [43]. In this study, the contents of differential metabolites were very high about Cyanidin 3, 5-O-diglucoside. The above research results revealed that Cyanidin was important anthocyanin in Acer, and played a key role of leaf color change in autumn. This study provided the basis for molecular breeding theory for ornamental plant leaf color improvement.

Conclusions

In this study, five anthocyanins were detected in the leaves of A. pseudosieboldianum, especially, Cyanidin 3, 5-O-diglucoside played an important role for the final leaf color. A total of 50,501 unigenes were produced at three stages of leaf color changing among which 16,521 DEGs and 64 unigenes were identified as color-related homologous genes. Four PAL, one CHS, one CHI, two F3H, two F3’H, one F3’5’H, two DFR, one ANS and six GT about anthocyanin synthesis pathway were detected. Combined with the detection of metabolites, the gene pathways related to anthocyanin synthesis were conducted. Also, related regulatory genes include MYB, bHLH, and WD40 were found. This study provides a theoretical basis for the formation of leaf color in A. pseudosieboldianum.

Methods

Plant materials and treatments

The materials tested were A. pseudosieboldianum plants that were five years old from Tianchi Square, Yanji City, Jilin Province (The plant materials were identified by Professor Liu Ji-Sheng from Agriculture college, Yanbian University, engaged in dendrological research for many years). Three leaf samples were collected separately at three different stages (early stage: B; mid- stage: M; last stage: A) with three replicate libraries per stage (Fig. 7). Three stages were September 21, 2018 (B) September 30, 2018 (M) and October 11, 2018 (A). All flesh samples were frozen immediately in liquid nitrogen, and then stored in a refrigerator at -80 ℃ for transcriptome sequencing and anthocyanin metabolite analysis.

Fig. 7
figure 7

The three stages of color change in A. pseudosieboldianum leaf. b: Initial stage, the leaves were all green. m: Mid stage, the leaves were both red and green. a: Last stage, the leaves were all red

Extraction identification and data analysis of anthocyanin metabolites

Leaf tissue samples of A. pseudosieboldianum were ground to a powder, and 100 mg of powder was dissolved in 1.0 ml extract solution (70 % methanol aqueous solution). The dissolved sample was placed in a refrigerator overnight at 4 ℃, and then vortexed three times during the period to improve the extraction rate. After centrifugation, the supernatant was reserved and the sample was filtered with a microporous filter membrane, and then stored in a sample bottle for LC-MS / MS analysis. We used multiple reaction monitoring (MRM) for qualitative and quantitative analysis of metabolites by mass spectrometry. Combining single variable statistical analysis and multivariate statistical analysis, we calculated the fold-change, and called a metabolite as a differential metabolite when its value was between a fold change ≥ 2 and a fold change ≤ 0.5. The differential metabolites were then annotated using the KEGG database [44].

RNA isolation library construction and RNA-Seq

Total RNA was extracted from leaf samples using an RNA extraction kit (Beijing Tiangen in China). Agarose electrophoresis and the Agilent 2100 Bioanalyzer were used to determine the concentration, purity and integrity of RNA samples. Then, PolyA mRNA was reverse transcribed into cDNA, and the construction and sequencing of the cDNA library was completed by the BMK Technology company in Beijing. Raw reads were obtained using an Illumina Hiseq 2500 sequencing platform, and after filtering, clean reads were obtained. Contigs were assembled by overlapping information between sequences, transcripts were locally assembled, and unigenes were obtained by homologous clustering and splicing of transcripts with Tgicl and Phrap software, respectively [45].

De novo assembly and functional annotation

After obtaining high quality sequencing data, it was necessary to assemble the genomic sequence of A. pseudosieboldianum. First, Trinity software parsed the sequencing reads into shorter fragments (K-mers), extends these fragments into longer fragments (Contig), and uses the overlap between these fragments to determine the fragment set (Component). Finally, using the dual methods of De Bruijn mapping and sequencing read information analysis, each transcript sequence was identified in each fragment set. The Unigene sequence was compared with the gene sequence in NR [46], Swiss-Prot [47], GO [48], COG [49], KOG [50], eggNOG4.5 [51], KEGG database by Blast software [52] (e < 0.00001). Using KOBAS 2.0 [53], the KEGG orthology result of unigenes from KEGG was obtained, and after predicting the amino acid sequence of each unigene, we used HMMER [54] software to compare with the Pfam [55] database, select unigenes whose BLAST parameter E-values were not greater than 1e− 5 and whose HMMER parameter E-values were not more than 1e− 10, and thus, finally obtained a unigene with annotation information.

Expression and differentially expressed unigene annotation

Bowtie [56] was used to compare the sequenced reads with a unigene library, and RSEM [57] was used to estimate the expression level. The expression abundance of each corresponding unigene was expressed by its FPKM [58] value. It is a common method for estimating gene expression level in transcriptome sequencing data analysis. The use of FPKM values can eliminate the influence of gene length and sequencing on calculations of gene expression. When detecting differentially expressed genes, DESeq2 was used to analyze the differentially expressed genes between the sample groups and the differentially expressed gene sets between two different conditions were identified. In the process of differential expression analysis, the Benjamini−Hochberg method was used to correct the significance p-value of the original hypothesis test, so as to reduce the false positives in independent statistical hypothesis testing for a large number of gene expression values. In the screening process, the criterion was that the FDR (False Discovery Rate) was less than 0.01 and the difference factor FC (Fold Change) was greater than or equal to 2. Between these two factors, the FC represented the ratio of expression between two samples (groups).

Gene validation and expression analysis

In order to validate our differential gene expression analysis, RT-qPCR was used to validate the differentially expressed genes related to anthocyanin biosynthesis [59]. We used a fluorescence quantitative Kit (2×SYBR ® green premix) and an analytikjena-qTOWER 2.2 fluorescence quantitative PCR instrument for quantitative analysis. The primer sequences can be found in Additional file 7: Table S6. The reaction procedure was as follows: 95 ℃ for 3 min, 95 ℃ for 10 s, 58 ℃ for 30 s, for a total of 39 cycles. Melt curve analysis (60 ℃ ~ 95 ℃ +1 ℃ / cycle, holding time 4 s), and carried out centrifugation on PCR plate centrifuge at 4 ℃ 6000 rpm for 30 s. Finally, we put it in quantitative PCR for amplification, using c110191.graph_c0 as an internal reference gene.

Statistical analysis

The procedure was repeated three times for each sample and the relative expressions were calculated using the 2−∆∆Ct method. Excel and GraphPad Prism 5 were used for chart preparation. The R-3.4.2 was used to conduct the heatmap.