Introduction

Chinese cherry [Cerasus pseudocerasus (Lindl.) G.Don] (syn. Prunus pseudocerasus Lindl.), belonging to the Rosaceae family, is an economically important tetraploid fruiting cherry species [1,2,3]. Cherry cultivation has been rapidly developing in China and has increasingly contributed to poverty alleviation and rural revitalization. Chinese cherry exhibits a wide range of fruit colors, including yellow, vermilion on yellow ground, red, purple red, and black purple [4]. The fruit coloration is attributed to the accumulation of anthocyanins, which are controlled by a distinct group of R2R3-MYB transcription factors [5, 6]. In apple, MdMYB10, MdMYB1, and MdMYBA are the main determinants of fruit color variations among cultivars [7,8,9]. In sweet cherry, three PavMYB10.1 alleles determine the fruit skin color: yellow (PavMYB10.1c), blush (homozygous for PavMYB10.1b) and red (at least one intact PavMYB10.1a) [10]. Allelic variation of MYB10 is the major force controlling natural variations in both skin and flesh color in strawberry fruits [11]. Therefore, R2R3-MYB TFs play key roles in regulating the anthocyanin biosynthesis in Rosaceae fruit crops.

Numerous reports have demonstrated that the R2R3-MYB TFs typically form a well-conserved MYB-bHLH-WD40 (MBW) complex to regulate the anthocyanin biosynthesis [12,13,14]. In Arabidopsis, AtMYB11/12/111 independently participate in transcriptional activation of the early biosynthetic genes (EBGs), while AtMYB75/90/113/114 of subfamily S6 activate the late biosynthetic genes (LBGs) by the formation of MBW complex [15]. These functions are conserved in other species, such as apple MdMYB10 [16, 17], strawberry FvMYB10 [11, 18] and FaMYB5 [19], and sweet cherry PavMYB10 and PacMYBA [10, 20, 21]. In addition to MYB activators, it has been reported that R2R3-MYB TFs belonging to subfamily S4 can repress anthocyanin accumulation. Two types of MYB repressors have been identified, one of which is dependent on its own EAR (ERF-associated amphiphilic repression) inhibitory sequence, while the other is independent [22,23,24]. Furthermore, two ways to inhibit anthocyanin biosynthesis have been proposed: the AtMYB4-like type [25] and the FaMYB1-like type [26]. The AtMYB4-like type acts directly on the promoters of target structural genes [27], such as MdMYB6, MdMYB16 in apple [24, 28] and SlMYB7 in tomato [29]. The FaMYB1-like type functions as a corepressor, which is incorporated into or binds MBW complexes to alter the complex activity and transform from activation to inhibition [27], including grape VvMYBC2, VvMYBC2-L3, and apple MdMYB15L [23, 30, 31]. These findings have highlighted the core roles of transcriptional regulations of R2R3-MYB in the control of anthocyanin biosynthesis.

In Chinese cherry, cyanidin and its glycoside derivatives have been identified as the primary anthocyanins responsible for fruit coloration [32,33,34]. Compared to yellow fruits, (dark)-red fruits were found to accumulate significantly higher level of cyanidin-3-rutinoside, but lower levels of flavanol and proanthocyanidin [33]. The up-regulation of structural genes in cyanidin biosynthesis, including EBGs (CpF3H, CpF3’H) and LBGs (CpDFR, CpANS, and CpUFGT), was observed, contributing to the formation of dark-red fruits [33]. On the contrary, higher expression of CpLAR was observed in yellow fruits. In addition, eight regulatory genes, including MYB TFs, have been identified as candidate determinants of fruit color in Chinese cherry [33]. However, the genome-wide characterization of MYB gene family and their regulatory roles in anthocyanin biosynthesis in Chinese cherry has not been previously reported.

To gain further insights into the regulatory network underlying fruit color variation in Chinese cherry, we conducted a genome-wide characterization of the R2R3-MYB gene family in Chinese cherry and relative species in Rosaceae family. Based on transcriptomic profiling and function verification, we identified the key genes involved in anthocyanin biosynthesis in Chinese cherry fruits. Our objectives were (i) to characterize the R2R3-MYB gene family of Chinese cherry through genome-wide identification and compare it with that of 10 Rosaceae relatives and Arabidopsis thaliana; (ii) to identify key MYB TFs related to anthocyanin biosynthesis in Chinese cherry; and (iii) to preliminarily verify the functions of CpMYB10 and CpMYB4. This study provides a starting point for further analysis of MYB functions in Chinese cherry and establishes a solid foundation for utilizing candidate genes in breeding programs aimed at improving anthocyanin accumulation.

Materials and methods

Plant materials

Three Chinese cherry accessions, namely ‘PZB’, ‘HF’ and ‘HP600’, representing yellow, red and black-purple fruit colors respectively, were used in this study. They were cultivated under field conditions at Cherry Germplasm Repository of Sichuan Province in Sichuan Agricultural University (Chengdu), China. The fruits from three accessions were harvested at full maturity stage. Tissue samples including roots, stems, leaves, flower buds (red), and open flowers (white) were specifically collected from ‘HF’ accession. All samples were immediately frozen in liquid nitrogen and stored at − 80 °C for subsequent analysis, with three biological replicates per sample.

Determination of fruit color and total anthocyanin content

Fruit color parameters (lightness L*, redness a*, and yellowness b*) were measured using a KONICA MINOLTA CM-2600d spectrophotometer (Japan), and the color ratio (a*/b*) was calculated [35]. Ten cherries for each replicate were used, with three biological replicates per sample point.

Total anthocyanin content was determined using the pH differential method described by Lee et al. [36]. Approximately 0.5 g of fruit tissue was extracted with 5 mL of extraction solution (acetone: methanol: water: acetic acid = 2:2:1:0.5) and heated in a water bath at 40 °C. The mixture was centrifuged at 8,000 ×g for 25 min, and the supernatant was used for analysis. Two buffer systems were used with 0.4 M potassium chloride (pH 1.0) and 0.4 M dibasic sodium (pH 4.5). The total anthocyanin content was calculated using the equation: A = [(A510 − A700) pH1.0 − (A510 − A700) pH4.5] and converted into mg cyanidin 3-glucoside per 1,000 g fresh weight (mg⋅kg− 1 FW). Three independent biological replicates per sample point were analyzed.

Genome-wide identification of R2R3-MYB gene family

A total of 12 genomes from Chinese cherry (unpublished) and related species within Rosaceae family, and Arabidopsis thaliana were selected for analysis (Table 1).

Table 1 The information of reference genomes for Chinese cherry and other eleven species

Putative MYB proteins were identified using a Hidden Markov Model (HMM) profile of the MYB DNA binding domain (PF00249) (http://pfam.sanger.ac.uk/) and conserved domains were confirmed using SMART (http://smart.embl-heidelberg.de/) and Pfam (http://pfam.janelia.org/). The ExPASy proteomics tool (http://web.expasy.org) was utilized to predict the protein characteristics, including the number of amino acids, molecular weight, theoretical isoelectric point (pI), instability index (II), aliphatic index (AI), and grand average of hydropathicity (GRAVY). Then a neighbor-joining (NJ) phylogenetic tree was constructed by aligning the full-length MYB amino acid sequences using MAFFT and MEGAX 11, with 1,000 replicate bootstraps. The resulting tree was visualized using the Chiplot online tool (https://www.chiplot.online/). The classification and biological functions of the R2R3-MYBs were determined based on their phylogenetic relationships with the corresponding AtMYB proteins.

The chromosomal distribution of CpMYB gene family in the Chinese cherry genome was depicted using TBtools [48]. Gene duplication events were analyzed using the MCScanX (http://chibba.pgml.uga.edu/mcscan2). Gene pair collinearity analysis of CpMYBs was illustrated using the Circos software (http://circos.ca/software/download/). Non-synonymous (Ka) and synonymous (Ks) substitutions of gene pairs were calculated with TBtools [48, 49]. Conserved domains and motifs were predicted using MEME Suite v5.4.1 (https://meme-suite.org/meme/tools/meme), and an intron-exon structure diagram in Chinese cherry was generated using TBtools. The subcellular localization was predicted using the Plant-mPLoc website (http://www.csbio.sjtu.edu.cn/bioinf/plant-multi/). Cis-elements in the promoter region in the 2000 bp upstream sequences were obtained from the PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/).

Transcriptome analysis and RT-qPCR validation

A detailed understanding of CpMYB gene expression underlying fruit development (S1-S5) in yellow and red fruits [33] is necessary. The transcriptomic data were retrieved from the CNGB database (https://db.cngb.org/cnsa/experiment/page/batch/sub037808/view/) under accession number CNP0003682. A heat map of gene expression was generated using TBtools software. Weighted gene co-expression network analysis (WGCNA) was further performed using the WGCNA R package (v1.4.17) to identify core gene modules and hub genes related to anthocyanin biosynthesis. All differentially expressed genes were imported into the R package and used to construct gene co-expression modules via an automatic module building function. An adjacency matrix was then created to illustrate the correlation strength between the modules and color-related characteristics, including a*, b*, L*, a*/b* ratio, and anthocyanin content. Genes associated with trait-related modules were extracted for further analysis.

RT-qPCR-based expression analysis was carried out using TransStart®Green qPCR SuperMix (TransGen Biotech Co., Ltd, Beijing, China) on a CFX96 Touch™ Real-Time PCR system (Bio-Rad, Hercules, CA, USA). Total RNA was extracted using the AFTSpin Uninersal Plant Fast RNA Extraction Kit (ABclonal, Wuhan, China). The cDNA was synthesized from RNA using the PrimeScript™ RT-PCR Kit (RR047A; TaKaRa Bio, Kusatsu, Japan). The primers were designed using Primer Premier 5 software (Table S1). The RT-qPCR reaction procedure was as follows: 95℃ for 30 s, followed by 40 cycles of 95℃ for 5 s, 60℃ for 30 s, and 72℃ for 30 s. The relative expression was calculated using the 2−∆∆Ct method, with β-Actin as the internal reference.

Subcellular localization

The empty vector was linearized through double enzyme digestion using XbaI and BamHI enzymes for the subcellular localization vector. The CDS of CpMYB10/4 were amplified and inserted into pYTSL-16 (modified from pMDC83-35 S and pSITE-2NB) using the CloneEx-press II One Step Cloning Kit (Vazyme, Nanjing, China). We created a plasmid that encoded a fusion protein of CpMYB10/4 and green fluorescent protein (GFP), driven by the 35 S promoter. Subsequently, the plasmid was transformed into Agrobacterium tumefaciens strain GV2301. The empty-GFP vector was used as a negative control. The constructs were transiently expressed in tobacco leaves for subcellular localization analysis. The primers for the eGFP fusion vector are listed in Table S1.

Transcriptional activation assay

The full-length sequence of CpMYB10/4 was cloned and inserted into pGBKT7, with the specific primer sequences listed in Table S1. The positive control (pGBKT7-p53), negative control (pGBKT7-lam), and recombinant vector (pGBKT7-CpMYB10/4) were transformed into Saccharomyces cerevisiae strain Y2H-Gold. The transformed cells were incubated on SD-/Trp and SD-/Trp/Ade/His media supplemented with 40 µg/µL X-α-Gal at 30 °C for 3 ~ 7 days. Transactivation activity was assessed by observing the growth of the transformed cells.

Transient overexpression and virus-induced gene silencing

The full length of CDS sequences of both CpMYB10 and CpMYB4 were cloned into pCAMBIA2301 and pTRV2 vectors using Takara PrimerSTAR Max DNA Polymerase (Takara, Beijing, China) and specific primers (Table S1). The pCAMBIA2301 fusions CpMYB10 and CpMYB4 were introduced into yellow fruits via Agrobacterium-mediated genetic transformation [50]. The recombinant plasmids pTRV-CpMYB10 and -CpMYB4 were transformed into Agrobacterium tumefaciens strain GV2301 through electroporation. TRV-mediated gene-silencing of CpMYB4 and CpMYBB10 was performed as previously described with slight modifications [51] for yellow and red fruits, respectively. The fruits selected for infiltration were at the pre-color conversion stage (approximately DAF33 ~ 36). The fruits were infiltrated using needleless syringes into their skin surface. After seven days of infiltration, fruits were harvested to measure color-related characteristics. Both overexpression and silencing assays were performed, involving 40 transformed fruit per strain from the same tree every time, with three biological replicates using three 8-year-old trees (Table S2).

Data analysis

The experimental data were recorded as mean values ± standard deviation (SD). The statistical differences were analyzed using Prism 9 software. The differences between groups were determined using Student’s t-test. The results with a p value below 0.05 were considered statistically significant.

Results

Comprehensive genome-wide identification of R2R3-MYB gene family

R2R3-MYB gene family in Chinese cherry, 10 Rosaceae relatives and Arabidopsis thaliana

A total of 191 CpMYB genes were identified in Chinese cherry, with the R2R3 type (99 genes) comprising the largest proportion at 51.83% of the total number (Table S3). Additionally, 85 1R-CpMYB genes, 4 3R-CpMYB genes, and 3 4R-CpMYB genes were also obtained. Similar strategies were used to identify R2R3-MYB genes among 10 Rosaceae relatives: 108 in C. avium, 134 in C. yedoensis, 111 in C. serrulata, 118 in P. salicina, 113 in A. vulgaris, 110 in A. persica, 193 in P. pyrifolia, 177 in M. domestica, 103 in F. vesca, and 98 in R. occidentalis (Table S4). The results revealed that the number of R2R3-MYB genes was much greater in tribe Maleae than those in the other three tribes, and Arabidopsis.

These R2R3-MYB proteins were classified into 43 subfamilies (Fig. 1A) based on the NL tree topology and subfamily classification in Arabidopsis. They included 42 major subfamilies found in Arabidopsis and an additional subfamily (S45) belonging to Rosaceae family [6, 52]. Different subfamilies contained varying numbers of genes, ranging from 12 for S10 to 97 for S14 (with only 6, 1, and 1 for S12, S75, and S77 respectively). The sizes of these subfamilies also varied greatly within and between tribes (Fig. 1B). For instance, tribe Maleae exhibited a higher number of members compared to other tribes, particularly in subfamilies S21, S15, S20, S8, S31, S36, and S1, while a similar number of members was observed for subfamilies S6, S44, S7, and S4. The number in the majority of subfamilies such as S21, S6, S4, S30, and S74 in Chinese cherry was less than that in the other three Cerasus species. It is worth noting that subfamilies S19, S13, and S10 exist in all species except for Chinese cherry, suggesting that these CpMYB genes may have evolved or been lost during the divergence process. A similar phenomenon is also observed in other species, such as S28 being absent in A. vulgaris and A. persica, S15 and S31 absent in C. serrulata and R. occidentalis, as well as S74 absent in F. vesca. In addition, subfamily S45 is represented in all Rosaceae species but not in Arabidopsis. The subfamilies S12, S75, and S77 are all absent in Chinese cherry and any other Rosaceae species. These results possibly imply species-specific differences for some MYB subfamilies in Chinese cherry and its Rosaceae relatives.

Fig. 1
figure 1

R2R3-MYB gene family in Chinese cherry and comparative analysis. (A) Classification of R2R3-MYB subfamilies based on phylogenetic relationships. (B) Functional annotation and gene number of R2R3-MYB subfamilies. Cp, Cerasus pseudocerasus; Ca, C. avium; Cy, C. yedoensis; Cs, C. serrulata; Ps, Prunus salicina; Av, Armeniaca vulgaris; Ap, Amygdalus persica; Pp, Pyrus pyrifolia; Md, Malus domestica; Fv, Fragaria vesca; Ro, Rubus occidentalis; At, Arabidopsis thaliana. The 126 R2R3 AtMYBs (excluding the CDC5 like proteins: S29) are clustered into 42 major subfamilies (S1-S28, S30-S33, S36-S38, S44, S46, and S74-S78) [6, 52]. Subfamily S45 is present in Rosaceae but absent in Arabidopsis [6], while S12, S75, and S77 are absent in Rosaceae. The subfamilies S6 and S4, highlighted in red font, are associated with anthocyanin biosynthesis

Overall, the functions of R2R3-MYB gene family in Chinese cherry and its relatives can be categorized into three major processes: development and cell differentiation, specialized metabolism (particularly the phenylpropanoid biosynthesis pathway), and stress responses (biotic and abiotic stresses) (Fig. 1B). Previous studies suggested that the R2R3-MYB genes within the same subfamily may have similar functions [53]. For example, the subfamilies S45 and S5 genes are involved in proanthocyanidin biosynthesis, while S7 members are involved in flavonoid biosynthetic pathway. Equivalent or similar biological functions can be controlled by different subfamilies of MYB TFs. Members from eight subfamilies (S21, S3, S8, S13, S32, S30, S31, and S10) regulate secondary cell wall biosynthesis. Additionally, genes in subfamilies S6 and S4 participate in the regulation of anthocyanin biosynthesis (Fig. 1B).

Characterization of R2R3-MYBs in Chinese cherry

A total of 99 R2R3-CpMYB genes were identified in Chinese cherry, and their detailed information was summarized in Table S5. These CpMYB proteins exhibited significant variation in characteristics, with protein length ranging from 106 to 1206 amino acids, molecular weight ranging from 12.83 to 134.89 kDa, and theoretical isoelectric point ranging from 4.83 to 10.74. These proteins were predicted to localize in the nuclear region (Table S5). Through chromosomal localization analysis, the CpMYB genes were unevenly distributed across all eight chromosomes (Fig. 2A). Chromosome 1 had the greatest number of CpMYB genes (20 in total), followed by chr3 (19 genes) and chr6 (16 genes). The least number of CpMYB genes was on chr8, with only four genes. Co-linearity analysis revealed the identification of 31 duplication pairs, consisting of 26 segmental duplication pairs and five tandem duplication pairs (Fig. 2B). The Ks value ranged from 0.0004 to 4.3882 (Table S6), showing a wider range than sweet cherry MYB family [54]. Eight duplicated genes showed relatively high Ks values (1.8032 ~ 4.3882), indicating that more ancient duplication events occurred in Chinese cherry than in sweet cherry (Ks: 0.5875 ~ 1.6836, 32.28 ~ 92.50 Mya) [54] and apple (Ks: 1.5 ~ 1.8, ~ 140 Mya) [55]. The Ka/Ks ratios of 26 duplication gene pairs were all less than 0.8, indicating that these R2R3-CpMYB genes have experienced purifying selection during evolution following duplication events.

The amino acid sequences of the R2 and R3 MYB repeats were extracted from R2R3-CpMYB proteins (Fig. S1). In the R2 domain, three Trp residues were spaced by 19 amino acids. In the R3 domain, the Trp residues were separated by 18 amino acids, with the first being replaced by some hydrophobic amino acids. A total of 20 diverse motifs were identified in CpMYB proteins, with each motif ranging in size from 8 to 50 (Table S7). Generally, genes within the identical subfamily exhibited conserved motif patterns (Fig. S2A-B). Across all members, the motifs at the N-terminus were relatively conserved, including motifs 1 ~ 5, and motif 7, which encode the conserved MYB DNA-binding domain. Most subfamilies had exclusive motifs, but some motifs overlapped across subfamilies. For example, motif 16, 13 and 20 was exclusive to subfamily S6, S22 and S27 respectively, while motif 8 was found in most members among 12 subfamilies. The differences in the quantities and types of conserved motifs may result in changes in the degree of rate of evolution. Most genes within the same subfamily displayed common exon/intron organizations (Fig. S2C). The exon numbers ranged from 1 to 15 in Chinese cherry, with 61.62% of genes containing three exons. Genes in subfamily S6 contained 2 introns and 3 exons, while members of subfamily S4 included 1 ~ 2 introns and 2 ~ 4 exons. Additionally, 6 members in subfamily S22 were intronless, and 52 CpMYB proteins had no untranslated region (UTR). The presence of identical motifs and exon/intron structures among genes suggested that they may have similar functions.

Fig. 2
figure 2

Characterization of R2R3-CpMYB genes in Chinese cherry. (A) Chromosomal localization of CpMYBs. Red font represents the genes related to anthocyanin metabolism. Genes connected by right parentheses represent the tandem duplication pairs. (B) Segmental duplication and synteny analysis of CpMYBs. Duplicated gene pairs placed on the different chromosomes linked with colorful lines. Pp_sc: Ppseudocerasus_HiC_scaffold. The same as below

Various cis-regulatory elements were predicted in the promoter sequences, and a total of 56 types of cis-elements were identified (Table S8). These elements were divided into four groups: light responsive (29), plant growth (9), phytohormone responsive (13), and stress responsive (5) (Fig. S3). Light responsive cis-acting elements were found to be prevalent in the promoter regions, with the Box 4 (24.8%) and G-box (23.7%) being the most common. The O2-site encompassed the highest proportion (34.3%), follow by CAT-box (24.5%), which are responsible for plant growth. The phytohormone response related cis-elements like ABRE (27.8%), ARE (21.0%), CGTCA (13.3%), and TGACG (13.3%) were also observed, which are linked with ABA, anaerobic induction, and MeJA responses. Furthermore, certain elements were found to be widely distributed in the promoter regions of most CpMYBs, with Box 4 being the most frequently distributed, followed by G-box and ABRE. These findings suggested that members of CpMYB gene family can be induced by light and ABA signals.

Key genes associated with anthocyanin biosynthesis in Chinese cherry fruits by transcriptome profiling

Expression patterns of R2R3-CpMYB genes

An expression heat map of R2R3-CpMYB genes was generated based on the FPKM values at different developmental stages of yellow (‘PZB’) and red (‘HF’) fruits (Fig. 3A, Table S9). They exhibited significantly different expression levels across these developmental stages. In most cases, genes with the same subfamily displayed similar expression patterns. For example, genes in subfamilies S23, S21, S37, S45, S32, S36, and S9 were predominantly expressed at green fruit stage. On the other hand, genes in subfamilies S6 and S3 showed relatively high expression level at fruit coloration stage. Duplicated genes with high sequence similarity, such as Ppseudocerasus_HiC_scaffold_1_4299 and 1_4300, exhibited different expression profiles. Notably, in subfamily S6, which promotes anthocyanin biosynthesis, the expression of Ppseudocerasus_HiC_scaffold_3_1741 increased with fruit development and was higher in red than in yellow fruits. In subfamily S4, which inhibits anthocyanin biosynthesis, Ppseudocerasus_HiC_scaffold_1_3492 was strongly expressed in matured yellow fruits. Its expression gradually increased with fruit development in yellow fruits, suggesting that the repressor was active during anthocyanin biosynthesis, providing feedback regulation.

Validation of candidate R2R3-CpMYB genes

To further confirm the genes involved in anthocyanin accumulation in Chinese cherry fruits, we conducted WGCNA based on the normalized expression data for 46,058 genes from all 30 samples. After filtering, 11,515 genes were retained and classified into 22 distinct gene modules (Fig. 3B, Fig. S4). The lightgreen module, consisting of 103 genes, and the greenyellow module, consisting of 226 genes, showed a significant positive correlation with a* (r2 = 0.68, 0.90), anthocyanin content (r2 = 0.96, 0.84), and a*/b* ratio (r2 = 0.97, 0.80). The KEGG pathway enrichment analysis revealed that the genes within the lightgreen module were significantly enriched in “flavonoid biosynthetic process”, “regulation of anthocyanin biosynthetic process”, and “anthocyanin-containing compound biosynthetic process”. Furthermore, Ppseudocerasus_HiC_scaffold_3_1741 was specifically enriched in the “regulation of anthocyanin biosynthetic process” pathway (Fig. 3C). These findings suggested that Ppseudocerasus_HiC_scaffold_3_1741 may be involved in the biosynthesis of anthocyanins.

Ppseudocerasus_HiC_scaffold_3_1741/6_1865 (S6) and Ppseudocerasus_HiC_scaffold_1_3492/3_1684/8_2974 (S4), with normalized FPKM values greater than 1, were selected for RT-qPCR analysis (Fig. 3D). The results showed that the expression levels of Ppseudocerasus_HiC_scaffold_3_1741 in (dark)-red fruits were significantly higher by 5 ~ 6-fold than that in yellow fruits, consistent with the anthocyanin content in fruits (Fig. 3E). The expression of Ppseudocerasus_HiC_scaffold_3_1741 was higher in red flower buds than that in white blooming flowers, and other three tissues, suggesting that Ppseudocerasus_HiC_scaffold_3_1741 may also regulate the coloration of flower buds. In contrast, the expression level of Ppseudocerasus_HiC_scaffold_1_3492 was opposite to the anthocyanin accumulation in fruits with different colors. It was expressed at low levels in the roots, stems, leaves, and flowers. Additionally, Ppseudocerasus_HiC_scaffold_6_1865/ 3_1684/ 8_2974 exhibited much lower expression in fruits compared to other tissues. Therefore, Ppseudocerasus_HiC_scaffold_3_1741 and 1_3492 are likely to play important roles in regulating anthocyanin biosynthesis in Chinese cherry fruits and are ideal candidate genes for further functional analysis. Based on phylogenetic analysis, the former gene exhibits a high degree of sequence similarity (65.20%~82.00%) with Rosaceae MYB10s (Fig. S5A). The latter one belongs to the FaMYB1-like type, exhibiting the highest sequence similarity (98.40%) with peach PpMYB18 and sweet cherry PavMYB4 (Fig. S5B). Therefore, they were renamed CpMYB10 and CpMYB4, respectively.

Fig. 3
figure 3

Transcriptome profiling and relative expression of candidate CpMYBs associated with anthocyanin biosynthesis in Chinese cherry fruits. (A) Expression heatmap of CpMYBs during fruit development in yellow and red fruits. The color scale represents log2-transformed FPKM values from 0 (blue) to 1 (red). (B) WGCNA based on the gene expression level and phenotypic data. The color scale on the right shows module-trait correlations from − 1 (blue) to 1 (red). (C) KEGG enrichment analysis of lightgreen module genes. (D) RT-qPCR validation of candidate genes related to anthocyanin biosynthesis in fruits with different colors and different tissues. (E) Anthocyanin content in matured fruits with different colors. The data represent the means ± SD from three independent replicates. Lowercase letters indicated the significant difference at 0.05 level

Function verification of CpMYB10 and CpMYB4 in Chinese cherry fruits

Subcellular localization and transcriptional activation activity

We cloned the coding sequences of CpMYB10 and CpMYB4 proteins to creat plasmid constructs encoding fusion proteins. The empty vector harboring GFP and mCherry was used as a positive control, which showed a diffused distribution of green and red fluorescence signals throughout the entire cells and nucleus, respectively. The GFP fluorescence signals of CpMYB10::GFP fusion proteins were predominantly localized in the nucleus, whereas the signals of CpMYB4::GFP fusion proteins were predominantly localized in the nucleus and cytoplasm (Fig. 4). The results demonstrated that CpMYB10 is a nuclear-localized protein, while CpMYB4 is a both cytoplasm- and nucleus-localized protein.

To investigate the transcriptional activity of CpMYB10 and CpMYB4, a transactivation assay was performed in the yeast strain Y2HGold. We found that the yeast cells carrying the positive control and pGBKT7-CpMYB10 survived on SD-Trp/Ade/His/+X-α-Gal media, while the negative control and pGBKT7-CpMYB4 did not survive (Fig. 5). These results indicated that CpMYB10 had transcriptional activation activity in yeast cells, while CpMYB4 was not transcriptionally active.

Fig. 4
figure 4

Subcellular localization of the CpMYB10 and CpMYB4 proteins. The fusion constructs (35 S: CpMYB10-GFP, 35 S: CpMYB4-GFP) or an empty vector (35 S: GFP) were co-transformed with a nuclear marker gene, VirD2NLS fused to mCherry, in the epidermal cells of Nicotiana benthamiana leaves. GFP: green fluorescence; mCherry: red fluorescence; Bright Field: white light; Merged: combined GFP and mCherry signals. Bar = 50 μm

Fig. 5
figure 5

Transactivation analysis of CpMYB10 and CpMYB4 using yeast assay. pGBKT7-CpMYB10/4: vector pGKBT7 containing CpMYB10/4. pGBKT7-p53, positive control; pGBKT7-lam, negative control

Transient expression of CpMYB10 and CpMYB4

The OE-CpMYB10 yellow fruits accumulated a significantly higher anthocyanin content, with a 3.81-fold elevation in the skin compared to the control (Fig. 6A-1, A-2). Contrarily, the fruit flesh was not highly pigmented (Fig. S6), suggesting a differential regulation of anthocyanin biosynthesis between skin and flesh in Chinese cherry. Intriguingly, co-infiltration of CpMYB10 and CpMYB4 resulted in the disappearance of red pigmentation in yellow fruits (Fig. 6A-1, A-2). RT-qPCR analysis revealed a significant upregulation (6.7 ~ 8.0-fold) of key structural genes, particularly LBGs (CpDFR, CpANS, and CpUFGT), in OE-CpMYB10 fruits. Similarly, CpGSTF12, which regulates anthocyanin transport, also showed upregulated expression (Fig. 6A-3). However, there were no noticeable differences in the expression levels of co-infiltrated fruits with CpMYB10 and CpMYB4 compared to control (Fig. 6A-3). Conversely, silencing of CpMYB4 in yellow fruits generated a slight blush in the skin (Fig. 6B-1) and a moderate accumulation of anthocyanin (Fig. 6B-2), but the content did not reach the level observed in fully matured yellow fruits (Table S10). Expression analysis revealed a dramatical upregulation of both EBGs and LBGs, especially CpUFGT (51.7-fold) and CpGSTF12 (21.3-fold), upon silencing CpMYB4 (Fig. 6B-3). As expected, silencing of CpMYB10 in red fruits led to a lighter skin color, significantly lower anthocyanin level (by 6.38-fold decrease), and downregulation of structural genes (Fig. 6C-1, C-2 and C-3). Collectively, these results support the hypothesis that CpMYB10 acts as an activator, while CpMYB4 functions as a repressor in the regulation of anthocyanin biosynthesis in Chinese cherry fruits.

Fig. 6
figure 6

Function verification of CpMYB10 and CpMYB4 in Chinese cherry fruits with different colors. (A), (B) and (C) Transient overexpression of CpMYB10 and co-infiltration of CpMYB10 and CpMYB4 in yellow fruits, VIGS of CpMYB4 in yellow fruits, and VIGS of CpMYB10 in red fruits, respectively. (1) Fruit phenotypes after infiltration. The arrows indicate the infiltration sites. (2) Anthocyanin content of infiltration sites. (3) Expression levels of CpMYB10 and CpMYB4, anthocyanin biosynthetic pathway genes and transporter gene in infiltration sites. The data represent the means ± SD obtained from RT-qPCR. Lowercase letters indicate the significant difference at 0.05 level

Discussion

Species-specific expansion of R2R3-MYB gene family in Chinese cherry and its Rosaceae relatives

The MYB transcription factor family is the largest family in plants and plays a crucial role in regulating primary and secondary metabolism, growth and development, as well as hormone and stress responses [6]. Since its identification in Arabidopsis [47], whole-genome characterization of MYB TFs has been conducted in various plants, including strawberry [56], apple [57, 58], and sweet cherry [54]. In this study, we identified 1364 R2R3-MYB genes from Chinese cherry and 10 relatives in Rosaceae family, with the gene counts ranging from 99 to 134 in tribe Amygdaleae, and from 177 to 193 in tribe Maleae, 103 in tribe Potentilleae, and 98 in tribe Rubeae. The number of the R2R3-MYB gene family exhibited abundant diversity among these species, likely due to frequent gene duplications and losses. Notably, tribe Maleae species showed nearly twice as many genes as the other tribes, suggesting that a genome-wide duplication event in tribe Maleae might have led to the expansion of the R2R3-MYB gene family. This is also supported by the suggestion that R2R3-MYB genes have undergo rapid expansion during plant evolution via whole-genome duplication (WGD) and small-scale duplication [6]. Additionally, the gene numbers within identical subfamily also varied greatly among four cherry species, even though they are traditionally classified into Cerasus genus. This also implies species-specific expansion of R2R3-MYB gene family within the same genus.

In particular, the evolution of R2R3-MYBs in plants is related to specific expansions giving rise to species of lineage-specific subfamilies [59]. In this study, subfamily S45 exclusively contained R2R3-MYB genes from Chinese cherry and all relatives, indicating its unique role in Rosaceae family. Additionally, the absence of certain Rosaceae species in subfamilies S28, S15, S19, S13, S74, and S10 possibly suggests specific functions for these subfamilies. For instance, subfamilies S19, S13, and S10 were excluded only in Chinese cherry, indicating that they were lost during the evolution of Chinese cherry. S74 formed a subfamily with genes from all Rosaceae species except for strawberry, suggesting that these genes might play an important role only in woody plants. Furthermore, the subfamily S12 found in Arabidopsis was lost in Rosaceae, similar to observation in Solanaceae [60]. This further supports the notion that S12, involved in glucosinolates biosynthesis, is specific to the Brassicaceae family [61, 62]. Therefore, these results suggest an obvious expansion and functional trend toward specialized metabolism in R2R3-MYB gene family during the evolution of Chinese cherry and Rosaceae relative species.

CpMYB10 and CpMYB4 are involved in anthocyanin biosynthesis in Chinese cherry

In this study, we identified 99 CpMYB genes of the R2R3 type within the genome of Chinese cherry. Based on the phylogenetic relationships and RNA-seq results, we selected five TFs from subfamilies S6 and S4 that may be involved in anthocyanin biosynthesis pathway. The expression level of CpMYB10 increased rapidly during the fruit coloration stage and was higher in red fruits compared to yellow fruits. CpMYB10 was also highly expressed in red flower buds. Conversely, the expression pattern of CpMYB4 increased at the coloration stage in yellow fruits, while it decreased in red fruits. Both genes exhibited low expression levels in roots, stems, and leaves, indicating their specific expression is dependent on tissue type and ripeness stage, which is similar to the findings in sweet cherry [10, 54]. Transient overexpression of CpMYB10 resulted in a 3.81-fold increase in anthocyanin level in yellow fruits (Fig. 6A-2), while silencing its expression led to a significant 6.38-fold decrease in anthocyanin content in red fruits (Fig. 6C-2). CpMYB10 shares 81.50% of sequence similarity with sweet cherry PavMYB10, underscoring its pivotal role in regulating anthocyanin biosynthesis in Chinese cherry fruits. In contrast, silencing of CpMYB4 only slightly rescued anthocyanin accumulation in yellow fruits (Fig. 6B-2), and its overexpression did not significantly alter anthocyanin content or color of red fruits (Fig. S7). This suggests that CpMYB4 may not be the primary factor contributing to the formation of yellow fruits in Chinese cherry.

Based on these results, we propose a model for the regulation of anthocyanin biosynthesis in Chinese cherry involving CpMYB10 and CpMYB4 (Fig. 7). CpMYB10, a member of the subfamily S6, may regulate anthocyanin biosynthesis in the skin by forming MBW complexes to activate the expression of structural genes, particularly the LBGs. Similarly, sweet cherry PavMYB10.1a interacts with PavbHLH and PavWD40, which is selectively recruited to the PavANS and PavUFGT promoter regions to enhance anthocyanin accumulation [10]. FaMYB10 boosts anthocyanins accumulation by upregulating the expression of almost all structural genes in strawberry fruits [63]. Furthermore, the degree of trans-activation and interaction with bHLH partners varies greatly among MYB10 genes in different Rosaceae species [17]. For instance, the efficient induction of anthocyanin biosynthesis by MdMYB10 in apple depends on the co-expression of MdbHLH3 and MdbHLH33 [17, 64]. However, MYB10 TFs perform poorly with bHLH3 in some species such as peach, strawberry, and pear [17], and FaMYB10 does not interact with FabHLH33 in strawberry [63]. Additionally, variations in the promoter region of MYB10 caused by the insertion of transposons or fragments alter gene expression, leading to activation or inactivation, thereby regulating coloration of Rosaceae fruits [11, 65, 66]. Thus, further study is needed to illustrate the regulation mechanism of CpMYB10 activator responsible for anthocyanin accumulation in Chinese cherry.

CpMYB4 belongs to the FaMYB1-like type, which may act on the MBW complex to inhibit its activation [27]. Apple MdMYB15L weakens MdbHLH33-induced anthocyanin accumulation by interacting with MdbHLH33 [31]. In peach, PpMYB18-like genes are likely to be induced by anthocyanin-related activators, thus providing negative feed regulation to MBW complexes and preventing the over-accumulation of anthocyanins [67]. As mentioned above, CpMYB4 may be a minor factor that inhibits anthocyanin biosynthesis in Chinese cherry. The regulatory role and molecular mechanism of CpMYB4 underlying anthocyanin accumulation also need further exploration. As previously reported, matured yellow fruit in Chinese cherry is not pure in color but exhibits a slight blush [4], suggesting the anthocyanin biosynthetic pathway is functional in yellow fruit. The formation of yellow fruit skin color might be a complex process that requires further in-depth study. Overall, this study provides a theoretical basis for further understanding the function of MYB family members in the process of Chinese cheery coloring.

Fig. 7
figure 7

Diagram of CpMYB10 and CpMYB4 involving in Chinese cherry fruit coloring. aCpMYB10 may regulate anthocyanin biosynthesis in the skin by forming MBW complexes to activate the expression of structural genes. bCpMYB4 may act as a corepressor, which is incorporated into or binds MBW complexes to change the complex activity and transform from activation to inhibition. CHS: chalcone synthase; CHI: chalcone isomerase; F3H: flavanone 3-hydroxylase; DFR: dihydroflavonol 4-reductase; ANS: leucoanthocyanidin dioxygenase; UFGT: flavonoid- 3-O-glucosyltransferase

Conclusions

This study presents a comprehensive and systematic analysis of the R2R3-MYB gene family in Chinese cherry and its Rosaceae relatives. A total of 1490 R2R3-MYB genes were identified and divided into 43 subfamilies across Chinese cherry, 10 Rosaceae relatives and Arabidopsis. The variation in gene numbers within identical subfamilies among different species, and the absence of certain subfamilies in some species, suggest species-specific expansion within the MYB gene family in Chinese cherry and its relatives. The expansion of CpMYBs was primarily driven by segmental and tandem duplication events. Phylogenetic relationships and transcript profiling further revealed CpMYB10 and CpMYB4 as key regulators involved in anthocyanin biosynthesis in Chinese cherry. Combined expressions patterns and function verification confirmed that CpMYB10 promotes anthocyanin accumulation in the fruit skin, while CpMYB4 acts as a minor repressor inhibiting anthocyanin biosynthesis in Chinese cherry. Further documentation is required to fully understand their authentic roles and regulatory mechanisms in Chinese cherry.