Cytochrome P450s (CYPs) are the largest enzyme family involved in NADPH- and/or O2-dependent hydroxylation reactions, which are ubiquitous across all domains of life [1]. P450 enzymes are present in all plant species, and play important roles in plant growth, development, and adaptation to the environment [2]. Under terrestrial environments, the preserved P450 families support chemical defence mechanisms, and a number of them participate in the manufacture and catabolism of hormones [3]. Furthermore, through boosting the action of substances (such as flavonoids) with a higher antioxidant activity, CYPs are also implicated in safeguarding plants from harsh environmental circumstances [4, 5]. For the biosynthesis pathways of species-specific metabolites, species-specific P450 families are necessary [6]. All cytochrome enzymes will have the code "CYP" followed by the family number, then an alphabet that designates the subfamily of the enzyme [7]. Their amino acid sequences are extremely diverse, with similarities as low as 16% in some cases, but their structural folding has remained conserved throughout evolution [8].

With the development of next-generation sequencing technology (NGS), a large number of plant genomes have been published, which has also facilitated the identification of gene families [9]. As one of the largest gene superfamily in plant genomes, P450s are represented by more than 300,000 gene sequences that have so far been preserved in databases, which include more than 16,000 plant P450s [10]. Nonetheless, the identification of P450 gene family members presents a significant challenge due to their vast quantity, comprising no less than 1% of the total annotated genes in plant genomes. Consequently, this results in a comparatively lower number of identified P450 gene families. Research has shown that Arabidopsis thaliana (A. thaliana) has 246 P450 genes, making it the third-largest gene family in A. thaliana [11]. The number of P450 genes in other plants is also relatively high, such as 457 in grape (Vitis vinifera), 332 in soybean (Glycine max), 312 in poplar (Populus trichocarpa), 356 in rice (Oryza sativa), 372 in sorghum (Sorghum bicolor) [12], 233 in tomato (Solanum lycopersicum) [13], 174 in mulberry (Morus notabilis) [14], 334 in flax (Linum usitatissimum L.) [15], 263 in tobacco (Nicotiana tabacum) [16], and 258 in Chinese cabbage (Brassica rapa L.) [17]. Therefore, whole-genome analysis and co-expression networks of P450 gene families can help to determine the functions of P450s and understand the evolution of these multifunctional enzymes.

P450 enzymes are classified into different subfamilies based on their amino acid sequence and function. Plant P450s have been shown to participate in various biochemical pathways to produce primary and secondary metabolites, such as phenylpropanoids, alkaloids, terpenoids, lipids, cyanoglycosides, and polyols, as well as plant hormones [18]. For example, gene families CYP90, CYP724, and CYP734 are involved in the biosynthesis of steroidal saponins and sugar alkaloids [19]. P450 enzymes can also participate in the regulation of plant growth and development by synthesizing hormones [20], such as CYP735As involved in the biosynthesis of cytokinins [21], CYP707A involved in the catalytic synthesis of abscisic acid [22], CYP701A, CYP88AC, CYP714A1, CYP714D1, and CYP714A2 involved in the synthesis and inactivation of gibberellins [23, 24], CYP85A, CYP90A, CYP90B, CYP90C, CYP90D, CYP724B, and CYP734A involved in the biosynthesis of brassinosteroids [25,26,27], and CYP74A, CYP94B3, CYP94C1, CYP74A, and CYP74B involved in the synthesis of jasmonic acid [28,29,30].

P450 enzymes have also been shown to play a role in plant stress responses, including responses to abiotic stress (such as drought and extreme temperatures) and biotic stress (such as insect and pathogen attacks) [31, 32]. For instance, after Xanthomonas axonopodis infection, the CYP gene CaCYP1 from Capsicum annuum was discovered to be implicated in the (hypersensitivity response) [33]. It was discovered that the Arabidopsis CYP gene, AtCYP76C2, is linked to hypersensitive fast cell death, a defensive mechanism against bacterial canker (Pseudomonas syringae) infection [34]. Such CYP genes are excellent candidates to be exploited in agricultural species engineering to make them resistant to biotic and abiotic stress. Besides, P450 genes have been found to be involved in the metabolism of heavy metal stress [35]. Overall, the P450 gene family plays a key role in the metabolism of various compounds in plants, and understanding the functions of these enzymes is important for studying plant biology and developing new plant-derived products.

Tea (Camellia sinensis) is one of the most important beverage crops in the world, with significant economic and health benefits. With the publication of the tea genome, over 80 tea gene families have been identified, such as HDAC [36], PMF [37], PLD [38], MAPK [39], as well as transcription factor families NAC, bZIP, TCP, and MYB [40,41,42,43]. However, few P450 genes from tea have been reported and functionally annotated. Moreover, to date, there have been no reports on the whole-genome study of these genes. Therefore, in this study, we identified the members of the P450 gene family in the whole genome of tea using bioinformatics methods, grouped P450 genes with important functions, and analyzed the physicochemical information, structural function, and expression patterns of all members to understand the molecular evolution of P450 genes and provide a reference for functional characterization of important candidate genes. Furthermore, this investigation holds significant implications for the genetic enhancement of tea growth, development, yield, and resistance to pests and diseases through the utilization of this gene family.

Materials and methods

Identification of P450 genes in tea plant genome

In this study, we aimed to identify and characterize P450 genes in the tea plant (C. sinensis) genome. To achieve this, we downloaded the HMM (Hidden Markov Model) file for the typical conserved domain of P450 genes (PF00067) from the Pfam 35.0 protein family database ( We then used the HMMER3.0 software to perform a comparative search of all protein sequences in the tea plant genome database (

To increase the accuracy of our search, we obtained 238 AtP450 protein sequences from the TAIR website ( and used them as queries to perform a local BLAST search in the tea plant genome database (with an E-value cutoff of 10–3). We then filtered the candidate protein sequences with incomplete structures using the NCBI-CDD (http://www.Ncbi.Nlm.Nih.Gov/Structure/cdd/wrpsb.cgi) and SMART ( domain detection tools, resulting in the identification of CsP450 protein sequences.

To further characterize the identified CsP450 protein sequences, we submitted them to the ProtParam ( and predicted their molecular weight, isoelectric point, and amino acid composition [44]. Finally, we used TBtools ( to locate the CsP450 genes on the tea plant chromosomes and named them according to their positions on the chromosomes [45].

Phylogenetic analysis of CsP450s

To identify the gene family members, protein sequences were extracted based on their IDs and aligned with 238 family genes from A. thaliana using Clustal W software with the default parameters [46]. The resulting alignment was used to construct an unrooted evolutionary tree using the Neighbor-Joining method using MEGA 7 software ( [47]. The Bootstrap parameter was set to 1000 to ensure the robustness of the tree. The resulting tree was further annotated using EvolView ( to enhance its readability and visual presentation.

Analysis of CsP450s gene structure and cis-acting elements

In this study, the CDS and genomic annotation information of the CsP450 gene family was obtained from the tea plant genome database. The Gene Structure Display Server (GSDS, was used to generate a schematic representation of the gene family's exon–intron structure [48]. The MEME online software ( was used to analyze the conserved motifs of the CsP450 proteins, with the following parameters: maximum of 10 misfits and an optimum motif width of 6—200 amino acid residues [49]. The gene family's evolutionary tree, gene structure, and motif analysis were combined in a single figure using the TBtools software to demonstrate the gene structure and evolutionary relationships between family members.

To further explore the regulatory elements of the CsP450 gene family, the 2 kb upstream region of the ATG start codon of the CsP450 genes was downloaded from the tea plant database. The PlantCARE online tool ( was used to predict cis-acting elements in the promoter sequences [50], and the results were visualized using TBtools.

Subcellular localization prediction of CsP450s gene

WOLF PSORT ProtParam tool ( were used to predict the subcellular localization of CsP450-encoded proteins. The algorithm of WOLF PSORT ProtParam tool compares the input sequence to the database of known subcellular localization signals and motifs, and then assigns a probability score to each potential subcellular localization site.

Chromosomal localization and genome collinearity analysis of CsP450s gene

To perform chromosome localization analysis of the gene family, we used the software MapChart ( We conducted genome-wide collinearity analysis and gene duplication event analysis using the software McscanX with default parameters [51]. KaKs Calculator 2.0 was used to estimate the non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), and the ratio (= Ka/Ks) of paralog pairs for each pair of paralogs [52]. In general, Ka/Ks = 1 reflects neutral selection (pseudogenes), Ka/Ks =  < 1 shows purifying or negative selection, and Ka/Ks =  > 1 shows positive selection.

Protein–protein interaction network analysis of CsP450s

The candidate P450 genes of tea plant were not found in the String database ( Therefore, we used OrthoVenn2 ( to search for homologous genes of tea plant P450 genes in Arabidopsis for further analysis. The protein–protein interaction network was visualized using Cytoscape ( network visualization software, where nodes represented proteins and edges represented interactions..

In-silico gene expression analysis of CsP450 genes

The Illumina RNA-sequencing (RNA-seq) data of tea plant were downloaded from the tea plant genome database ( to examine the relative expression patterns of CsP450s under abiotic stress with various time points (0 h, 24 h, 48 h, and 72 h for PEG) and (0 h, 6 h, and 7 d for cold (4℃)) and different tissues including apical buds, flowers, fruits, young leaves, mature leaves, old leaves, roots, and stems. The clustering heatmap was drawn using the heatmap tool by Biotech Cloud Platform (, with the parameter settings for clustering rows and selecting FPKM as the data preprocessing method.

qPCR analysis

Drought stress was induced in tea plants by treating them with 20% PEG6000 for 24 h, 48 h, and 72 h, while the control sample was collected at 0 h. To investigate the response of CsP450 genes to drought stress, ten CsP450 genes were selected and their expression levels were analyzed using qPCR. Total RNA was extracted from the tea plant samples using the RNAprep Pure Plant Kit (Tianjin, China), and cDNA was synthesized using the PrimeScript® RT reagent kit (Takara, China) according to the manufacturer's instructions. Gene-specific primers were designed using the NCBI database online toolkit ( and used to amplify the target fragments. The relative expression levels of the selected genes were calculated using the 2−ΔΔCt method [53]. Additionally, cold stress was imposed on the tea plants by treating them at 4℃ for 6 h and 7 d, with samples collected at 0 h as the control. The expression analysis of CsP450 genes was performed with three biological replicates and three technical replicates for all samples.

Data analysis

The statistical analysis was performed using IBM SPSS Statistics 22 software (IBM, New York, USA) to compare the differences between treatments. All values presented in the figures are expressed as the mean ± standard deviation (SD) of biological triplicates, unless otherwise stated. Two-way analysis of variance (ANOVA) was conducted to determine the least significant difference (LSD) with a significance level of p < 0.05.


Identification and physicochemical analysis of CsP450 gene family

After screening the tea plant genome using NCBI-CDD and SMART, 273 candidate P450 genes were identified, and were subsequently designated as CsP4501 to CsP450273 according to their chromosome location, numbering and naming (Table 1). The chromosomal distribution of the P450 genes was found to well-balanced, with genes located on chromosomes 1 to 15. The P450 protein sequences varied greatly in length, ranging from 268 to 612 amino acids, with molecular weights ranging from 30.95 to 68.5 kDa, and isoelectric points ranging from 4.93 to 10.17. Subcellular localization analysis showed that these proteins were mainly localized to organelles such as chloroplasts, plasma membranes, cytoplasm, endoplasmic reticulum, mitochondria, nuclei, and vacuoles.

Table 1 The physiological and biochemical properties of 273 CsP450 proteins in C. sinensis. Plas: plasma membrane; E.R.: endoplasmic reticulum; Mito: mitochondria; Chlo: chloroplast; Extr: extracellular; Cyto: cytoplasm; vacu: vacuole; nucl: nucleus; golg: Golgi apparatus; pero: peroxisome, cysk: cytoskeleton

Phylogenetic analysis of CsP450 gene family

To gain a deeper understanding of the evolutionary relationships among members of the tea P450 gene family, we conducted multiple sequence alignment of the identified 273 tea P450 proteins with 238 AtP450 protein sequences, followed by cluster analysis to generate a phylogenetic tree (Fig. 1). The results of the phylogenetic tree analysis indicated that the tea P450 proteins belong to 34 subfamilies, including CYP71, CYP72, CYP73, CYP76, CYP77, CYP78, CYP81, CYP82, CYP84, CYP85, CYP86, CYP87, CYP89, CYP90, CYP93, CYP97, CYP98, CYP94, CYP701, CYP702, CYP703, CYP704, CYP705, CYP706, CYP707, CYP708, CYP709, CYP710, CYP711, CYP714, CYP716, CYP734, CYP749, and MAH. The CYP71 subfamily had the most members, containing 31 tea P450 proteins, while the CYP711 subfamily had the fewest members, each containing only one protein. The CYP702, CYP705, and CYP708 subfamilies had no tea P450 proteins, and there were no AtP450 proteins in the CYP749 subfamily. Notably, our evolutionary tree analysis revealed that all subfamilies included tea plant and A. thaliana P450 family genes, indicating that the tea plant P450 family shares a common ancestry with the A. thaliana P450 family. This analysis provides insights into the evolutionary relationships among tea P450 genes and lays the foundation for further investigations into the functional characteristics of this gene family.

Fig. 1
figure 1

Phylogenetic relationships of C.sinensis and A. thaliana P450 transporter proteins. The blue triangle represents AtP450 gene and the red asterisk represents CsP450

Gene structure and conserved motif analysis of CsP450s in tea plant

The majority of plant genes are often interrupted by one or more introns or exons. These configurations may be used to investigate the evolutionary link between different members of the respective gene families. Many earlier investigations have observed a correlation between exon/intron distribution patterns and their pertinent biological activities [54]. The evolutionary relationships and gene structures of the tea P450 family members were further investigated by integrating phylogenetic trees, gene structure diagrams, and motif analysis (Fig. 2A and 2B). By using the MEME website, 10 CsP450 proteins' conserved motifs were identified. The analysis revealed that the number of exons in tea P450 family genes ranged from 1 to 14, with 27 genes lacking introns and only one exon. In addition, ten conserved motifs (Motif1-Motif10) were identified in the CsP450 family proteins (Figure S1). The number of conserved motifs in tea P450 family genes varied from 1 to 10, with Motif5 to Motif8 being the most frequently occurring motifs in all genes. Furthermore, there were significant differences in the patterns of conserved motifs and gene structures between type A and non-type A P450s. For example, type A includes the CYP71 clan, which contains the following sub-families: CYP71, CYP78, CYP82, CYP89 and CYP736, while the non-type A clan included all CYPs other than CYP70 types. However, similar patterns were observed within the same subfamily, which enhanced the credibility of the phylogenetic relationships and population classifications.

Fig. 2
figure 2

Phylogenetic relationship, gene structure, and distribution of conserved motifs of CsP450 proteins in C.sinensis. Diferent motifs are represented by diferent colored numbered boxes

Analysis of cis-acting elements of CsP450 gene family

Cis regulatory elements (CREs) are a family of non-coding DNA components that regulate gene expression at various developmental stages by influencing the transcription of nearby genes [55]. To investigate the potential response of CsP450 family members to growth and development, stress and other environmental cues, the promoter regions were analyzed using PlantCARE. The results showed that the major cis-acting elements included abscisic acid responsive elements (ABRE), jasmonic acid response elements (CGTCA-motif), low temperature responsive element (LTR), MYB binding site involved in drought-inducibility (MBS), gibberellin-responsive regulatory element (TATC-box), salicylic acid responsive element (TCA-element) and auxin-responsive element (TGA-element) (Figure S2). The predicted results further suggest that the tea CsP450 family plays an important role in regulating growth and development processes, hormone signal transduction, and response to environmental stress.

Chromosomal distribution analysis of CsP450s in tea plant

Based on the genome annotation of the tea plant, we investigated the physical locations of CsP450s on tea plant chromosomes, and the results are presented in Fig. 3. The chromosome localization results of P450 genes in tea plants showed that all 15 chromosomes of tea plants contain P450 genes, indicating that the chromosome distribution of P450 genes in tea plants is biased. Among them, chromosomes 1, 2, 4, 7, and 12 have the most P450 genes, while chromosome 10 has the fewest. In addition, it was found that some CsP450 genes are closely linked, and 37 pairs of genes exhibit gene tandem duplication.

Fig. 3
figure 3

Chromosomal locations of P450 transporter proteins in C.sinensis. Chromosome numbers are represented at the top of each chromosome

Gene duplication relationship and collinearity analysis of CsP450 genes

The investigation of gene duplication and amplification is crucial for exploring the evolution and expansion of the P450 gene families in tea plant. To investigate gene duplication events in the CsP450 gene family of tea plants, the MCScanX algorithm was used to analyze collinearity and gene duplication in the tea plant genome. Gene duplication and amplification between P450 genes provide important evidence for studying the evolution and expansion of gene families. Red lines linking two chromosomal parts represent syntenic regions. Analysis of large-scale gene duplication within the P450 gene family revealed that 37 pairs of genes participated in tandem duplication and 28 pairs of genes were collinear, providing the driving force for the evolution of tea plants (Fig. 4). In addition, duplication was most frequent in chromosomes 2 and 3, which is also the main reason for the higher number of CsP450 genes on these chromosomes. According to the aforementioned findings, tandem duplication and segmental duplication both contributed to the growth of the CsP450 family, although the former had a more significant impact.

Fig. 4
figure 4

Synteny analysis of P450 genes in C.sinensis. The red line is a large segment replication between gene family members

Family members of a gene family often evolve from a single ancestral gene. Therefore, using collinearity analysis to study the relationship between P450 gene families in tea plants and A. thaliana genomes helps to understand the origin and evolutionary relationship of P450 genes (Fig. 5). The results showed that 41 homologous P450 genes were co-constructed in tea plants and A. thaliana, with more homologous P450 genes found in chromosomes 1 and 2 of tea plants, while no homologous P450 genes were found on chromosomes 5, 8, and 9. In addition, multiple tea plant P450 genes were identified as homologous to a single AtP450 gene, and multiple AtP450 genes were also homologous to a single tea plant P450 gene. This collinearity relationship suggests that the expansion of this gene family may have occurred before the divergence of tea plants and A. thaliana.

Fig. 5
figure 5

Synteny analysis of P450 genes between the genomes of C.sinensis and A. thaliana. Cs represents C. sinensis genome and AT represents A. thaliana genome. The gray lines are all collinear relationships among different genomes, and the colored lines are collinear relationships among P450 gene family genes

Selection pressure analysis of CsP450 genes

Throughout the course of evolution, gene duplication events often lead to the divergence of duplicated genes from their initial specialized functions. This divergence may manifest as non-functionalization, sub-functionalization, or neo-functionalization [56]. We calculated Ka/Ks values from inter and intra genomic/subgenomic combinations of the tea plant in order to study the influence of Darwinian positive selection and the magnitude of selection pressure on divergence of P450 duplicated genes. As the majority of the Ka/Ks values were less than 1, it was assumed that after segmental and whole genome duplication, the CsP450 gene family had undergone strong purifying selection pressure with limited functional divergence (Table S1).

Protein–protein interaction network analysis of CsP450 genes

Analysis of protein–protein interactions (PPI) is a crucial way to understand protein function. Using the protein interaction network of Arabidopsis, we mapped and analyzed the protein interaction network of tea P450 proteins (Fig. 6). The results showed that 317 interactions were detected to be involved in the PPI network. The protein interaction map showed that multiple tea P450 genes have interacting target proteins, such as phenylalanine ammonia-lyase PAL1, flavonoid synthesis gene F3H, brassinosteroid synthesis pathway genes DWF5, DET2, STE1, and BR6OX1, among others. Additionally, there may also be protein–protein interaction relationships between tea P450 proteins, such as CsP450107, CsP450108, CsP450116, CsP450145, CsP450231, and CsP450266, among others. Therefore, the protein interaction network analysis further supports the hypothesis that tea P450 proteins may participate in multiple physiological pathways through protein interactions.

Fig. 6
figure 6

Protein interactions network diagram of P450 genes in C.sinensis. The red dots are CsP450 genes; the green dots are other genes added based on the String database. The size of the dots represents the size of the degree, and the thickness of the line represents the level of confidence

Tissue-specific expression of CsP450 genes

Understanding the tissue-specific expression patterns of genes is crucial for elucidating their roles in plant growth, development, and responses to environmental stresses [57]. The expression patterns of genes in different tissues are closely related to their biological functions. In this study, we analyzed RNA-Seq data from eight different tissues of tea plants (apical buds, flowers, fruits, young leaves, mature leaves, old leaves, roots, and stems) to analyze the tissue-specific expression profiles of the P450 gene family. Normalized FPKM expression values were used to construct a digital expression profile heatmap. The CsP450s exhibited a diverse expression pattern. The results showed that the P450 gene family had high expression levels in the roots and stems, while their expression levels were low in mature and old leaves in tea plants (Fig. 7–1, -2). The clustering results indicated that P450 genes in the same subfamily exhibited similar expression patterns.

Fig. 7
figure 7

Heatmap of CsP450 genes expression clustering in eight different tissues in C.sinensis. The heat blocks represent high and low expression, with red color representing high gene expression and blue color representing low gene expression

Expression analysis of CsP450s in response to drought and cold stress

To investigate the response of the P450 gene family to drought and cold stress in tea plants, transcriptome sequencing data from tea plants subjected to PEG treatment (24 h, 48 h, and 72 h) and cold stress (6 h and 7 d) were analyzed. The results indicated that the expression of CsP450 genes in response to drought stress followed one of three trends: initial upregulation followed by downregulation, sustained upregulation, or continuous downregulation (Fig. 8). Similar expression patterns were also observed under cold stress (Fig. 9). Furthermore, the clustering analysis of the CsP450 gene family revealed that genes from the same subfamily displayed similar expression patterns. These findings demonstrate that the expression of the CsP450 gene family is modulated in response to drought and cold stress in tea plants. These results may provide valuable insights into the molecular mechanisms underlying stress tolerance in tea plants and could facilitate the development of stress-resistant tea cultivars in the future.

Fig. 8
figure 8

Heatmap of CsP450 genes expression clustering under drought stress in C. sinensis. The heat blocks represent high and low expression, with red color representing high gene expression and blue color representing low gene expression

Fig. 9
figure 9

Heatmap of CsP450 genes expression clustering under cold stress in C. sinensis. The heat blocks represent high and low expression, with red color representing high gene expression and blue color representing low gene expression

Expression analysis of CsP450s in response to drought and cold stress

CsP450 genes are essential in tea plant response to environmental abiotic and biotic stresses. To further validate the expression patterns of the selected CsP450 genes under drought and cold stress, a quantitative real-time polymerase chain reaction (qPCR) was performed on 12 different CsP450 genes. The results indicated that the qPCR data were generally consistent with the transcriptomic data (Fig. 10). Specifically, under drought stress, CsP450139, CsP450197, and CsP450252 exhibited a continuous upregulation trend, with an approximately 8-, 5- and threefold increase, while CsP450219 showed a continuous downregulation trend compared with control. Besides, CsP45080, CsP450157 and CsP450181 showed an initial upregulation followed by a downregulation trend (Fig. 10A). However, the transcript level on each time points of CsP450240 showed no significant difference than control, with the maximum relative expression reaches 1.6 times at 48 h.

Fig. 10
figure 10

The relative expression levels of selected CsP450 genes under drought and cold treatments, as determined by qPCR. A The expression profiles of genes under drought treatments at different time points. B The expression profiles of genes under cold treatments at different time points. Error bars show standard deviations among three independent biological replications. * represents p < 0.05

Under cold stress, CsP45080, CsP450157 and CsP450219 exhibited an increase followed by a decrease in expression levels, while CsP45022, CsP450197 and CsP450252 showed a continuous upregulation trend compared with control, with an approximately 2.5-, 2.4- and 2.3-fold increase. Conversely, CsP450171 and CsP450181 showed a continuous downregulation trend compared with control (Fig. 10B). Besides, the transcript level on each time points of CsP4507, CsP450139, and CsP450240 showed no significant difference than control. The findings from the qPCR analysis support the expression patterns observed in the transcriptomic data, thereby providing further evidence for the involvement of CsP450 genes in response to drought and cold stress in tea plants.


The cytochrome P450 genes catalyze various reactions, including growth, development, and biosynthesis of secondary metabolites [13]. Gene identification and functional classification are essential for studying the function of gene families. As an important supergene family, cytochrome P450s have been identified at the genome level with the availabilities of the whole genome sequence in various plants. However, little is known about how these P450 genes respond to biotic and abiotic stresses and how they participate in the growth and development of tea plants. In this study, 273 non-redundant P450 genes were identified from the tea plant genome, and these genes are similar to those found in Arabidopsis. Then, a comprehensive study was conducted on the phylogenetic relationships, conserved motifs, gene structures, gene duplication events, cis-acting elements, and gene expression patterns in different tissues of tea plant members of this gene family. Besides, we analyzed the expression profile from RNA-Seq data related to drought and cold stress. The study contributes detailed knowledge on the CsP450 gene family and will help in comprehending the functional divergence of P450 genes in tea plants.

Recent genome sequencing revealed an approximate 3.0 Gb genome size for two representative elite tea plant cultivars [58]. The phylogenetic tree topology of tea plant and Arabidopsis P450s showed similar clustering, indicating a certain degree of conservation of the P450 multi-gene family in plants. In the current phylogenetic classification of plant P450s, the plant P450 family is divided into nine different subfamilies, including CYP51, CYP71, CYP710, CYP711, CYP72, CYP74, CYP85, CYP86, and CYP97 subfamilies [11]. Among the subfamilies present in the tea are CYP710, CYP711, CYP71, CYP72, CYP74, CYP85, CYP86, and CYP97. Many plant-specific enzymes encoded by P450 genes play a role in the metabolism of secondary products, belonging to the largest subfamily, CYP71, which has the most members in tea plants. The CYP71 subfamily is classified as type A P450s, and the remaining eight subfamilies are classified as non-A type [59]. Most type A genes encode plant-specific enzymes that act on the metabolism of secondary products (such as phenylpropanoids and alkaloids), while non-A type genes are mainly involved in the synthesis of hormones and other compounds [60]. These analyses provide critical information for studying the phylogeny of the cytochrome P450 gene family.

A recent study found that multiple cytochrome P450 (P450) genes induced by both biotic and abiotic stressors contain recognition sites for MYB and MYC transcription factors, ACGT core sequences, TGA-boxes, and W-boxes for WRKY transcription factors [61]. These cis-acting elements are known to be involved in the regulation of plant defense, and the response of each P450 gene to various stressors is strictly controlled [17]. In this study, numerous hormone-induced regulatory elements, such as TATC-box, TCA-element and TGA-element, and cis-acting elements involved in responses to abiotic stress, such as low temperature and drought, were identified in the promoter sequences of tea plant P450 genes.

Although the functions of multiple subfamilies of the P450 family have been extensively explored, the molecular basis for the transcriptional activation of many P450 genes by receptor-mediated signaling remains in its early stages [62]. Furthermore, it should be noted that subcellular localization of some P450 enzymes, some of which may have more than one organelle localization, such as CsP45052 may function in the plasma membrane, mitochondrial membrane or endoplasmic reticulum. In particular, many P450-catalyzed reactions in plants may produce toxic compounds if released into the cytoplasm [63].

The evolution of organisms is mostly fueled by gene duplication. Tandem duplication (TD) and segmental or whole-genome duplications (S/WGD) are the two basic mechanisms by which gene duplication has taken place [63]. In our study, segmental duplication of 28 P450 gene pairs was found in the tea plant. It was assumed that the ancient triplication WGD throughout evolution was responsible for these genes. Together with the segmental duplication events, 37 tandem duplication events were found, suggesting that tandem duplication played a major role in the proliferation of P450 genes in tea plants. These results were in line with the phenomenon observed in citrus and grapevine, where the majority of CYP genes were created through tandem duplication [64, 65].

Previous studies have revealed that plant P450 plays significant roles in different kinds of biochemical pathways and plays important roles in multiple biological processes, including development and stress response [66, 67]. The phenylpropanoid (PPP) pathway was discovered in the CsP450 PPI network, a crucial secondary metabolism pathway implicated in numerous biosyntheses, including the formation of lignin, radical scavenging, signalling molecules, and reproduction. In our study, the CsP450 genes' expression profiles were examined during various developmental stages as well as in response to drought and cold stresses. The findings suggested that the CsP450 genes could be grouped into various groups based on their expression patterns, and the genes within each cluster might be involved in a number of related functions. Furthermore, additional research is necessary to uncover the specific roles of individual CsP450 genes in the stress response and to assess their potential for the genetic improvement of tea plants.


In this study, we identified a total of 273 CsP450s family genes in the tea plant genome, which can be divided into A and non-A types, consisting of 34 subfamilies. We analyzed their structures and functions and found that subfamilies within the same type have similar exon–intron structures and motif compositions. In addition, we identified some cis-acting elements related to secondary metabolism and stress response. The results of collinearity and synteny suggested that the WGD/segmental duplications might mainly contribute to the expansion of the P450 gene family during evolution. Furthermore, our findings suggest that the CsP450 gene family is implicated in the response of tea plants to drought and cold stress. These results offer novel insights into the molecular mechanisms that underlie stress responses in tea plants and could have practical implications for breeding stress-tolerant tea cultivars.