Identification of yellowhorn (Xanthoceras sorbifolium) WRKY transcription factor family and analysis of abiotic stress response model

WRKY transcription factors are widely distributed in higher plants and play important roles in many biological processes, including stress resistance. The recently published genome sequence of yellowhorn, an oil tree with robust resistance to cold, drought, heat, salt and alkali, provides an excellent opportunity to identify and characterize the entire yellowhorn WRKY protein family and a basis for the study of abiotic stress resistance of WRKY gene family in forest species. In the present comprehensive analysis of WRKY transcription factors in yellowhorn, 65 WRKY genes were identified and defined based on their location on the chromosome. According to their structure and phylogenetic relationships, XsWRKY genes clustered into WRKY groups I–III. Segmental duplication events played a significant role in the expansion of WRKY gene family. Furthermore, transcriptomic data and real-time quantitative PCR analysis showed that expression of XsWRKY genes responding to salt and drought stresses and a hormone treatment. We also determined structures of the encoded proteins, cis-elements of the promoter region, and expression patterns. These results provide a foundation for the study of the biological function of WRKY transcription factors in yellowhorn.


Introduction
Transcription factors are proteins that bind specifically to gene promoter regions and thus regulate transcription and gene expression in response to internal and external stimuli. Unique to plants, the WRKY family is ubiquitous one of the largest transcription factor families in higher plants. WRKY transcription factors are major regulators of many plant biological processes, including abiotic and biotic stress responses, growth and development, carbohydrate synthesis, senescence and secondary metabolite synthesis (Eulgem et al. 2000;Robatzek and Somssich 2002;Sun et al. 2003;Wang et al. 2010a, b). WRKY transcription factors are defined by the conserved DNA binding domain (DBD) and the WRKY domain, which comprises approximately 60 residues, consisting of an N-terminal DNA binding domain and a zinc finger motif at the C-terminal (Eulgem et al. 2000). The DBD sequence is highly conserved, but the residues of the WRKY domain are sometimes replaced by WRRY, WKRY, WSKY, WVKY, or WKKY (Xie et al. 2005). The zinc finger motif of the WRKY family forms two modes during evolution: C2H2 (C-X4-5-C-X22-23-H-X1-H) and C2HC (C-X7-C-X23-H-X-C) (Eulgem et al. 2000). According to the amino acid sequences, WRKY transcription factors can be divided into three groups (Eulgem et al. 2000;Zhang and Wang 2005;Rushton et al. 2010Rushton et al. , 2012. The first group of WRKY proteins have two DBD domains and one C2-H2 zinc finger motif (C-X4-5-C-X22-23-H-X1-H), the second and third groups have only one DBD domain and a zinc finger motif. However, the zinc finger motifs in the second group are C-X4-5-C-X22-23-H-X1-H (Kumar et al. 2016;Gu et al. 2018a, b) and in the third group, C2-HC (C-X7-C-X23-H-X1-C) (Kumar et al. 2016;Gu et al. 2018a, b). In addition to the WRKY domain and zinc finger motif, some WRKY transcription factors also contain other structures, such as nuclear localization signal, leucine zipper, glutamic acid enrichment region, proline-rich region and kinase domain (Tian et al. 2006;Agarwal et al. 2011).
Numerous studies have shown that WRKY transcription factors are involved in plant responses to various biotic and abiotic stresses (Rushton et al. 2012;Wang et al. 2014Wang et al. , 2016. In Arabidopsis thaliana, AtWRKY22 and AtWRKY29 are induced as part of a defense response to an elicitor from a bacterial plant pathogen, perhaps through activation of a mitogen-activated protein kinase (MAPK) cascade or calcium-dependent protein kinase (CDPK) (Göhre et al. 2012), while overexpression of AtWRKY38 and AtWRKY62 in double mutants of A. thaliana reduces disease resistance (Kim et al. 2008). AtWRKY48 gain-of-function transgenic overexpression mutants had impaired resistance to the bacterial pathogen Pseudomonas syringae and lossof-function T-DNA insertion mutants had enhanced resistance associated with increased of a salicylic acid-regulated defense-related protein (PR1) (Xing et al. 2008). Similarly, WRKY genes such as BdWRKY8, BdWRKY34, BdWRKY50, BdWRKY69, and BdWRKY70 in barley were upregulated after inoculation with Fusarium graminearum, causal fungus of Fusarium head blight. It can cause diseases such as ear rot, stem rot, stem base rot and root rot, and can also infect other plants (Wen et al. 2014). Studies have also shown that various plants overexpressing WRKY genes had strong resistance to salt stress (e.g., TaWRKY10 in wheat (Wang et al. 2013) ZmWRKY23 in Zea mays, (Jiang and Yu 2009); OsWRKY45 and OsWRKY72 in Oryza sativa Uji et al. 2019). WRKY transcription factors are also essential in signaling pathways mediated by salicylic acid (SA) and by abscisic acid (ABA) (Qiu and Yu 2009) and in plant growth and development. AtWRKY6, AtWRKY22 (Zhou et al. 2011;Zhang et al. 2018), AtWRKY57 , AtWRKY53 (Miao et al. 2013), AtWRKY54 and AtWRKY70 (Besseau et al. 2012) are involved in the regulation of senescence. Also, WRKY transcription factors regulate the synthesis of the secondary metabolites, including terpenes, lignin, flavanol, kaempferol, and quercetin (Yokotani et al. 2013).
In addition to the study of the WRKY gene family in model species, studies on WRKY transcription factors in tree species have indicated that Populus euphratica the formation of new WRKY family members and adaptive evolution of other genes may have played an important role in the evolution of high salt tolerance in this species (Ma et al. 2015). EgWRKY18 and EgWRKY64 in Elaeis guineensis are conducive to improving their cold resistance, while the induced expression of EgWRKY07 and EgWRKY52 is beneficial to resistance to salt stress (Xiao et al. 2017). SaWRKY1 is involved in SA and MeJA-mediated abiotic or biological response pathways, and its overexpression improves the salt tolerance of transgenic A. thaliana (Yan et al. 2019).
So far, the studies of WRKY transcription factors have mainly focused on model plants or horticultural plants that are sensitive to stress. So studies of WRKY transcription factors in the stress-tolerant plants will help us to understand the molecular mechanism of the WRKY transcription factors in the processes of stress resistance. Yellowhorn (Xanthoceras sorbifolium) has robust resistance to low temperature, drought, heat, salt and alkali stresses, and it also has high potential for use an edible oil and biofuel because of the high oil content in its seeds. Making use of the recently sequenced yellowhorn genome (Bi et al. 2019), we here identified 56 WRKY genes in the yellowhorn genome and explored their composition, gene structure and gene replication events. Using transcriptomic data for yellowhorn, we also identified some WRKY genes that respond to various stresses such as NaCl, cold and ABA since WRKY genes have been reported to play a vital role in regulating plant tolerance to NaCl and cold stress and in regulating ABA signaling (Wang et al. 2014;Yan et al. 2014;Zhang et al. 2016).

Genome-wide identification of WRKY genes in yellowhorn
The genomic and protein sequences of yellowhorn and annotated gene models were obtained from the Yellowhorn Genome Project of the Institute of Beijing Academy of Forestry Sciences (Bi et al. 2019). WRKY family of proteins from A. thaliana, O. sativa, Malus domestica, Z. mays, and Glycine max was downloaded from the plant TFDB (https ://plant tfdb.cbi.pku.edu.cn) (Jin et al. 2017), and BLASTP was used to search for homologous WRKY genes. The targeted genes (E-value less than 1e-5) were retained for the following analysis. The WRKY domain (PF03106) from Pfam (https ://pfam.xfam.org/) (El-Gebali et al. 2019) and WRKY gene were subsequently identified from among the genes of the yellowhorn genome using HMMER 3.1 (Mistry et al. 2013). The target gene was screened according to e-value < 1.0, and the integrity of the WRKY domain (e-value < 0.1) was further confirmed using the online software SMART (https :// smart .embl-heide lberg .de/) (Letunic and Bork 2018) and National Center of Biotechnology Information CDsearch (https ://www.ncbi.nlm.nih.gov/Struc ture/cdd/wrpsb .cgi) (Marchler-Bauer et al. 2017). Finally, the repetitive sequence and the incomplete sequence were manually removed, and the confirmed gene was designated as the yellowhorn WRKY gene family. The isoelectric point of the XsWRKY protein was identified using the ExPasy online website (https ://web.expas y.org/protp aram/) (Gasteiger et al. 2003). MEGA6 (Tamura et al. 2013) software was used for multiple sequence alignment analysis of the WRKY domain. The subcellular localization of XsWRKY in yellowhorn was predicted using the subcellular localization online prediction tool MBC (https ://cello .life.nctu. edu.tw/) (Yu et al. 2006).

Phylogenetic analyses of the WRKY proteins
Yellowhorn phylogenetic tree was constructed from the WRKY domain and amino acid sequence of A. thaliana using the maximum likelihood (ML) MEGA6 (Tamura et al. 2013), and the bootstrap test was set as 1,000 replications. Visualization and beautification of the phylogenetic tree were used iTOL (https ://itol.embl.de/) (Letunic and Bork 2019).

Gene structure of WRKY genes
Information on the structure of WRKY genes was extracted from the general feature format file of the yellowhorn genome. The TBtools platform (https ://githu b.com/CJ-Chen/ TBtoo ls) ) was used to display the exon/ intron structures, conservative motifs, and WRKY domain of each predicted XsWRKY gene, and meme (https ://meme. nbcr.net/meme/intro .html) (Bailey et al. 2009) was used to identify other conserved motifs (Motif number was set to 20; the rest were the full-length protein using default settings.) in sequences of WRKY family members. The WRKY domain was annotated by the Pfam database (https ://pfam.xfam.org/) (El-Gebali et al. 2019).

Promoter analysis
For this study, the 2-kb upstream sequence of the coding region of the WRKY genes was scanned using PlantCARE (https ://bioin forma tics.psb.ugent .be/webto ols/plant care/ html/) (Lescot et al. 2002) to identify transcription factor binding sites. Low temperature, defense responses, hormone responses, drought responses, heat stress, and salt response motif were selected for analysis of promoter specificity for abiotic stress response. TBtools ) was used for visual display.

Chromosome localization and synteny analysis
BLASTP was used for all-against-all comparison (e-value: 1e−5) for information on gene pairs. Next, MCScan (https ://chibb a.agtec .uga.edu/dupli catio n/mcsca n) ) was adopted to analyze syntenic blocks and gene duplication events (using default settings) using gene pairs and gff files. Tandem duplications of WRKY genes were used to estimate nonsynonymous (Ka) and synonymous (Ks) and calculate their ratio in KaKs_Calculator 2.0 (Wang et al. 2010a, b). Circos (Krzywinski et al. 2009) was used to draw a circular map of the yellowhorn genome according to gene position and genome collinearity information in the gff file.

Plant materials and treatments
Yellowhorn seedlings (4-weeks-old) were grown in vermiculite-humus (1:3) soil in a growth chamber at 22-24 °C, 60% relative humidity, and 12 h/12 h photoperiod for 28 days as a non-stress treatment. Seedlings for stress treatments were grown similarly, with the following differences. For the salt treatment, plants were watered with 150 mM NaCl. For ABA, 100 μM was sprayed through the foliage. Coldstressed plants were grown at 4 °C in a growth chamber. For each treatment, three plants were sampled (biological repeats) with three corresponding untreated controls, and leaves were collected at 3, 6, 9, 12, 24, 48 and 72 h. All leaf samples were immediately frozen in liquid nitrogen and stored at − 80 °C until analyzed.

Transcriptomics analysis
The expression pattern of stress-related WRKY genes in yellowhorn after treatment with 150 mM NaCl, 4 °C, or 100 mM ABA for 24 h and 48 h was analyzed using RNAseq analysis of transcriptome data in the laboratory (unpublished). The program rnacocktail (Sahraeian et al. 2017) by Salmon was used to calculate the expression level TPM (per million transcripts), and DESeq2 (Love et al. 2014) was used to analyze differential expression of genes between the control with treatment groups and calculate the fold-change (FDR ≤ 0.05, fold change ≥ 2). A heatmap was constructed using the TBtools according to the expression matrix of XsWRKY.

Quantitative RT-PCR analysis
The total RNA of yellowhorn was extracted using the Omin-iPlant RNA kit (DNase I) (CWBIO, Beijing, China). RNA quality and concentration was determined using agarose gel electrophoresis and a DHS NanoPro 2010 spectrophotometer (Ding Haoyuan, Beijing, China). Template cDNA was synthesized using 500 ng of high-quality total RNA and the TaKaRa PrimeScriptTM RT Reagent Kit with gDNA Eraser (Perfect Real Time) (TaKaRa, Dalian, China) according to the manufacturer's instructions. Quantitative real-time PCR (qRT-PCR) primers were designed according to the CDS sequence of XsWRKY by Primer 5 software (Table S1), and primer mass was tested using PCR amplification, agarose gel electrophoresis, and melting curve analysis. qRT-PCR was performed using a Bio-Rad/CFX Connect real-time PCR detection system and UltraSYBR mixture (CWBIO, Beijing, China). Relative expression level of the XsWRKY gene was analyzed by the 2 −ΔΔCt method and normalized to the average transcript amount of the XsACTIN gene (EVM0013888.1). The qRT-PCR experiments were done three times.

Identification of WRKY proteins in yellowhorn
In this systematic analysis to identify WRKY proteins in yellowhorn, 67 potential WRKY proteins were identified using BLAST (Mount 2007) and HMMER (Mistry et al. 2013) for protein domain analysis. Two of these 67 (EVM0005841.1, EVM0022387.1) did not contain the complete amino acid sequence of the WRKY domain and were removed from the list. The remaining 65 WRKY genes in the yellowhorn genome (Table 1) were denoted as XsWRKY1 through XsWRKY 59 (Redundant protein numbers such as XsWRKY38, XsWRKY38.1, XsWRKY38.2, and XsWRKY38.3, are four genes with the same protein sequence and were sorted according to the order of genes on the chromosome.) based on their order on the chromosome, but four protein lines not mapped on the chromosome were classified by their number of contigs.
The length of the amino acid sequence in XsWRKY proteins ranged from 122 (XsWRKY24) to 759 (XsWRKY46), with an average length of 372 residues. The molecular weight (MW) of XsWRKY proteins ranged from 14.81 kDa (XsWRKY24) to 83.18 kDa (XsWRKY46), and the isoelectric point (pI) of the XsWKRY proteins ranged from 4.98 (XsWRKY42) to 9.78 (XsWRKY39), with 35 XsWRKYs having a pI < 7 and the rest having pI > 7 (Table 1). Transcription factors are usually localized in nucleus and bind to the target's promoter regions. Using the prediction tool, we found that all XsWRKYs were localized in the nucleus, except for XsWRKY14, which is expected to be localized in the plasma membrane (Table 1).

Phylogenetic analysis of the XsWRKY family
In the phylogenetic analysis of XsWRKY transcription factors using multiple sequence alignment of the 60-aa WRKY domain of XsWRKY protein, the WRKY domain from each WRKY transcription family and subfamily in A. thaliana, Populus trichocarpa, Z. mays was randomly selected as a representative of each family and aligned with that of XsWRKY. In yellowhorn, we found that 61 XsWRKY proteins have the full conserved sequence WRKYGQK, while the other XsWRKY proteins (XsWRKY15, XsWRKY16, XsWRKY21, XsWRKY36) are WRKYGKK (Fig. 1). Subsequently, in a preliminary classification based on the structure of the XsWRKY protein, 12 XsWRKY proteins with two WRKY domains and a C2H2-type zinc finger motif (C-X4-C-X22-23-HXH) were classified into group I. Among the remaining 53 XsWRKY proteins, 45 XsWRKY proteins having C2H2-type zinc finger motif were classified as group II, and eight XsWRKY proteins containing the C2HC-type zinc finger were classified into group III (Fig. 1).
To further classify XsWRKY family members, three WRKY domains of each family and subfamily from A. thaliana, P. trichocarpa, and Z. mays were randomly selected to construct a phylogenetic tree together with the XsWRKY domain (Fig. 2). The phylogenetic tree shows that the WRKY domain of the second group can be divided into five subgroups (IIa-IIe). The C2H2-type zinc finger motif of IIa, IIb, IId, IIe is C-X5-C-X23-HXH. However, the C2H2type zinc finger motif of 16 XsWRKY in IIc is C-X4-C-X23-HXH, and the remaining two are C-X4-C-X16-HXH (Fig. 2).

Gene structure of yellowhorn WRKY gene family and motif composition of WRKY proteins
To further understand the gene structure and motif composition of the yellowhorn WRKY transcription factors, we further analyzed the gene structure of XsWRKY by using MEME. Twenty motifs were predicted in XsWRKY, and the details for the 20 conserved motifs are shown in the supplementary file (Fig. 3, Table S2). The length of motifs is approximate 10 amino acids to 50 amino acids, and each LG5 IIc LG13 I Identification of yellowhorn (Xanthoceras sorbifolium) WRKY transcription factor family… protein contains 3-12 motifs. The same group of proteins have a similar motif composition, but all XsWRKY proteins contain motifs 1 and 2, so motifs 1 and 2 that are present in WRKY are important parts of the WRKY domain. The highly similar motif 3 and motif 14 are part of the zinc finger motif domain, and with motifs 2 and 4 form the C2H2type zinc finger. Motif 9 belongs to the N-terminal WRKY domain of XsWRKY proteins. Motif 10 is unique to IIa and IIb. Motif 8 is specific to IId, and motif 20 is specific to III (Fig. 3). The motifs in different subfamilies may be related to the specific functions of the family members.
Regarding the gene structure of the XsWRKY family, all XsWRKY genes have 2 to 6 exons (Fig. 3). XsWRKY genes in the same group have a similar gene structure. For example, members of IIa and IIb except for XsWRKY19 contain only phase 0 introns, and all members of group I have N-terminal domains that do not contain introns (Fig. 3).

Distribution and synonymy analysis of XsWRKY genes in genome
Regarding the distribution of XsWRKY genes in the yellowhorn genome, we found that in addition to XsWRKY4. 1, 12.1, 38.2, 38.3, 56, 57, 58, 59 (on the contig), the XsWRKY gene is unevenly distributed on 14 chromosomes except chromosome 7 (Fig. 4). Chromosome 1 contains the most WRKY genes. Although most XsWRKY genes are distributed on several chromosomes, there is no positive correlation between the number of XsWRKY genes on a chromosome and its length (Fig. 4).
To understand the expansion of the XsWRKY family, we identified two tandem repeat gene pairs, XsWRKY54/55 and XsWRKY19/20 (Table S3) and 23 segmental duplication events based on MCScanX for collinearity analysis of yellowhorn.
To explore the phylogenetic mechanisms of the XsWRKY family, we studied the collinear relationship between yellowhorn four dicotyledons (A. thaliana, G. max, P. trichocarpa, M. domestica) and two monocotyledons (O. sativa and Z. mays) (Fig. 5). There were the most homologous gene pairs between WRKY and XsWRKY in G. max (51), followed by P. trichocarpa (49) XsWRKY47). Interestingly, a collinear block in O. sativa and Z. mays with yellowhorn contained more than two XsWRKY genes, and the span of any two XsWRKY genes in the block was less than 14 genes. This phenomenon does not exist in collinear blocks in G. max, P. trichocarpa, M. domestica with yellowhorn. On the other hand, we calculated the number of XsWRKY between the two ends of XsWRKY genes in the collinear block between yellowhorn and G. max (40), P. trichocarpa (32), M. domestica (21), A. thaliana (6), O. sativa (2) and Z. mays (0). There are some unique collinear gene pairs between yellowhorn and dicotyledons (A. thaliana, G. max, P. trichocarpa, and M. domestica), but there are 10 common collinear gene pairsin the yellowhorn and these 6 species. The purple triangles represent yellowhorn; fuchsia stars represents Populus trichocarpa; orange-yellow pentagram represents Zea mays; and yellow-green stars represents A. thaliana. The yellowhorn WRKY protein is mainly divided into three subfamilies; group I proteins with suffix "N" or "C" represents an N-terminal or C-terminal WRKY domain. The WRKY domain protein sequence was aligned with ClustalX, and the phylogenetic tree was constructed in MEGA using the maximum likelihood method. Bootstrap value is based on 1000 repetitions

Cis-acting regulatory element analysis of the WRKY promoter
To identify the cis-regulatory elements in the XsWRKY promoter, we downloaded the 2-kb sequence upstream of the initiation codons of the XsWRKY genes and analyzed them with the PLACE server (Table S5). Interestingly, numerous cis-acting elements involved in the responses to plant hormone, biotic, and abiotic stress were identified in the promoter of the XsWRKY transcription factors. According to the annotations, the elements belong to eight categories: ABA, JA, GA, SA, light, low temperature, drought, anaerobic. The elements related to light response are the most enriched, e.g., Box 4, ATCT-motif, GT1motif, GATA-motif, G-box, G-Box (Fig. 6). Other types of cis-elements such as CGTCA-motif, TGACG-motif are MeJA response elements; GARE-motif, TATC-motif, P-box are gibberellin response elements; ARE is involved in anaerobic induction; LTR linked to plant response to low temperatures related; TC-rich repeats are involved in defense and stress response; the TCA-element is involved in the salicylic acid response.

Expression profile of XsWRKY genes after NaCl or cold stress and ABA exposure
To study the role of the XsWRKY genes in response to various environmental stresses and related signal transduction, we analyzed the transcriptional responses of the XsWRKY genes to salt, cold and ABA treatments (Table S6). After Fig. 3 Phylogenetic tree, conserved motif distribution and gene structure of the XsWRKY gene family. a Phylogenetic trees constructed using MEGA software and based on the XsWRKY gene family protein sequences mainly clustered into seven categories according to family classification and are labeled with different colors.
b Conservative motifs pattern of the XsWRKY protein. Different motifs are represented by different colors and numbers (see key on right). The specific sequence information of each motif is given in Supplemental 1. c XsWRKY gene family structure. Green: UTR, yellow: CDS, red: WRKY domain, black lines: introns treatment with 150 mM NaCl for 24 h, most of the differentially expressed gene (DEGs) were significantly upregulated (e.g., XsWRKY1, XsWRKY24, XsWRKY4), but XsWRKY6 was significantly downregulated after treatment for 24 h, and significantly upregulated after 48 h. After 48 h, XsWRKY53, XsWRKY24, XsWRKY27, XsWRKY4, XsWRKY5, XsWRKY1, and other genes were significantly upregulated. At 24 h and 48 h of salt treatment, 10 and 16 genes were differentially expressed genes (Fig. 7). To validate the transcriptomic data, we analyzed some XsWRKY genes by qRT-PCR and found that XsWRKY22 was induced by NaCl at 6 h. The expression of XsWRKY22 peaked at 6 h and then slowly decreased. The expression levels of XsWRKY6 and XsWRKY33 showed a downward trend at the initial stage of induction but began to rise after 6 h, peaked by 24 h, then decreased by 48 h (Fig. 8). Other studies have shown that AtWRKY46 can regulate stomatal closure and enhance plant salt stress resistance (Ding et al. 2015). In cotton, the constitutive expression of GhWRKY17 increases plant tolerance to drought and salt stress while reducing sensitivity to ABA (Yan et al. 2014;Gu et al. 2018a, b). OsWRKY30 and OsWRKY72 are activated by ROS, and their overexpression makes plants more susceptible to salt stress Shen et al. 2012).
In conclusion, XsWRKYs are closely related to the salt stress response in yellowhorn.
Most plants native to cold areas have specific cold resistance. However, the mechanism of action of WRKYs in cold stress response is not precise. VaWRKY12 is expressed in the nucleus and cytoplasm at average temperature, but only in the nucleus after cold treatment (Zhang et al. 2019). Overexpression of VaWRKY12 enhances the cold tolerance of A. thaliana and grape callus and significantly increases expression of antioxidant-related genes. Overexpression of CsWRKY46 increased cold tolerance in cucumber and actively regulates cold signaling pathways in an ABA-dependent manner (Zhang et al. 2016). After 24 h of cold treatment in the present study, the expression of nine XsWRKY genes was significantly upregulated, for example, XsWRKY4, XsWRKY24, XsWRKY35, and XsWRKY46 (Fig. 7), but no downregulated genes were found. After 48 h of cold treatment, some genes were somewhat upregulated (XsWRKY4, XsWRKY46, XsWRKY40), while most of the other DEGs were significantly upregulated, and XsWRKY6 and XsWRKY54 were significantly downregulated. The Gray lines represent all homologous blocks in the yellowhorn genome; red lines indicate repeated pairs of XsWRKY genes; and purple represents tandem repeats. The chromosome number follows "Chr" expression of XsWRKY24 as determined by qRT-PCR decreased first and then peaked at 48 h (Fig. 8). The expression of XsWRKY34 increased continuously after cold induction. The expression of XsWRKY6 decreased significantly after 48 h. The expression of XsWRKY54 decreased after cold stress. As for the WRKY genes expression trend, the expression level was the highest at 6 h and was reduced to the lowest after 48 h (Fig. 7).
ABA is closely related to plant response to abiotic stress and plays a crucial role in response to environmental challenges. A. thaliana chloroplast protein ABRR interacts with WRKY transcription factors (WRKY40, WRKY18, and WRKY60) to act as ABA signals. WRKYs can also act as negative regulators. For example, WRKY40 inhibits the expression of the ABA response gene ABI5 (Shang et al. 2010). GhWRKY17 in cotton can respond to drought and salt stress through ABA signaling and regulation of cell ROS production in plants exposed to drought and salt stress (Yan et al. 2014). The expression of XsWRKY55 was significantly upregulated, and the expression of XsWRKY8 was downregulated at 24 h after ABA treatment. The expression of XsWRKY55 in the treated group was four times higher than in the control group. The expression of XsWRKY25 was significantly downregulated after 48 h of ABA treatment. qRT-PCR showed that XsWRKY54 was induced by ABA, and its expression increased gradually in the early stage, but began to decrease after 6 h (Fig. 8). The results suggest that these XsWRKY genes may be involved in ABA signaling.
In conclusion, the expression pattern of XsWRKY genes under NaCl and cold conditions indicated that different XsWRKY genes might be involved in different stress responses. ABA treatment demonstrated that XsWRKY gene may respond to different stresses through ABA transduction pathway.
In our evaluation of the conserved domain of the XsWRKY protein, multi-sequence alignment of different subfamilies showed that the conservation of the WRKY domain differed among the subfamilies; conservation of the domain in subfamily groups IN, IIc and III was weaker than in the other groups. The WRKY domain of four members (XsWRKY15,XsWRKY16,XsWRKY21,XsWRKY36) in the IIc subfamily changed from WRKYGQK to WRKYGKK. Most XsWRKY proteins had a binding preference for the homologous homeopathic element W-box. Variation in the WRKYGQK motif in the WRKY domain may affect the interaction of the WRKY genes with downstream target genes, and this mutation also exists in other species such as pineapple (Xie et al. 2018), Caragana intermedia (Wan et al. 2018), grapes (Romero et al. 2019). So these four proteins merit further study of their function and binding specificity. Changes in the WRKY domain provide the impetus for the expansion of the WRKY gene family (Xie et al. 2018). The XsWRKY proteins in group I in yellowhorn have two WRKY domains, and groups II and III have only one, indicating that XsWRKY proteins evolved different characteristics. Subfamilies II and III apparently arose from the N-terminal deletion in subfamily I, and the C-terminal domain in subfamily I members is more highly conserved than the N-terminal domain (Zhang and Wang 2005). Consistent with results of previous studies (Zhang and Wang 2005), the WRKY domains of subfamilies II and III contain introns consistent with the C-terminal WRKY domain of subfamily I. Subfamily I has the most primitive WRKY gene Fig. 6 Phylogenetic tree and distribution of cis-regulatory elements of the WRKY genes in yellowhorn. The cis-regulatory elements were identified in 65 XsWRKY genes using the PLACE server. A python script was used to extract the 2-kb upstream region from the translation initiation codon of the gene and is considered to be a proximal promoter sequence, based on the position of the cisregulator elements. a Phylogenetic tree of WRKY gene family. b Distribution of cis-regulatory elements in the promoter region of the XsWRKY gene family type, which transitions to phase II due to loss of differentiation of the WRKY domain. Group III may have evolved through a variation in the C-terminal zinc finger. Each group member has the same or similar phase composition, which also implies this evolutionary relationship of the WRKY transcription factor family.
In the analysis of the XsWRKY genes of the yellowhorn genome, except for 2 tandem repeats and 23 fragment repeats, the contigs and chromosomes are highly similar; XsWRKY pairs may be potential tandem repeats or segment repeats. At the same time, 25 repeat events were identified in 33 XsWRKY genes. These results indicate that the specific XsWRKY gene was generated by gene duplication and segment repeat events as the main driving force of XsWRKY evolution. At the same time, in the collinear module of yellowhorn and the other six plants analyzed, some XsWRKY gene pairs were spaced 100 or even 200 genes apart from the other species, which may be due to genetic differences between yellowhorn with other species, genomic doubling events, transposon jumping and other factors.
Compared to the genomes of dicotyledons such as M. domestica (Meng et al. 2016), A. thaliana (Dong  (Tuskan et al. 2006), yellowhorn has a relatively small number of XsWRKY genes. Segmental and tandem repeat events play an important role in the expansion of the XsWRKY gene family, while genome-wide doubling events play a key role in the expansion of the gene family. Yellowhorn shares γ whole-genome duplication events with other dicots (Murat et al. 2017), but based on the distribution of WRKY genes on the chromosomes and the collinearity analysis with other species, the lower number of XsWRKY genes in yellowhorn may be due to genomic structural variation. The collinear XsWRKY Fig. 8 Expression profiles over time of the eight selected XsWRKY genes in response to various abiotic stress treatments. The data was normalized to the ACTIN gene, and vertical bars represent the standard deviation gene pairs between the six species and yellowhorn appear simultaneously in multiple species, which may indicate that these homologous genes may play an important role in these plants.
According to the evolutionary hypothesis of the WRKY transcription factors, the most primitive WRKY proteins only contain a WRKY domain at the C-terminal, then group I was generated by doubling, while group I lost the N-terminal WRKY domain during evolution to produce group II (Zhang and Wang 2005). The group III WRKY gene is based on group II. Yellowhorn has fewer group III XsWRKY genes (8) than in most other dicotyledons, consistent with it having fewer members of the WRKY family. This may be related to the fact that yellowhorn is the only species of yellowhorn and has not been subjected to lineage-specific replication.
Yellowhorn is usually grown in impoverished land with scarce water, which confer resistance to various stresses including drought, cold, salt, and alkali. It is a green pioneer species. Based on the biological function of WRKY transcription factors in another species, it is reasonable to conclude that XsWRKY transcription factors are also involved in physiological processes and stress responses. Combined with public transcriptomic data and phylogenetic analysis, our studies have shown that XsWRKY proteins have a biological function similar to those of WRKY proteins in other species. For example, our transcriptomic data showed that the expression of 18 XsWRKY genes was significantly induced under salt stress.
Because AtWRKY33 in A. thaliana responds to salt stress, we selected its homolog in yellowhorn, XsWRKY34, for further study. Consistently, the qRT-PCR analysis indicated that expression of XsWRKY34 was strongly induced by 150 mM NaCl. The results of the qRT-PCR of XsWRKY genes showed that some XsWRKY genes were upregulated or downregulated by around 6 h. This time point may indicate a new mechanism by which XsWRKY genes participate in conferring resistance to stress in yellowhorn. These results indicate that XsWRKY proteins play a vital role in the regulation of salt stress. Yellowhorn is mainly distributed in northern China and has cold resistance. The WRKY transcription factor is a crucial component of the cold stress regulation network. In O. sativa, OsWRKY76 (Yokotani et al. 2013) is induced by low temperature. Yellowhorn XsWRKY20 was also induced by cold stress, indicating that XsWRKY20 has a potential role in the cold response of yellowhorn (Fig. 5). Increasing studies have shown that the WRKY genes are regulated by hormones such as ABA, SA, and JA and play a crucial role in the plant hormone signal transduction network. Low temperature in the present study induced XsWRKY20 and its ortholog gene OsWRK76 in rice, indicating that they have similar functions in cold tolerance and ABA signaling network. Consistent with previous studies in other species, our results indicate that differential expression of some XsWRKY genes after various abiotic stresses and hormone treatments highlights the extensive involvement of WRKY genes in environmental adaptation.
Collectively, our findings provide new insights into the potential function of XsWRKY genes in yellowhorn. A comprehensive analysis will help select candidate WRKY genes for further functional identification and genetic improvement of forest species.

Conclusion
In this analysis of 65 full-length WRKY genes identified from yellowhorn, the XsWRKY transcription factors represented seven subfamilies with high similarity in motif composition and exon-intron structure. Homology analysis and phylogenetic comparison of WRKY genes from several different plant species provided valuable clues on duplication events during the evolution of the WRKY genes. The response of XsWRKY genes is vital to the high tolerance of yellowhorn to stresses and thus to its growth and development. These valuable insights into the function of the WRKY transcription factors highlight their potential for breeding programs to increase tolerance to poor environments.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.