Crystal structures of N-terminal WRKY transcription factors and DNA complexes
- 233 Downloads
Plant-specific WRKY transcription factors (TFs) are among the largest families of TFs in higher plants; they are also found in the unicellular eukaryote Giardia lamblia and the slime mold Dictyostelium discoideum (Ulker and Somssich, 2004), but not in animals. WRKY TFs participate in diverse developmental and physiological processes in plants, such as disease resistance, abiotic stress responses, senescence, seed and trichome development, as well as additional developmental and hormone-controlled processes (Agarwal et al., 2011).
There are 75 WRKY family members identified so far in Arabidopsis and more than 100 in rice (UniProt:http://www.uniprot.org/). The WRKY TFs are named after their approximately 60 conserved amino acids of DNA binding domains (DBDs) called the WRKY domains required for W-box (5′-TTGAC-C/T-3′) DNA recognition (Eulgem et al., 2000). The WRKY domain contains a highly conserved WRKYGQK motif (forming a β strand) near the N-terminus and a zinc-finger motif at the C-terminus, featuring an atypical C2H2 (CX4–5CX22–23HX1H) or C2HC (CX7CX23HXC) type (Eulgem et al., 2000). The zinc-finger structure is indispensable for the DNA-binding of WRKY TFs. Any substitutions of the conserved cysteine or histidine residue eliminate the protein-DNA interaction (Duan et al., 2007). Based on the number of WRKY domains and the zinc-finger structure, WRKY TFs are classified into groups I to III and each group is further divided into subgroups (Brand et al., 2013). Group I WRKY genes are defined by the presence of two WRKY domains, whereas groups II and III contain only a single WRKY domain (Eulgem et al., 2000). We named the WRKY domains as WRKY-N and WRKY-C respectively for the group I two-domain WRKY TFs. As shown by previous experiments, the specific binding to W-box was thought to be mediated mainly by the C-terminal WRKY domain, whereas the N-terminal WRKY domain showed weaker (Brand et al., 2013) or even no binding to W-box (Ishiguro and Nakamura, 1994; de Pater et al., 1996; Eulgem et al., 1999). However, recent one-hybrid studies on yeast demonstrated that the two WRKY domains of AtWRKY1 (Arabidopsis WRKY1 protein) were both essential for its transcriptional activities (Qiao et al., 2016). We have previously determined the crystal structure of the apo AtWRKY1-C (Arabidopsis WRKY1-C) that comprises a five-stranded β-sheet (Duan et al., 2007). The solution structures of the apo AtWRKY4-C (Arabidopsis WRKY4-C), and its complex with a W-box DNA were solved by NMR (Yamasaki et al., 2005; Yamasaki et al., 2012). Recently, a complex crystal structure of dimeric rice WRKY45-DBD, a group III (C2HC zinc finger domain) WRKY TF bound to two W-box DNA was reported (Cheng et al., 2019). To date, there is no structural information for any N-terminal WRKY domains.
The DNA recognition was accomplished by seven base-specific interactions (Watson strand T5, T6, T7 and Crick strand G6’, G7’, T8’, C9’) and nonspecific interactions with the phosphate backbone (Fig. 2F). To provide a unified description for all WRKY domains with different numbering, we renumbered the highly conserved WRKYGQK sequence as W1R2K3Y4G5Q6K7. The base-recognition of Watson strand are mainly by hydrophobic interactions, including: (a) R2 and K3 to T5, (b) K3 and Y4 to T6, (c) K3, G5 and Q6 to T7 (Fig. 2F). In addition, the side chain of R2 and K3 contact the phosphate backbone of C4, T5 and T6 through nonspecific salt bridges and hydrogen bonds (Fig. 2F). As for the Crick strand, there are more interactions involved in DNA recognition. The hydrophobic interactions include: (a) G5, Y4, K7 and Y133 to T8’, (b) Y4 and G5 to C9’. There also exist H-bonds, including: (a) the amino-group of K7 forms hydrogen bonds with the O6 and N7 atoms of G6’ and G7’, (b) the carboxyl oxygen of Y4 and the N4 atom of C9’ (Fig. 2F). Meanwhile, the H-bonds and electrostatic interactions between protein and DNA phosphate groups strengthen the binding preference to the sequence -G6’G7’T8’C9’- of the Crick strand. The H-bonds include: (a) the guanidyl of R131 and the hydroxyl of Y133 to the G7’, (b) the hydroxyl of Y4 and the side chain of K144 to T8’, (c) the guanidyl of R135 to C9’. Electrostatic interactions are found between Arg or Lys and DNA phosphate group, including R131 to G7’, K144 to T8’ and R135 to C9’ (Fig. 2F).
We also studied the contacts described above by base substitution and evaluated their binding affinities by ITC measurements (Fig. 2G). The binding affinity of the substitutions of G7’C9’ to A7’A9’ is much weaker than in the original sequence. The KD between AtWRKY1-N and G7’C9’ to A7’A9’ is 7.6 µmol/L, whereas the KD between AtWRKY1-N and the original W-box is 0.1 µmol/L, a 76-fold decrease (Fig. 2G and Table S3), showing the importance of these two bases for the specific recognition. In addition, the substitution of T8’ to A8’ significantly decreased the affinity by 12-fold (Fig. 2G and Table S3). However, the substitution of G6’ to T6’ only decreased the binding affinity by 2.5-fold (Fig. S2A and Table S3). Subsequently, we mutated the T6 and T7 of the Watson strand to C6 and G7, which only decreased the affinity by 4.5-fold (Fig. 2G and Table S3). Therefore, TT bases of the Watson strand are not as important to AtWRKY1-N as to AtWRKY4-C (Yamasaki et al., 2012) and OsWRKY45-DBD (Cheng et al., 2019). These results demonstrate that the DNA-protein interaction of AtWRKY1N is mainly concentrated on the Crick strand particularly around sequence of “-G7’T8’C9’-”. The substitution of the bases outside the “-G7’T8’C9’-” only impairs the binding slightly (Table S3). A previous work also revealed that the DBDs of AtWRKY11 and AtWRKY50 bind to an invariant ‘GAC’ core consensus (reading from the Watson strand) (Brand et al., 2013), consistent with our results.
Next, we investigated the residues of AtWRKY1-N involved in DNA binding by site-directed mutagenesis and electrophoretic mobility shift assay (EMSA). The mutant of R117A (R2) or K118A (K3), interacting with TT bases of the Watson strand, could still bind to DNA (Fig. S3A and S3B), whereas the mutation of K416A (K3) in AtWRKY4-C eliminated its DNA binding activity (Yamasaki et al., 2012). The mutants Q121A (Q6), K122A (K7), Y133A, R135A and K144A appeared to not bind to DNA without an apparent shift band (Fig. S3D, S3E and S3G–I), noticeably the mutant K122A (K7), with Y119 (Y4), Q121 (Q6), K122 (K7), Y133, R135 and K144 directly in contact with the sequence G7’, T8’, C9’ and T7 (Fig. 2F). The mutant Y119A still bound to DNA (Fig. S3C) because Y119 (Y4) forms a hydrogen bond with base C9’ via the main chain oxygen atom (Fig. 2E). These results are consistent with the complex structures observed above and ITC results.
The classical model of a transcription factor searching for its specific site presumes that positively charged DBD binds first to dsDNA somewhere non-specifically and then slides on the DNA in one dimension to find the specific site (Berg et al., 1981). In our case, the residues involved in non-specific contacts surrounding the phosphate groups appear to enable the protein to locate closer to the DNA major groove non-specifically, and the K7 contributes to searching for the optimal specific binding site. However, we could not obtain the dynamic process from the static picture of our crystal structures. The residue K7 is absolutely conserved for all WRKY proteins. We thus propose that K7 is the key amino acid for all WRKY domains to search for and bind to dsDNA specifically. In our three complex structures, the K7 interacts with G6’ and G7’ with different but similar distances (Fig. S4A–C). To understand the role of K7 in different WRKY domains, we mutated it to Ala, Gln and Arg. Only the mutant K284R of AtWRKY2-N could form a slight band with DNA in one of the WRKY domains while the other mutants completely eliminated the DNA binding ability (Fig. S4D–F).
All together, we have shown that the N-terminal group I WRKY domains bind to W-box DNA as well (if not better) as the C-terminal WRKY domains, with quite different binding mode (more extensive interaction to the Crick strand and to the ‘GAC’ core sequence). Furthermore, the EMSA and ITC results show that AtWRKY1101−339 (residues 101–339, comprising both WRKY domains) can bind to two W-box DNA at the same time (Fig. 2H–I). The KD between AtWRKY1101−339 and W-box DNA is 0.5 µmol/L with two DNA binding sites (Fig. 2I). The mechanism of two binding-sites on group I WRKY proteins immediately suggests that group I WRKY TFs can interact and recruit more DNA partners than previous knowledge of a single domain of WRKY TF binding to one W-box DNA (Fig. S5).
WRKY TFs bind to DNA specific sites in the promoters of target genes to regulate their expression. However, all WRKY TFs bind to the same W-box sequence, raising the question of how specificity is achieved and differentiated between different promoters and WRKY TFs. The differences in their binding site preferences were suggested to partly depend on flanking sequences outside the TTGACY-core motif (Ciolkowski et al., 2008). Our study also emphasized that N-terminal WRKY domain interacting with W-box is more concentrated on a conserved ‘-G’T’C’-’ consensus on the Crick strand (Figs. 2F, 2G and S2A), indicating some diversity in the binding sequences since there should be many more binding sites with the three bases ‘-G’T’C’-’ (or ‘GAC’ reading from the Watson strand) consensus. A WRKY gene from Tamarix hispida, ThWRKY4, could bind to two other motifs: a W-box like sequence (GTCTA) and the RAV1A element (CAACA) (Xu et al., 2017). The former consists of the invariant ‘GTC’ motif while another is a novel sequence. These studies suggest and in agreement with our results that the WRKY TFs not only recognize the conventional W-box (TTGACC), but also could bind to other DNA sequences.
This work was supported by grants 31670740 and 31270803 from NSFC (the National Science Foundation of China). We thank the Shanghai Synchrotron Radiation Facility (SSRF) for providing us with opportunities to test the crystals and to collect datasets on the BL19U beamline. We thank the KEK Photon Factory and staff members for their assistance in data collection. We thank the National Center for Protein Sciences at Peking University (Beijing) for providing experimental equipment.
X.-D.S. conceived the project. Y.X. and H.X performed gene construction, protein expression and purification, and crystal screening and optimization. Y.X. and B.W. performed the collection of X-ray diffraction data and structure determination. Y.X. performed the EMSA assays, ITC assays and structure analysis. X.-D.S. and Y.X. wrote the manuscript.
Yong-ping Xu, Hua Xu, Bo Wang and Xiao-Dong Su declare that they have no conflict of interest.
This article does not contain any studies with human or animal subjects performed by any of the authors.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.