, 139:973 | Cite as

WRKY gene family evolution in Arabidopsis thaliana

  • Qishan Wang
  • Minghui Wang
  • Xiangzhe Zhang
  • Boji Hao
  • S. K. Kaushik
  • Yuchun Pan


The Arabidopsis thaliana WRKY proteins are characterized by a sequence of 60 amino acids including WRKY domain. It is well established that these proteins are involved in the regulation of various physiological programs unique to plants including pathogen defense, senescence and response to environmental stresses, which attracts attention of the scientific community as to how this family might have evolved. We tried to satisfy this curiosity and analyze reasons for duplications of these gene sequences leading to their diversified gene actions. The WRKY sequences available in Arabidopsis thaliana were used to evaluate selection pressure following duplication events. A phylogenetic tree was constructed and the WRKY family was divided into five sub-families. After that, tests were conducted to decide whether positive or purified selection played key role in these events. Our results suggest that purifying selection played major role during the evolution of this family. Some amino acid changes were also detected in specific branches of phylogeny suggesting that relaxed constraints might also have contributed to functional divergence among sub-families. Sites relaxed from purifying selection were identified and mapped onto the structural and functional regions of the WRKY1 protein. These analyses will enhance our understanding of the precise role played by natural selection to create functional diversity in WRKY family.


WRKY proteins Relaxed selection Functional divergence 

Supplementary material

10709_2011_9599_MOESM1_ESM.xls (27 kb)
The set of 66 genes used for the analysis (XLS 27 kb)
10709_2011_9599_MOESM2_ESM.jpg (651 kb)
(jpg 651 kb)
10709_2011_9599_MOESM3_ESM.doc (34 kb)
LRTs results to detect heterogeneous selection regimes among groups for each gene (DOC 33 kb)
10709_2011_9599_MOESM4_ESM.xls (52 kb)
Detection of gene conversion events (XLS 52 kb)


  1. Altschul SF, Madden TL, Schaffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  2. Anisimova M, Bielawski JP, Yang Z (2002) Accuracy and power of bayes prediction of amino acid sites under positive selection. Mol Biol Evol 19:950–958PubMedGoogle Scholar
  3. Asai T, Tena G, Plotnikova J et al (2002) MAP kinase signalling cascade in Arabidopsis innate immunity. Nature 415:977–983PubMedCrossRefGoogle Scholar
  4. Chen C, Chen Z (2000) Isolation and characterization of two pathogen- and salicylic acid-induced genes encoding WRKY DNA-binding proteins from tobacco. Plant Mol Biol 42:387–396PubMedCrossRefGoogle Scholar
  5. Chen JM, Cooper DN, Chuzhanova N et al (2007) Gene conversion: mechanisms, evolution and human disease. Nat Rev Genet 8:762–775PubMedCrossRefGoogle Scholar
  6. Church G, Sussman J, Kim S (1977) Secondary structural complementarity between DNA and proteins. Proc Natl Acad Sci 74:1458–1462PubMedCrossRefGoogle Scholar
  7. Crooks GE, Hon G, Chandonia JM et al (2004) WebLogo: a sequence logo generator. Genome Res 14:1188–1190PubMedCrossRefGoogle Scholar
  8. Dellagi A, Helibronn J, Avrova AO et al (2000) A potato gene encoding a WRKY-like transcription factor is induced in interactions with Erwinia carotovora subsp. atroseptica and Phytophthora infestans and is coregulated with class I endochitinase expression. Mol Plant Microbe Interact 13:1092–1101PubMedCrossRefGoogle Scholar
  9. Dong J, Chen C, Chen Z (2003) Expression profiles of the Arabidopsis WRKY gene superfamily during plant defense response. Plant Mol Biol 51:21–37PubMedCrossRefGoogle Scholar
  10. Du L, Chen Z (2000) Identification of genes encoding receptor-like protein kinases as possible targets of pathogen- and salicylic acid-induced WRKY DNA-binding proteins in Arabidopsis. Plant J 24:837–847PubMedCrossRefGoogle Scholar
  11. Duan M, Nan J, Liang Y et al (2007) DNA binding mechanism revealed by high resolution crystal structure of Arabidopsis thaliana WRKY1 protein. Nucleic Acids Res 35:1145PubMedCrossRefGoogle Scholar
  12. Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113PubMedCrossRefGoogle Scholar
  13. Eulgem T, Rushton P, Robatzek S et al (2000) The WRKY superfamily of plant transcription. Trends Plant Sci 5:199–205PubMedCrossRefGoogle Scholar
  14. Gu X (1999) Statistical methods for testing functional divergence after gene duplication. Mol Biol Evol 16:1664–1674PubMedGoogle Scholar
  15. Gu X (2003) Functional divergence in protein (family) sequence evolution. Genetica 118:133–141PubMedCrossRefGoogle Scholar
  16. Gu X (2006) A simple statistical method for estimating type-II (cluster-specific) functional divergence of protein sequences. Mol Biol Evol 23:1937–1945PubMedCrossRefGoogle Scholar
  17. Gu X, Vander Velden K (2002) DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics 18:500–501PubMedCrossRefGoogle Scholar
  18. Ishiguro S, Nakamura K (1994) Characterization of a cDNA encoding a novel DNA-binding protein, SPF1, that recognizes SP8 sequences in the 5′ upstream regions of genes coding for sporamin and beta-amylase from sweet potato. Mol Gen Genet 244:563–571PubMedCrossRefGoogle Scholar
  19. Kalinina OV (2006) Amino acid residues that determine functional specificity of NADP- and NAD-dependent isocitrate and isopropylmalate dehydrogenases. Proteins Struct Funct Bioinforma 64(4):1001–1009 Google Scholar
  20. Kalinina OV, Novichkov PS, Mironov AA et al (2004) SDPpred: a tool for prediction of amino acid residues that determine differences in functional specificity of homologous proteins. Nucleic Acids Res 32:W424–W428PubMedCrossRefGoogle Scholar
  21. Maeo K, Hayashi S, Kojima-suzuki H et al (2001) Role of conserved residues of the WRKY domain in the DNA-binding of tobacco WRKY family proteins. Biosci Biotechnol Biochem 65:2428–2436PubMedCrossRefGoogle Scholar
  22. Nielsen R, Yang Z (1998) Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929–936PubMedGoogle Scholar
  23. Pei J, Cai W, Kinch L et al (2006) Prediction of functional specificity determinants from protein sequences using log-likelihood ratios. Bioinformatics 22:164PubMedCrossRefGoogle Scholar
  24. Posada D (2002) Evaluation of methods for detecting recombination from DNA sequences: empirical data. Mol Biol Evol 19:708–717PubMedGoogle Scholar
  25. Posada D, Crandall KA (1998) MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817–818PubMedCrossRefGoogle Scholar
  26. Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574PubMedCrossRefGoogle Scholar
  27. Rushton PJ, Macdonald H, Huttly AK et al (1995) Members of a new family of DNA-binding proteins bind to a conserved cis-element in the promoters of alpha-Amy2 genes. Plant Mol Biol 29:691–702PubMedCrossRefGoogle Scholar
  28. Sawyer S (1989) Statistical tests for detecting gene conversion. Mol Biol Evol 6:526–538PubMedGoogle Scholar
  29. Suyama M, Torrents D, Bork P (2006) PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 34:W609–W612PubMedCrossRefGoogle Scholar
  30. Sywula T (1989) Gene conversion as a possible factor influencing genetic polymorphism. Hereditas 111:171–174PubMedCrossRefGoogle Scholar
  31. Wang M, Zhang X, Zhao H et al (2009) FoxO gene family evolution in vertebrates. BMC Evol Biol 9:222PubMedCrossRefGoogle Scholar
  32. Wong WS, Yang Z, Goldman N et al (2004) Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites. Genetics 168:1041–1051PubMedCrossRefGoogle Scholar
  33. Wu K, Guo Z, Wang H et al (2005) The WRKY family of transcription factors in rice and Arabidopsis and their origins. DNA Res 12:9PubMedCrossRefGoogle Scholar
  34. Yang Z (1998) Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15:568–573PubMedGoogle Scholar
  35. Yang Z (2002) Likelihood and Bayes estimation of ancestral population sizes in hominoids using data from multiple loci. Genetics 162:1811–1823PubMedGoogle Scholar
  36. Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586–1591PubMedCrossRefGoogle Scholar
  37. Yang Z, Nielsen R (2002) Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol Biol Evol 19:908–917PubMedGoogle Scholar
  38. Yuan Z, Bailey T, Teasdale R (2005) Prediction of protein B-factor profiles. Proteins Struct Function Bioinformatics 58:905–912CrossRefGoogle Scholar
  39. Zhang J, Nielsen R, Yang Z (2005) Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol 22:2472–2479PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  • Qishan Wang
    • 1
  • Minghui Wang
    • 1
  • Xiangzhe Zhang
    • 1
  • Boji Hao
    • 1
  • S. K. Kaushik
    • 2
  • Yuchun Pan
    • 1
  1. 1.School of Agriculture and BiologyShanghai Jiao Tong UniversityShanghaiPeople’s Republic of China
  2. 2.Central Potato Research Institute CampusMeerutIndia

Personalised recommendations