Abstract
Plants undergo an extensive change in gene regulation during abiotic stress. It is of great agricultural importance to know which genes are affected during stress response. The genome sequence of a number of plant species has been determined, among them Arabidopsis and Oryza sativa, whose genome has been annotated most completely as of yet, and are well-known organisms widely used as experimental systems. This paper applies a statistical algorithm for predicting new stress-induced motifs and genes by analyzing promoter sets co-regulated by abiotic stress in the previously mentioned two species. After identifying characteristic putative regulatory motif sequence pairs (dyads) in the promoters of 125 stress-regulated Arabidopsis genes and 87 O. sativa genes, these dyads were used to screen the entire Arabidopsis and O. sativa promoteromes to find related stress-induced genes whose promoters contained a large number of these dyads found by our algorithm. We were able to predict a number of putative dyads, characteristic of a large number of stress-regulated genes, some of them newly discovered by our algorithm and serve as putative transcription factor binding sites. Our new motif prediction algorithm comes complete with a stand-alone program. This algorithm may be used in motif discovery in the future in other species. The more than 1,200 Arabidopsis and 1,700 Orzya sativa genes found by our algorithm are good candidates for further experimental studies in abiotic stress.
Similar content being viewed by others
Abbreviations
- ABA:
-
Abscisic acid
- AUC:
-
Area under curve
- PLACE:
-
Plant cis-acting regulatory DNA elements
- REP:
-
Regulatory element pair
- rev comp:
-
Reverse complement
- ROC:
-
Receiver operating characteristic
- TC:
-
Tentative consensus
- TFBS:
-
Transcription factor binding site
- TIGR:
-
The institute for genome research
References
Abe H, Yamaguchi-Shinozaki K, Urao T, Iwasaki T, Hosokawa D, Shinozaki K (1997) Role of arabidopsis MYC and MYB homologs in drought- and abscisic acid-regulated gene expression. Plant Cell 10:1859–1868
Abe H, Urao T, Ito T, Seki M, Shinozaki K, Yamaguchi-Shinozaki K (2003) Arabidopsis AtMYC2 (bHLH) and AtMYB2 (MYB) function as transcriptional activators in abscisic acid signaling. Plant Cell 15:63–78
Alonso-Blanco C, Gomez-Mena C, Llorente F, Koornneef M, Salinas J (2005) Genetic and molecular analyses of natural variation indicate CBF2 as a candidate gene for underlying a freezing tolerance quantitative trait locus in Arabidopsis. Plant Physiol 139(3):1304–1312
Cooper B, Clarke JD, Budworth P, Kreps J, Hutchison D, Park S, Guimil S, Dunn M, Luginbühl P, Ellero C, Goff SA, Glazebrook J (2003) A network of rice genes associated with stress response and seed development. Proc Natl Acad Sci USA 100(8):4945–4950
Cserháti M (2006) Usage of enumeration method based algorithms for finding promoter motifs in plant genomes. Acta Biol Szeged 50(3–4):145
European Plant Science Organization (2005) European plant science a field of opportunities. J Exp Bot 56(417):1699–1709
Gómez-Porras JL, Riaño-Pachón DM, Dreyer I, Mayer JE, Mueller-Roeber B (2007) Genome-wide analysis of ABA-responsive elements ABRE and CE3 reveals divergent patterns in Arabidopsis and rice. BMC Genomics 8:260
Guiltinan MJ, Marcotte WR Jr, Quatrano RS (1990) A plant leucine zipper protein that recognizes an abscisic acid response element. Science 250(4978):267–271
Hattori T, Totsuka M, Hobo T, Kagaya Y, Yamamoto-Toyoda A (2002) Experimentally determined sequence requirement of ACGT-containing abscisic acid response element. Plant Cell Physiol 43(1):136–140
Higo K, Ugawa Y, Iwamoto M, Korenaga Y (1999) Plant cis-acting regulatory DNA elements (PLACE) database. Nucleic Acids Res 27(1):297–300
Jung KH, Dardick C, Bartley LE, Cao P, Phetsom J, Canlas P, Seo YS, Shultz M, Ouyang S, Yuan Q, Frank BC, Ly E, Zheng L, Jia Y, Hsia AP, An K, Chou HH, Rocke D, Lee GC, Schnable PS, An G, Buell CR, Ronald PC (2008) Refinement of light-responsive transcript lists using rice oligonucleotide arrays: evaluation of gene-redundancy. PLoS One 3(10):e3337
Lee Y, Tsai J, Sunkara S, Karamycheva S, Pertea G, Sultana R, Antonescu V, Chan A, Cheung F, Quackenbush J (2005) The TIGR Gene Indices clustering and assembling EST and known genes and integration with eukaryotic genomes. Nucleic Acids Res, 33:D71–D74 (Database issue)
Lescot M, Dehais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y, Rouze R, Rombauts S (2002) PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res 30(1):325–327
Ludwig AA, Saitoh H, Felix G, Freymark G, Miersch O, Wasternack C, Boller T, Jones JD, Romeis T (2005) Ethylene-mediated cross-talk between calcium-dependent protein kinase and MAPK signaling controls stress responses in plants. Proc Natl Acad Sci USA 102(30):10736–10741
Mahajan S, Tuteja N (2005) Cold, salinity and drought stresses: an overview. Arch Biochem Biophys 444(2):139–158
Matys V, Fricke E, Geffers R, Gössling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, Kloos DU, Land S, Lewicki-Potapov B, Michael H, Münch R, Reuter I, Rotert S, Saxel H, Scheer M, Thiele S, Wingender E (2003) TRANSFAC: transcription regulation, from patterns to profiles. Nucleic Acids Res 31(1):374–378
Picot E, Krusche P, Tiskin A, Carré I, Ott S (2010) Evolutionary Analysis of Regulatory Sequences (EARS) in Plants. Plant J, doi:10.1111/j.1365-313X.2010.04314.x
Quackenbush J, Cho J, Lee D, Liang F, Holt I, Karamycheva S, Parvizi B, Pertea G, Sultana R, White J (2001) The TIGR Gene Indices analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res 29(1):159–164
Rombauts SK, Florquin K, Lescot M, Marchal K, Rouzé P, van de Peer Y (2003) Computational approaches to identify promoters and cis-regulatory elements in plant genomes. Plant Physiol 132(3):1162–1176
Sandve GK, Drabløs F (2006) A survey of motif discovery methods in an integrated framework. Biol Direct 1:11
Shinozaki K, Yamaguchi-Shinozaki K, Seki M (2003) Regulatory network of gene expression in the drought and cold stress responses. Curr Opin Plant Biol 6(5):410–417
Simpson SD, Nakashima K, Narusaka Y, Seki M, Shinozaki K, Yamaguchi-Shinozaki K (2003) Two different novel cis-acting elements of erd1, a clpA homologous Arabidopsis gene function in induction by dehydration stress and dark-induced senescence. Plant J 33(2):259–270
Sinha S, Tompa M (2003) YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res 31(13):3586–3588
Solovyev VV, Shahmuradov IA, Salamov AA (2010) Identification of plant promoters and regulatory sites. Methods Mol Biol 674:57–83
Sutoh K, Yamauchi D (2003) Two cis-acting elements necessary and sufficient for gibberellin-upregulated proteinase expression in rice seeds. Plant J 34(5):635–645
Tuteja N (2007) Abscisic acid and abiotic stress signaling. Plant Signal Behav 2(3):135–138
van Helden J, Rios AF, Collado-Vides J (2000) Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic Acids Res 28(8):1808–1818
Vardhanabhuti S, Wang J, Hannenhalli S (2007) Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation. Nucleic Acids Res 35(10):3203–3213
Walther D, Brunnemann R, Selbig J (2007) The regulatory code for transcriptional response diversity and its relation to genome structural properties in A. thaliana. PLoS Genet 3(2):216–229
Wang Z-Y, Kenigsbuch D, Sun L, Harel E, Ong MS, Tobin EM (1997) A myb-related transcription factor is involved in the phytochrome regulation of an Arabidopsis Lhcb gene. Plant cell 9:491–507
Wang S, Yang S, Yin Y, Guo X, Wang S, Hao D (2008) An in silico strategy identified the target gene candidates regulated by dehydration responsive element binding proteins (DREBs) in Arabidopsis genome. Plant Mol Biol 69(1–2):167–178
Wray GA, Hahn MW, Abouheif E, Balhoff JP, Pizer M, Rockman MV, Romano LA (2003) The evolution of transcriptional regulation in eukaryotes. Mol Biol Evol 20(9):1377–1419
Yamaguchi-Shinozaki K, Shinozaki K (2005) Organization of cis-acting regulatory elements in osmotic- and cold-stress-responsive promoters. Trends Plant Sci 10(2):88–94
Yu X, Lin J, Masuda T, Esumi N, Zack DJ, Qian J (2006) Genome-wide prediction and characterization of interactions between transcription factors in Saccharomyces cerevisiae. Nucleic Acids Res 34(3):917–927
Zhang B, Chen W, Foley RC, Büttner M, Singh KB (1995) Interactions between distinct types of DNA binding proteins enhance binding to ocs element promoter sequences. Plant Cell 7(12):2241–2252
Zhang W, Ruan J, Ho TH, You Y, Yu T, Quatrano RS (2005) Cis-regulatory element based targeted gene finding: genome-wide identification of abscisic acid- and abiotic stress-responsive genes in Arabidopsis thaliana. Bioinformatics 21(14):3074–3081
Zimmermann P, Hirsch-Hoffmann M, Hennig L, Gruissem W (2004) GENEVESTIGATOR Arabidopsis microarray database and analysis toolbox. Plant Physiol 136(1):2621–2632
Acknowledgments
The first author would hereby like to mention that the underlying idea of the analysis of the occurence distributions of putative transcription factor binding site dyads originated from his college thesis work which was done at the Agricultural Biotechnology Center in Gödöllő, Hungary. This work was funded by the OTKA T046495 grant given by the Hungarian National Science foundation and the Bio-140-KPI given by the National Office for Research and Technology (NKTH) as well as grant number 4-065-2004. We would also like to thank William Gruissem for supplying us data from the Genevestigator Arabidopsis database. The authors would like to thank Maria Sečenji and Krisztina Talpas for their assistance in setting up and helping in greenhouse experiments and QT-PCR work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Y. Van de Peer.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Online resource 1 –Excel data file for Arabidopsis (SupplementaryArabidopsisData.xls) This file contains all the data produced by our analysis for Arabidopsis. This includes information on the genes whose promoters were used in the analysis, as well as the distribution of the TRANSFAC and PLACE motifs, the dyads found in the stress learning set with a minimum occurrence of 5, and a minimum score of 0.6, as well as the 81 optimum dyads. Furthermore, data from the ROC analysis, the promoterome search, and the regulatory network analysis is also included.
Online resource 2 –Excel data file for Oryza sativa (SupplementaryRiceData.xls) This file contains all the data produced by our analysis for Oryza sativa. This includes information on the genes whose promoters were used in the stress and non-stress promoter sets, the dyads found in the stress learning set with a minimum occurrence of 5, and a minimum score of 0.5, as well as the 38 optimum dyads. Furthermore, data from the ROC analysis, the promoterome search, and the GO term analysis is also included.
438_2011_605_MOESM1_ESM.tif
Supplemental Figure 1-The optimum AUC score was calculated in Arabidopsis using the top 81 dyads found in the two learning promoter sets. These dyads were found back in the tuning promoter set according to a number of parameters (minimum dyad score, spacer wobbling, minimum occurrence of dyad in stress learning set). In Arabidopsis, this parameter combination is 0.9, 2 bp, and a minimum occurrence of 14. The different colors denote different AUC value ranges. (TIFF 168 kb)
438_2011_605_MOESM2_ESM.tif
Supplemental Figure 2-The optimum AUC score was calculated in Oryza sativa using the top 38 dyads found in the two learning promoter sets. These dyads were found back in the tuning promoter set according to a number of parameters (minimum dyad score, spacer wobbling, minimum occurrence of dyad in stress learning set). In Oryza sativa, this parameter combination is 0.89, 0 bp, and a minimum occurrence of 9. (TIFF 126 kb)
Rights and permissions
About this article
Cite this article
Cserháti, M., Turóczy, Z., Zombori, Z. et al. Prediction of new abiotic stress genes in Arabidopsis thaliana and Oryza sativa according to enumeration-based statistical analysis. Mol Genet Genomics 285, 375–391 (2011). https://doi.org/10.1007/s00438-011-0605-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-011-0605-4