Abstract
Chloroplasts play critical roles in land plant cells. Despite their importance and the availability of at least 200 sequenced chloroplast genomes, the number of known DNA regulatory sequences in chloroplast genomes are limited. In this paper, we designed computational methods to systematically study putative DNA regulatory sequences in intergenic regions near chloroplast genes in seven plant species and in promoter sequences of nuclear genes in Arabidopsis and rice. We found that −35/−10 elements alone cannot explain the transcriptional regulation of chloroplast genes. We also concluded that there are unlikely motifs shared by intergenic sequences of most of chloroplast genes, indicating that these genes are regulated differently. Finally and surprisingly, we found five conserved motifs, each of which occurs in no more than six chloroplast intergenic sequences, are significantly shared by promoters of nuclear-genes encoding chloroplast proteins. By integrating information from gene function annotation, protein subcellular localization analyses, protein–protein interaction data, and gene expression data, we further showed support of the functionality of these conserved motifs. Our study implies the existence of unknown nuclear-encoded transcription factors that regulate both chloroplast genes and nuclear genes encoding chloroplast protein, which sheds light on the understanding of the transcriptional regulation of chloroplast genes.
Similar content being viewed by others
References
Allison LA, Maliga P (1995) Light-responsive and transcription-enhancing elements regulate the plastid psbD core promoter. EMBO J 14(15):3721–3730
Allison LA, Simon LD, Maliga P (1996) Deletion of rpoB reveals a second distinct transcription system in plastids of higher plants. EMBO J 15(11):2802–2809
Allocco DJ, Kohane IS, Butte AJ (2004) Quantifying the relationship between co-expression, co-regulation and gene function. BMC Bioinform 5:18
Arlen PA, Falconer R, Cherukumilli S, Cole A, Cole AM et al (2007) Field production and functional evaluation of chloroplast-derived interferon-alpha 2b. Plant Biotechnol J 5(4):511–525
Arnone MI, Davidson EH (1997) The hardwiring of development: organization and function of genomic regulatory systems. Development (Camb, Engl) 124(10):1851–1864
Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings/international conference on intelligent systems for molecular biology; ISMB 2:28–36
Barbrook AC, Howe CJ, Kurniawan DP, Tarr SJ (2010) Organization and expression of organellar genomes. Philos Trans R Soc Lond 365(1541):785–797
Barkan A (2011) Expression of plastid genes: organelle-specific elaborations on a prokaryotic scaffold. Plant Physiol 155(4):1520–1532
Berends Sexton T, Jones JT, Mullet JE (1990) Sequence and transcriptional analysis of the barley ctDNA region upstream of psbD-psbC encoding trnK(UUU), rps16, trnQ(UUG), psbK, psbI, and trnS(GCU). Curr Genet 17(5):445–454
Blanchette M, Tompa M (2002) Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res 12(5):739–748
Boyle EI, Weng S, Gollub J, Jin H, Botstein D et al (2004) GO: TermFinder–open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics (Oxf, Engl) 20(18):3710–3715
Bussemaker HJ, Li H, Siggia ED (2000) Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. Proc Nat Acad Sci USA 97(18):10096–10100
Cai X, Hou L, Su N, Hu H, Deng M et al (2010) Systematic identification of conserved motif modules in the human genome. BMC Genomics 11(1):567
Christopher DA, Kim M, Mullet JE (1992) A novel light-regulated promoter is conserved in cereal and dicot chloroplasts. Plant Cell 4(7):785–798
Daniell H, Lee SB, Panchal T, Wiebe PO (2001) Expression of the native cholera toxin B subunit gene and assembly as functional oligomers in transgenic tobacco chloroplasts. J Mol Biol 311(5):1001–1009
de Hoon MJ, Imoto S, Nolan J, Miyano S (2004) Open source clustering software. Bioinformatics (Oxf, Engl) 20(9):1453–1454
Dempster A, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc 39(1):1–38
Ferro M, Brugiere S, Salvi D, Seigneurin-Berny D, Court M et al (2010) AT_CHLORO, a comprehensive chloroplast proteome database with sub plastidial localization and curated information on envelope proteins. Mol Cell Proteomics 9(6):1063–1084
Frith MC, Li MC, Weng Z (2003) Cluster-buster: finding dense clusters of motifs in DNA sequences. Nucleic Acids Res 31(13):3666–3668
Gatenby AA, Rothstein SJ, Nomura M (1989) Translational coupling of the maize chloroplast atpB and atpE genes. Proc Nat Acad Sci USA 86(11):4066–4070
Gillham NW, Boynton JE, Hauser CR (1994) Translational regulation of gene expression in chloroplasts and mitochondria. Annu Rev Genet 28:71–93
Goldschmidt-Clermont M (1998) Coordination of nuclear and chloroplast gene expression in plant cells. Int Rev Cytol 177:115–180
Hajdukiewicz PT, Allison LA, Maliga P (1997) The two RNA polymerases encoded by the nuclear and the plastid compartments transcribe distinct groups of genes in tobacco plastids. EMBO J 16(13):4041–4048
Heazlewood JL, Verboom RE, Tonti-Filippini J, Small I, Millar AH (2007) SUBA: the arabidopsis subcellular database. Nucleic Acids Res 35(Database issue):D213–D218
Hertz GZ, Stormo GD (1999) Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics (Oxf, Engl) 15(7–8):563–577
Higo K, Ugawa Y, Iwamoto M, Korenaga T (1999) Plant cis-acting regulatory DNA elements (PLACE) database. Nucleic Acids Res 27(1):297–300
Hu J, Hu H, Li X (2008) MOPAT: a graph-based method to predict recurrent cis-regulatory modules from known motifs. Nucleic Acids Res 36(13):4488–4497
Hughes JD, Estep PW, Tavazoie S, Church GM (2000) Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol 296(5):1205–1214
Iratni R, Baeza L, Andreeva A, Mache R, Lerbs-Mache S (1994) Regulation of rDNA transcription in chloroplasts: promoter exclusion by constitutive repression. Gene Dev 8(23):2928–2938
Jarvis P (2008) Targeting of nucleus-encoded proteins to chloroplasts in plants. New Phytol 179(2):257–285
Jung HS, Chory J (2010) Signaling between chloroplasts and the nucleus: can a systems biology approach bring clarity to a complex and highly regulated pathway? Plant Physiol 152(2):453–459
Kakizaki T, Matsumura H, Nakayama K, Che FS, Terauchi R et al (2009) Coordination of plastid protein import and nuclear gene expression by plastid-to-nucleus retrograde signaling. Plant Physiol 151(3):1339–1353
Kaneko T, Sato S, Kotani H, Tanaka A, Asamizu E et al (1996) Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. DNA Res 3(3):109–136
Kaundal R, Saini R, Zhao PX (2010) Combining machine learning and homology-based approaches to accurately predict subcellular localization in Arabidopsis. Plant Physiol 154(1):36–54
Kessler F, Schnell D (2009) Chloroplast biogenesis: diversity and regulation of the protein import apparatus. Curr Opin Cell Biol 21(4):494–500
Kim M, Mullet JE (1995) Identification of a sequence-specific DNA binding factor required for transcription of the barley chloroplast blue light-responsive psbD-psbC promoter. Plant Cell 7(9):1445–1457
Kleffmann T, Hirsch-Hoffmann M, Gruissem W, Baginsky S (2006) plprot: a comprehensive proteome database for different plastid types. Plant Cell Physiol 47(3):432–436
Koya V, Moayeri M, Leppla SH, Daniell H (2005) Plant-based vaccine: mice immunized with chloroplast-derived anthrax protective antigen survive anthrax lethal toxin challenge. Infect Immun 73(12):8266–8274
Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF et al (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science (New York, NY) 262(5131):208–214
Leister D, Wang X, Haberer G, Mayer KF, Kleine T (2011) Intracompartmental and intercompartmental transcriptional networks coordinate the expression of genes for organellar functions. Plant Physiol 157(1):386–404
Lerbs-Mache S (2011) Function of plastid sigma factors in higher plants: regulation of gene expression or just preservation of constitutive transcription? Plant Mol Biol 76(3–5):235–249
Li X, Wong WH (2005) Sampling motifs on phylogenetic trees. Proc Nat Acad Sci USA 102(27):9481–9486
Li X, Zhong S, Wong WH (2005) Reliable prediction of transcription factor binding sites by phylogenetic verification. Proc Nat Acad Sci USA 102(47):16945–16950
Liere K, Weihe A, Borner T (2011) The transcription machineries of plant mitochondria and chloroplasts: composition, function, and regulation. J Plant Physiol 168(12):1345–1360
Liu X, Brutlag DL, Liu JS (2001) BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pacific symposium on biocomputing, pp 127–138
Maliga P, Bock R (2011) Plastid biotechnology: food, fuel, and medicine for the 21st century. Plant Physiol 155(4):1501–1510
Martin W, Stoebe B, Goremykin V, Hapsmann S, Hasegawa M et al (1998) Gene transfer to the nucleus and the evolution of chloroplasts. Nature 393(6681):162–165
Mayfield SP, Cohen A, Danon A, Yohn CB (1994) Translation of the psbA mRNA of Chlamydomonas reinhardtii requires a structured RNA element contained within the 5′ untranslated region. J Cell Biol 127(6 Pt 1):1537–1545
Merchant SS, Prochnik SE, Vallon O, Harris EH, Karpowicz SJ et al (2007) The chlamydomonas genome reveals the evolution of key animal and plant functions. Science (New York, NY) 318(5848):245–250
Obayashi T, Nishida K, Kasahara K, Kinoshita K (2011) ATTED-II updates: condition-specific gene coexpression to extend coexpression analyses and applications to a broad range of flowering plants. Plant Cell Physiol 52(2):213–219
Olson JM (2006) Photosynthesis in the Archean era. Photosynth Res 88(2):109–117
Pfannschmidt T, Nilsson A, Tullberg A, Link G, Allen JF (1999) Direct transcriptional control of the chloroplast genes psbA and psaAB adjusts photosynthesis to light energy distribution in plants. IUBMB Life 48(3):271–276
Puthiyaveetil S, Allen JF (2008) Transients in chloroplast gene transcription. Biochem Biophys Res Commun 368(4):871–874
Richly E, Leister D (2004) An improved prediction of chloroplast proteins reveals diversities and commonalities in the chloroplast proteomes of Arabidopsis and rice. Gene 329:11–16
Rochaix JD (2001) Posttranscriptional control of chloroplast gene expression. From RNA to photosynthetic complex. Plant Physiol 125(1):142–144
Roth FP, Hughes JD, Estep PW, Church GM (1998) Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat Biotechnol 16(10):939–945
Ruhlman T, Verma D, Samson N, Daniell H (2010) The role of heterologous chloroplast sequence elements in transgene integration and expression. Plant Physiol 152(4):2088–2104
Samson N, Bausher MG, Lee SB, Jansen RK, Daniell H (2007) The complete nucleotide sequence of the coffee (Coffea arabica L.) chloroplast genome: organization and implications for biotechnology and phylogenetic relationships amongst angiosperms. Plant Biotechnol J 5(2):339–353
Schweer J, Turkeri H, Kolpack A, Link G (2011) Role and regulation of plastid sigma factors and their functional interactors during chloroplast transcription—recent lessons from Arabidopsis thaliana. Eur J Cell Biol 89(12):940–946
Shiina T, Tsunoyama Y, Nakahira Y, Khan MS (2005) Plastid RNA polymerases, promoters, and transcription regulators in higher plants. Int Rev Cytol 244:1–68
Sinha S, Blanchette M, Tompa M (2004) PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences. BMC Bioinform 5:170
Sokal RR, Michener CD (1958) A statistical method for evaluating systematic relationships. Univ Kans Sci Bull 38:1409–1438
Sokal R, Michener C (1985) A statistical method for evaluating systematic relationships. Univ Kansas Sci Bull 38:1409–1438
Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Nat Acad Sci USA 100(16):9440–9445
Stormo GD, Hartzell GW 3rd (1989) Identifying protein-binding sites from unaligned DNA fragments. Proc Nat Acad Sci USA 86(4):1183–1187
Sun E, Wu BW, Tewari KK (1989) In vitro analysis of the pea chloroplast 16S rRNA gene promoter. Mol Cell Biol 9(12):5650–5659
Sun Q, Zybailov B, Majeran W, Friso G, Olinares PD et al (2009) PPDB, the plant proteomics database at cornell. Nucleic Acids Res 37(Database issue):D969–D974
Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A et al (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39(Database issue):D561–D568
Taboada B, Verde C, Merino E (2010) High accuracy operon prediction method based on STRING database scores. Nucleic Acids Res 38(12):e130
Thum KE, Kim M, Morishige DT, Eibl C, Koop HU et al (2001) Analysis of barley chloroplast psbD light-responsive promoter elements in transplastomic tobacco. Plant Mol Biol 47(3):353–366
Tsunoyama Y, Ishizaki Y, Morikawa K, Kobori M, Nakahira Y et al (2004) Blue light-induced transcription of plastid-encoded psbD gene is mediated by a nuclear-encoded transcription initiation factor, AtSig5. Proc Nat Acad Sci USA 101(9):3304–3309
Tullberg A, Alexciev K, Pfannschmidt T, Allen JF (2000) Photosynthetic electron flow regulates transcription of the psaB gene in pea (Pisum sativum L.) chloroplasts through the redox state of the plastoquinone pool. Plant Cell Physiol 41(9):1045–1054
van Helden J, Andre B, Collado-Vides J (1998) Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 281(5):827–842
Verma D, Samson NP, Koya V, Daniell H (2008) A protocol for expression of foreign genes in chloroplasts. Nat Protoc 3(4):739–758
Wang T, Stormo GD (2003) Combining phylogenetic data with co-regulated genes to identify regulatory motifs. Bioinform (Oxf, Engl) 19(18):2369–2380
Wingender E, Dietze P, Karas H, Knuppel R (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res 24(1):238–241
Yada T, Nakao M, Totoki Y, Nakai K (1999) Modeling and predicting transcriptional units of Escherichia coli genes using hidden Markov models. Bioinform (Oxf, Engl) 15(12):987–993
Yu J, Langridge WH (2001) A plant-based multicomponent vaccine protects mice from enteric diseases. Nat Biotechnol 19(6):548–552
Yu QB, Li G, Wang G, Sun JC, Wang PC et al (2008) Construction of a chloroplast protein interaction network and functional mining of photosynthetic proteins in Arabidopsis thaliana. Cell Res 18(10):1007–1019
Zhelyazkova P, Sharma CM, Forstner KU, Liere K, Vogel J et al (2012) The primary transcriptome of barley chloroplasts: numerous noncoding RNAs and the dominating role of the plastid-encoded RNA polymerase. Plant Cell 24(1):123–136
Zhou Q, Wong WH (2004) CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc Nat Acad Sci USA 101(33):12114–12119
Acknowledgments
This project is supported by two National Science Foundation grants 1125676 and 1149955 (to HH). We sincerely appreciate the helpful comments by the two reviewers.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Ying Wang and Jun Ding are co-first authors.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Wang, Y., Ding, J., Daniell, H. et al. Motif analysis unveils the possible co-regulation of chloroplast genes and nuclear genes encoding chloroplast proteins. Plant Mol Biol 80, 177–187 (2012). https://doi.org/10.1007/s11103-012-9938-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11103-012-9938-6