Abstract
FootprintDB is a database and search engine that compiles regulatory sequences from open access libraries of curated DNA cis-elements and motifs, and their associated transcription factors (TFs). It systematically annotates the binding interfaces of the TFs by exploiting protein–DNA complexes deposited in the Protein Data Bank. Each entry in footprintDB is thus a DNA motif linked to the protein sequence of the TF(s) known to recognize it, and in most cases, the set of predicted interface residues involved in specific recognition. This chapter explains step-by-step how to search for DNA motifs and protein sequences in footprintDB and how to focus the search to a particular organism. Two real-world examples are shown where this software was used to analyze transcriptional regulation in plants. Results are described with the aim of guiding users on their interpretation, and special attention is given to the choices users might face when performing similar analyses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Stormo GD (2000) DNA binding sites: representation and discovery. Bioinformatics 16(1):16–23
Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18(20):6097–6100
Sebastian A, Contreras-Moreira B (2013) The twilight zone of cis element alignments. Nucleic Acids Res 41(3):1438–1449. doi:10.1093/nar/gks1301
Galas DJ, Schmitz A (1978) DNAse footprinting: a simple method for the detection of protein-DNA binding specificity. Nucleic Acids Res 5(9):3157–3170
Garner MM, Revzin A (1981) A gel electrophoresis method for quantifying the binding of proteins to specific DNA regions: application to components of the Escherichia coli lactose operon regulatory system. Nucleic Acids Res 9(13):3047–3060
O’Neill LP, Turner BM (1996) Immunoprecipitation of chromatin. Methods Enzymol 274:189–197
Ren B, Robert F, Wyrick JJ, Aparicio O, Jennings EG, Simon I, Zeitlinger J, Schreiber J, Hannett N, Kanin E, Volkert TL, Wilson CJ, Bell SP, Young RA (2000) Genome-wide location and function of DNA binding proteins. Science 290(5500):2306–2309. doi:10.1126/science.290.5500.2306
Berger MF, Bulyk ML (2006) Protein binding microarrays (PBMs) for rapid, high-throughput characterization of the sequence specificities of DNA binding proteins. Methods Mol Biol 338:245–260. doi:10.1385/1-59745-097-9:245
Johnson DS, Mortazavi A, Myers RM, Wold B (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316(5830):1497–1502. doi:10.1126/science.1141319
Ogawa N, Biggin MD (2012) High-throughput SELEX determination of DNA sequences bound by transcription factors in vitro. Methods Mol Biol 786:51–63. doi:10.1007/978-1-61779-292-2_3
Machanick P, Bailey TL (2011) MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27(12):1696–1697. doi:10.1093/bioinformatics/btr189
Thomas-Chollier M, Herrmann C, Defrance M, Sand O, Thieffry D, van Helden J (2011) RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets. Nucleic Acids Res. doi:10.1093/nar/gkr1104
Hertz GZ, Stormo GD (1999) Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7–8):563–577, doi:btc069 [pii]
Bailey TL, Williams N, Misleh C, Li WW (2006) MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 34(Web Server issue):W369–W373, doi:34/suppl_2/W369 [pii]10.1093/nar/gkl198
Frith MC, Fu Y, Yu L, Chen JF, Hansen U, Weng Z (2004) Detection of functional DNA motifs via statistical over-representation. Nucleic Acids Res 32(4):1372–1381. doi:10.1093/nar/gkh299
Chen QK, Hertz GZ, Stormo GD (1995) MATRIX SEARCH 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices. Comput Appl Biosci 11(5):563–566
Mahony S, Auron PE, Benos PV (2007) DNA familial binding profiles made easy: comparison of various motif alignment and clustering strategies. PLoS Comput Biol 3(3), e61. doi:10.1371/journal.pcbi.0030061
Turatsinze JV, Thomas-Chollier M, Defrance M, van Helden J (2008) Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules. Nat Protoc 3(10):1578–1588. doi:10.1038/nprot.2008.97
Bailey TL, Gribskov M (1998) Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14(1):48–54
Shortle D, DiMaio D, Nathans D (1981) Directed mutagenesis. Ann Rev Genet 15:265–294. doi:10.1146/annurev.ge.15.120181.001405
O’Neill M, Dryden DT, Murray NE (1998) Localization of a protein-DNA interface by random mutagenesis. EMBO J 17(23):7118–7127. doi:10.1093/emboj/17.23.7118
Morozov AV, Havranek JJ, Baker D, Siggia ED (2005) Protein-DNA binding specificity predictions with structural models. Nucleic Acids Res 33(18):5781–5798. doi:10.1093/nar/gki875
Alamanova D, Stegmaier P, Kel A (2010) Creating PWMs of transcription factors using 3D structure-based computation of protein-DNA free binding energies. BMC Bioinformatics 11:225. doi:10.1186/1471-2105-11-225
Contreras-Moreira B, Collado-Vides J (2006) Comparative footprinting of DNA-binding proteins. Bioinformatics 22(14):e74–e80. doi:10.1093/bioinformatics/btl215
Angarica VE, Perez AG, Vasconcelos AT, Collado-Vides J, Contreras-Moreira B (2008) Prediction of TF target sites based on atomistic models of protein-DNA complexes. BMC Bioinformatics 9:436. doi:10.1186/1471-2105-9-436
Contreras-Moreira B (2010) 3D-footprint: a database for the structural analysis of protein-DNA complexes. Nucleic Acids Res 38(Database issue):D91–D97. doi:10.1093/nar/gkp781
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28(1):235–242
Sebastian A, Contreras-Moreira B (2014) footprintDB: a database of transcription factors with annotated cis elements and binding interfaces. Bioinformatics 30(2):258–265. doi:10.1093/bioinformatics/btt663
Mathelier A, Zhao X, Zhang AW, Parcy F, Worsley-Hunt R, Arenillas DJ, Buchman S, Chen CY, Chou A, Ienasescu H, Lim J, Shyr C, Tan G, Zhou M, Lenhard B, Sandelin A, Wasserman WW (2014) JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res 42(Database issue):D142–D147. doi:10.1093/nar/gkt997
Hume MA, Barrera LA, Gisselbrecht SS, Bulyk ML (2015) UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein-DNA interactions. Nucleic Acids Res 43(Database issue):D117–D122. doi:10.1093/nar/gku1045
Jolma A, Yan J, Whitington T, Toivonen J, Nitta Kazuhiro R, Rastas P, Morgunova E, Enge M, Taipale M, Wei G, Palin K, Vaquerizas Juan M, Vincentelli R, Luscombe Nicholas M, Hughes Timothy R, Lemaire P, Ukkonen E, Kivioja T, Taipale J (2013) DNA-binding specificities of human transcription factors. Cell 152(1):327–339
Kulakovskiy IV, Medvedeva YA, Schaefer U, Kasianov AS, Vorontsov IE, Bajic VB, Makeev VJ (2013) HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Res 41(Database issue):D195–D202. doi:10.1093/nar/gks1089
Salgado H, Peralta-Gil M, Gama-Castro S, Santos-Zavaleta A, Muniz-Rascado L, Garcia-Sotelo JS, Weiss V, Solano-Lira H, Martinez-Flores I, Medina-Rivera A, Salgado-Osorio G, Alquicira-Hernandez S, Alquicira-Hernandez K, Lopez-Fuentes A, Porron-Sotelo L, Huerta AM, Bonavides-Martinez C, Balderas-Martinez YI, Pannier L, Olvera M, Labastida A, Jimenez-Jacinto V, Vega-Alvarado L, Del Moral-Chavez V, Hernandez-Alvarez A, Morett E, Collado-Vides J (2013) RegulonDB v80: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more. Nucleic Acids Res 41(Database issue):D203–D213. doi:10.1093/nar/gks1201
Sierro N, Makita Y, de Hoon M, Nakai K (2008) DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information. Nucleic Acids Res 36(Database issue):D93–D96. doi:10.1093/nar/gkm910
Bülow L, Engelmann S, Schindler M, Hehl R (2009) AthaMap, integrating transcriptional and post-transcriptional data. Nucleic Acids Res 37(Database issue):D983–D986. doi:10.1093/nar/gkn709
Franco-Zorrilla JM, Lopez-Vidriero I, Carrasco JL, Godoy M, Vera P, Solano R (2014) DNA-binding specificities of plant transcription factors and their potential to define target genes. Proc Natl Acad Sci U S A 111(6):2367–2372. doi:10.1073/pnas.1316278111
Down TA, Bergman CM, Su J, Hubbard TJ (2007) Large-scale discovery of promoter motifs in Drosophila melanogaster. PLoS Comput Biol 3(1), e7. doi:10.1371/journal.pcbi.0030007
Enuameh MS, Asriyan Y, Richards A, Christensen RG, Hall VL, Kazemian M, Zhu C, Pham H, Cheng Q, Blatti C, Brasefield JA, Basciotta MD, Ou J, McNulty JC, Zhu LJ, Celniker SE, Sinha S, Stormo GD, Brodsky MH, Wolfe SA (2013) Global analysis of Drosophila Cys(2)-His(2) zinc finger proteins reveals a multitude of novel recognition motifs and binding determinants. Genome Res 23(6):928–940. doi:10.1101/gr.151472.112
Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter I, Chekmenev D, Krull M, Hornischer K, Voss N, Stegmaier P, Lewicki-Potapov B, Saxel H, Kel AE, Wingender E (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res 34(Database issue):D108–D110. doi:10.1093/nar/gkj143
Higo K, Ugawa Y, Iwamoto M, Korenaga T (1999) Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Res 27(1):297–300. doi:10.1093/nar/27.1.297
Davuluri RV, Sun H, Palaniswamy SK, Matthews N, Molina C, Kurtz M, Grotewold E (2003) AGRIS: Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors. BMC Bioinformatics 4:25. doi:10.1186/1471-2105-4-25
Medina-Rivera A, Defrance M, Sand O, Herrmann C, Castro-Mondragon J, Delerce J, Spinelli L, Jaeger S, Blanchet C, Vincens P, Caron C, Staines D, Contreras-Moreira B, Artufel M, Charbonnier L, Hernandez C, Thieffry D, Thomas-Chollier M, van Helden J (2015) RSAT 2015: Regulatory Sequence Analysis Tools. Nucleic Acids Res 43:W50–W56
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37(Web Server issue):W202–W208. doi:10.1093/nar/gkp335
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410. doi:10.1006/jmbi.1990.9999
Mahony S, Benos PV (2007) STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res 35(Web Server issue):W253–W258
Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M (2014) Pfam: the protein families database. Nucleic Acids Res 42(Database issue):D222–D230. doi:10.1093/nar/gkt1223
Martinez-Garcia JF, Moyano E, Alcocer MJ, Martin C (1998) Two bZIP proteins from Antirrhinum flowers preferentially bind a hybrid C-box/G-box motif and help to define a new sub-family of bZIP transcription factors. Plant J 13(4):489–505
Dubos C, Kelemen Z, Sebastian A, Bülow L, Huep G, Xu W, Grain D, Salsac F, Brousse C, Lepiniec L, Weisshaar B, Contreras-Moreira B, Hehl R (2014) Integrating bioinformatic resources to predict transcription factors interacting with cis-sequences conserved in co-regulated genes. BMC Genomics 15(1):317. doi:10.1186/1471-2164-15-317
Che D, Jensen S, Cai L, Liu JS (2005) BEST: binding-site estimation suite of tools. Bioinformatics 21(12):2909–2911
Serra TS, Figueiredo DD, Cordeiro AM, Almeida DM, Lourenco T, Abreu IA, Sebastian A, Fernandes L, Contreras-Moreira B, Oliveira MM, Saibo NJ (2013) OsRMC, a negative regulator of salt stress response in rice, is regulated by two AP2/ERF transcription factors. Plant Mol Biol 82(4–5):439–455. doi:10.1007/s11103-013-0073-9
Bryne JC, Valen E, Tang MH, Marstrand T, Winther O, da Piedade I, Krogh A, Lenhard B, Sandelin A (2008) JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res 36(Database issue):D102–D106. doi:10.1093/nar/gkm955
Wingender E, Karas H, Knuppel R (1997) TRANSFAC database as a bridge between sequence data libraries and biological function. Pac Symp Biocomput:477–485
Chang WC, Lee TY, Huang HD, Huang HY, Pan RL (2008) PlantPAN: plant promoter analysis navigator, for identifying combinatorial cis-regulatory elements with distance constraint in plant gene groups. BMC Genomics 9:561. doi:10.1186/1471-2164-9-561
Paz-Ares J, Regia C (2002) REGIA, an EU project on functional genomics of transcription factors from Arabidopsis thaliana. Comp Funct Genomics 3(2):102–108. doi:10.1002/cfg.146
Acknowledgments
We would like to thank our colleagues C. Dubos, L Bülow, N. Saibo, T. Serra and J. van Helden for past and current collaborations. This work was funded by grant Euroinvestigación EUI2008-03612 under the framework of the Transnational (Germany, France, Spain) Cooperation within the PLANT-KBBE Initiative.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this protocol
Cite this protocol
Contreras-Moreira, B., Sebastian, A. (2016). FootprintDB: Analysis of Plant Cis-Regulatory Elements, Transcription Factors, and Binding Interfaces. In: Hehl, R. (eds) Plant Synthetic Promoters. Methods in Molecular Biology, vol 1482. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6396-6_17
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6396-6_17
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6394-2
Online ISBN: 978-1-4939-6396-6
eBook Packages: Springer Protocols