Plant Synthetic Promoters pp 297-322 | Cite as
RSAT::Plants: Motif Discovery in ChIP-Seq Peaks of Plant Genomes
Abstract
In this protocol, we explain how to run ab initio motif discovery in order to gather putative transcription factor binding motifs (TFBMs) from sets of genomic regions returned by ChIP-seq experiments. The protocol starts from a set of peak coordinates (genomic regions) which can be either downloaded from ChIP-seq databases, or produced by a peak-calling software tool. We provide a concise description of the successive steps to discover motifs, cluster the motifs returned by different motif discovery algorithms, and compare them with reference motif databases. The protocol is documented with detailed notes explaining the rationale underlying the choice of options. The interpretation of the results is illustrated with an example from the model plant Arabidopsis thaliana.
Key words
Chromatin immunoprecipitation DNA-sequencing (ChIP-seq) Transcription factor (TF) Transcription factor binding motifs (TFBM) Transcription factor binding site (TFBS) Gene ontology (GO) Functional enrichmentNotes
Acknowledgements
We thank C. Dubos for feedback on MYBR3 proteins. This work was funded in part by Fundación ARAID and by the Enseignants-Chercheurs invités program of Aix-Marseille Université (to B.C.M.). C.R. was supported by the France Génomique National infrastructure, funded as part of the Investissements d’Avenir, program managed by the Agence Nationale pour la Recherche (contract ANR-10-INBS-09). J.C-M PhD grant is funded by the Ecole Doctorale des Sciences de la Vie et de la Santé (EDSVS), Aix-Marseille Université.
References
- 1.Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4:651–657CrossRefPubMedGoogle Scholar
- 2.Mardis ER (2007) ChIP-seq: welcome to the new frontier. Nat Methods 4:613–614CrossRefPubMedGoogle Scholar
- 3.Kulakovskiy IV, Boeva VA, Favorov AV, Makeev VJ (2010) Deep and wide digging for binding motifs in ChIP-Seq data. Bioinformatics 26:2622–2623CrossRefPubMedGoogle Scholar
- 4.Machanick P, Bailey TL (2011) MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27:1696–1697CrossRefPubMedPubMedCentralGoogle Scholar
- 5.Thomas-Chollier M, Darbo E, Herrmann C, Defrance M, Thieffry D, van Helden J (2012) A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs. Nat Protoc 7:1551–1568CrossRefPubMedGoogle Scholar
- 6.Thomas-Chollier M, Herrmann C, Defrance M, Sand O, Thieffry D, van Helden J (2012) RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets. Nucleic Acids Res 40, e31CrossRefPubMedGoogle Scholar
- 7.Medina-Rivera A, Defrance M, Sand O, Herrmann C, Castro-Mondragon JA, Delerce J, Jaeger S, Blanchet C, Vincens P, Caron C, Staines DM, Contreras-Moreira B, Artufel M, Charbonnier-Khamvongsa L, Hernandez C, Thieffry D, Thomas-Chollier M, van Helden J (2015) RSAT 2015: Regulatory Sequence Analysis Tools. Nucleic Acids Res 43:W50–W56CrossRefPubMedPubMedCentralGoogle Scholar
- 8.Pepke S, Wold B, Mortazavi A (2009) Computation for ChIP-seq and RNA-seq studies. Nat Methods 6:S22–S32CrossRefPubMedPubMedCentralGoogle Scholar
- 9.Steinhauser S, Kurzawa N, Eils R, Herrmann C (2016) A comprehensive comparison of tools for differential ChIP-seq analysis. Brief Bioinform. doi: 10.1093/bib/bbv110 PubMedGoogle Scholar
- 10.Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9(9):R137. doi: 10.1186/gb-2008-9-9-r137 CrossRefPubMedPubMedCentralGoogle Scholar
- 11.Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK (2010) Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38(4):576–589CrossRefPubMedPubMedCentralGoogle Scholar
- 12.Wilder S (2009) SWEMBL: a generic peak-calling program. Unpublished. http://www.ebi.ac.uk/~swilder/SWEMBL/
- 13.Kharchenko PV, Tolstorukov MY, Park PJ (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol 26:1351–1359. doi: 10.1038/nbt.1508 CrossRefPubMedPubMedCentralGoogle Scholar
- 14.Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100CrossRefPubMedPubMedCentralGoogle Scholar
- 15.Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, Morgunova E, Enge M, Taipale M, Wei G, Palin K, Vaquerizas JM, Vincentelli R, Luscombe NM, Hughes TR, Lemaire P, Ukkonen E, Kivioja T, Taipale J (2013) DNA-binding specificities of human transcription factors. Cell 152:327–339CrossRefPubMedGoogle Scholar
- 16.Sebastian A, Contreras-Moreira B (2014) footprintDB: a database of transcription factors with annotated cis elements and binding interfaces. Bioinformatics 30:258–265CrossRefPubMedGoogle Scholar
- 17.Thorvaldsdóttir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192CrossRefPubMedGoogle Scholar
- 18.Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A (2013) NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res 41:D991–D995CrossRefPubMedGoogle Scholar
- 19.Kolesnikov N, Hastings E, Keays M, Melnichuk O, Tang YA, Williams E, Dylag M, Kurbatova N, Brandizi M, Burdett T, Megy K, Pilicheva E, Rustici G, Tikhonov A, Parkinson H, Petryszak R, Sarkans U, Brazma A (2015) ArrayExpress update—simplifying data submissions. Nucleic Acids Res 43:D1113–D1116CrossRefPubMedGoogle Scholar
- 20.Kobayashi K, Suzuki T, Iwata E et al (2015) Transcriptional repression by MYB3R proteins regulates plant organ growth. EMBO J 34:1992–2007CrossRefPubMedPubMedCentralGoogle Scholar
- 21.Ito M, Araki S, Matsunaga S, Itoh T, Nishihama R, Machida Y, Doonan JH, Watanabe A (2001) G2/M-phase-specific transcription during the plant cell cycle is mediated by c-Myb-like transcription factors. Plant Cell 13:1891–1905CrossRefPubMedPubMedCentralGoogle Scholar
- 22.van Helden J, André B, Collado-Vides J (1998) Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 281:827–842CrossRefPubMedGoogle Scholar