Abstract
Regulation of gene expression is a fundamental biological process that relies on transcription factors (TF) recognizing specific cis motifs in the regulatory regions of the genes that they control. In most eukaryotic organisms, cis-regulatory elements are significantly enriched around the transcription start site (TSS). However, different from other genic features, TSSs need to be experimentally determined, becoming then important components of genome annotations. One of the methods for experimentally determining TSSs at the genome-wide level is CAGE (cap analysis of gene expression). This chapter describes how to prepare a CAGE library for sequencing, starting with RNA extraction, library construction, and quality controls before proceed to sequencing in the Illumina platform. We then describe how to use a computational pipeline to determine, from the alignment of CAGE tags, the genome-wide location of TSSs, followed with statistical approaches required to cluster TSSs that operate as transcriptional units, and to determine core promoter properties such as shape. The analyses described here focus on maize, since its large and yet deficiently annotated genome creates some unique challenges, but with some modifications can be easily adopted for other organisms as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brkljacic J, Grotewold E (2017) Combinatorial control of plant gene expression. Biochim Biophys Acta 1860:31–40
Meyer CA, Liu XS (2014) Identifying and mitigating bias in next-generation sequencing methods for chromatin biology. Nat Rev Genet 15:709–721
Davie K, Jacobs J, Atkins M, Potier D, Christiaens V, Halder G, Aerts S (2015) Discovery of transcription factors and regulatory regions driving in vivo tumor development by ATAC-seq and FAIRE-seq open chromatin profiling. PLoS Genet 11:e1004994
Giresi PG, Kim J, McDaniell RM, Iyer VR, Lieb JD (2007) FAIRE (formaldehyde-assisted isolation of regulatory elements) isolates active regulatory elements from human chromatin. Genome Res 17:877–885
Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E, Sheffield NC, Stergachis AB, Wang H, Vernot B (2012) The accessible chromatin landscape of the human genome. Nature 489:75–82
Rodgers-Melnick E, Vera DL, Bass HW, Buckler ES (2016) Open chromatin reveals the functional maize genome. Proc Natl Acad Sci U S A 113:E3177–E3184
Juven-Gershon T, Kadonaga JT (2010) Regulation of gene expression via the core promoter and the basal transcriptional machinery. Dev Biol 339:225–229
Lenhard B, Sandelin A, Carninci P (2012) Metazoan promoters: emerging characteristics and insights into transcriptional regulation. Nat Rev Genet 13:233–245
Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa T, Kawaji H, Kodzius R, Watahiki A, Nakamura M, Arakawa T, Fukuda S, Sasaki D, Podhajska A, Harbers M, Kawai J, Carninci P, Hayashizaki Y (2003) Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci U S A 100:15776–15781
Batut P, Gingeras TR (2013) Rampage: Promoter activity profiling by paired-end sequencing of 5′-complete cdnas. Curr Protoc Mol Biol 104:Unit 25B 11
Ni T, Corcoran DL, Rach EA, Song S, Spana EP, Gao Y, Ohler U, Zhu J (2010) A paired-end sequencing strategy to map the complex landscape of transcription initiation. Nat Methods 7:521–527
Mejia-Guerra MK, Li W, Galeano NF, Vidal M, Gray J, Doseff AI, Grotewold E (2015) Core promoter plasticity between maize tissues and genotypes contrasts with predominance of sharp transcription initiation sites. Plant Cell 27:3309–3320
Takahashi H, Kato S, Murata M, Carninci P (2012) CAGE (cap analysis of gene expression): A protocol for the detection of promoter and transcriptional networks. In: Deplancke B, Gheldof N (eds) Gene regulatory networks: methods and protocols, Methods in molecular biology, vol 786. Humana Press Inc., Totowa, NJ, pp 181–200
Takahashi H, Lassmann T, Murata M, Carninci P (2012) 5′ end-centered expression profiling using cap-analysis gene expression and next-generation sequencing. Nat Protoc 7:542–561
Haberle V, Forrest AR, Hayashizaki Y, Carninci P, Lenhard B (2015) Cager: precise TSS data retrieval and high-resolution promoterome mining for integrative analyses. Nucleic Acids Res 43:e51
Balwierz PJ, Carninci P, Daub CO, Kawai J, Hayashizaki Y, Van Belle W, Beisel C, van Nimwegen E (2009) Methods for analyzing deep sequencing expression data: constructing the human and mouse promoterome with deepcage data. Genome Biol 10:R79
Frith MC, Valen E, Krogh A, Hayashizaki Y, Carninci P, Sandelin A (2008) A code for transcription initiation in mammalian genomes. Genome Res 18:1–12
Nepal C, Hadzhiev Y, Previti C, Haberle V, Li N, Takahashi H, Suzuki AM, Sheng Y, Abdelhamid RF, Anand S, Gehrig J, Akalin A, Kockx CE, van der Sloot AA, van Ijcken WF, Armant O, Rastegar S, Watson C, Strahle U, Stupka E, Carninci P, Lenhard B, Muller F (2013) Dynamic regulation of the transcription initiation landscape at single nucleotide resolution during vertebrate embryogenesis. Genome Res 23:1938–1950
Haberle V, Li N, Hadzhiev Y, Plessy C, Previti C, Nepal C, Gehrig J, Dong X, Akalin A, Suzuki AM, van IWF, Armant O, Ferg M, Strahle U, Carninci P, Muller F, Lenhard B (2014) Two independent transcription initiation codes overlap on vertebrate core promoters. Nature 507:381–385
Hoskins RA, Landolin JM, Brown JB, Sandler JE, Takahashi H, Lassmann T, Yu C, Booth BW, Zhang D, Wan KH, Yang L, Boley N, Andrews J, Kaufman TC, Graveley BR, Bickel PJ, Carninci P, Carlson JW, Celniker SE (2011) Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res 21:182–192
Lassmann T, Hayashizaki Y, Daub CO (2009) TAGDUST – a program to eliminate artifacts from next generation sequencing data. Bioinformatics 25:2839–2840
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359
Langmead B (2010) Aligning short sequencing reads with Bowtie. Curr Protoc Bioinformatics Chapter 11:Unit 11 17
Faulkner GJ, Forrest AR, Chalk AM, Schroder K, Hayashizaki Y, Carninci P, Hume DA, Grimmond SM (2008) A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE. Genomics 91:281–288
Hashimoto T, de Hoon MJ, Grimmond SM, Daub CO, Hayashizaki Y, Faulkner GJ (2009) Probabilistic resolution of multi-mapping reads in massively parallel sequencing data using mumrescuelite. Bioinformatics 25:2613–2614
Pagès H (2018) BSgenome: Software infrastructure for efficient representation of full genomes and their SNPs. R package version 1.48.0
Acknowledgments
This work was supported by grants from the USA National Science Foundation IOS-1125620 and IOS-1733633 to A.I.D and E.G.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Mejia-Guerra, M.K., Li, W., Doseff, A.I., Grotewold, E. (2018). Genome-Wide TSS Identification in Maize. In: Yamaguchi, N. (eds) Plant Transcription Factors. Methods in Molecular Biology, vol 1830. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8657-6_14
Download citation
DOI: https://doi.org/10.1007/978-1-4939-8657-6_14
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-8656-9
Online ISBN: 978-1-4939-8657-6
eBook Packages: Springer Protocols