Skip to main content

Cap Analysis of Gene Expression (CAGE): A Quantitative and Genome-Wide Assay of Transcription Start Sites

  • Protocol
  • First Online:
Bioinformatics for Cancer Immunotherapy

Abstract

Cap analysis of gene expression (CAGE) is an approach to identify and monitor the activity (transcription initiation frequency) of transcription start sites (TSSs) at single base-pair resolution across the genome. It has been effectively used to identify active promoter and enhancer regions in cancer cells, with potential utility to identify key factors to immunotherapy. Here, we overview a series of CAGE protocols and describe detailed experimental steps of the latest protocol based on the Illumina sequencing platform; both experimental steps (see Subheadings 3.13.11) and computational processing steps (see Subheadings 3.123.20) are described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schena M, Shalon D, Davis RW et al (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467–470

    Article  CAS  PubMed  Google Scholar 

  2. Adams MD, Kelley JM, Gocayne JD et al (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252:1651–1656

    Article  CAS  PubMed  Google Scholar 

  3. Velculescu VE, Zhang L, Vogelstein B et al (1995) Serial analysis of gene expression. Science 270:484–487

    Article  CAS  PubMed  Google Scholar 

  4. Margulies M, Egholm M, Altman WE et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380. https://doi.org/10.1038/nature03959

    Article  PubMed  PubMed Central  Google Scholar 

  5. Morin R, Bainbridge M, Fejes A et al (2008) Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing. BioTechniques 45:81–94. https://doi.org/10.2144/000112900

    Article  CAS  PubMed  Google Scholar 

  6. Mortazavi A, Williams BA, McCue K et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628. https://doi.org/10.1038/nmeth.1226

    Article  CAS  PubMed  Google Scholar 

  7. Cloonan N, Forrest ARR, Kolle G et al (2008) Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods 5:613–619. https://doi.org/10.1038/nmeth.1223

    Article  CAS  PubMed  Google Scholar 

  8. Carninci P, Kasukawa T, Katayama S et al (2005) The transcriptional landscape of the mammalian genome. Science 309:1559–1563. https://doi.org/10.1126/science.1112014

    Article  CAS  PubMed  Google Scholar 

  9. Kawamoto S, Yoshii J, Mizuno K et al (2000) BodyMap: a collection of 3′ ESTs for analysis of human gene expression information. Genome Res 10:1817–1827

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet 12:671–682. https://doi.org/10.1038/nrg3068

    Article  CAS  PubMed  Google Scholar 

  11. Marioni JC, Mason CE, Mane SM et al (2008) RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18:1509–1517. https://doi.org/10.1101/gr.079558.108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Pruitt KD, Tatusova T, Maglott DR (2005) NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 33:D501–D504. https://doi.org/10.1093/nar/gki025

    Article  CAS  PubMed  Google Scholar 

  13. Harrow J, Frankish A, Gonzalez JM et al (2012) GENCODE: the reference human genome annotation for the ENCODE project. Genome Res 22:1760–1774. https://doi.org/10.1101/gr.135350.111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Shiraki T, Kondo S, Katayama S et al (2003) Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci U S A 100:15776–15781. https://doi.org/10.1073/pnas.2136655100

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. FANTOM Consortium and the RIKEN PMI and CLST (DGT), Forrest ARR, Kawaji H et al (2014) A promoter-level mammalian expression atlas. Nature 507:462–470. https://doi.org/10.1038/nature13182

    Article  CAS  Google Scholar 

  16. Andersson R, Gebhard C, Miguel-Escalada I et al (2014) An atlas of active enhancers across human cell types and tissues. Nature 507:455–461. https://doi.org/10.1038/nature12787

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Arner E, Daub CO, Vitting-Seerup K et al (2015) Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science 347:1010–1014. https://doi.org/10.1126/science.1259418

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Dunham I, Kundaje A, Aldred SF et al (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. https://doi.org/10.1038/nature11247

    Article  CAS  Google Scholar 

  19. Carninci P, Sandelin A, Lenhard B et al (2006) Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet 38:626–635. https://doi.org/10.1038/ng1789

    Article  CAS  PubMed  Google Scholar 

  20. Valen E, Pascarella G, Chalk A et al (2009) Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE. Genome Res 19:255–265. https://doi.org/10.1101/gr.084541.108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Faulkner GJ, Kimura Y, Daub CO et al (2009) The regulated retrotransposon transcriptome of mammalian cells. Nat Genet 41:563–571. https://doi.org/10.1038/ng.368

    Article  CAS  PubMed  Google Scholar 

  22. Takahashi H, Lassmann T, Murata M et al (2012) 5′ end-centered expression profiling using cap-analysis gene expression and next-generation sequencing. Nat Protoc 7:542–561. https://doi.org/10.1038/nprot.2012.005

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Djebali S, Davis CA, Merkel A et al (2012) Landscape of transcription in human cells. Nature 489:101–108. https://doi.org/10.1038/nature11233

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Carninci P, Kvam C, Kitamura A et al (1996) High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics 37:327–336. https://doi.org/10.1006/geno.1996.0567

    Article  CAS  PubMed  Google Scholar 

  25. Kim T-K, Hemberg M, Gray JM et al (2010) Widespread transcription at neuronal activity-regulated enhancers. Nature 465:182–187. https://doi.org/10.1038/nature09033

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. de Hoon M, Shin JW, Carninci P (2015) Paradigm shifts in genomics through the FANTOM projects. Mamm Genome 26:391–402. https://doi.org/10.1007/s00335-015-9593-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Kodzius R, Kojima M, Nishiyori H et al (2006) CAGE: cap analysis of gene expression. Nat Methods 3:211–222. https://doi.org/10.1038/nmeth0306-211

    Article  CAS  PubMed  Google Scholar 

  28. Ravasi T, Suzuki H, Cannistraci CV et al (2010) An atlas of combinatorial transcriptional regulation in mouse and man. Cell 140:744–752. https://doi.org/10.1016/j.cell.2010.01.044

    Article  CAS  PubMed  Google Scholar 

  29. FANTOM Consortium, Suzuki H, Forrest ARR, van Nimwegen E et al (2009) The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line. Nat Genet 41:553–562. https://doi.org/10.1038/ng.375

    Article  CAS  PubMed Central  Google Scholar 

  30. Taft RJ, Glazov EA, Cloonan N et al (2009) Tiny RNAs associated with transcription start sites in animals. Nat Genet 41:572–578. https://doi.org/10.1038/ng.312

    Article  CAS  PubMed  Google Scholar 

  31. Plessy C, Bertin N, Takahashi H et al (2010) Linking promoters to functional transcripts in small samples with nanoCAGE and CAGEscan. Nat Methods 7:528–534. https://doi.org/10.1038/nmeth.1470

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Zhu YY, Machleder EM, Chenchik A et al (2001) Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. BioTechniques 30:892–897

    Article  CAS  PubMed  Google Scholar 

  33. Ohtake H, Ohtoko K, Ishimaru Y et al (2004) Determination of the capped site sequence of mRNA based on the detection of cap-dependent nucleotide addition using an anchor ligation method. DNA Res 11:305–309

    Article  CAS  PubMed  Google Scholar 

  34. Harris TD, Buzby PR, Babcock H et al (2008) Single-molecule DNA sequencing of a viral genome. Science 320:106–109. https://doi.org/10.1126/science.1150427

    Article  CAS  PubMed  Google Scholar 

  35. Kawaji H, Lizio M, Itoh M et al (2014) Comparison of CAGE and RNA-seq transcriptome profiling using clonally amplified and single-molecule next-generation sequencing. Genome Res 24:708–717. https://doi.org/10.1101/gr.156232.113

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Itoh M, Kojima M, Nagao-Sato S et al (2012) Automated workflow for preparation of cDNA for cap analysis of gene expression on a single molecule sequencer. PLoS One 7:e30809. https://doi.org/10.1371/journal.pone.0030809

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Kanamori-Katayama M, Itoh M, Kawaji H et al (2011) Unamplified cap analysis of gene expression on a single-molecule sequencer. Genome Res 21:1150–1159. https://doi.org/10.1101/gr.115469.110

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Murata M, Nishiyori-Sueki H, Kojima-Ishiyama M et al (2014) Detecting expressed genes using CAGE. Methods Mol Biol 1164:67–85. https://doi.org/10.1007/978-1-4939-0805-9_7

    Article  CAS  PubMed  Google Scholar 

  39. Hasegawa A, Daub C, Carninci P et al (2014) MOIRAI: a compact workflow system for CAGE analysis. BMC Bioinformatics 15:144. https://doi.org/10.1186/1471-2105-15-144

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Gordon A, Hannon GJ (2010) Fastx-toolkit. FASTQ/A short-reads pre-processing tools. http://hannonlab.cshl.edu/fastx_toolkit. Accessed 19 Jul 2019

  41. Lassmann T (2015) TagDust2: a generic method to extract reads from sequencing data. BMC Bioinformatics 16:24. https://doi.org/10.1186/s12859-015-0454-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. FANTOM Consortium (2014) rRNAdust program. http://fantom.gsc.riken.jp/5/suppl/rRNAdust/. Accessed 19 Jul 2019

  43. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. https://doi.org/10.1093/bioinformatics/btp324

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111. https://doi.org/10.1093/bioinformatics/btp120

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Dobin A, Gingeras TR (2015) Mapping RNA-seq Reads with STAR. Curr Protoc Bioinformatics 51:11.14.1–11.14.19. https://doi.org/10.1002/0471250953.bi1114s51

    Article  Google Scholar 

  46. Lassmann T (2011) DELVE: a probabilistic short read aligner used in FANTOM5 and ENCODE. http://fantom.gsc.riken.jp/5/suppl/delve/delve.tgz. Accessed 19 Jul 2019

  47. Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Quinlan AR (2014) BEDTools: The Swiss-army tool for genome feature analysis. Curr Protoc Bioinformatics 47:11.12.1–11.12.34. https://doi.org/10.1002/0471250953.bi1112s47

    Article  Google Scholar 

  49. Robinson JT, Thorvaldsdóttir H, Winckler W et al (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26. https://doi.org/10.1038/nbt.1754

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Rosenbloom KR, Armstrong J, Barber GP et al (2015) The UCSC genome browser database: 2015 update. Nucleic Acids Res 43:D670–D681. https://doi.org/10.1093/nar/gku1177

    Article  CAS  PubMed  Google Scholar 

  51. Kent WJ, Zweig AS, Barber G et al (2010) BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26:2204–2207. https://doi.org/10.1093/bioinformatics/btq351

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. UCSC Kent source utilities. http://hgdownload.soe.ucsc.edu/admin/exe/. Accessed 19 Jul 2019

  53. Severin J, Lizio M, Harshbarger J et al (2014) Interactive visualization and analysis of large-scale sequencing datasets using ZENBU. Nat Biotechnol 32:217–219. https://doi.org/10.1038/nbt.2840

    Article  CAS  PubMed  Google Scholar 

  54. Frith MC, Valen E, Krogh A et al (2008) A code for transcription initiation in mammalian genomes. Genome Res 18:1–12. https://doi.org/10.1101/gr.6831208

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Fejes-Toth K, Sotirova V, Sachidanandam R et al (2009) Post-transcriptional processing generates a diversity of 5′-modified long and short RNAs. Nature 457:1028–1032. https://doi.org/10.1038/nature07759

    Article  CAS  PubMed Central  Google Scholar 

  56. Hirzmann J, Luo D, Hahnen J et al (1993) Determination of messenger RNA 5′-ends by reverse transcription of the cap structure. Nucleic Acids Res 21:3597–3598

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Ohmiya H, Vitezic M, Frith MC et al (2014) RECLU: a pipeline to discover reproducible transcriptional start sites and their alternative regulation using capped analysis of gene expression (CAGE). BMC Genomics 15:269. https://doi.org/10.1186/1471-2164-15-269

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Haberle V, Forrest ARR, Hayashizaki Y et al (2015) CAGEr: precise TSS data retrieval and high-resolution promoterome mining for integrative analyses. Nucleic Acids Res 43:e51. https://doi.org/10.1093/nar/gkv054

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Hyvärinen A, Oja E (1997) A fast fixed-point algorithm for independent component analysis. Neural Comput 9(7):1483–1492

    Article  Google Scholar 

  60. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140. https://doi.org/10.1093/bioinformatics/btp616

    Article  CAS  PubMed  Google Scholar 

  61. Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550. https://doi.org/10.1186/s13059-014-0550-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Fort A, Hashimoto K, Yamada D et al (2014) Deep transcriptome profiling of mammalian stem cells supports a regulatory role for retrotransposons in pluripotency maintenance. Nat Genet 46:558–566. https://doi.org/10.1038/ng.2965

    Article  CAS  PubMed  Google Scholar 

  63. Hashimoto K, Suzuki AM, Dos Santos A et al (2015) CAGE profiling of ncRNAs in hepatocellular carcinoma reveals widespread activation of retroviral LTR promoters in virus-induced tumors. Genome Res 25:1812–1824. https://doi.org/10.1101/gr.191031.115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Vitezic M, Lassmann T, Forrest ARR et al (2010) Building promoter aware transcriptional regulatory networks using siRNA perturbation and deepCAGE. Nucleic Acids Res 38:8141–8148. https://doi.org/10.1093/nar/gkq729

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Lizio M, Harshbarger J, Shimoji H et al (2015) Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol 16:22. https://doi.org/10.1186/s13059-014-0560-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Takamochi K, Ohmiya H, Itoh M et al (2016) Novel biomarkers that assist in accurate discrimination of squamous cell carcinoma from adenocarcinoma of the lung. BMC Cancer 16(1):760

    Article  PubMed  PubMed Central  Google Scholar 

  67. Yoshida E, Terao Y, Hayashi N et al (2017) Promoter-level transcriptome in primary lesions of endometrial cancer identified biomarkers associated with lymph node metastasis. Sci Rep 7(1):14160. https://doi.org/10.1038/s41598-017-14418-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Sompallae R, Hofmann O, Maher CA et al (2013) A comprehensive promoter landscape identifies a novel promoter for CD133 in restricted tissues, cancers, and stem cells. Front Genet 4:209. https://doi.org/10.3389/fgene.2013.00209

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Thorsen K, Schepeler T, Øster B et al (2011) Tumor-specific usage of alternative transcription start sites in colorectal cancer identified by genome-wide exon array analysis. BMC Genomics 12:505. https://doi.org/10.1186/1471-2164-12-505

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Demircioğlu D, Kindermans M, Nandi T et al (2017) A pan cancer analysis of promoter activity highlights the regulatory role of alternative transcription start sites and their association with noncoding mutations. bioRxiv. https://doi.org/10.1101/176487

  71. Dieudonné FX, O’Connor PB, Gubler-Jaquier P et al (2015) The effect of heterogeneous Transcription Start Sites (TSS) on the translatome: implications for the mammalian cellular phenotype. BMC Genomics 16:986. https://doi.org/10.1186/s12864-015-2179-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Conte M, De Palma R, Altucci L (2018) HDAC inhibitors as epigenetic regulators for cancer immunotherapy. Int J Biochem Cell Biol 98:65–74. https://doi.org/10.1016/j.biocel.2018.03.004

    Article  CAS  PubMed  Google Scholar 

  73. Brocks D, Schmidt CR, Daskalakis M et al (2017) DNMT and HDAC inhibitors induce cryptic transcription start sites encoded in long terminal repeats. Nat Genet 49(7):1052–1060. https://doi.org/10.1038/ng.3889

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Navada SC, Steinmann J, Lübbert M et al (2014) J Clin Invest 124(1):40–46. https://doi.org/10.1172/JCI69739

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Pan T, Qi J, You T et al (2018) Addition of histone deacetylase inhibitors does not improve prognosis in patients with myelodysplastic syndrome and acute myeloid leukemia compared with hypomethylating agents alone: a systematic review and meta-analysis of seven prospective cohort studies. Leuk Res 71:13–24. https://doi.org/10.1016/j.leukres

    Article  CAS  PubMed  Google Scholar 

  76. Pleyer L, Greil R (2015) Digging deep into “dirty” drugs—modulation of the methylation machinery. Drug Metab Rev 47(2):252–279. https://doi.org/10.3109/03602532.2014.995379

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masayoshi Itoh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Morioka, M.S. et al. (2020). Cap Analysis of Gene Expression (CAGE): A Quantitative and Genome-Wide Assay of Transcription Start Sites. In: Boegel, S. (eds) Bioinformatics for Cancer Immunotherapy. Methods in Molecular Biology, vol 2120. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0327-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0327-7_20

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0326-0

  • Online ISBN: 978-1-0716-0327-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics