Keywords

1 Introduction

Cap Analysis of Gene Expression (CAGE) is a method to profile RNA expression and precisely identify promoters and regulatory elements such as enhancers. CAGE can identify not only protein-coding RNAs but also noncoding RNAs, which are capped when produced by RNA polymerase II. CAGE technology has been developed and modified over the years to follow advanced sequencing technologies and biological interests [1,2,3,4,5,6].

Classically, promoters and enhancers are defined as genomic elements which proximally initiate and distally enhance transcription, respectively. A part of promoters is highly cell type specific and dynamically associate with nuclear architecture such as histone modifications and nucleosome-depleted regions, together with the enhancer RNAs (eRNAs) . Unlike RNAs derived from standard promoters , eRNAs are typically unstable and poorly adenylated transcripts, and located up to 1 Mb apart from the core promoter in the nucleus. However, many transcriptome analyses have shown that distal enhancers might also play roles in the promoter activity (reviewed in [7]). We previously analyzed comprehensive cell nuclear CAGE datasets from polyadenylated (poly-A) and non-poly-A RNAs with chromatin architectural datasets from ENCODE consortium, which showed that promoters are associated with the complex three-dimensional interconnected chromatin network [8].

Enhancers and promoters are known to act as highly cell type-specific regulatory elements [8,9,10] and active enhancers are likely to transcribe eRNAs that are interacting through chromatin looping with promoters [11]. Further analysis of enhancer–promoter (EP) interactions of ENCODE CAGE datasets also showed that eRNAs expression associated with predicted EP interactions are clearly cell type specific [12].

Importantly, together with eRNAs and promoters , long noncoding RNAs (lncRNAs) overlap regions known to be involved in human genetic traits. In particular, these elements overlap expression quantitative trait loci (eQTL) and single nucleotide polymorphisms (SNPs) associated with genome-wide association studies (GWAS) . In particular, lncRNAs corresponding to promoters /enhancers are significantly and specifically co-expressed in cell types that are interrelated in several human diseases [13].

In addition, the single-base resolution of CAGE TSS mapping has revealed that transcription initiation at thousands of promoters dynamically shifts throughout the specific zebrafish early developmental stage, which are orchestrated with epigenomic chromatin modifications [14].

To further broaden these analyses, we have developed the high-throughput low quantify single strand (LQ-ssCAGE ) method, which is based on previously developed cap trapper technologies [15]. The method is designed to analyze promoters and enhancers usages with single nucleotide resolution from large number of sample, which may include human patient samples, early developmental stage samples and specific cell types. An initial study using the LQ-ssCAGE method demonstrated that antisense lncRNA–mRNA pairs have specific expression patterns in the cellular compartment from zebrafish early developmental stage [16].

The LQ-ssCAGE presented here captures both poly-A and non-poly-A transcripts by using a 15 nucleotides (N15) random primer in the reverse transcriptase (RT) reaction, before CAP trapper procedures (see the workflow of the protocol in Fig. 1). A key advantage of the LQ-ssCAGE is in using small quantities of RNA yet avoiding PCR amplification, which would otherwise lead to biased and noisy quantification of expression. In addition to promoter and enhancer annotations from single CAGE reads, the LQ-ssCAGE can provide complexity of the promotome-transcript structure from paired-end reads [3].

Fig. 1
figure 1

The workflow of LQ-ssCAGE library preparation. The section numbers correspond to those in the methods

Compared to other CAGE methods [4, 5], the LQ-ssCAGE protocol is further simplified, as the loaded in the sequencer material consists simply of single stranded cap selected cDNAs. In summary, LQ-ssCAGE protocol (1) can work with RNA amount as little as 25 ng, (2) shortens the preparation time (less than 3 days) and (3) allows for preparing multiple libraries in parallel in microtiter plates. Such libraries can be efficiently sequenced on various Illumina sequencing platforms by using Illumina Index and original barcode identifiers. With a proper organization of library workflow, an operator can easily prepare 96 samples simultaneously in a 96-well plate.

To show high reproducibility and to confirm that the method can be used to identify key regulatory elements, including enhancers and lncRNA promoters , we prepared LQ-ssCAGE libraries from human acute monocytic leukemia (THP-1) cells and analyzed regulatory RNAs comparing our data to FANTOM Cage-Associated Transcripts (CAT) gene models and annotations [13]. As examples, we identified enhancers and promoter-derived regulatory RNAs, including GAS5 promoter-derived lncRNA , previously shown to be associated with apoptosis pathway in THP-1 cells [17], and a novel bidirectional genomic region transcribing e-lncRNAs.

2 Materials

2.1 Equipment

  1. 1.

    0.2 mL Polypropylene PCR Tube Strips and Domed Cap Strips, 8 Tubes/Strip, 8 Domed Caps/Strip, Clear, Nonsterile.

  2. 2.

    1.5 mL Maxymum Recovery Snaplock Microcentrifuge Tube, Polypropylene, Clear, Nonsterile.

  3. 3.

    16-well Polypropylene PCR Microplate, Clear, Nonsterile.

  4. 4.

    96-well Polypropylene PCR Microplate, No Skirt, Clear, Nonsterile.

  5. 5.

    PCR 1 × 8 Strip Domed Caps, Fit 0.2 mL PCR Tube Strips, Clear, Nonsterile.

  6. 6.

    X-Pierce Sealing Films, Sterile (EXCEL Scientific, Inc.).

  7. 7.

    Low binding barrier tips of 10 μL (0.1–10 μL), 20 μL (1–20 μL), 200 μL (1–200 μL), and 1000 μL (100–1000 μL).

  8. 8.

    PIPETMAN P2, P20, P200, and P1000.

  9. 9.

    Thermal Cycler.

  10. 10.

    Centrifuge for plates, PCR tubes, and 1.5 mL tubes.

  11. 11.

    8 channel pipettes for 0.5–10 μL and 10–100 μL and 30–300 μL.

  12. 12.

    Vortex mixer.

  13. 13.

    miVAC DNA (SP Scientific Genevac).

  14. 14.

    miVac rotor for micro plate (SP Scientific Genevac).

  15. 15.

    Dynabeads MPC-S (Magnetic Particle Concentrator) (Thermo Fisher Scientific).

  16. 16.

    DynaMag-96 Side Skirted Magnet (Thermo Fisher Scientific).

2.2 Commercial Reagents

  1. 1.

    Agencourt AMPure XP (BECKMAN COULTER).

  2. 2.

    Agencourt RNAClean XP (BECKMAN COULTER).

  3. 3.

    KAPA Library Quantification Kits (KAPA BIOSYSTEMS).

  4. 4.

    11 M Sodium Chloride (NaCl) , Molecular Biology Grade.

  5. 5.

    RNase One Ribonuclease (Promega).

  6. 6.

    8 M Lithium chloride solution, for molecular biology, ≥99%.

  7. 7.

    3 M Sodium acetate buffer solution BioXtra, pH 7.0 ± 0.05 (25 °C) for molecular biology, nonsterile; 0.2 μm filtered.

  8. 8.

    Sodium periodate (NaIO4) powder, ACS Reagent Grade.

  9. 9.

    DNA Ligation Kit Mighty Mix (Takara Bio Inc).

  10. 10.

    Ribonuclease H (RNase H ) (20–60 U/μL) (Takara Bio Inc).

  11. 11.

    10 mM dNTP Mix.

  12. 12.

    Dynabeads M-270 Streptavidin (Thermo Fisher Scientific).

  13. 13.

    RNase Decontamination Solution.

  14. 14.

    SuperScript III Reverse Transcriptase (Thermo Fisher Scientific).

  15. 15.

    UltraPure 0.5 M EDTA pH 8.0.

  16. 16.

    UltraPure DNase/RNase-Free Distilled Water.

  17. 17.

    Biotin (Long Arm) Hydrazide (Vector Laboratories).

  18. 18.

    0.5 M EDTA pH 8.0.

  19. 19.

    10% w/v Polyoxyethylene (20) Sorbitan Monolaurate Solution.

  20. 20.

    1 M Tris–HCl pH 7.0, pH 7.5 and pH 8.5.

  21. 21.

    3 M Sodium Acetate pH 5.2.

  22. 22.

    Dimethyl Sulfoxide .

  23. 23.

    70% ethanol .

  24. 24.

    2 M NaOH .

  25. 25.

    Hybridization buffer HT1 (Illumina).

2.3 Homemade Solutions

Water used should be DNase/RNase-Free Distilled Water

  1. 1.

    250 mM NaIO4: Dissolve 1 mg of Sodium periodate in 18.7 μL of water and keep in the dark. The solution can be aliquoted to 50 μL and stored at −80 °C.

  2. 2.

    100 mM Biotin (long arm) Hydrazide : Dissolve 50 mg of Biotin (long arm) Hydrazide in 1.345 mL of DMSO . The solution can be aliquoted to 50 μL and stored at −80 °C.

  3. 3.

    LiCl buffer: mix 35 mL of 8 M Lithium chloride solution, 800 μL of 1 M Tris–HCl pH 7.5, 400 μL of 10% w/v Polyoxyethylene (20) Sorbitan Monolaurate Solution, 160 μL of 0.5 M EDTA pH 8.0 and 3.64 mL of water, and store at room temperature.

  4. 4.

    TE wash buffer: Mix 39.12 mL of water, 400 μL of 1 M Tris–HCl pH 7.5, 400 μL of 10 w/v% Polyoxyethylene (20) Sorbitan Monolaurate Solution, and 80 μL of 0.5 M EDTA pH 8.0, and store at room temperature.

  5. 5.

    Release buffer: Mix 100 μL of RNase ONE 10 × Reaction Buffer, 1 μL of 10% w/v Polyoxyethylene (20) Sorbitan Monolaurate Solution and 899 μL of water, and store at room temperature.

  6. 6.

    1 × TE buffer: Mix 500 μL of 1 M Tris–HCl pH 8.0, 100 μL of 0.5 M EDTA pH 8.0, and 49.4 mL of water, and store at room temperature.

  7. 7.

    0.1 M NaCl /TE buffer: Mix 500 μL of 1 M NaCl and 4.5 mL of 1 × TE buffer, and store at room temperature.

2.4 Primers and Linker Sequences

See Tables 1, 2, and 3.

Table 1 List of RT primers containing barcode
Table 2 List of 5′ linkers
Table 3 List of 3′ linkers

3 Methods

Water used should be DNase/RNase-Free Distilled Water.

3.1 Reverse Transcription (Timing: 5.5 h)

  1. 1.

    Mix 4 μL of 50 ng RNAs (12.5 ng/μL) and 1 μL of 1.25 mM RT primers in a 96-well plate by pipetting to generate RNA and primer mix on ice (see Notes 1 and 2).

  2. 2.

    Incubate the RNA-primer mix from step 1 of this section 65 °C for 5 min and immediately place on ice for 2 min.

  3. 3.

    Mix the following components (enzyme mix) (see Note 3).

    Reagent

    Volume (μL)

    Final concentration

    5× First Strand buffer

    2

    0.1 M DTT

    0.5

    0.01 M

    10 mM dNTPs

    0.5

    1 mM

    RNase-free water

    1

    SuperScript III Reverse Transcriptase

    1

    200 U

    Total volume

    5

     
  4. 4.

    Add 5 μL of enzyme mix from step 3 of this section to RNA-primer mix solution from step 2 of this section and carefully mix ten times by pipetting on ice.

  5. 5.

    Incubate at 25 °C for 30 s, followed by 50 °C for 30 min and keep at 4 °C to generate RNA-cDNA hybrids.

  6. 6.

    Mix samples using following steps (see Note 4). Transfer each 10 μL of RNA-cDNA hybrids from step 5 of this section to new 1.5 mL tubes on ice (total volume is 480 μL).

  7. 7.

    Add 15 μL of water to the first 8 wells of the 96-well plate from step 5 of this section, wash wells by pipetting, and transfer the 15 μL of the solutions to the next 8 wells.

  8. 8.

    Wash the 8 wells by pipetting and transfer 15 μL of the solutions to the next 8 wells.

  9. 9.

    Repeat step 8 of this section three times until the end of 8 wells and transfer all samples [total volume is 120 μL (15 μL × 8 wells)] to the 1.5 mL tube at step 6 of this section (final volume is 600 μL).

  10. 10.

    Mix 600 μL of the solution from step 9 of this section by vortex, spin down and aliquot 200 μL in new 1.5 mL tube on ice (total 200 μL in three 1.5 mL tubes) (see Note 4).

  11. 11.

    Add 360 μL (1.8 folds) of RNAClean XP beads to the 48 mixed RNA-cDNA hybrids in the 1.5 mL tube from step 10 of this section, mix well by pipetting and then elute the mixed RNA-cDNA hybrids in the following steps at room temperature unless otherwise specified.

  12. 12.

    Incubate for 10 min, spin down and set the 1.5 mL tube on a magnetic stand for 5 min.

  13. 13.

    Discard the supernatant by pipette aspiration.

  14. 14.

    Wash the beads with 1.2 mL of 70% ethanol .

  15. 15.

    Place the 1.5 mL tube on magnetic stand for 5 min.

  16. 16.

    Discard the supernatant by pipette aspiration.

  17. 17.

    Repeat steps 1416 of this section twice.

  18. 18.

    Discard the 70% ethanol completely by pipette aspiration.

  19. 19.

    Add 100 μL of water and mix by pipetting extensively (more than 60 times) to elute RNA-cDNA hybrids.

  20. 20.

    Incubate at room temperature for 5 min.

  21. 21.

    Spin down and place the tube on magnetic stand for 5 min.

  22. 22.

    Transfer 100 μL of the RNA-cDNA hybrids to new 1.5 mL tubes.

  23. 23.

    Repeat steps 1922 of this section twice (final volume is 200 μL in one tube, total three tubes).

  24. 24.

    Concentrate 200 μL of the RNA-cDNA hybrids solutions from step 23 of this section to around 40 μL by SpeedVac vacuum concentrator at 37 °C, and collect the solutions from three 1.5 mL tubes to one 1.5 mL tube (total volume is around 120 μL) and concentrate to 40 μL in the 1.5 mL tube at 37 °C. The timing is around 2 h (see Note 5).

  25. 25.

    Check the sample volume several times during the concentration and adjust the final volume to 40 μL with water when the volume becomes less than 40 μL.

3.2 Oxidation to Modify Diol Group of Cap Structure (Timing: 10 min)

  1. 1.

    Mix 40 μL of RNA-cDNA hybrid from Subheading 3.1, 2 μL of 1 M NaOAc pH 4.5 and 2 μL of 250 mM NaIO4 by ten times pipetting on ice.

  2. 2.

    Incubate for 5 min on ice in dark by aluminum foil wrapping.

  3. 3.

    Add 16 μL of 1 M Tris–HCl pH 8.5 to neutralize the solution and mix well by pipetting on ice.

3.3 Purification (Timing: 1 h)

Add 108 μL (1.8 folds) of RNACleanXP beads to 60 μL of oxidated RNA-cDNA hybrid from Subheading 3.2, mix well by pipetting and then elute the 48 samples mixed RNA-cDNA hybrids in the following steps at room temperature.

  1. 1.

    Incubate for 5 min.

  2. 2.

    Spin down and set the tube on the magnetic stand for 5 min.

  3. 3.

    Discard the supernatant by pipette aspiration.

  4. 4.

    Wash the beads with 200 μL of 70% ethanol .

  5. 5.

    Discard the 70% ethanol .

  6. 6.

    Repeat steps 4 and 5 of this section twice and discard 70% ethanol completely.

  7. 7.

    Add 42 μL of water and mix by pipetting extensively (more than 60 times) to elute supernatant.

  8. 8.

    Incubate for 5 min.

  9. 9.

    Spin down and set the tube on the magnetic stand for 5 min.

  10. 10.

    Transfer 40 μL of the supernatant to new tube.

3.4 Biotinylation by the Coupling Reaction to the Oxidized RNA-cDNA Hybrids (See Note 6) (Timing: 1.5 h)

  1. 1.

    Mix 40 μL of purified oxidized RNA/cDNA hybrid from Subheading 3.3, 4 μL of 1 M NaOAc pH 6.3 and 4 μL of 100 mM Biotin (long arm) hydrazide by ten times pipetting (total 48 μL).

  2. 2.

    Incubate for 30 min at 40 °C.

  3. 3.

    Add 86.4 μL of RNACleanXP (1.8 folds) to 48 μL of solution from step 2 of this section and perform purification as previously described in Subheading 3.3.

3.5 RNaseONE Treatment to Digest RNA of RNA-cDNA Hybrids (See Note 7) (Timing: 1.5 h)

  1. 1.

    Add 4.5 μL of 10 × RNaseONE buffer and 0.5 μL of RNaseONE to 40 μL of purified biotinylated RNA-cDNA hybrids from Subheading 3.4 and mix by ten times pipetting (total 45 μL).

  2. 2.

    Incubate for 30 min at 37 °C.

  3. 3.

    Add 81 μL of RNACleanXP (1.8 folds) to the solution from step 2 of this section and perform purification as previously described in Subheading 3.3.

3.6 Dynabeads M-270 Streptavidin Beads Preparation (Timing: 0.5 h)

  1. 1.

    Add 30 μL of Dynabeads M-270 Streptavidin to new 1.5 mL tube, place on the magnetic stand for 5 min, and discard supernatant.

  2. 2.

    Wash the beads with 30 μL of LiCl buffer, set on the magnetic stand for 5 min, and discard supernatant.

  3. 3.

    Repeat step 2 of this section twice.

  4. 4.

    Resuspend the beads in 95 μL of LiCl buffer.

3.7 CapTrap Reaction (Timing: 1 h)

  1. 1.

    Add 95 μL of beads from Subheading 3.6 to 40 μL of RNA-cDNA hybrids from Subheading 3.5 and mix well by pipetting.

  2. 2.

    Incubate for 15 min at 37 °C and mix by pipetting after the first 7 min.

  3. 3.

    Spin down, set the tube on the magnetic stand for 2 min, and discard the supernatant by pipette aspiration.

  4. 4.

    Add 150 μL of TE wash buffer and mix by 60 times pipetting. Spin down, place the tube on the magnetic stand for 2 min, and discard the supernatant.

  5. 5.

    Repeat step 4 of this section three more times.

  6. 6.

    Add 35 μL of release buffer to the beads and mix by 60 times pipetting.

  7. 7.

    Incubate for 5 min at 95 °C and subsequently on ice for 1 min.

  8. 8.

    Spin down and place the tube on the magnetic stand for 2 min.

  9. 9.

    Transfer the 35 μL of supernatant to new tube.

  10. 10.

    Add 30 μL of release buffer to the beads and mix well by pipetting. Spin down and place the tube on the magnetic stand for 2 min.

  11. 11.

    Transfer the 30 μL of supernatant to the tube containing the CapTrapped cDNA of step 9 of this section (total 65 μL).

3.8 RNaseONE and RNaseH Reaction to Remaining RNAs (Timing: 2.5 h)

  1. 1.

    Add 2.9 μL of release buffer, 2 μL of RNaseONE and 0.1 μL of RNase H to 65 μL of CapTrapped cDNA from Subheading 3.7 and mix by ten times pipetting (total 70 μL).

  2. 2.

    Incubate at 37 °C for 30 min.

  3. 3.

    Add 126 μL of AMPureXP (1.8 folds) to 70 μL of cDNA from step 2 of this section and perform purification as previously described in Subheading 3.3.

  4. 4.

    Dry up 40 μL of purified CapTrapped cDNA using SpeedVac concentrator at 37 °C for around 75 min.

  5. 5.

    Add 4 μL of water to the dried pellet.

3.9 5′ Single Strand Linker Ligation (Timing: 16.5 h)

  1. 1.

    Mix 4 μL of 1 mM 5′ adaptor GN5, 4 μL of 1 mM 5′ adaptor down, 4 μL of 1 M NaCl and 28 μL of 1 × TE buffer, and carry out the annealing reaction to generate 100 μM GN5 linker as follows:

    Step

    Temperature

    Time (min)

    Denature

    95 °C

    5

    Annealing

    95 °C → 83 °C in 0.1 °C steps; 1 s per 0.1 °C

     

    83 °C

    5

    83 °C → 71 °C in 0.1 °C steps; 1 s per 0.1 °C

     

    71 °C

    5

    71 °C → 59 °C in 0.1 °C steps; 1 s per 0.1 °C

     

    59 °C

    5

    59 °C → 47 °C in 0.1 °C steps; 1 s per 0.1 °C

     

    47 °C

    5

    47 °C → 35 °C in 0.1 °C steps; 1 s per 0.1 °C

     

    35 °C

    5

    35 °C → 23 °C in 0.1 °C steps; 1 s per 0.1 °C

     

    23 °C

    5

    23 °C → 11 °C in 0.1 °C steps; 1 s per 0.1 °C

     
     

    11 °C

    Pause

  2. 2.

    Mix 1 μL of 1 mM 5′ adaptor N6, 1 μL of 1 mM 5′ adaptor down, 1 μL of 1 M NaCl , and 7 μL of 1 × TE buffer and repeat the annealing reaction at step 1 of this section to generate 100 μM N6 linker.

  3. 3.

    Mix 40 μL of 100 μM GN5 linker from step 1 of this section and 10 μL of 100 μM N6 linker from step 2 of this section.

  4. 4.

    Dilute 100 μM of mixed 5′ linkers to 2.5 μM in 0.1 M NaCl /TE buffer . For instance, add 2.5 μL of 100 μM mixed linkers from step 3 of this section to 97.5 μL of 0.1 M NaCl /TE buffer for 25 samples. 2.5 μM and 100 M mixed 5′ linkers can be stored at −20 °C.

  5. 5.

    Incubate 4 μL of CapTrapped cDNA from Subheading 3.8 at 95 °C for 5 min and put on ice for 2 min.

  6. 6.

    Incubate 4 μL of 2.5 μM 5′ linker from step 4 of this section at 55 °C for 5 min and put on ice for 2 min.

  7. 7.

    Mix 4 μL of CapTrapped cDNA from step 5 of this section, 4 μL of 2.5 μM 5′ linker from step 6 of this section, and 16 μL of Mighty Mix by pipetting (total volume is 24 μL).

  8. 8.

    Incubate at 16 °C for 16 h (overnight).

3.10 Purification to Remove Excess 5′ Linkers and Linker Dimers (Timing: 2 h)

  1. 1.

    Add 46 μL of water and 126 μL of AMPureXP (1.8 folds) to 24 μL of cDNA from Subheading 3.9 and mix well by pipetting.

  2. 2.

    Perform purification as previously described in Subheading 3.3.

  3. 3.

    Incubate 40 μL of cDNA from step 2 from this section at 95 °C for 5 min and immediately put on ice for 2 min.

  4. 4.

    Add 48 μL of AMPureXP beads (1.2 folds) to 40 μL of cDNA from step 3 of this section and mix well by pipetting.

  5. 5.

    Perform purification again as previously described in Subheading 3.3.

  6. 6.

    Dry up 40 μL of cDNA from step 5 of this section by SpeedVac concentrator at 80 °C for 35 min.

  7. 7.

    Add 4 μL of water to the dried pellet.

3.11 3′ Single Strand Linker Ligation (See Note 8) (Timing: 4.5 h)

  1. 1.

    Mix 1 μL of 1 mM 3′ adaptor up, 1 μL of 1 mM 3′ adaptor down, 1 μL of 1 M NaCl , and 7 μL of 1 × TE buffer and carry out the annealing reaction at the step 1 of Subheading 3.9 to generate 100 μM of 3′ linker.

  2. 2.

    Dilute 100 μM mixed linker from step 1 of this section to 2.5 μM with 0.1 M NaCl /TE buffer (see details at the step 4 of Subheading 3.9).

  3. 3.

    Incubate 4 μL of cDNA from Subheading 3.10 at 95 °C for 5 min and place on ice for 2 min.

  4. 4.

    Incubate 4 μL of 2.5 μM 3′ linker from step 2 of this section at 65 °C for 5 min and place on ice for 2 min.

  5. 5.

    Mix 4 μL of cDNA from step 3 of this section, 4 μL of 3′ linker from step 4 of this section and 16 μL of Mighty Mix by pipetting (total volume is 24 μL).

  6. 6.

    Incubate at 30 °C for 4 h.

3.12 Purification to Remove Excess 3′ Linkers And Linker Dimers (Final Library) (Timing: 1.5 h)

  1. 1.

    Add 46 μL of water and 126 μL of AMPureXP (1.8 folds) to 24 μL of cDNA from Subheading 3.11 and mix well by pipetting.

  2. 2.

    Perform purification as previously described in Subheading 3.3.

  3. 3.

    Incubate 40 μL of cDNA from step 2 from this section at 95 °C for 5 min and immediately put on ice for 2 min.

  4. 4.

    Add 48 μL of AMPureXP beads (1.2 folds) to 40 μL of cDNA from step 3 of this section and mix well by pipetting.

  5. 5.

    Perform purification again as previously described in Subheading 3.3.

3.13 Library Quantification (Timing: 2.5 h)

Quantify the concentration of library by KAPA Library Quantification Kit with a small modification of manufacture’s protocol. Briefly, the 1 μL of final library from Subheading 3.12 was 50 times diluted and used for kapa assay (see Note 9).

3.14 Sequencing

  1. 1.

    Mix the following components (see Note 10).

    Reagent

    Volume (μL)

    Final concentration

    Library (2250 attomole)

    ×

     

    2 N NaOH

    1

    0.1 N

    Water

    ×

     

    Total volume

    20

     
  2. 2.

    Denature library by incubating at room temperature for 5 min.

  3. 3.

    Put the tube on ice and add 20 μL of 1 M Tris–HCl pH 7.0 (pre-chilled) to neutralize.

  4. 4.

    Add 110 μL of HT1 buffer.

  5. 5.

    Transfer 150 μL of the denatured and diluted library (final concentration is 15 pM) (see Note 10) to HiSeq2500 with 50 bp Paired-End sequencing (see Fig. 1 for the library structure). Read 1 allows to read sequence information of cDNA , read 2 allows to read sequence information of barcode in RT primer and Index read 1 allows to read sequence information of Index at the 3′ linker.

    Read 1 contains cDNA sequence, read 2 contains barcode sequence of the RT primer and Index read 1 contains sequence of the Illumina Index at the 3′ linker.

3.15 Bioinformatics Analysis and Results

3.15.1 Sequencing Coverage and Reproducibility

We made 240 samples from 50 ng of RNA extracted from THP-1 cells and generated a total of ~134.44 million reads (median of 575,008 reads) using paired-end 50 cycles kit on Illumina HiSeq-2000 and Illumina HiSeq-2500 sequencers in two independent sequencing runs. Reproducibility across all biological replicates was generally quite high (Pearson’s correlation coefficients 0.93–0.98). Examples of correlations for four LQ-ssCAGE libraries with different barcodes selected from two different sequencing runs are shown in Fig. 2.

Fig. 2
figure 2

Person correlations between pairs of four selected biological replicates in THP-1 whole cell showing technical reproducibility of the method

3.15.2 Annotated Genes Coverage

Using the FANTOM CAT gene annotations [13], we cover (defined as identification of at least one CAGE tag in at least one library) 54,100 out of 124,047 genes, ranging from 4403 to 18,616 genes for sense-overlap-RNAs and protein-coding genes, respectively (Fig. 3a). From these, 6920 genes were annotated as enhancer lncRNA (e-lncRNA), 5817 as promoter-derived intergenic lncRNA and 1331 as promoter-derived divergent lncRNA (Fig. 3b), when considering a subset of 44,069 genes with their DNase I hypersensitive sites (DHSs) classified as either enhancer or promoter [18]. Specifically, LQ-ssCAGE method has an advantage of capturing many lowly expressed e-lncRNAs (Fig. 3c): here we found 6920 of all e-lncRNAs in one cellular state, which is definitely a large number considering the high cell type specificity of e-lncRNAs expression. Selected examples, a novel e-lncRNA (CATG00000112934.1) and a p-lncRNA (GAS5) with the average expression of ~22 transcripts per million (TPM) and ~226 TPM, respectively, are shown in Fig. 4.

Fig. 3
figure 3

Coverage and expression levels of annotated genes in LQ-ssCAGE . (a) Number of genes in a given gene class with at least one CAGE tag in one library as defined in [13]. (b) Number of genes with DNase I hypersensitive sites (DHSs) classified as either enhancer or promoter [18]. (c) Mean expression levels of each epigenetic gene class as in (b)

Fig. 4
figure 4

Genome browser views of selected examples of ssCAGE results. (a) Example of an e-lncRNA (novel gene). (b) Example of a p-intergenic-lncRNA (GAS5). CAGE tags in green correspond to the sense strand and in violet correspond to the antisense strand

3.15.3 CAGE Tags Mapping and Gene Expression Quantification

CAGE tags were mapped to the human genome assembly hg38 using STAR (version 2.5.3a). The average mapping rate was 69.5%, with ~500,000 mapped counts obtained on average across all 240 samples. First, expression for CAGE promoters was estimated by counting the numbers of mapped CAGE tags falling under the 379,952 promoter regions of FANTOM 6 CAT gene models (described in [19]). Next, the expression of the corresponding 124,047 genes was estimated by summing up the expression values of all promoters assigned to a given gene. We found 54,100 genes to have at least one CAGE tag across all 240 libraries. CAGE sequencing summary, raw and summarized gene expression tables are available at https://fantom.gsc.riken.jp/6/suppl/Takahashi_et_al_2020/.

3.15.4 CAGE Library Correlations

Expression values for the 54,100 genes with at least one CAGE tag across all 240 libraries were correlated for all pairs of CAGE libraries using the Pearson correlation from the “cor” function (“stats” R package 4.0.0). Correlation for four libraries chosen from two different sequencing runs and with two different barcodes were plotted using “plotCorrelation2” function [20] with “tagCountThreshold = 1” and “applyThresholdBoth = FALSE” parameters.

3.15.5 Genomic and Epigenomic Gene Classes

DHS_type, genomic and epigenomic classifications of genes were inherited from FANTOM CAT annotations [13] and the numbers were plotted for all genes with at least one CAGE tag across all 240 libraries and with available annotations. Genomic classes: “short_ncRNA”, “uncertain_coding”, “small_RNA”, and “structural_RNA” were broadly classified as “other”. Detailed annotations of all genes are available at https://fantom.gsc.riken.jp/6/suppl/Takahashi_et_al_2020/.

3.15.6 Data Availability

Raw sequencing data generated in this study have been deposited at NCBI under the BioProject accession number PRJNA664583 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA664583). Expression data can be accessed through the FANTOM6 project portal (https://fantom.gsc.riken.jp/6/suppl/Takahashi_et_al_2020/) (see Note 11).

4 Notes

  1. 1.

    The amount of RNAs in each tube should be 25–100 ng. After the reverse transcription, samples will be barcodes and then they can be mixed, using up to 5 μg/tube. For instance, if the starting RNA amount is 50 ng, 100 samples can be mixed in one tube for subsequent analysis. If starting from 100 ng of RNAs, process the pooled library in two tubes. The number of mixed samples is dependent on how many samples the operator needs to process. We describe here, for example, a mix of 48 samples from 50 ng starting RNA. Users can modify these numbers taking care of Note 2.

  2. 2.

    The mixed solution should contain no more than 5 μg of starting sample; otherwise, the following linkers at Subheadings 3.9 and 3.11 may become insufficient for the linker ligation.

  3. 3.

    In order to avoid void volume, we advise to prepare premix solution for the number of samples needed considering a factor of 1.1 for each sample.

  4. 4.

    Step 6 of Subheading 3.1 is needed for the collection of remaining molecules to avoid losing any RNA-cDNA hybrids in the wells. We recommend to aliquot less than 200 μL per each 1.5 mL tube at step 10 of Subheading 3.1. Due to the volume limitations of next RNAClean XP purification step.

  5. 5.

    DO NOT DRY UP the RNA-cDNA hybrids solutions by using the concentrator because RNAs stick very easily to the surface of tubes when dried.

  6. 6.

    Biotinylation occurs at both 5′ end of capped RNAs and 3′ end of RNA (described in Fig. 2 in the published protocol [4]), but RNAse treatment removes the 3′ end biotin.

  7. 7.

    RNase treatment is critical to digest uncompleted cDNA synthesis to the 5′ end of capped RNAs (described in Fig. 2 at published protocol [4]).

  8. 8.

    When generating more than two libraries with same barcode in the RT primers, use different Index in the 3′ linker to mix (see Table 3).

  9. 9.

    The ideal concentration of one library from 48 mixed samples is 20–23 pM in 10 μL (400 attomole), which is from 50 ng RNA. In order to obtain high cost performance analysis, an operator can mix equal mol libraries up to 2250 attomole after the quantification of the library at Subheading 3.13. After mix, adjust a volume to 19 μL by SpeedVac vacuum concentrator at 37 °C and repeat Subheading 3.13.

  10. 10.

    The concentration of the final library 15 pM was determined by a Flow Cell spec of the HiSeq2500 with 50 bp Paired-End sequencing. The concentration should be optimized for each spec of Illumina Flow Cell technology.

  11. 11.

    Only Read 1 were used for our CAGE data analysis and made publicly available.