Statistical Methods for Transcriptome-Wide Analysis of RNA Methylation by Bisulfite Sequencing

  • Brian J. ParkerEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1562)


For the transcriptome-wide detection and quantification of the 5-methylcytosine (m5C) methylation modification of RNA, one experimental approach is via bisulfite conversion. In this chapter we discuss statistical methods, and a corresponding computational pipeline, to perform transcriptome-wide differential m5C methylation analysis between RNA samples, specialized for this assay.

Key words

RNA methylation Differential methylation 5-methylcytosine Epitranscriptomics Bisulfite conversion High-throughput sequencing 



This work was in collaboration with the lab of Thomas Preiss at the John Curtin School of Medical Research, Australian National University.


  1. 1.
    Edelheit S, Schwartz S, Mumbach MR, Wurtzel O, Sorek R (2013) Transcriptome-wide mapping of 5-methylcytidine RNA modifications in bacteria, archaea, and yeast reveals m5C within archaeal mRNAs. PLoS Genet 9(6):e1003,602. doi:10.1371/journal.pgen.1003602CrossRefGoogle Scholar
  2. 2.
    Khoddami V, Cairns BR (2013) Identification of direct targets and modified bases of RNA cytosine methyltransferases. Nat Biotechnol 31(5):458–464. doi:10.1038/nbt.2566CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Hussain S, Sajini AA, Blanco S, Dietmann S, Lombard P, Sugimoto Y, Paramor M, Gleeson JG, Odom DT, Ule J, Frye M (2013) NSun2-mediated cytosine-5 methylation of vault noncoding RNA determines its processing into regulatory small RNAs. Cell Rep 4(2):255–261. doi:10.1016/j.celrep.2013.06.029CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Schaefer M, Pollex T, Hanna K, Lyko F (2009) RNA cytosine methylation analysis by bisulfite sequencing. Nucleic Acids Res 37(2):e12. doi:10.1093/nar/gkn954CrossRefPubMedGoogle Scholar
  5. 5.
    Sibbritt T, Shafik A, Clark SJ, Preiss T (2016) Nucleotide-level profiling of m5C RNA methylation. Methods Mol Biol 1358:269–284. doi:10.1007/978-1-4939-3067-8_16 CrossRefPubMedGoogle Scholar
  6. 6.
    Squires JE, Patel HR, Nousch M, Sibbritt T, Humphreys DT, Parker BJ, Suter CM, Preiss T (2012) Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA. Nucleic Acids Res 40(11):5023–5033CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Schaefer M (2015) RNA 5-methylcytosine analysis by bisulfite sequencing. Methods Enzymol 560:297–329. doi:10.1016/bs.mie.2015.03.007CrossRefPubMedGoogle Scholar
  8. 8.
    Sibbritt T, Wen J, Squires JE, Shafik A, Beveridge NJ, Statham AL, Humphreys DT, Clark SJ, Parker BJ, Preiss T (2016) The RNA:m5C methyltransferase NSUN2 methylates a broad range of coding and non-coding RNAs (in preparation)Google Scholar
  9. 9.
    David R, Burgess A, Parker BJ, Li J, Pulsford K, Sibbritt T, Preiss T, Searle I (2016) Transcriptome-wide mapping of RNA 5-methylcytosine in Arabidopsis mRNAs and ncRNAs. The Plant Cell (in press).Google Scholar
  10. 10.
    Cock PJA, Fields CJ, Goto N, Heuer ML, Rice PM (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38(6):1767–1771. doi:10.1093/nar/gkp1137CrossRefPubMedGoogle Scholar
  11. 11.
    Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120. doi:10.1093/bioinformatics/btu170CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25. doi:10.1186/gb-2009-10-3-r25CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Langmead B, Salzberg SL (2012) Fast gapped-read alignment with bowtie 2. Nat Methods 9(4):357–359. doi:10.1038/nmeth.1923CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Trapnell C, Pachter L, Salzberg SL (2009) Tophat: discovering splice junctions with rna-seq. Bioinformatics 25(9):1105–1111. doi:10.1093/bioinformatics/btp120CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Krueger F, Andrews SR (2011) Bismark: a flexible aligner and methylation caller for bisulfite-seq applications. Bioinformatics 27(11):1571–1572. doi:10.1093/bioinformatics/btr167CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Kreck B, Marnellos G, Richter J, Krueger F, Siebert R, Franke A (2012) B-SOLANA: an approach for the analysis of two-base encoding bisulfite sequencing data. Bioinformatics 28(3):428–429. doi:10.1093/bioinformatics/btr660CrossRefPubMedGoogle Scholar
  17. 17.
    R Core Team (2015) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.
  18. 18.
    Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015) limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43(7):e47. doi:10.1093/nar/gkv007Google Scholar
  19. 19.
    Law CW, Chen Y, Shi W, Smyth GK (2014) voom: precision weights unlock linear model analysis tools for rna-seq read counts. Genome Biol 15(2):R29. doi:10.1186/gb-2014-15-2-r29Google Scholar
  20. 20.
    Thorvaldsdóttir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14(2):178–192. doi:10.1093/bib/bbs017CrossRefPubMedGoogle Scholar
  21. 21.
    Jühling F, Mörl M, Hartmann RK, Sprinzl M, Stadler PF, Pütz J (2009) tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res 37(Database issue):D159–162. doi:10.1093/nar/gkn772CrossRefPubMedGoogle Scholar
  22. 22.
    Snedecor G, Cochran W (1980) Statistical methods, 7th edn. Iowa State University Press, Ames, IAGoogle Scholar
  23. 23.
    Smyth GK (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3:Article3. doi:10.2202/1544-6115.1027Google Scholar
  24. 24.
    Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol) 57(1):289–300Google Scholar
  25. 25.
    Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Ser B (Stat Methodol) 64(3):479–498CrossRefGoogle Scholar
  26. 26.
    Kiran A, Baranov PV (2010) DARNED: a DAtabase of RNa EDiting in humans. Bioinformatics 26(14):1772–1776. doi:10.1093/bioinformatics/btq285CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.Department of BiologyNew York UniversityNew YorkUSA

Personalised recommendations