Detection and Quantification of Alternative Splicing Variants Using RNA-seq

  • Douglas W. BryantJr
  • Henry D. Priest
  • Todd C. MocklerEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 883)


Next-generation sequencing has enabled genome-wide studies of alternative pre-mRNA splicing, allowing for empirical determination, characterization, and quantification of the expressed RNAs in a sample in toto. As a result, RNA sequencing (RNA-seq) has shown tremendous power to drive biological discoveries. At the same time, RNA-seq has created novel challenges that necessitate the development of increasingly sophisticated computational approaches and bioinformatic tools. In addition to the analysis of massive datasets, these tools also need to facilitate questions and analytical approaches driven by such rich data. HTS and RNA-seq are still in a stage of very rapid evolution and are, therefore, only introduced in general terms. This chapter mainly focuses on the methods for discovery, detection, and quantification of alternatively spliced transcript variants.

Key words

RNA-seq Bioinformatics Next-generation sequencing Transcript abundance Alternative splicing 


  1. 1.
    Jacquier A (2009) The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs. Nat Rev Genet 10:833–844PubMedCrossRefGoogle Scholar
  2. 2.
    Fox S, Filichkin S, Mockler TC (2009) Applications of ultra-high-throughput sequencing. Methods Mol Biol 553:79–108PubMedCrossRefGoogle Scholar
  3. 3.
    Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628PubMedCrossRefGoogle Scholar
  4. 4.
    Nagalakshmi U et al (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320: 1344–1349PubMedCrossRefGoogle Scholar
  5. 5.
    Filichkin SA et al (2010) Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res 20:45–58PubMedCrossRefGoogle Scholar
  6. 6.
    Li H et al (2008) Determination of tag density required for digital transcriptome analysis: application to an androgen-sensitive prostate cancer model. Proc Natl Acad Sci USA 105:20179–20184PubMedCrossRefGoogle Scholar
  7. 7.
    Parkhomchuk D et al (2009) Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res 37:e123PubMedCrossRefGoogle Scholar
  8. 8.
    Ingolia NT et al (2009) Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324:218–223PubMedCrossRefGoogle Scholar
  9. 9.
    He Y et al (2008) The antisense transcriptomes of human cells. Science 322:1855–1857PubMedCrossRefGoogle Scholar
  10. 10.
    Cloonan N et al (2008) Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods 5:613–619PubMedCrossRefGoogle Scholar
  11. 11.
    Lister R et al (2008) Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133:523–536PubMedCrossRefGoogle Scholar
  12. 12.
    Core LJ, Waterfall JJ, Lis JT (2008) Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322:1845–1848PubMedCrossRefGoogle Scholar
  13. 13.
    Li R et al (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25:1966–1967PubMedCrossRefGoogle Scholar
  14. 14.
    Bryant DW et al (2010) Supersplat-spliced RNA-seq alignment. Bioinformatics 26:1500–1505PubMedCrossRefGoogle Scholar
  15. 15.
    Bryant DW, et al (2011) Gumby—a purely empirical RNA-seq-based approach to genome annotation. Manuscript in Preparation.Google Scholar
  16. 16.
    Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111PubMedCrossRefGoogle Scholar
  17. 17.
    Jiang H, Wong WH (2008) SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformatics 24:2395–2396PubMedCrossRefGoogle Scholar
  18. 18.
    Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851–1858PubMedCrossRefGoogle Scholar
  19. 19.
    Li R et al (2008) SOAP: short oligonucleotide alignment program. Bioinformatics 24: 713–714PubMedCrossRefGoogle Scholar
  20. 20.
    Smith AD, Xuan Z, Zhang MQ (2008) Using quality scores and longer reads improves accuracy of Solexa read mapping. BMC Bioinformatics 9:128PubMedCrossRefGoogle Scholar
  21. 21.
    Homer N, Merriman B, Nelson SF (2009) BFAST: an alignment tool for large scale genome resequencing. PLoS One 4:e7767PubMedCrossRefGoogle Scholar
  22. 22.
    Rumble SM et al (2009) SHRiMP: accurate mapping of short color-space reads. PLoS Comput Biol 5:e1000386PubMedCrossRefGoogle Scholar
  23. 23.
    Lunter G, Goodson M (2011) Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21:936–939PubMedCrossRefGoogle Scholar
  24. 24.
    Rizk G, Lavenier D (2010) GASSST: global alignment short sequence search tool. Bioinformatics 26:2534–2540PubMedCrossRefGoogle Scholar
  25. 25.
    Li H et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079PubMedCrossRefGoogle Scholar
  26. 26.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics 25:1754–1760PubMedCrossRefGoogle Scholar
  27. 27.
    Wang K et al (2010) MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res 38:e178PubMedCrossRefGoogle Scholar
  28. 28.
    Au KF et al (2010) Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res 38:4570–4578PubMedCrossRefGoogle Scholar
  29. 29.
    Denoeud F et al (2008) Annotating genomes with massive-scale RNA sequencing. Genome Biol 9:R175PubMedCrossRefGoogle Scholar
  30. 30.
    Garber M et al (2011) Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods 8:469–477PubMedCrossRefGoogle Scholar
  31. 31.
    Costa V et al (2010) Uncovering the complexity of transcriptomes with RNA-Seq. J Biomed Biotechnol 2010:853916PubMedCrossRefGoogle Scholar
  32. 32.
    Yassour M et al (2009) Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing. Proc Natl Acad Sci USA 106:3264–3269PubMedCrossRefGoogle Scholar
  33. 33.
    Kelley DR, Schatz MC, Salzberg SL (2010) Quake: quality-aware detection and correction of sequencing errors. Genome Biol 11:R116PubMedCrossRefGoogle Scholar
  34. 34.
    Shi H et al (2010) A parallel algorithm for error correction in high-throughput short-read data on CUDA-enabled graphics hardware. J Comput Biol 17:603–615PubMedCrossRefGoogle Scholar
  35. 35.
    Yang X, Dorman KS, Aluru S (2010) Reptile: representative tiling for short read error correction. Bioinformatics 26:2526–2533PubMedCrossRefGoogle Scholar
  36. 36.
    Kao WC, Chan AH, Song YS (2011) ECHO: a reference-free short-read error correction algorithm. Genome Res 21:1181–1192PubMedCrossRefGoogle Scholar
  37. 37.
    Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829PubMedCrossRefGoogle Scholar
  38. 38.
    Birol I, Jackman SD, Nielsen CB (2009) De novo transcriptome assembly with ABySS. Bioinformatics 25:2872–2877PubMedCrossRefGoogle Scholar
  39. 39.
    Robertson G et al (2010) De novo assembly and analysis of RNA-seq data. Nat Methods 7:909–912PubMedCrossRefGoogle Scholar
  40. 40.
    Grabherr MG et al (2011) Full-length transcriptome assembly from RNA-Seq data ­without a reference genome. Nat Biotechnol 29:644–652PubMedCrossRefGoogle Scholar
  41. 41.
    De Bruijn NG (1946) A combinatorial problem. Koninklijke Nederlandse Akademie v Wetenschappen 46:6Google Scholar
  42. 42.
    Griffith M et al (2010) Alternative expression analysis by RNA sequencing. Nat Methods 7:843–847PubMedCrossRefGoogle Scholar
  43. 43.
    Trapnell C et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515PubMedCrossRefGoogle Scholar
  44. 44.
    Katz Y et al (2010) Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods 7:1009–1015PubMedCrossRefGoogle Scholar
  45. 45.
    Marioni JC et al (2008) RNA-Seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18:1509–1517PubMedCrossRefGoogle Scholar
  46. 46.
    Jiang H, Wong WH (2009) Statistical inferences for isoform expression in RNA-Seq. Bioinformatics 25:1026–1032PubMedCrossRefGoogle Scholar
  47. 47.
    Li B et al (2009) RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26:493–500PubMedCrossRefGoogle Scholar
  48. 48.
    Richard H et al (2010) Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments. Nucleic Acids Res 38:e112PubMedCrossRefGoogle Scholar
  49. 49.
    Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140PubMedCrossRefGoogle Scholar
  50. 50.
    Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:R106PubMedCrossRefGoogle Scholar
  51. 51.
    Langmead B, Hansen KD, Leek JT (2010) Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 11:R83PubMedCrossRefGoogle Scholar
  52. 52.
    Cumbie JS, et al (2011) GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences. PLoS One (6):e25279Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Douglas W. BryantJr
    • 1
    • 2
  • Henry D. Priest
    • 1
    • 3
  • Todd C. Mockler
    • 1
    • 3
    Email author
  1. 1.The Donald Danforth Plant Science CenterSt. LouisUSA
  2. 2.Intuitive Genomics, Inc.St. LouisUSA
  3. 3.Division of Biology and Biomedical SciencesWashington UniversitySt. LouisUSA

Personalised recommendations