Skip to main content

Methods to Study Splicing from High-Throughput RNA Sequencing Data

  • Protocol
  • First Online:
Spliceosomal Pre-mRNA Splicing

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1126))

Abstract

The development of novel high-throughput sequencing (HTS) methods for RNA (RNA-Seq) has provided a very powerful mean to study splicing under multiple conditions at unprecedented depth. However, the complexity of the information to be analyzed has turned this into a challenging task. In the last few years, a plethora of tools have been developed, allowing researchers to process RNA-Seq data to study the expression of isoforms and splicing events, and their relative changes under different conditions. We provide an overview of the methods available to study splicing from short RNA-Seq data, which could serve as an entry point for users who need to decide on a suitable tool for a specific analysis. We also attempt to propose a classification of the tools according to the operations they do, to facilitate the comparison and choice of methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Djebali S, Davis CA, Merkel A et al (2012) Landscape of transcription in human cells. Nature 489(7414):101–108

    PubMed Central  PubMed  CAS  Google Scholar 

  2. Wang ET, Sandberg R, Luo S et al (2008) Alternative isoform regulation in human tissue transcriptomes. Nature 456(7221):470–476

    PubMed Central  PubMed  CAS  Google Scholar 

  3. Pan Q, Shai O, Lee LJ et al (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40(12):1413–1415

    PubMed  CAS  Google Scholar 

  4. Chen L (2011) Statistical and computational studies on alternative splicing. In: Horng-Shing Lu H et al (eds) Handbook of statistical bioinformatics. Springer, New York. doi:10.1007/978-3-642-16345-6_2

    Google Scholar 

  5. Pachter L (2011) Models for transcript quantification from RNA-Seq. arXiv:1104.3889v2 (http://arxiv.org/abs/1104.3889)

  6. Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25(9):1105–1111

    PubMed Central  PubMed  CAS  Google Scholar 

  7. Huang S, Zhang J, Li R et al (2011) SOAPsplice: genome-wide ab initio detection of splice junctions from RNA-Seq data. Front Genet 2(July):46

    PubMed Central  PubMed  Google Scholar 

  8. Zhang Y, Lameijer EW, ‘t Hoen PA et al (2012) PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data. Bioinformatics 28(4):479–486

    PubMed Central  PubMed  CAS  Google Scholar 

  9. Wang K, Singh D, Zeng Z et al (2010) MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res 38(18):e178

    PubMed Central  PubMed  Google Scholar 

  10. Au KF, Jiang H, Lin L et al (2010) Detection of splice junctions from paired-end RNA seq data by SpliceMap. Nucleic Acids Res 38(14):4570–4578

    PubMed Central  PubMed  CAS  Google Scholar 

  11. Dimon MT, Sorber K, DeRisi JL (2010) HMMSplicer: a tool for efficient and sensitive discovery of known and novel splice junctions in RNA-Seq data. PloS one 5(11):e13875

    PubMed Central  PubMed  Google Scholar 

  12. Li Y, Li-Byarlay H, Burns P et al (2013) TrueSight: a new algorithm for splice junction detection using RNA-seq. Nucleic Acids Res 41(4):e51

    PubMed Central  PubMed  CAS  Google Scholar 

  13. Marco-Sola S, Sammeth M, Guigó R et al (2012) The GEM mapper: fast, accurate and versatile alignment by filtration. Nat Methods 9(12):1185–1188

    PubMed  CAS  Google Scholar 

  14. Ameur A, Wetterbom A, Feuk L et al (2010) Global and unbiased detection of splice junctions from RNA-seq data. Genome Biol 11(3):R34

    PubMed Central  PubMed  Google Scholar 

  15. Bryant DW, Shen R, Priest HD et al (2010) Supersplat– spliced RNA-seq alignment. Bioinformatics 26(12):1500–1505

    PubMed Central  PubMed  CAS  Google Scholar 

  16. Wang L, Wang X, Wang X et al (2011) Observations on novel splice junctions from RNA sequencing data. Biochem Biophys Res Commun 409(2):299–303

    PubMed  CAS  Google Scholar 

  17. Lou SK, Ni B, Lo LY et al (2011) ABMapper: a suffix array-based tool for multi-location searching and splice-junction mapping. Bioinformatics 27(3):421–422

    PubMed Central  PubMed  CAS  Google Scholar 

  18. Bao H, Xiong Y, Guo H et al (2009) MapNext: a software tool for spliced and unspliced alignments and SNP detection of short sequence reads. BMC Genomics 10(Suppl 3):S13

    PubMed Central  PubMed  Google Scholar 

  19. Dobin A, Davis CA, Schlesinger F et al (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15–21

    PubMed Central  PubMed  CAS  Google Scholar 

  20. Wu TD, Nacu S (2010) Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26(7):873–881

    PubMed Central  PubMed  CAS  Google Scholar 

  21. De Bona F, Ossowski S, Schneeberger K et al (2008) Optimal spliced alignments of short sequence reads. Bioinformatics 24(16):i174–i180

    PubMed  Google Scholar 

  22. Jean G, Kahles A, Sreedharan VT et al. (2010) RNA-Seq read alignments with PALMapper. Curr Protoc Bioinformat Chapter 11:Unit 11.6

    Google Scholar 

  23. Philippe N, Salson M, Commes T et al (2013) CRAC: an integrated approach to the analysis of RNA-seq reads. Genome Biol 14(3):R30

    PubMed  Google Scholar 

  24. Wu J, Anczuków O, Krainer AR et al (2013) OLego: fast and sensitive mapping of spliced mRNA-Seq reads using small seeds. Nucl Acids Res 41(10):5149–5163

    PubMed Central  PubMed  CAS  Google Scholar 

  25. Liao Y, Smyth GK, Shi W (2013) The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Res 41(10):e108

    PubMed Central  PubMed  Google Scholar 

  26. Hu J, Ge H, Newman M, Liu K (2012) OSA: a fast and accurate alignment tool for RNA-Seq. Bioinformatics 28(14):1933–1934

    PubMed  CAS  Google Scholar 

  27. Wood DL, Xu Q, Pearson JV et al (2011) X-MATE: a flexible system for mapping short read data. Bioinformatics 27(4):580–581

    PubMed Central  PubMed  CAS  Google Scholar 

  28. Chen LY, Wei KC, Huang AC et al (2012) RNASEQR—a streamlined and accurate RNA-seq sequence analysis program. Nucleic Acids Res 40(6):e42

    PubMed Central  PubMed  CAS  Google Scholar 

  29. Labaj PP, Linggi BE, Wiley HS et al (2012) Improving RNA-Seq Precision with MapAl. Front Genet 3:28

    PubMed Central  PubMed  Google Scholar 

  30. Xu G, Deng N, Zhao Z et al (2011) SAMMate: a GUI tool for processing short read alignments in SAM/BAM format. Source Code Biol Med 6(1):2

    PubMed Central  PubMed  Google Scholar 

  31. Kim H, Bi Y, Pal S et al (2011) IsoformEx: isoform level gene expression estimation using weighted non-negative least squares from mRNA-seq data. BMC Bioinforma 12:305

    CAS  Google Scholar 

  32. Grant GR, Farkas MH, Pizarro AD et al (2011) Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioinformatics 27(18):2518–2528

    PubMed Central  PubMed  CAS  Google Scholar 

  33. Ryan MC, Cleland J, Kim R et al (2012) SpliceSeq: a resource for analysis and visualization of RNA-Seq data on alternative splicing and its functional impacts. Bioinformatics 28(18):2385–2387

    PubMed Central  PubMed  CAS  Google Scholar 

  34. Tang S, Riva A (2013) PASTA: splice junction identification from RNA-Sequencing data. BMC Bioinforma 14(1):116

    Google Scholar 

  35. Bonfert T, Csaba G, Zimmer R et al (2012) A context-based approach to identify the most likely mapping for RNA-seq experiments. BMC Bioinforma 13(Suppl 6):S9

    Google Scholar 

  36. Wang L, Xi Y, Yu J et al (2010) A statistical method for the detection of alternative splicing using RNA-seq. PLoS one 5(1):e8529

    PubMed Central  PubMed  Google Scholar 

  37. Wu J, Akerman M, Sun S et al (2011) SpliceTrap: a method to quantify alternative splicing under single cellular conditions. Bioinformatics 27:3010–3016

    PubMed Central  PubMed  CAS  Google Scholar 

  38. Katz Y, Wang ET, Airoldi EM et al (2010) Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods 7(12):1009–1015

    PubMed Central  PubMed  CAS  Google Scholar 

  39. Griffith M, Griffith OL, Mwenifumbo J et al (2010) Alternative expression analysis by RNA sequencing. Nat Methods 7(10):843–847

    PubMed  CAS  Google Scholar 

  40. Richard H, Schulz MH, Sultan M et al (2010) Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments. Nucl Acids Res 38(10):e112

    PubMed Central  PubMed  Google Scholar 

  41. Mortazavi A, Williams BA, Mccue K et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):1–8

    Google Scholar 

  42. Jiang H, Wong WH (2009) Statistical inferences for isoform expression in RNA-Seq. Bioinformatics 25(8):1026–1032

    PubMed Central  PubMed  CAS  Google Scholar 

  43. Bohnert R, Behr J, Rätsch G (2009) Transcript quantification with RNA-Seq data. BMC Bioinforma 10(Suppl 13):P5

    Google Scholar 

  44. Montgomery SB, Sammeth M, Gutierrez-Arcelus M et al (2010) Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464(7289):773–777

    PubMed  CAS  Google Scholar 

  45. Du J, Leng J, Habegger L et al (2012) IQSeq: integrated isoform quantification analysis based on next-generation sequencing. PLoS One 7(1):e29175

    PubMed Central  PubMed  CAS  Google Scholar 

  46. Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511–515

    PubMed Central  PubMed  CAS  Google Scholar 

  47. Rossell D, Attolini CSO, Kroiss M et al. (2012) Quantifying alternative splicing from paired-end RNA-sequencing data. COBRA Preprint Series. Working Paper 97 http://biostats.bepress.com/cobra/art97

  48. Li W, Jiang T (2012) Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads. Bioinformatics 28(22):2914–2921

    PubMed Central  PubMed  CAS  Google Scholar 

  49. Li W, Feng J, Jiang T (2011) IsoLasso: a LASSO regression approach to RNA-Seq based transcriptome assembly. J Comput Biol 18(11):1693–1707

    PubMed Central  PubMed  Google Scholar 

  50. Feng J, Li W, Jiang T (2010) Inference of isoforms from short sequence reads. In: Berger B (ed) Research in computational molecular biology, lecture notes in computer science, vol 6044. Springer, Heidelberg, pp 138–157

    Google Scholar 

  51. Li JJ, Jiang CR, Brown JB et al (2011) Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation. PNAS 108(50):19867–19872

    PubMed Central  PubMed  CAS  Google Scholar 

  52. Roberts A, Pimentel H, Trapnell C et al (2011) Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics 27(17):2325–2329

    PubMed  CAS  Google Scholar 

  53. Mangul S, Caciula A, Glebova O et al (2012) Improved transcriptome quantification and reconstruction from RNA-Seq reads using partial annotations. Silico Biol 11(5):251–261

    CAS  Google Scholar 

  54. Mezlini AM, Smith EJ, Fiume M et al (2013) iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res 23(3):519–529

    PubMed Central  PubMed  CAS  Google Scholar 

  55. Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinforma 12:323

    CAS  Google Scholar 

  56. Nicolae N, Mangul S, Mandoiu I et al (2011) Estimation of alternative splicing isoform frequencies from RNA-seq data. Algorithms Mol Biol 6:9

    PubMed Central  PubMed  Google Scholar 

  57. Lee S, Seo CH, Lim B et al (2011) Accurate quantification of transcriptome from RNA-seq data by effective length normalization. Nucleic Acids Res 39(2):e9

    PubMed Central  PubMed  Google Scholar 

  58. Glaus P, Honkela A, Rattray M (2012) Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics 28(13):1721–1728

    PubMed Central  PubMed  CAS  Google Scholar 

  59. Turro E, Su SY, Gonçalves  et al (2011) Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads. Genome Biol 12(2):R13

    PubMed Central  PubMed  CAS  Google Scholar 

  60. Roberts A, Pachter L (2013) Streaming fragment assignment for real-time analysis of sequencing experiments. Nat Methods 10(1):71–73

    PubMed  CAS  Google Scholar 

  61. Denoeud F, Aury JM, Da Silva C et al (2008) Annotating genomes with massive-scale RNA sequencing. Genome Biol 9(12):R175

    PubMed Central  PubMed  Google Scholar 

  62. Zhao Z, Nguyen T, Deng N et al. (2011) SPATA: a seeding and patching algorithm for de novo transcriptome assembly. 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshop (IEEE BIBMW’11) pp. 26–33

    Google Scholar 

  63. Filichkin S, Priest H, Givan S et al (2010) Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res 20(1):45–58

    PubMed Central  PubMed  CAS  Google Scholar 

  64. Guttman M, Garber M, Levin JZ et al (2010) Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol 28(5):503–510

    PubMed Central  PubMed  CAS  Google Scholar 

  65. Hiller D, Wong WH (2012) Simultaneous isoform discovery and quantification from RNA-Seq. Stat Biosci 5(1):100–118

    Google Scholar 

  66. Xia Z, Wen J, Chang CC et al (2011) NSMAP: a method for spliced isoforms identification and quantification from RNA-Seq. BMC Bioinforma 12:162

    CAS  Google Scholar 

  67. Rogers MF, Thomas J, Reddy AS et al (2012) SpliceGrapher: detecting patterns of alternative splicing from RNA-Seq data in the context of gene models and EST data. Genome Biol 13(1):R4

    PubMed Central  PubMed  CAS  Google Scholar 

  68. Seok J, Xu W, Jiang H et al (2012) Knowledge-based reconstruction of mRNA transcripts with short sequencing reads for transcriptome research. PLoS ONE 7(2):e31440

    PubMed Central  PubMed  CAS  Google Scholar 

  69. Behr J, Bohnert R, Zeller G et al (2010) Next generation genome annotation with mGene.ngs. BMC Bioinforma 11(Suppl 10):O8

    Google Scholar 

  70. Stanke M, Schöffmann O, Morgenstern B et al (2006) Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinforma 7:62

    Google Scholar 

  71. Howe KL, Chothia T, Durbin R (2002) GAZE: a generic framework for the integration of gene-prediction data by dynamic programming. Genome Res 12(9):1418–1427

    PubMed Central  PubMed  CAS  Google Scholar 

  72. Allen JE, Salzberg SL (2005) JIGSAW: integration of multiple sources of evidence for gene prediction. Bioinformatics 21(18):3596–3603

    PubMed  CAS  Google Scholar 

  73. Haas BJ, Salzberg SL, Zhu W et al (2008) Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9(1):R7

    PubMed Central  PubMed  Google Scholar 

  74. Liu Q, Mackey AJ, Roos DS et al (2008) Evigan: a hidden variable model for integrating gene evidence for eukaryotic gene prediction. Bioinformatics 24(5):597–605

    PubMed  CAS  Google Scholar 

  75. Martin J, Bruno VM, Fang Z et al (2010) Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads. BMC Genomics 11:663

    PubMed Central  PubMed  CAS  Google Scholar 

  76. Surget-Groba Y, Montoya-Burgos J (2010) Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res 20(10):1432–1440

    PubMed Central  PubMed  CAS  Google Scholar 

  77. Schulz MH, Zerbino DR, Vingron M et al (2012) Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28(8):1086–1092

    PubMed Central  PubMed  CAS  Google Scholar 

  78. Xie Y, Wu G, Tang J et al. (2013) SOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads. arXiv:1305.6760 [q-bio.GN] (http://arxiv.org/abs/1305.6760)

  79. Robertson G, Schein J, Chiu R et al (2010) De novo assembly and analysis of RNA-seq data. Nat Methods 7(11):909–912

    PubMed  CAS  Google Scholar 

  80. Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7):644–652

    PubMed Central  PubMed  CAS  Google Scholar 

  81. Sacomoto GA, Kielbassa J, Chikhi R et al (2012) KISSPLICE: de-novo calling alternative splicing events from RNA-seq data. BMC Bioinforma 13(Suppl 6):S5

    Google Scholar 

  82. Anders S, Reyes A, Huber W (2012) Detecting differential usage of exons from RNA-seq data. Genome Res 22(10):2008–2017

    PubMed Central  PubMed  CAS  Google Scholar 

  83. Wang W, Qin Z, Feng Z et al (2013) Identifying differentially spliced genes from two groups of RNA-seq samples. Gene 518(1):164–170

    PubMed  CAS  Google Scholar 

  84. Srivastava S, Chen L (2010) A two-parameter generalized Poisson model to improve the analysis of RNA-seq data. Nucleic Acids Res 38(17):e170

    PubMed Central  PubMed  Google Scholar 

  85. Shen S, Park JW, Huang J et al (2012) MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data. Nucleic Acids Res 40(8):e61

    PubMed Central  PubMed  CAS  Google Scholar 

  86. Brooks AN, Yang L, Duff MO et al (2011) Conservation of an RNA regulatory map between Drosophila and mammals. Genome Res 21(2):193–202

    PubMed Central  PubMed  CAS  Google Scholar 

  87. Seok J, Xu W, Gao H et al (2012) JETTA: junction and exon toolkits for transcriptome analysis. Bioinformatics 28(9):1274–1275

    PubMed Central  PubMed  CAS  Google Scholar 

  88. Aschoff M, Hotz-Wagenblatt A, Glatting KH et al (2013) SplicingCompass: differential splicing detection using RNA-Seq data. Bioinformatics 29(9):1141–1148

    PubMed  CAS  Google Scholar 

  89. Hu Y, Huang Y, Du Y et al (2013) DiffSplice: the genome-wide detection of differential splicing events with RNA-seq. Nucleic Acids Res 41(2):e39

    PubMed Central  PubMed  CAS  Google Scholar 

  90. Singh D, Orellana CF, Hu Y et al (2011) FDM: a graph-based statistical method to detect differential transcription using RNA-seq data. Bioinformatics 27(19):2633–2640

    PubMed Central  PubMed  CAS  Google Scholar 

  91. Drewe P, Stegle O, Hartmann L et al (2013) Accurate detection of differential RNA processing. Nucl Acids Res 41(10):5189–5198

    PubMed Central  PubMed  CAS  Google Scholar 

  92. Zheng S, Chen L (2009) A hierarchical Bayesian model for comparing transcriptomes at the individual transcript isoform level. Nucleic Acids Res 37(10):e75

    PubMed Central  PubMed  Google Scholar 

  93. Trapnell C, Hendrickson DG, Sauvageau M et al (2013) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol 31(1):46–53

    PubMed  CAS  Google Scholar 

  94. Leng N, Dawson JA, Thomson JA et al (2013) EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics 29(8):1035–1043

    PubMed  CAS  PubMed Central  Google Scholar 

  95. Fiume M, Williams V, Brook A et al (2010) Savant: genome browser for high-throughput sequencing data. Bioinformatics 26(16):1938–1944

    PubMed Central  PubMed  CAS  Google Scholar 

  96. Liu Q, Chen C, Shen E et al (2012) Detection, annotation and visualization of alternative splicing from RNA-Seq data with SplicingViewer. Genomics 99(3):178–182

    PubMed  CAS  Google Scholar 

  97. Slater GS, Birney E (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinforma 6:31

    Google Scholar 

  98. Kent WJ (2002) BLAT–the BLAST-like alignment tool. Genome Res 12(4):656–664

    PubMed Central  PubMed  CAS  Google Scholar 

  99. Wu TD, Watanabe CK (2005) GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21(9):1859–1875

    PubMed  CAS  Google Scholar 

  100. Fonseca NA, Rung J, Brazma A et al (2012) Tools for mapping high-throughput sequencing data. Bioinformatics 28(24):3169–3177

    PubMed  CAS  Google Scholar 

  101. Garber M, Grabherr MG, Guttman M et al (2011) Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods 8(6):469–477

    PubMed  CAS  Google Scholar 

  102. Schneeberger K, Hagmann J, Ossowski S et al (2009) Simultaneous alignment of short reads against multiple genomes. Genome Biol 10(9):R98

    PubMed Central  PubMed  Google Scholar 

  103. Langmead B, Trapnell C, Pop M et al (2009) Ultrafast and memory efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25

    PubMed Central  PubMed  Google Scholar 

  104. Clark TA, Sugnet CW, Ares M Jr (2002) Genome wide analysis of mRNA processing in yeast using splicing-specific microarrays. Science 296(5569):907–910

    PubMed  CAS  Google Scholar 

  105. Sultan M, Schulz MH, Richard H et al (2008) A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321(5891):956–960

    PubMed  CAS  Google Scholar 

  106. Cloonan N, Forrest ARR, Kolle G et al (2008) Stem cell transcriptome profiling via massive scale mRNA sequencing. Nat Methods 5(7):613–619

    PubMed  CAS  Google Scholar 

  107. Cloonan N, Xu Q, Faulkner GJ et al (2009) RNA-MATE: a recursive mapping strategy for high-throughput RNA-sequencing data. Bioinformatics 25(19):2615–2616

    PubMed Central  PubMed  CAS  Google Scholar 

  108. Tang F, Barbacioru C, Wang Y et al (2009) mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods 6(5):377–382

    PubMed  CAS  Google Scholar 

  109. Chen L (2012) Statistical and computational methods for high-throughput sequencing data analysis of alternative splicing. Stat Biosci 5(1):138–155

    Google Scholar 

  110. Venables JP, Klinck R, Bramard A et al (2008) Identification of alternative splicing markers for breast cancer. Cancer Res 68(22):9525–9531

    PubMed  CAS  Google Scholar 

  111. Li R, Yu C, Li Y et al (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25(15):1966–1967

    PubMed  CAS  Google Scholar 

  112. Deng N, Puetter A, Zhang K et al (2011) Isoform-level microRNA-155 target prediction using RNA-seq. Nucleic Acids Res 39(9):e61

    PubMed Central  PubMed  CAS  Google Scholar 

  113. Nguyen TC, Deng N, Zhu D (2013) SASeq: a selective and adaptive shrinkage approach to detect and quantify active transcripts using RNA-Seq. arXiv:1208.3619v2 [q-bio.QM] (http://arxiv.org/abs/1208.3619v2)

  114. Heber S, Alekseyev M, Sze SH et al (2002) Splicing graphs and EST assembly problem. Bioinformatics 18(Suppl 1):S181–S188

    PubMed  Google Scholar 

  115. Haas BJ, Delcher AL, Mount SM et al (2003) Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res 31:5654–5666

    PubMed Central  PubMed  CAS  Google Scholar 

  116. Xing Y, Resch A, Lee C (2004) The multiassembly problem: reconstructing multiple transcript isoforms from EST fragment mixtures. Genome Res 14(3):426–441

    PubMed Central  PubMed  CAS  Google Scholar 

  117. Xing Y, Yu T, Wu YN et al (2006) An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphs. Nucleic Acids Res 34(10):3150–3160

    PubMed Central  PubMed  CAS  Google Scholar 

  118. Nagaraj SH, Gasser RB, Ranganathan S (2007) A hitchhiker’s guide to expressed sequence tag (EST) analysis. Brief Bioinform 8(1):6–21

    PubMed  CAS  Google Scholar 

  119. Salzman J, Jiang H, Wong WH (2011) Statistical modeling of RNA-Seq data. Stat Sci 26(1):62–83

    Google Scholar 

  120. Li B, Ruotti V, Stewart R et al (2010) RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26(4):493–500

    PubMed Central  PubMed  Google Scholar 

  121. Sonnenburg S, Schweikert G, Philips P et al (2007) Accurate splice site prediction using support vector machines. BMC Bioinforma 8(Suppl 10):S7

    Google Scholar 

  122. Stanke M, Keller O, Gunduz I et al (2006) AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res 34(Web Server issue):W435–W439

    PubMed Central  PubMed  CAS  Google Scholar 

  123. Guigó R, Flicek P, Abril JF et al (2006) EGASP: the human ENCODE genome annotation assessment project. Genome Biol 7(Suppl 1):S2.1–31

    Google Scholar 

  124. Pontius JU, Wagner L, Schuler GD (2003) UniGene: a unified view of the transcriptome. In: The NCBI Handbook. Bethesda (MD): National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/books/NBK21083/

  125. Zhao QY, Wang Y, Kong YM et al (2011) Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinforma 12(Suppl 14):S2

    CAS  Google Scholar 

  126. Jackson B, Schnable P, Aluru S (2009) Parallel short sequence assembly of transcriptomes. BMC Bioinforma 10(Suppl 1):S14

    Google Scholar 

  127. Vijay N, Poelstra JW, Künstner A et al (2013) Challenges and strategies in transcriptome assembly and differential gene expression quantification. A comprehensive in silico assessment of RNA-seq experiments. Mol Ecol 22(3):620–634

    PubMed  CAS  Google Scholar 

  128. Stegle O, Drewe P, Bohnert R et al (2010) Statistical tests for detecting differential rna-transcript expression from read counts. Nat Preced. doi:10.1038/npre.2010.4437.1

    Google Scholar 

  129. Kakaradov B, Xiong HY, Lee LJ et al (2012) Challenges in estimating percent inclusion of alternatively spliced junctions from RNA-seq data. BMC Bioinforma 13(Suppl 6):S11

    CAS  Google Scholar 

  130. Jiang H, Wong WH (2008) SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformatics 24(20):2395–2396

    PubMed Central  PubMed  CAS  Google Scholar 

  131. Borgwardt KM, Gretton A, Rasch MJ et al (2006) Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bioinformatics 22(14):e49–e57

    PubMed  CAS  Google Scholar 

  132. Hansen KD, Wu Z, Irizarry RA et al (2011) Sequencing technology does not eliminate biological variability. Nat Biotechnol 29:572–573

    PubMed Central  PubMed  CAS  Google Scholar 

  133. Oshlack A, Robinson MD, Young MD (2010) From RNA-seq reads to differential expression results. Genome Biol 11(12):220. doi:10.1186/gb-2010-11-12-220

    PubMed Central  PubMed  CAS  Google Scholar 

  134. Bhasi A, Philip P, Sreedharan VT et al (2009) AspAlt: A tool for inter-database, inter-genomic and user-specific comparative analysis of alternative transcription and alternative splicing in 46 eukaryotes. Genomics 94(1):48–54

    PubMed  CAS  Google Scholar 

  135. Martelli PL, D’Antonio M, Bonizzoni P et al (2011) ASPicDB: a database of annotated transcript and protein variants generated by alternative splicing. Nucleic Acids Res 39(Database issue):D80–D85

    PubMed Central  PubMed  CAS  Google Scholar 

  136. Karolchik D, Hinrichs AS, Kent WJ (2012) The UCSC Genome Browser. Curr Protoc Bioinformatics Chapter 1:Unit1.4

    Google Scholar 

  137. Donlin MJ. (2009) Using the Generic Genome Browser (GBrowse). Curr Protoc Bioinformatics, Chapter 9:Unit 9.9

    Google Scholar 

  138. Lee E, Harris N, Gibson M et al (2009) Apollo: a community resource for genome annotation editing. Bioinformatics 25:1836–1837

    PubMed  Google Scholar 

  139. Pyrkosz AB, Cheng H, Brown CT. (2013) RNA-Seq Mapping Errors When Using Incomplete Reference Transcriptomes of Vertebrates. arXiv:1303.2411 [q-bio.GN] (http://arxiv.org/abs/1303.2411)

  140. Birzele F, Schaub J, Rust W et al (2010) Into the unknown: expression profiling without genome sequence information in CHO by next generation sequencing. Nucleic Acids Res 38(12):3999–4010

    PubMed Central  PubMed  CAS  Google Scholar 

  141. MacManes MD, Eisen MB (2013) Improving transcriptome assembly through error correction of high-throughput sequence reads. arXiv:1304.0817 [q-bio.GN] (http://arxiv.org/abs/1304.0817) (3/April/2013)

  142. Eyras E, Caccamo M, Curwen V et al (2004) ESTGenes: alternative splicing from ESTs in Ensembl. Genome Res 14(5):976–987

    PubMed Central  PubMed  CAS  Google Scholar 

  143. Lovén J, Orlando DA, Sigova AA et al (2012) Revisiting global gene expression analysis. Cell 151(3):476–482

    PubMed Central  PubMed  Google Scholar 

Download references

Acknowledgements

We thank Y. Xing, K. Hertel, J.R. González, M. Kreitzman, and P. Drewe for comments and suggestions. This work was supported by the Spanish Ministry of Science with grants BIO2011-23920 and CSD2009-00080 and by Sandra Ibarra Foundation for Cancer with grant FSI 2011-035.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Alamancos, G.P., Agirre, E., Eyras, E. (2014). Methods to Study Splicing from High-Throughput RNA Sequencing Data. In: Hertel, K. (eds) Spliceosomal Pre-mRNA Splicing. Methods in Molecular Biology, vol 1126. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-980-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-980-2_26

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-979-6

  • Online ISBN: 978-1-62703-980-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics