Splicing Code Modeling

Part of the Advances in Experimental Medicine and Biology book series (AEMB, volume 825)


How do cis and trans elements involved in pre-mRNA splicing come together to form a splicing “code”? This question has been a driver of much of the research involving RNA biogenesis. The variability of splicing outcome across developmental stages and between tissues coupled with association of splicing defects with numerous diseases highlights the importance of such a code. However, the sheer number of elements involved in splicing regulation and the context-specific manner of their operation have made the derivation of such a code challenging. Recently, machine learning-based methods have been developed to infer computational models for a splicing code. These methods use high-throughput experiments measuring mRNA expression at exonic resolution and binding locations of RNA-binding proteins (RBPs) to infer what the regulatory elements that control the inclusion of a given pre-mRNA segment are. The inferred regulatory models can then be applied to genomic sequences or experimental conditions that have not been measured to predict splicing outcome. Moreover, the models themselves can be interrogated to identify new regulatory mechanisms, which can be subsequently tested experimentally. In this chapter, we survey the current state of this technology, and illustrate how it can be applied by non-computational or RNA splicing experts to study regulation of specific exons by using the AVISPA web tool.


Splicing code Posttranscriptional regulation Alternative splicing Machine learning Computational biology 



The authors would like to thank Matthew Gazzara and Alex Amlie-Wolf for helpful comments and suggestions regarding the manuscript.


  1. Anders S, Reyes A, Huber W (2012) Detecting differential usage of exons from RNA-seq data. Genome Res 22:2008–2017. doi: 10.1101/gr.133744.111 PubMedCentralPubMedCrossRefGoogle Scholar
  2. Barash Y, Blencowe BJ, Frey BJ (2010a) Model-based detection of alternative splicing signals. Bioinformatics 26:i325–i333PubMedCentralPubMedCrossRefGoogle Scholar
  3. Barash Y, Calarco JA, Gao W et al (2010b) Deciphering the splicing code. Nature 465:53–59. doi: 10.1038/nature09000 PubMedCrossRefGoogle Scholar
  4. Barash Y, Vaquero-Garcia J, Gonzalez-Vallinas J et al (2013) AVISPA: a web tool for the prediction and analysis of alternative splicing. Genome Biol 14:R114. doi: 10.1186/gb-2013-14-10-r114 PubMedCentralPubMedCrossRefGoogle Scholar
  5. Barbosa-Morais NL, Irimia M, Pan Q et al (2012) The evolutionary landscape of alternative splicing in vertebrate species. Science 338:1587–1593. doi: 10.1126/science.1230612 PubMedCrossRefGoogle Scholar
  6. Bishop CM (2007) Pattern recognition and machine learning, Information science and statistics. Springer, New York, 1st ed. 2006. Corr. 2nd printing 2011Google Scholar
  7. Cartegni L, Wang J, Zhu Z et al (2003) ESEfinder: a web resource to identify exonic splicing enhancers. Nucleic Acids Res 31:3568–3571PubMedCentralPubMedCrossRefGoogle Scholar
  8. Castle JC, Zhang C, Shah JK et al (2008) Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines. Nat Genet 40(12):1416–1425PubMedCentralPubMedCrossRefGoogle Scholar
  9. Chen M, Manley JL (2009) Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat Rev Mol Cell Biol 10:741–754. doi: 10.1038/nrm2777 PubMedCentralPubMedGoogle Scholar
  10. Corvelo A, Hallegger M, Smith CW, Eyras E (2010) Genome-wide association between branch point properties and alternative splicing. PLoS Comput Biol 6:e1001016PubMedCentralPubMedCrossRefGoogle Scholar
  11. Dror G, Sorek R, Shamir R (2005) Accurate identification of alternatively spliced exons using support vector machine. Bioinformatics 21:897–901PubMedCrossRefGoogle Scholar
  12. Fagnani M, Barash Y, Ip J et al (2007) Functional coordination of alternative splicing in the mammalian central nervous system. Genome Biol 8:R108PubMedCentralPubMedCrossRefGoogle Scholar
  13. Fairbrother WG, Yeh RF, Sharp PA, Burge CB (2002) Predictive identification of exonic splicing enhancers in human genes. Science 297:1007–1013PubMedCrossRefGoogle Scholar
  14. Gazzara M, Vaquero-Garcia J, Lynch KW, Barash Y (2014) In silico to in vivo splicing analysis using splicing code models. Methods 67(1):3–12PubMedCrossRefGoogle Scholar
  15. Giardine B, Riemer C, Hardison RC et al (2005) Galaxy: a platform for interactive large-scale genome analysis. Genome Res 15:1451–1455PubMedCentralPubMedCrossRefGoogle Scholar
  16. Hall MP, Nagel RJ, Fagg WS et al (2013) Quaking and PTB control overlapping splicing regulatory networks during muscle cell differentiation. RNA 19:627–638. doi: 10.1261/rna.038422.113 PubMedCentralPubMedCrossRefGoogle Scholar
  17. Hiller M, Zhang Z, Backofen R, Stamm S (2007) Pre-mRNA secondary structures influence exon recognition. PLoS Genet 3:e204PubMedCentralPubMedCrossRefGoogle Scholar
  18. Huelga SC, Vu AQ, Arnold JD et al (2012) Integrative genome-wide analysis reveals cooperative regulation of alternative splicing by hnRNP proteins. Cell Rep 1:167–178. doi: 10.1016/j.celrep.2012.02.001 PubMedCentralPubMedCrossRefGoogle Scholar
  19. Kalsotra A, Xiao X, Ward AJ et al (2008) A postnatal switch of CELF and MBNL proteins reprograms alternative splicing in the developing heart. Proc Natl Acad Sci 105:20333–20338PubMedCentralPubMedCrossRefGoogle Scholar
  20. Katz Y, Wang ET, Airoldi EM, Burge CB (2010) Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods 7:1009–1015. doi: 10.1038/nmeth.1528 PubMedCentralPubMedCrossRefGoogle Scholar
  21. Kishore S, Jaskiewicz L, Burger L et al (2011) A quantitative analysis of CLIP methods for identifying binding sites of RNA-binding proteins. Nat Methods 8:559–564. doi: 10.1038/nmeth.1608 PubMedCrossRefGoogle Scholar
  22. Li B, Dewey C (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12:323PubMedCentralPubMedCrossRefGoogle Scholar
  23. Licatalosi DD, Mele A, Fak JJ et al (2008) HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456(7221):464–469PubMedCentralPubMedCrossRefGoogle Scholar
  24. Lim LP, Burge CB (2001) A computational analysis of sequence features involved in recognition of short introns. Proc Natl Acad Sci 98:11193–11198PubMedCentralPubMedCrossRefGoogle Scholar
  25. Luco RF, Misteli T (2011) More than a splicing code: integrating the role of RNA, chromatin and non-coding RNA in alternative splicing regulation. Curr Opin Genet Dev. doi: 10.1016/j.gde.2011.03.004 PubMedGoogle Scholar
  26. Luco RF, Pan Q, Tominaga K et al (2010) Regulation of alternative splicing by histone modifications. Science 327:996–1000. doi: 10.1126/science.1184208 PubMedCentralPubMedCrossRefGoogle Scholar
  27. Matlin AJ, Clark F, Smith CWJ (2005) Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol 6:386–398. doi: 10.1038/nrm1645 PubMedCrossRefGoogle Scholar
  28. Merkin J, Russell C, Chen P, Burge CB (2012) Evolutionary dynamics of gene and isoform regulation in mammalian tissues. Science 338:1593–1599. doi: 10.1126/science.1228186 PubMedCentralPubMedCrossRefGoogle Scholar
  29. Pan Q, Shai O, Misquitta C et al (2004) Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Mol Cell 16:929–941PubMedCrossRefGoogle Scholar
  30. Pan Q, Shai O, Lee LJ et al (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40:1413–1415PubMedCrossRefGoogle Scholar
  31. Pearl J (2000) Causality: models, reasoning, and inference. Cambridge University Press, CambridgeGoogle Scholar
  32. Ray D, Kazan H, Cook KB et al (2013) A compendium of RNA-binding motifs for decoding gene regulation. Nature 499:172–177. doi: 10.1038/nature12311 PubMedCentralPubMedCrossRefGoogle Scholar
  33. Schafer DA, Korshunova YO, Schroer TA, Cooper JA (1994) Differential localization and sequence analysis of capping protein beta-subunit isoforms of vertebrates. J Cell Biol 127:453–465. doi: 10.1083/jcb.127.2.453 PubMedCrossRefGoogle Scholar
  34. Scholkopf B, Smola AJ (2001) Learning with Kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge, MAGoogle Scholar
  35. Shai O, Morris QD, Blencowe BJ, Frey BJ (2006) Inferring global levels of alternative splicing isoforms using a generative model of microarray data. Bioinformatics 22:606PubMedCrossRefGoogle Scholar
  36. Shen S, Park JW, Huang J et al (2012) MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data. Nucleic Acids Res 40:e61. doi: 10.1093/nar/gkr1291 PubMedCentralPubMedCrossRefGoogle Scholar
  37. Shukla S, Kavak E, Gregory M et al (2011) CTCF-promoted RNA polymerase II pausing links DNA methylation to splicing. Nature 479:74–79. doi: 10.1038/nature10442 PubMedCrossRefGoogle Scholar
  38. Sugnet CW, Srinivasan K, Clark TA et al (2006) Unusual intron conservation near tissue-regulated exons found by splicing microarrays. PLoS Comput Biol 2:e4PubMedCentralPubMedCrossRefGoogle Scholar
  39. Ule J, Stefani G, Mele A et al (2006) An RNA map predicting Nova-dependent splicing regulation. Nature 444:580–586PubMedCrossRefGoogle Scholar
  40. Wang ET, Sandberg R, Luo S et al (2008) Alternative isoform regulation in human tissue transcriptomes. Nature 456:470–476. doi: 10.1038/nature07509 PubMedCentralPubMedCrossRefGoogle Scholar
  41. Wang ET, Cody NAL, Jog S et al (2012) Transcriptome-wide regulation of Pre-mRNA splicing and mRNA localization by muscleblind proteins. Cell 150:710–724PubMedCentralPubMedCrossRefGoogle Scholar
  42. Xiong HY, Barash Y, Frey BJ (2011) Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context. Bioinformatics 27:2554–2562PubMedCrossRefGoogle Scholar
  43. Yeo G, Burge CB (2004) Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J Comput Biol 11:377–394PubMedCrossRefGoogle Scholar
  44. Yeo GW, Nostrand EL, Liang TY (2007) Discovery and analysis of evolutionarily conserved intronic splicing regulatory elements. PLoS Genet 3:e85PubMedCentralPubMedCrossRefGoogle Scholar
  45. Zhang XH, Chasin LA (2004) Computational definition of sequence motifs governing constitutive exon splicing. Genes Dev 18:1241–1250PubMedCentralPubMedCrossRefGoogle Scholar
  46. Zhang C, Frias MA, Mele A et al (2010) Integrative modeling defines the nova splicing-regulatory network and its combinatorial controls. Science 329:439–443. doi: 10.1126/science.1191150 PubMedCentralPubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of GeneticsUniversity of PennsylvaniaUniversity ParkUSA
  2. 2.Department of Computer and Information ScienceUniversity of PennsylvaniaUniversity ParkUSA

Personalised recommendations