Support Vector Machine Approach for Retained Introns Prediction Using Sequence Features

  • Huiyu Xia
  • Jianning Bi
  • Yanda Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3973)


It is estimated that 40-60% of human genes undergo alternative splicing. Currently, expressed sequence tags (ESTs) alignment and microarray analysis are the most efficient methods for large-scale detection of alternative splice events. Because of the inherent limitation of these methods, it is hard to detect retained introns using them. Thus, it is highly desirable to predict retained introns using only their own sequence information. In this paper, support vector machine is introduced to predict retained introns merely based on their own sequences. It can achieve a total accuracy of 98.54%. No other data, such as ESTs, are required for the prediction. The results indicate that support vector machine can achieve a reasonable acceptant prediction performance for retained introns with effective rejection of constitutive introns.


Support Vector Machine Alternative Splice Prediction Performance Support Vector Machine Classifier Alternative Splice Event 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Graveley, B.R.: Alternative Splicing: Increasing Diversity in the Proteomic World. Trends Genet. 17(2), 100–107 (2001)CrossRefGoogle Scholar
  2. 2.
    Modrek, B., Lee, C.: A Genomic View of Alternative Splicing. Nat. Genet. 30(1), 13–19 (2002)CrossRefGoogle Scholar
  3. 3.
    Galante, P.A., Sakabe, N.J., Kirschbaum-Slager, N., de Souza, S.J.: Detection and Evaluation of Intron Retention Events in the Human Transcriptome. RNA 10(5), 757–765 (2004)CrossRefGoogle Scholar
  4. 4.
    Michael, I.P., Kurlender, L., Memari, N., et al.: Intron Retention: a Common Splicing Event within the Human Kallikrein Gene Family. Clin. Chem. 51(3), 506–515 (2005)CrossRefGoogle Scholar
  5. 5.
    Johnson, J.M., Castle, J., Garrett-Engele, P., et al.: Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays. Science 302(5653), 2141–2144 (2003)CrossRefGoogle Scholar
  6. 6.
    Hiller, M., Huse, K., Platzer, M., Backofen, R.: Non-EST Based Prediction of Exon Skipping and Intron Retention Events Using Pfam Information. Nucleic Acids Res. 33(17), 5611–5621 (2005)CrossRefGoogle Scholar
  7. 7.
    Thanaraj, T.A., Stamm, S., Clark, F., Riethoven, J.J., Le Texier, V., Muilu, J.: ASD: the Alternative Splicing Database. Nucleic Acids Res. 32(Database Issue), D64–D69 (2004)CrossRefGoogle Scholar
  8. 8.
    Wen, F., Li, F., Xia, H.Y., Lu, X., Zhang, X.G., Li, Y.D.: The Impact of Very Short Alternative Splicing on Protein Structures and Functions in the Human Genome. Trends Genet. 20(5), 232–236 (2004)CrossRefGoogle Scholar
  9. 9.
    Norton, P.A.: Polypyrimidine Tract Sequences Direct Selection of Alternative Branch Sites and Influence Protein Binding. Nucleic Acids Res. 22(19), 3854–3860 (1994)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Kol, G., Lev-Maor, G., Ast, G.: Human-Mouse Comparative Analysis Reveals that Branch-Site Plasticity Contributes to Splicing Regulation. Hum. Mol. Genet. 14(11), 1559–1568 (2005)CrossRefGoogle Scholar
  11. 11.
    Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)MATHGoogle Scholar
  12. 12.
    Dror, G., Sorek, R., Shamir, R.: Accurate Identification of Alternatively Spliced Exons Using Support Vector Machine. Bioinformatics 21(7), 897–901 (2005)CrossRefGoogle Scholar
  13. 13.
    Joachims, T.: Making Large-Scale SVM Learning Practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning, Ch.11, pp. 169–184. MIT Press, Cambridge (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Huiyu Xia
    • 1
  • Jianning Bi
    • 1
  • Yanda Li
    • 1
  1. 1.MOE Key Laboratory of Bioinformatics / Department of AutomationTsinghua UniversityBeijingChina

Personalised recommendations