Skip to main content
Log in

Splicing-site recognition of rice (Oryza sativa L.) DNA sequences by support vector machines

  • Mechanics & Control Technology
  • Published:
Journal of Zhejiang University-SCIENCE A Aims and scope Submit manuscript

Abstract

Motivation: It was found that high accuracy splicing-site recognition of rice (Oryza sativa L.) DNA sequence is especially difficult. We described a new method for the splicing-site recognition of rice DNA sequences. Method: Based on the intron in eukaryotic organisms conforming to the principle of GT-AG, we used support vector machines (SVM) to predict the splicing sites. By machine learning, we built a model and used it to test the effect of the test data set of true and pseudo splicing sites. Results: The prediction accuracy we obtained was 87.53% at the true 5′ end splicing site and 87.37% at the true 3′ end splicing sites. The results suggested that the SVM approach could achieve higher accuracy than the previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Burge, C., 1997. Identification of Genes in Human Genomic DNA. Doctoral Thesis, Stanford University.

  • Burbidge, R., Trotter, M., Buxton, B. and Holden, S., 2001. Drug design by machine learning: support vector machines for pharmaceutical data analysis.Computers and Chemistry,26: 5–14.

    Article  Google Scholar 

  • Chang, C. C., Hsu, C. W. and Lin, C. J., 2000. The analysis of decomposition methods for support vector machines.IEEE Trans. Neural Networks,11(4): 1003–1008.

    Article  Google Scholar 

  • Cortes, C. and Vapnik, V., 1995. Support-Vector networks.Machine learning,20: 275–297.

    MATH  Google Scholar 

  • Gao, J. R. and Ye, L. B., 1999. Molecular Biology. Wuhan University Press, Wuhan, p. 135–138 (in Chinese).

    Google Scholar 

  • Hua, S. J. and Sun, Z. R., 2001a. A Novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach.J. Mol. Biol.,308: 397–407.

    Article  Google Scholar 

  • Hua, S. J. and Sun, Z. R., 2001b. Support vector machine approach for protein sub cellular localization prediction.Bioinformatics,17(8): 721–728.

    Article  Google Scholar 

  • Ogura, H. and Hideyuki, Agata, 1997. A study of learning splicing site of DNA sequence by neural networks.Comput. Biol. Med.,27(1): 67–75.

    Article  Google Scholar 

  • Osuna, E., Freund, R. and Girosi, F., 1997. Support Vector Machines: Training and Applications. AI Memo 1602, Massachusetts Institute of Technology.

  • Sun, J., Xu, J. and Lin, L. J., 1993. Using neural networks to recognize the splicing sites of mRNA.Transactions of Biophysical Sinica,9(1): 127–131 (in Chinese).

    Google Scholar 

  • Tong, K. Z., 1998. Gene and its Expression. Science Press, Beijing.

    Google Scholar 

  • Yu, J., Hu, S. N. and Wang, J., 2002. A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. Indica).Science,296: 79–92.

    Article  Google Scholar 

  • Vapnik, V., 2000. The Nature of Statistical Learning Theory. Traslated by Zhang Yuegong, Tsinghua University Press, Beijing (in Chinese).

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Project partially supported by the Start-up Funding of Zhejiang University to Chen Liang-biao

Rights and permissions

Reprints and permissions

About this article

Cite this article

Si-hua, P., Long-jiang, F., Xiao-ning, P. et al. Splicing-site recognition of rice (Oryza sativa L.) DNA sequences by support vector machines. J. Zhejiang Univ. Sci. A 4, 573–577 (2003). https://doi.org/10.1631/jzus.2003.0573

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.2003.0573

Key words

Document code

CLC number

Navigation