Skip to main content

An SVM-Based Approach to Discover MicroRNA Precursors in Plant Genomes

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7104))

Abstract

MicroRNAs (miRNAs) are noncoding RNAs of ~22 nucleotides that play versatile regulatory roles in multicelluler organisms. Since the cloning methods for miRNAs identification are biased towards abundant miRNAs, the computational approaches provide useful complements to identify miRNAs which are highly constrained by tissue- and time-specifically expression manners. In this paper, we propose a novel Support Vector Machine (SVM) based detector, named MiR-PD, to identify pre-miRNAs in plants. The classifier is constructed based on twelve features of pre-miRNAs, inclusive of five global features and seven sub-structure features. Trained on 790 plant pre-miRNAs and 7,900 pseudo pre-miRNAs, MiR-PD achieves 96.43% five-fold cross-validation accuracy. Tested on the newly identified 441 plant pre-miRNAs and 62,883 pseudo pre-miRNAs, MiR-PD reports an accuracy of 99.71% with 77.55% sensitivity and 99.87% specificity, suggesting a feasible genome-wide application of this miRNAs detector so as to identify novel miRNAs (especially for those species-specific miRNAs) in plants without relying on phylogenetical conservation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Batuwita, R., Palade, V.: microPred: effective classification of pre-miRNAs for human miRNA gene prediction. Bioinformatics 25(8), 989 (2009)

    Article  Google Scholar 

  2. Bentwich, I., Avniel, A., Karov, Y., Aharonov, R., Gilad, S., Barad, O., Barzilai, A., Einat, P., Einav, U., Meiri, E., et al.: Identification of hundreds of conserved and nonconserved human microRNAs. Nature Genetics 37(7), 766–770 (2005)

    Article  Google Scholar 

  3. Bonnet, E., Wuyts, J., Rouzé, P., Van de Peer, Y.: Detection of 91 potential conserved plant microRNAs in Arabidopsis thaliana and Oryza sativa identifies important target genes. PNAS 101(31), 11511 (2004)

    Article  Google Scholar 

  4. Bonnet, E., Wuyts, J., Rouzé, P., Van de Peer, Y.: Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences. Bioinformatics (2004)

    Google Scholar 

  5. Carrington, J.C., Ambros, V.: Role of microRNAs in plant and animal development. Science 301(5631), 336 (2003)

    Article  Google Scholar 

  6. Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001)

    Google Scholar 

  7. Chang, D., Wang, C., Chen, J.: Using a kernel density estimation based classifier to predict species-specific microRNA precursors. BMC Bioinformatics 9(suppl.12), 2 (2008)

    Article  Google Scholar 

  8. Cullen, B.: Viruses and microRNAs. Nature Genetics 38, S25–S30 (2006)

    Article  Google Scholar 

  9. Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S., Bateman, A.: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Research 33(Database Issue), D121 (2005)

    Article  Google Scholar 

  10. Griffiths-Jones, S., Saini, H., Dongen, S., Enright, A.: miRBase: tools for microRNA genomics. Nucleic Acids Research (2007)

    Google Scholar 

  11. Hertel, J., Stadler, P.: Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data. Bioinformatics 22(14), e197 (2006)

    Article  Google Scholar 

  12. Hofacker, I., Fekete, M., Stadler, P.: Secondary structure prediction for aligned RNA sequences. Journal of Molecular Biology 319(5), 1059–1066 (2002)

    Article  Google Scholar 

  13. Hsieh, C., Chang, D., Hsueh, C., Wu, C., Oyang, Y.: Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm. BMC Bioinformatics 11(suppl.1), 52 (2010)

    Article  Google Scholar 

  14. Jones-Rhoades, M., Bartel, D.: Computational identification of plant microRNAs and their targets, including a stress-induced miRNA. Molecular Cell 14(6), 787–799 (2004)

    Article  Google Scholar 

  15. Kwang Loong, S., Mishra, S.: De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures. Bioinformatics (2007)

    Google Scholar 

  16. Lai, E., Tomancak, P., Williams, R., Rubin, G.: Computational identification of Drosophila microRNA genes. Genome Biol. 4(7), R42 (2003)

    Article  Google Scholar 

  17. Lee, R., Feinbaum, R., Ambros, V.: The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75(5), 843–854 (1993)

    Article  Google Scholar 

  18. Lim, L., Glasner, M., Yekta, S., Burge, C., Bartel, D.: Vertebrate microRNA genes. Science 299(5612), 1540 (2003)

    Article  Google Scholar 

  19. Lim, L., Lau, N., Weinstein, E., Abdelhakim, A., Yekta, S., Rhoades, M., Burge, C., Bartel, D.: The microRNAs of Caenorhabditis elegans. Genes & Development 17(8), 991 (2003)

    Article  Google Scholar 

  20. Osuna, E., Freund, R., Girosi, F.: Support vector machines: Training and applications. CBCL-144 (1997)

    Google Scholar 

  21. Pedersen, J., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh, K., Lander, E., Kent, J., Miller, W., Haussler, D.: Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput. Biol. 2(4), e33 (2006)

    Article  Google Scholar 

  22. Reinhart, B., Slack, F., Basson, M., Pasquinelli, A., Bettinger, J., Rougvie, A., Horvitz, H., Ruvkun, G.: The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403(6772), 901–906 (2000)

    Article  Google Scholar 

  23. Sewer, A., Paul, N., Landgraf, P., Aravin, A., Pfeffer, S., Brownstein, M., Tuschl, T., Van Nimwegen, E., Zavolan, M.: Identification of clustered microRNAs using an ab initio prediction method. BMC Bioinformatics 6(1), 267 (2005)

    Article  Google Scholar 

  24. Wang, X., Zhang, J., Li, F., Gu, J., He, T., Zhang, X., Li, Y.: MicroRNA identification based on sequence and structure alignment. Bioinformatics 21(18), 3610 (2005)

    Article  Google Scholar 

  25. Wang, X., Reyes, J., Chua, N., Gaasterland, T.: Prediction and identification of Arabidopsis thaliana microRNAs and their mRNA targets. Genome Biology 5(9), R65 (2004)

    Article  Google Scholar 

  26. Washietl, S., Hofacker, I., Stadler, P.: Fast and reliable prediction of noncoding RNAs. Proceedings of the National Academy of Sciences 102(7), 2454 (2005)

    Article  Google Scholar 

  27. Xue, C., Li, F., He, T., Liu, G., Li, Y., Zhang, X.: Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics 6(1), 310 (2005)

    Article  Google Scholar 

  28. Zhang, B., Pan, X., Cox, S., Cobb, G., Anderson, T.: Evidence that miRNAs are different from other RNAs. Cellular and Molecular Life Sciences 63(2), 246–254 (2006)

    Article  Google Scholar 

  29. Zuker, M.: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research 31(13), 3406 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Y., Jin, C., Zhou, M., Zhou, A. (2012). An SVM-Based Approach to Discover MicroRNA Precursors in Plant Genomes. In: Cao, L., Huang, J.Z., Bailey, J., Koh, Y.S., Luo, J. (eds) New Frontiers in Applied Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 7104. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28320-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28320-8_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28319-2

  • Online ISBN: 978-3-642-28320-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics