Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 294))

  • 1125 Accesses

Abstract

MicroRNAs (miRNAs) are short (~22 nucleotides), endogenously-initiated non-coding RNAs that control gene expression post transcriptionally, either by the degradation of target miRNAs or by the inhibition of protein translation. The prediction of miRNA genes is a challenging problem towards the understanding of post transcriptional gene regulation. The present paper focuses on developing a computational method for the identification of miRNA precursors.

We propose a machine learning algorithm based on Random Forests (RF) for miRNA prediction. The prediction algorithm relies on a set of features; compiled from known features as well as others introduced for the first time; that results in a performance that is better than most well known miRNA classifiers. The method achieves 91.3% accuracy, 86% f-measure, 97.2% specificity, 93.4% precision and 79.6% sensitivity, when tested on real data. Our method succeeds in getting better results than MiPred (the best currently known RF algorithm in literature), Triplet-SVM and Virgo and EumiR.

The obtained results indicate that Random Forests is a better alternative to Support Vector Machines (SVM) for miRNA prediction, especially from the point of view of accuracy and f-measure metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D., et al.: Basic local alignment search tool. Journal of Molecular Biology 215, 403–410 (1990)

    Article  Google Scholar 

  2. Jiang, P., Wu, H., Wang, W., Ma, W., Sun, X., Lu, Z.: Mipred: classification of real and pseudo microrna precursors using random forest prediction model with combined features. Nucleic Acids Research 35, W339–W344 (2007)

    Google Scholar 

  3. Lim, L., Lau, N., Weinstein, E., Abdelhakim, A., Yekta, S., Rhoades, M., Burge, C., Bartel, D.: The micrornas of caenorhabditis elegans. Genes & Development 17, 991 (2003)

    Article  Google Scholar 

  4. Lai, E., Tomancak, P., Williams, R., Rubin, G.: Computational identication of drosophila microrna genes. Genome Biology 4 (2003)

    Google Scholar 

  5. Bonnet, E., Wuyts, J., Rouz, P., Van de Peer, Y.: Detection of 91 potential conserved plant micrornas in arabidopsis thaliana and oryza sativa identies important target genes. Proc. Natl. Acad. Sci. USA 101, 11511–11516 (2004)

    Article  Google Scholar 

  6. Jones-Rhoades, M., Bartel, D.: Computational identification of plant micrornas and their targets, including a stress-induced mirna. Molecular Cell 14, 787–799 (2004)

    Article  Google Scholar 

  7. Ng, K., Mishra, S.: De novo svm classification of precursor micrornas from genomic pseudo hairpins using global and intrinsic folding measures. Bioinformatics 23, 1321–1330 (2007)

    Article  Google Scholar 

  8. Sewer, A., Paul, N., Landgraf, P., Aravin, A., Pfeffer, S., Brownstein, M., Tuschl, T., van Nimwegen, E., Zavolan, M.: Identication of clustered micrornas using an ab initio prediction method. BMC Bioinformatics 6 (2005)

    Google Scholar 

  9. Xue, C., Li, F., He, T., Liu, G., Li, Y., Zhang, X.: Classification of real and pseudo microrna precursors using local structure-sequence features and support vector machine. BMC Bioinformatics 6 (2005)

    Google Scholar 

  10. Zheng, Y., Hsu, W., Li Lee, M., Wong, L.: Exploring essential attributes for detecting microRNA precursors from background sequences. In: Dalkilic, M.M., Kim, S., Yang, J. (eds.) VDMB 2006. LNCS (LNBI), vol. 4316, pp. 131–145. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Batuwita, R., Palade, V.: Micropred: effective classification of pre-mirnas for human mirna gene prediction. Bioinformatics 25, 989–995 (2009)

    Article  Google Scholar 

  12. Pasaila, D., Mohorianu, I., Sucila, A., Pantiru, S., Ciortuz, L.: Yet another svm for mirna recognition: yasmir. Technical report, Citeseer (2010)

    Google Scholar 

  13. Shiva, K., Faraz, A., Vinod, S.: Prediction of viral microrna precursors based on human microrna precursor sequence and structural features. Virology Journal 6 (2009)

    Google Scholar 

  14. Hofacker, I., Fontana, W., Stadler, P., Bonhoeffer, L., Tacker, M., Schuster, P.: Fast folding and comparison of rna secondary structures. Monatshefte für Chemie/Chemical Monthly 125, 167–188 (1994)

    Article  Google Scholar 

  15. Griffiths-Jones, S.: The microrna registry. Nucleic Acids Research 32, D109–D111 (2004)

    Google Scholar 

  16. Pruitt, K., Maglott, D.: Refseq and locuslink: Ncbi gene-centered resources. Nucleic Acids Research 29, 137–140 (2001)

    Article  Google Scholar 

  17. Bonnet, E., Wuyts, J., Rouzé, P., Van de Peer, Y.: Evidence that microrna precursors, unlike other non-coding rnas, have lower folding free energies than random sequences. Bioinformatics 20, 2911–2917 (2004)

    Article  Google Scholar 

  18. Freyhult, E., Gardner, P.P., Moulton, V.: A comparison of rna folding measures. BMC Bioinformatics 6, 241 (2005)

    Article  Google Scholar 

  19. Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5, 3–55 (2001)

    Article  MathSciNet  Google Scholar 

  20. van der Burgt, A., Fiers, M.W., Nap, J.P., van Ham, R.C.: In silico mirna prediction in metazoan genomes: balancing between sensitivity and specificity. BMC Genomics 10, 204 (2009)

    Article  Google Scholar 

  21. Loong, S.N.K., Mishra, S.K.: Unique folding of precursor micrornas: Quantitative evidence and implications for de novo identification. Rna 13, 170–187 (2007)

    Article  Google Scholar 

  22. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  23. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explorations Newsletter 11, 10–18 (2009)

    Article  Google Scholar 

  24. Griffiths-Jones, S., Grocock, R., Van Dongen, S., Bateman, A., Enright, A.: mirbase: microrna sequences, targets and gene nomenclature. Nucleic Acids Research 34, D140–D144 (2006)

    Google Scholar 

  25. Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sherin M. ElGokhy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

ElGokhy, S.M., Shibuya, T., Shoukry, A. (2014). Improving miRNA Classification Using an Exhaustive Set of Features. In: Saez-Rodriguez, J., Rocha, M., Fdez-Riverola, F., De Paz Santana, J. (eds) 8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014). Advances in Intelligent Systems and Computing, vol 294. Springer, Cham. https://doi.org/10.1007/978-3-319-07581-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07581-5_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07580-8

  • Online ISBN: 978-3-319-07581-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics