MicroRNAs (miRNAs) are small endogenous non-coding RNAs known to post-transcriptionally regulate gene expression in a broad range of organism. Since the discovery of the very first miRNAs, lin-4 and let-7, computational methods have been indispensable tools that complement experimental approaches to understand the biology of miRNAs. In this article, we introduce a web-based computational tool, miRHunter, that identifies potential miRNA precursors (pre-miRNAs) in the genomic sequences by using a combined computational method. The method coupled ab initio method with homology-based and hairpin structure-based methods. The miRHunter consists of five modules: 1) a preprocessing module, 2) an evolutionary conservation filter module, 3) a hairpin structure filter module, 4) a support vector machine module that evaluates preliminary pre-miRNA candidates derived from the previous two filtering modules, and 5) a post-processing module. The miRHunter system yielded the following average test results: 96.16%/93.23%, 96.00%/94.68%, and 95.87%/93.57% which are sensitivity (Sn) and specificity (Sp) for animal, plant, and overall categories respectively. The miRHunter system can complement experimental methods and allow wetlab researchers to screen long sequences for putative miRNAs as well as pre-testing miRNAs of interest. The microarray profiling experiments have supported that the clusters of proximal pairs of miRNAs are generally coexpressed. Therefore, the clustering or spatial localization information will be used to improve the accuracy of our system in further work. The miRHunter is available at http://www.bioinfoworld.com/.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Chan, J.A., Krichevsky, A.M. & Kenneth, S.K. MicroRNA-21 is an antiapoptotic factor in human glioblastoma cells. Cancer Res. 65, 6029–6033 (2005).
Esquela-Kerscher, A. & Slack, F.J. Oncomirs -microRNAs with a role in cancer. Nat. Rev. Cancer 6, 6:259–269 (2006).
Bartel, D.P. MicroRNAs: Genomics, biogenesis, mechanism, and function. Cell 116, 281–297 (2004).
Yekta, S., Shih, I.H. & Bartel, D.P. MicroRNA-directed cleavage of HOXB8 mRNA. Science 304, 594–596 (2004).
Bagga, S. et al. Regulation by let-7 and lin-4 miRNAs results in target mRNA degradation. Cell 122, 553–563 (2005).
Kozomara, A. & Griffiths-Jones, S. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res. 42, D68–D73 (2014).
Szittya, G. et al. High-throughput sequencing of Medicago truncatula short RNAs identifies eight new miRNA families. BMC Genomics 9, 593. doi: 10.1186/1471-2164-9-593 (2008).
Kim, K.B. A survey on computational approaches to the discovery of microRNA genes. Current Bioinformatics 9, 173–181 (2014).
Byvatov, E. & Schneider, G. Support vector machine applications in bioinformatics. Appl. Bioinformatics 2, 67–77 (2003).
Lancashire, L.J., Lemetre, C. & Ball, G.R. An introduction to artificial neural networks in bioinformatics -application to complex microarray and mass spectrometry datasets in cancer studies. Brief. Bioinform. 10, 315–329 (2009).
Yoon, B.J. Hidden markov models and their applications in biological sequence analysis. Curr. Genomics 10, 402–415 (2009).
Webb, G.I., Boughton, J. & Wang, Z. Not so Naïve Bayes: aggregating one-dependence estimators. Machine Learning 58, 5–24 (2005).
Altschul, S., Gish, W., Miller, W., Myers, E. & Lipman, D. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Hofacker, I.L. Vienna RNA secondary structure server. Nucleic Acids Res. 31, 3429–h3431 (2003).
Loong, K. & Mishra, S. De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures. Bioinformatics 23, 1321–1330 (2007).
Batuwita, R. & Palade, V. microPred: Effective classification of pre-miRNAs for human miRNA gene prediction. Bioinformatics 25, 989–995 (2009).
Zhong, Y., Xuan, P., Han, K., Zhang, W. & Li, J. Improved Pre-miRNA Classification by Reducing the Effect of Class Imbalance. Biomed Res. Int. 2015, DOI: 10.1155 (2015).
Pruitt, K.D. & Maglott, D.R. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 29, 137–140 (2001).
Keerthi, S. & Lin, C.J. Asymptotic behaviours of support vector machines with Gaussian kernel. Neural Comput. 15, 1667–1689 (2003).
Chang, C.C. & Lin, C.J. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 7:1–27:27 (2011).
About this article
Cite this article
Koh, I., Kim, KB. miRHunter: A tool for predicting microRNA precursors based on combined computational method. BioChip J 11, 164–171 (2017). https://doi.org/10.1007/s13206-017-1210-3
- Gene expression
- Combined computational method
- Support vector machine