Abstract
Recently, Next-Generation Sequencing (NGS) has emerged as revolutionary technique in the fields of ‘-omics’ research. The Cancer Research Atlas (TCGA) is a great example of it where massive amount of sequencing data is present for miRNA and mRNA. Analysing these data could bring out some potential biological insight. Moreover, developing a prognostic system based on this newly available sequencing data will give a greater help to cancer diagnosis. Hence, in this article, we have made an attempt to analyse such sequencing data of miRNA for accurate prediction of Breast Cancer. Generally miRNAs are small non-coding RNAs which are shown to participate in several carcinogenic processes either by tumor suppressors or oncogenes. This is the reason clinical treatment of the breast cancer patient has changed nowadays. Thus, it is interesting to understand the role of miRNAs for the prediction of breast cancer. In this regard, we have developed a technique using Gravitation Search Algorithm, which optimizes the underlying classification performance of Support Vector Machine. The proposed technique is able to select the potential features, in this case miRNAs, in order to achieve better prediction accuracy. In this study, we have achieved the classification accuracy upto 95.29 % by considering \({\simeq }\)1.5 % miRNAs of whole dataset automatically. Thereafter, a list of miRNAs is created after providing a rank. It is found from the list of top 15 miRNAs that 6 miRNAs are associated with the breast cancer while in others, 5 miRNAs are associated with different cancer types and 4 are unknown miRNAs. The performance of the proposed technique is compared with seven other state-of-the-art techniques. Finally, the results have been justified by the means of statistical test along with biological significance analysis of selected miRNAs.
Keywords
I. Saha and S.S. Bhowmick—Joint first authors and contributed equally.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Grada, A., Weinbrecht, K.: Next-generation sequencing: methodology and application. J. Invest. Dermatol. 133(8), e11 (2013)
Miller, T., Ghoshal, K., Ramaswamy, B., Roy, S., Datta, J., Shapiro, C., Jacob, S., Majumder, S.: MicroRNA-221/222 confers tamoxifen resistance in breast cancer by targeting p27Kip1. J. Biol. Chem. 283(44), 29897–29903 (2008)
Bartel, D.: MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233 (2009)
Jacobsen, A., Silber, J., Harinath, G., Huse, J., Schultz, N., Sander, C.: Analysis of microRNA-target interactions across diverse cancer types. Nat. Struct. Mol. Biol. 20(11), 1325–1332 (2013)
Bang-Berthelsen, C., Pedersen, L., Fløyel, T., Hagedorn, P., Gylvin, T., Pociot, F.: Independent component and pathway-based analysis of miRNA-regulated gene expression in a model of type 1 diabetes. BMC Genomics 12(1), 97 (2011)
Song, H., Wang, Q., Guo, Y., Liu, S., Song, R., Gao, X., Dai, L., Li, B., Zhang, D., Cheng, J.: Microarray analysis of microRNA expression in peripheral blood mononuclear cells of critically ill patients with influenza A (H1N1). BMC Infect. Dis. 13(1), 257 (2013)
Hunsberger, J., Fessler, E., Chibane, F., Leng, Y., Maric, D., Elkahloun, A., Chuang, D.: Mood stabilizer-regulated miRNAs in neuropsychiatric and neurodegenerative diseases: identifying associations and functions. Am. J. Transl. Res. 5(4), 450–464 (2013)
Baskerville, S., Bartel, D.: Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. RNA 11(3), 241–247 (2005)
Rodriguez, A., Griffiths-Jones, S., Ashurst, J., Bradley, A.: Identification of mammalian microRNA host genes and transcription units. Genome Res. 14(10a), 1902–1910 (2004)
Sun, Y., Koo, S., White, N., Peralta, E., Esau, C., Dean, N., Perera, R.: Development of a micro-array to detect human and mouse microRNAs and characterization of expression in human organs. Nucleic Acids Res. 32, e188 (2004)
Grimson, A., Farh, K., Johnston, W., Garrett-Engele, P., Lim, L., Bartel, D.: MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol. Cell 27(1), 91–105 (2007)
Rashedi, E., Nezamabadi-Pour, H., Saryazdi, S.: GSA: a gravitational search algorithm. Inf. Sci. 179(13), 2232–2248 (2009)
Boser, B.E., Guyon, I.M., Vapnik, N.V.: A training algorithm for optimal margin classifiers. In: Proceedings of the 5th Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gassenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomeld, D.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Bickel, P.J., Doksum, K.A.: Mathematical Statistics: Basic Ideas and Selected Topics. Holden-Day, San Francisco (1977)
Hollander, M., Wolfe, D.A.: Nonparametric Statistical Methods, vol. 2. Wiley, New York (1999)
Yang, H., Moody, J.: Feature selection based on joint mutual information. In: Proceedings of the International Symposium on Advances in Intelligent Data Analysis, pp. 22–25 (1999)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 5(4), 537–550 (1994)
Lancucki, A., Saha, I., Lipinski, P.: A new evolutionary gene selection technique. In: Proceedings of the International IEEE Conference on Evolutionary Computing, pp. 1612–1619 (2015)
Friedman, M.: A comparison of alternative tests of significance for the problem of m rankings. Ann. Math. 11, 86–92 (1940)
Xie, B., Ding, Q., Han, H., Wu, D.: miRCancer: a microRNA-cancer association database constructed by text mining on literature. Bioinformatics 29(5), 638–644 (2013)
Saha, I., Maulik, U., Plewczynski, D.: A new multi-objective technique for differential fuzzy clustering. Appl. Soft Comput. 11(2), 2765–2776 (2011)
Saha, I., Plewczynski, D., Maulik, U., Bandyopadhyay, S.: Improved differential evolution for microarray analysis. Int. J. Data Min. Bioinform. 6(1), 86–103 (2012)
Saha, I., Rak, B., Bhowmick, S.S., Maulik, U., Bhattacharjee, D., Koch, U., Lazniewski, M., Plewczynski, D.: Binding activity prediction of cyclin-dependent inhibitors. J. Chem. Inf. Model. 55(7), 1469–1482 (2015)
Bhowmick, S.S., Saha, I., Mazzocco, G., Maulik, U., Rato, L., Bhattacharjee, D., Plewczynski, D.: Application of RotaSVM for HLA class II protein-peptide interaction prediction. In: Proceedings of the 5th International Conference on Bioinformatics, pp. 178–185 (2014)
Mazzocco, G., Bhowmick, S.S., Saha, I., Maulik, U., Bhattacharjee, D., Plewczynski, D.: MaER: a new ensemble based multiclass classifier for binding activity prediction of HLA Class II proteins. in: Proceedings of the 6th International Conference on Pattern Recognition and Machine Intelligence, pp. 462–471 (2015)
Saha, I., Zubek, J., Klingström, T., Forsberg, S., Wikander, J., Kierczak, M., Maulik, U., Plewczynski, D.: Ensemble learning prediction of protein-protein interactions using proteins functional annotations. Mol. BioSyst. 10(4), 820–830 (2014)
Acknowledgment
This work was carried out during the tenure of an ERCIM ‘Alain Bensoussan’ Fellowship Programme as well as partially supported by the Polish National Science Centre (Grant number UMO-2013/09/B/NZ2/00121 and 2014/15/B/ST6/05082), COST BM1405 and BM1408 EU actions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Saha, I. et al. (2016). Analysis of Next-Generation Sequencing Data of miRNA for the Prediction of Breast Cancer. In: Panigrahi, B., Suganthan, P., Das, S., Satapathy, S. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2015. Lecture Notes in Computer Science(), vol 9873. Springer, Cham. https://doi.org/10.1007/978-3-319-48959-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-48959-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48958-2
Online ISBN: 978-3-319-48959-9
eBook Packages: Computer ScienceComputer Science (R0)