Abstract
Background: MicroRNAs (miRNAs) are a class of \(\sim \)22-nucleotide endogenous non-coding RNAs, having critical roles across various biological processes. It also performs a meaningful character in tumorigenesis. Therefore, the identification of differentially expressed miRNAs is a growing challenge. In this regard, our paper presents the approach for the selection of miRNA markers from high-throughput sequencing data with Least Absolute Shrinkage and Selection Operator (LASSO), Covariance Matrix Adaptation Evolution (CMA-ES), and inner classifiers. Results: The proposed method select features using regression analysis, evolutionary optimization, and inner classifiers, named after its underlying methods. LASSO is used as the dimensionality reduction method. CMA-ES optimizer is considered here to search the space of miRNA subsets, which yields the best performance concerning the inner classifiers. We have investigated several inner classifiers to choose the best-performing one in exchange for the given objective. Conclusions: The proposed miRNA selection task uses real, next-generation sequencing data from a United States-based consortium, The Cancer Genome Atlas (TCGA), and concerns miRNA expression levels in healthy and malignant tissues. Moreover, the emphasis is given here to determine the miRNAs with differential expression patterns. By doing so, the work of the proposed method is checked with other up-to-date methods as classification accuracy. We conclude by analyzing the selected, most relevant miRNAs in differentiation between sample types. These selected miRNAs are also been validated using different biological significance analyses. Our method reduces the number of miRNAs from several hundred to few, thereby facilitating a more target-oriented experimental validation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
The Cancer Genome Atlas. https://tcga-data.nci.nih.gov/tcga/. Accessed 20 Dec 2015
Akar Ö, Güngör O (2012) Classification of multispectral images using random forest algorithm. J Geodesy and Geoinformation 1(2):105–112
Backes C, Meder B, Hart M, Ludwig N, Leidinger P, Vogel B, Galata V, Roth P, Menegatti J, Grässer F et al (2016) Prioritizing and selecting likely novel miRNAs from NGS data. Nucleic Acids Res 44(6):e53–e53
Bhowmick SS, Bhattacharjee D, Rato L (2019) In silico markers: an evolutionary and statistical approach to select informative genes of human breast cancer subtypes. Genes Genomics 41(12):1371–1382
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Fan Y, Siklenka K, Arora SK, Ribeiro P, Kimmins S, Xia J (2016) miRNet-dissecting miRNA-target interactions and functional associations through network-based visual analysis. Nucleic Acids Res gkw288
Fan Y, Xia J (2018) miRNet-functional analysis and visual exploration of miRNA-target interactions in a network context. In: Computational cell biology, pp 215–233
Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5:1531–1555
Fonti V, Belitser E (2017) Feature selection using lasso. VU Amsterdam Research Paper in Business Analytics 30:1–25
Haeussler M, Schönig K, Eckert H, Eschstruth A, Mianné J, Renaud JB, Schneider-Maunoury S, Shkumatava A, Teboul L, Kent J et al (2016) Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol 17(1):148
Igel C, Hansen N, Roth S (2007) Covariance matrix adaptation for multi-objective optimization. Evol Comput 15(1):1–28
Jain I, Jain VK, Jain R (2018) Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput 62:203–215
Jakulin A (2005) Machine learning based on attribute interactions. PhD thesis, Univerza v Ljubljan
Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan,Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A et al (2016) Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res gkw377
Peace RJ, Hassani MS, Green JR (2019) miPIE: NGS-based prediction of miRNA using integrated evidence. Sci Rep 9(1):1–10
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238
Rish I et al (2001) An empirical study of the naive bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence. vol 3, pp 41–46
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst, Man, Cybern 21(3):660–674
Sarkar M, Leong TY (2000) Application of k-nearest neighbors algorithm on breast cancer diagnosis problem. In: Proceedings of the AMIA symposium. p 759
Varelas K, Auger A, Brockhoff D, Hansen N, ElHara OA, Semet Y, Kassab R, Barbaresco F (2018) A comparative study of large-scale variants of CMA-ES. In: International conference on parallel problem solving from nature, pp 3–15
Yang HH, Moody JE (1999) Data visualization and feature selection: new algorithms for nongaussian data. The MIT Press, pp 687–702
Zen K, Zhang CY (2012) Circulating microRNAs: a novel class of biomarkers to diagnose and monitor human cancers. Med Res Rev 32(2):326–348
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Bhowmick, S., Bhattacharjee, D. (2021). MicroRNA-Based Cancer Classification Using Feature Selection Wrapper. In: Chaki, R., Chaki, N., Cortesi, A., Saeed, K. (eds) Advanced Computing and Systems for Security: Volume 14. Lecture Notes in Networks and Systems, vol 242. Springer, Singapore. https://doi.org/10.1007/978-981-16-4294-4_13
Download citation
DOI: https://doi.org/10.1007/978-981-16-4294-4_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-4293-7
Online ISBN: 978-981-16-4294-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)