Cancer miRNA biomarkers classification using a new representation algorithm and evolutionary deep learning

Bagheri Khoulenjani, Niousha; Saniee Abadeh, Mohammad; Sarbazi-Azad, Saeed; Jaddi, Najmeh Sadat

doi:10.1007/s00500-020-05366-w

Cancer miRNA biomarkers classification using a new representation algorithm and evolutionary deep learning

Methodologies and Application
Published: 12 October 2020

Volume 25, pages 3113–3129, (2021)
Cite this article

Soft Computing Aims and scope Submit manuscript

Niousha Bagheri Khoulenjani¹,
Mohammad Saniee Abadeh ORCID: orcid.org/0000-0002-7100-2295¹,
Saeed Sarbazi-Azad¹ &
…
Najmeh Sadat Jaddi¹

442 Accesses
6 Citations
Explore all metrics

Abstract

The diagnosis of cancer is presently undergoing a change of paradigm for the diagnostic panel using molecular biomarkers. MicroRNA (miRNA) is one of the most important genomic datasets presenting the genome sequences. Since several studies have shown the relationship between miRNAs and cancers, data mining and machine learning methods can be incorporated to extract a large amount of knowledge from cancer genomic datasets. However, previous research works on the identification of cancers from miRNAs have made it possible to diagnose cancer, and the accuracy of some classes is not quite satisfactory. Therefore, this research is aimed at promoting a super-class (meta-label) approach and deep learning in a three-phase method to diagnose cancers from miRNAs. The steps in the first phase of the proposed method, named Representation learning, are partitioning data into super-classes, meta-data creation and super-classes classification. This phase helps data to be split into some subsets to improve classification accuracy. In other words, the first phase groups labels based on the separability of classes into a meta-label, and then a multi-label learner is built to predict these meta-labels. In the second phase, a feature selection to reduce the dimensions of the problem is applied to each super-class to help to focus the attention of an induction algorithm in those features that are more important to predict the target concept. In the third phase of the proposed method, an evolutionary deep neural network for the classification of labels in each super-class is performed. The last two phases are done separately for each subset in which five super-classes and subsequently five deep neural networks are trained. The experimental results reveal that the proposed method achieved more efficient results than 19 recent machine learning methods. Despite the fact that evaluating the dataset which consists of 29 types of cancers provides a more complicated situation for the convolutional neural network to be learned, the performance of the method is noticeably better than other existing methods. The other success which can be considered here is a significant reduction in running time comparing to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 6

MicroRNA-Based Cancer Classification Using Feature Selection Wrapper

Hybrid Framework for Genomic Data Classification Using Deep Learning: QDeep_SVM

Laryngeal cancer diagnosis via miRNA-based decision tree model

Article 26 December 2023

Aarav Arora, Igor F. Tsigelny & Valentina L. Kouznetsova

Notes

http://cancergenome.nih.gov/.

References

Abdel-Basset M et al (2018) A hybrid whale optimization algorithm based on local search strategy for the permutation flow shop scheduling problem. Future Gener Comput Syst 85:129–145
Article Google Scholar
Abualigah LM et al (2016) A krill herd algorithm for efficient text documents clustering. In: IEEE symposium on computer applications and industrial electronics (ISCAIE). IEEE
Abualigah LM et al (2017) β-hill climbing technique for the text document clustering. In: New Trends in Information Technology (NTIT)–2017, p 60
Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin
Book Google Scholar
Abualigah L (2020) Multi-verse optimizer algorithm: a comprehensive survey of its results, variants, and applications. Neural Comput Appl 32:12381–12401
Article Google Scholar
Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5(1):19
Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018) A novel weighting scheme applied to improve the text document clustering techniques. In: Innovative computing, optimization and its applications. Springer, Berlin, pp 305–320
Aghdam HH, Heravi EJ (2017) Guide to convolutional neural networks, vol 10. Springer, New York, pp 978–983
Book Google Scholar
Alevizos I, Illei GG (2010) MicroRNAs as biomarkers in rheumatic diseases. Nat Rev Rheumatol 6(7):391
Article Google Scholar
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
MathSciNet Google Scholar
Aydilek IB (2018) A hybrid firefly and particle swarm optimization algorithm for computationally expensive numerical problems. Appl Soft Comput 66:232–249
Article Google Scholar
Barger JF, Nana-Sinkam SP (2015) MicroRNA as tools and therapeutics in lung cancer. Respir Med 109(7):803–812
Article Google Scholar
Bartel DP (2004) MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116(2):281–297
Article Google Scholar
Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin
MATH Google Scholar
Bolón-Canedo V, Sánchez-Marono N, Alonso-Betanzos A (2014) Data classification using an ensemble of filters. Neurocomputing 135:13–20
Article Google Scholar
Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on computational learning theory. ACM
Breiman L (1999) Pasting small votes for classification in large databases and on-line. Mach Learn 36(1–2):85–103
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Breiman L et al (1984) Classification and regression trees. CRC Press, Boca Raton
MATH Google Scholar
Brown TA (2007) Genomes 3. Garland Science Pub., New York
Google Scholar
Chen X et al (2018) Novel human miRNA-disease association inference based on random forest. Mol Therapy Nucleic Acids 13:568–579
Article Google Scholar
Chin Y-H et al (2017) Music emotion recognition using PSO-based fuzzy hyper-rectangular composite neural networks. IET Signal Process 11(7):884–891
Article Google Scholar
Cox DR (1958) The regression analysis of binary sequences. J R Stat Soc Ser B (Methodol) 20(2):215–232
MathSciNet MATH Google Scholar
Crammer K et al (2006) Online passive-aggressive algorithms. J Mach Learn Res 7(Mar):551–585
MathSciNet MATH Google Scholar
Edgar JR (2016) Q&A: what are exosomes, exactly? BMC Biol 14(1):46
Article Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Article MathSciNet MATH Google Scholar
Fujino A, Isozaki H, Suzuki J (2008) Multi-label text categorization with model combination based on f1-score maximization. In: Proceedings of the third international joint conference on natural language processing, vol II
Garzelli A, Capobianco L, Nencini F (2008) Fusion of multispectral and panchromatic images as an optimisation problem. In: Image fusion, p 223
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42
Article MATH Google Scholar
Ghasemzadeh A, Azad SS, Esmaeili E (2019) Breast cancer detection based on Gabor-wavelet transform and machine learning methods. Int J Mach Learn Cybern 10(7):1603–1612
Article Google Scholar
Han S et al (2018) Optimizing filter size in convolutional neural networks for facial action unit recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Hastie T et al (2009) Multi-class adaboost. Stat Interface 2(3):349–360
Article MathSciNet MATH Google Scholar
Hearst MA et al (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28
Article Google Scholar
Ho TK, Basu M (2000) Measuring the complexity of classification problems. In: Proceedings 15th international conference on pattern recognition, ICPR-2000. IEEE
Holland JH (1992) Adaptation in natural and artificial systems. MIT Press, Cambridge
Book Google Scholar
Hubel DH, Wiesel TN (1959) Receptive fields of single neurones in the cat’s striate cortex. J Physiol 148(3):574–591
Article Google Scholar
Hubel D, Wiesel T (1960) Receptive fields of optic nerve fibres in the spider monkey. J Physiol 154(3):572–580
Article Google Scholar
Javaid N et al (2017) A hybrid genetic wind driven heuristic optimization algorithm for demand side management in smart grid. Energies 10(3):319
Article Google Scholar
Jovanovic M et al (2010) A quantitative targeted proteomics approach to validate predicted microRNA targets in C. elegans. Nat Methods 7(10):837–842
Article Google Scholar
Karaboga D, Akay B (2009) A comparative study of artificial bee colony algorithm. Appl Math Comput 214(1):108–132
MathSciNet MATH Google Scholar
Lewis DP, Jebara T, Noble WS (2006) Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Bioinformatics 22(22):2753–2760
Article Google Scholar
Liu X-Q et al (2019) Prediction of long non-coding RNAs based on deep learning. Genes 10(4):273
Article Google Scholar
Lopez-Rincon A et al (2018) Evolutionary optimization of convolutional neural networks for cancer miRNA biomarkers classification. Appl Soft Comput 65:91–100
Article Google Scholar
Lu J et al (2005) MicroRNA expression profiles classify human cancers. Nature 435(7043):834
Article Google Scholar
Mirjalili S, Mirjalili SM, Hatamlou A (2016) Multi-verse optimizer: a nature-inspired algorithm for global optimization. Neural Comput Appl 27(2):495–513
Article Google Scholar
Montavon G, Braun ML, MÃŧller K-R (2011) Kernel analysis of deep networks. J Mach Learn Res 12(Sep):2563–2581
MathSciNet MATH Google Scholar
Morán-Fernández L, Bolón-Canedo V, Alonso-Betanzos A (2017) Centralized vs. distributed feature selection methods based on data complexity measures. Knowl Based Syst 117:27–45
Article Google Scholar
Öztürk Ş et al (2018) Convolution kernel size effect on convolutional neural network in histopathological image processing applications. In: International symposium on fundamentals of electrical engineering (ISFEE). IEEE
Pedregosa F et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12(Oct):2825–2830
MathSciNet MATH Google Scholar
Peralta D et al (2015) Evolutionary feature selection for big data classification: a MapReduce approach. Math Probl Eng. https://doi.org/10.1155/2015/246139
Article MATH Google Scholar
Pian C et al (2020) Discovering cancer-related miRNAs from miRNA-target interactions by support vector machines. Mol Therapy Nucleic Acids 19:1423–1433
Article Google Scholar
Potharaju SP, Sreedevi M (2019) Distributed feature selection (DFS) strategy for microarray gene expression data to improve the classification performance. Clin Epidemiol Glob Health 7(2):171–176
Article Google Scholar
Rajabioun R (2011) Cuckoo optimization algorithm. Appl Soft Comput 11(8):5508–5518
Article Google Scholar
Sabzehzari M, Naghavi M (2018) Phyto-miRNA: a molecule with beneficial abilities for plant biotechnology. Gene 683:28–34
Article Google Scholar
Salem H, Attiya G, El-Fishawy N (2017) Classification of human cancer diseases by gene expression profiles. Appl Soft Comput 50:124–134
Article Google Scholar
Sarbazi-Azad S, Abadeh MS (2018) Gene selection for cancer classification from microarray data using data overlap measure. In: 25th National and 3rd international Iranian conference on biomedical engineering (ICBME). IEEE
Sarbazi-Azad S, Abadeh MS, Abadi MIN (2018) Feature selection in microarray gene expression data using fisher discriminant ratio. In: 8th International conference on computer and knowledge engineering (ICCKE). IEEE
Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. In: International conference on artificial neural networks. Springer, Berlin
Sherafatian M (2018) Tree-based machine learning algorithms identified minimal set of miRNA biomarkers for breast cancer diagnosis and molecular subtyping. Gene 677:111–118
Article Google Scholar
Soon FC et al (2017) Hyper-parameters optimisation of deep CNN architecture for vehicle logo recognition. IET Intell Trans Syst 12(8):939–946
Article Google Scholar
Tibshirani R et al (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci 99(10):6567–6572
Article Google Scholar
Tikhonov AN (1943) The stability of inverse problems. Dokl Akad Nauk SSSR 39:195–198
MathSciNet Google Scholar
Torres R, Judson-Torres RL (2019) Research techniques made simple: feature selection for biomarker discovery. J Investig Dermatol 139(10):2068–2074
Article Google Scholar
Vasudevan S, Tong Y, Steitz JA (2007) Switching from repression to activation: microRNAs can up-regulate translation. Science 318(5858):1931–1934
Article Google Scholar
Wang Y, Zhang H, Zhang G (2019) cPSO-CNN: an efficient PSO-based algorithm for fine-tuning hyper-parameters of convolutional neural networks. Swarm Evol Comput 49:114–123
Article Google Scholar
Yang X-S (2012) Flower pollination algorithm for global optimization. In: International conference on unconventional computing and natural computation. Springer, Berlin
Ye Z, Sun B, Xiao Z (2020) Machine learning identifies 10 feature miRNAs for Lung squamous cell carcinoma. Gene 749:144669
Article Google Scholar
Yoon S et al (2019) Biclustering analysis of transcriptome big data identifies condition-specific microRNA targets. Nucleic Acids Res 47:e53
Article Google Scholar
Young SR et al (2015) Optimizing deep learning hyper-parameters through an evolutionary algorithm. In: Proceedings of the workshop on machine learning in high-performance computing environments
Zhang Y-H et al (2020) Identifying circulating miRNA biomarkers for early diagnosis and monitoring of lung cancer. Biochim Biophys Acta (BBA) Mol Basis Dis 1866:165847
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Jalal Al-e Ahmad, P.O. Box: 14115-194, Tehran, Iran
Niousha Bagheri Khoulenjani, Mohammad Saniee Abadeh, Saeed Sarbazi-Azad & Najmeh Sadat Jaddi

Authors

Niousha Bagheri Khoulenjani
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Saniee Abadeh
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Sarbazi-Azad
View author publications
You can also search for this author in PubMed Google Scholar
Najmeh Sadat Jaddi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Saniee Abadeh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest regarding this manuscript.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bagheri Khoulenjani, N., Saniee Abadeh, M., Sarbazi-Azad, S. et al. Cancer miRNA biomarkers classification using a new representation algorithm and evolutionary deep learning. Soft Comput 25, 3113–3129 (2021). https://doi.org/10.1007/s00500-020-05366-w

Download citation

Published: 12 October 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s00500-020-05366-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cancer miRNA biomarkers classification using a new representation algorithm and evolutionary deep learning

Abstract

Access this article

Similar content being viewed by others

MicroRNA-Based Cancer Classification Using Feature Selection Wrapper

Hybrid Framework for Genomic Data Classification Using Deep Learning: QDeep_SVM

Laryngeal cancer diagnosis via miRNA-based decision tree model

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cancer miRNA biomarkers classification using a new representation algorithm and evolutionary deep learning

Abstract

Access this article

Similar content being viewed by others

MicroRNA-Based Cancer Classification Using Feature Selection Wrapper

Hybrid Framework for Genomic Data Classification Using Deep Learning: QDeep_SVM

Laryngeal cancer diagnosis via miRNA-based decision tree model

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation