A fine-tuning deep learning with multi-objective-based feature selection approach for the classification of text

Dhal, Pradip; Azad, Chandrashekhar

doi:10.1007/s00521-023-09225-1

A fine-tuning deep learning with multi-objective-based feature selection approach for the classification of text

Original Article
Published: 05 December 2023

Volume 36, pages 3525–3553, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

236 Accesses
2 Citations
Explore all metrics

Abstract

Document classification is becoming increasingly essential for the vast number of documents available in digital libraries, emails, the Internet, etc. Textual records frequently contain non-discriminative (noisy and irrelevant) terms that are also high-dimensional, resulting in higher computing costs and poorer learning performance in Text Classification (TC). Feature selection (FS), which tries to discover discriminate terms or features from the textual data, is one of the most effective tasks for this issue. This paper introduces a novel multi-stage term-weighting scheme-based FS model designed for the single-label TC system to obtain the optimal set of features. We have also developed a hybrid deep learning fine-tuning network based on Bidirectional Long Short-Term Memory (BiLSTM) and Convolutional Neural Network (CNN) for the classification stage. The FS approach is worked on two-stage criteria. The filter model is used in the first stage, and the multi-objective wrapper model, an upgraded version of the Whale Optimization Algorithm (WOA) with Particle Swarm Optimization (PSO), is used in the second stage. The objective function in the above wrapper model is based on a tri-objective principle. It uses the Pareto front technique to discover the optimal set of features. Here in the wrapper model, a novel selection strategy has been introduced to select the whale instead of the random whale. The proposed work is evaluated on four popular benchmark text corpora, of which two are binary class, and two are multi-class. The suggested FS technique is compared against classic Machine Learning (ML) and deep learning classifiers. The results of the experiments reveal that the recommended FS technique is more effective in obtaining better results than the other results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A lightweight filter based feature selection approach for multi-label text classification

Article 28 July 2022

Automatic Text Document Classification by Using Semantic Analysis and Lion Optimization Algorithm

An efficient automatic multiple objectives optimization feature selection strategy for internet text classification

Article 16 February 2018

Availability of Data and Material

The authors confirm that the data supporting the findings of this study are available within the article.

Notes

References

Agrawal R, Kaur B, Sharma S (2020) Quantum based whale optimization algorithm for wrapper feature selection. Appl Soft Comput 89:106092. https://doi.org/10.1016/j.asoc.2020.106092
Article Google Scholar
Ahmad MF, Isa NAM, Lim WH, Ang KM (2022) Differential evolution: a recent review based on state-of-the-art works. Alex Eng J 61(5):3831–3872. https://doi.org/10.1016/j.aej.2021.09.013
Article Google Scholar
Amiri F, Rezaei Yousefi M, Lucas C, Shakery A, Yazdani N (2011) Mutual information-based feature selection for intrusion detection systems. J Netw Comput Appl 34(4):1184–1199. https://doi.org/10.1016/j.jnca.2011.01.002
Article Google Scholar
Bahassine S, Madani A, Al-Sarem M, Kissi M (2020) Feature selection using an improved chi-square for arabic text classification. J King Saud Univ Comput Inf Sci 32(2):225–231. https://doi.org/10.1016/j.jksuci.2018.05.010
Article Google Scholar
Basiri ME, Nemati S, Abdar M, Asadi S, Acharrya UR (2021) A novel fusion-based deep learning model for sentiment analysis of covid-19 tweets. Knowl-Based Syst 228:107242. https://doi.org/10.1016/j.knosys.2021.107242
Article PubMed PubMed Central Google Scholar
BinSaeedan W, Alramlawi S (2021) Cs-bpso: hybrid feature selection based on chi-square and binary pso algorithm for Arabic email authorship analysis. Knowl-Based Syst 227:107224. https://doi.org/10.1016/j.knosys.2021.107224
Article Google Scholar
Chang YW, Hsieh CJ, Chang KW, Ringgaard M, Lin CJ (2010) Training and testing low-degree polynomial data mappings via linear svm. J Mach Learn Res 11(48):1471–1490
MathSciNet Google Scholar
Chen L, Jiang L, Li C (2021) Modified dfs-based term weighting scheme for text classification. Expert Syst Appl 168:114438. https://doi.org/10.1016/j.eswa.2020.114438
Article Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/BF00994018
Article Google Scholar
Dara S, Reddy MJ, Eluri NR (2018) Evolutionary computation based feature selection: a survey. In: 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp 1541–1547, https://doi.org/10.1109/ICECA.2018.8474568
Dhal P, Azad C (2021) A comprehensive survey on feature selection in the various fields of machine learning. Appl Intell. https://doi.org/10.1007/s10489-021-02550-9
Article Google Scholar
Dhal P, Azad C (2021) A multi-objective feature selection method using newton’s law based pso with gwo. Appl Soft Comput 107:107394. https://doi.org/10.1016/j.asoc.2021.107394
Article Google Scholar
Dokeroglu T, Deniz A, Kiziloz HE (2021) A robust multiobjective harris’ hawks optimization algorithm for the binary classification problem. Knowl-Based Syst 227:107219. https://doi.org/10.1016/j.knosys.2021.107219
Article Google Scholar
Fix E, Hodges JL (1989) Discriminatory analysis nonparametric discrimination: consistency properties. Int Stat Rev Rev Int Stat 57(3):238–247
Article Google Scholar
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. Processing 150
Got A, Moussaoui A, Zouache D (2021) Hybrid filter-wrapper feature selection using whale optimization algorithm: a multi-objective approach. Expert Syst Appl 183:115312. https://doi.org/10.1016/j.eswa.2021.115312
Article Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article CAS PubMed Google Scholar
Ikram ST, Cherukuri AK (2017) Intrusion detection model using fusion of chi-square feature selection and multi class svm. J King Saud Univ Comput Inf Sci 29(4):462–472. https://doi.org/10.1016/j.jksuci.2015.12.004
Article Google Scholar
Iqbal M, Abid MM, Khalid MN, Manzoor A (2020) Review of feature selection methods for text classification. International Journal of Advanced Computer Research https://doi.org/10.19101/IJACR.2020.1048037
Javed K, Maruf S, Babri HA (2015) A two-stage markov blanket based feature selection algorithm for text classification. Neurocomputing 157:91–104. https://doi.org/10.1016/j.neucom.2015.01.031
Article Google Scholar
Jia S, Zhu Z, Shen L, Li Q (2014) A two-stage feature selection framework for hyperspectral image classification using few labeled samples. IEEE J Sel Top Appl Earth Obs Remote Sens 7(4):1023–1035. https://doi.org/10.1109/JSTARS.2013.2282161
Article ADS Google Scholar
K T, K M, (2021) Feature selection using hybrid poor and rich optimization algorithm for text classification. Pattern Recognit Lett 147:63–70. https://doi.org/10.1016/j.patrec.2021.03.034
Kabir MM, Shahjahan M, Murase K (2011) A new local search based hybrid genetic algorithm for feature selection. Neurocomputing 74(17):2914–2928. https://doi.org/10.1016/j.neucom.2011.03.034
Article Google Scholar
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks, 4, 1942–1948, https://doi.org/10.1109/ICNN.1995.488968
Kou G, Xu Y, Peng Y, Shen F, Chen Y, Chang K, Kou S (2021) Bankruptcy prediction for smes using transactional data and two-stage multiobjective feature selection. Decis Support Syst 140:113429. https://doi.org/10.1016/j.dss.2020.113429
Article Google Scholar
Kowsari Meimandi J, Heidarysafa Mendu, Barnes Brown (2019) Text classification algorithms: a survey. Information 10(4):150. https://doi.org/10.3390/info10040150
Article Google Scholar
Kowsari K, Jafari Meimandi K, Heidarysafa M, Mendu S, Barnes L, Brown D (2019) Text classification algorithms: a survey. Information. https://doi.org/10.3390/info10040150
Article Google Scholar
Krithiga R, Ilavarasan E (2020) A reliable modified whale optimization algorithm based approach for feature selection to classify twitter spam profiles. Microprocess Microsyst. https://doi.org/10.1016/j.micpro.2020.103451
Article Google Scholar
Labani M, Moradi P, Jalili M (2020) A multi-objective genetic algorithm for text feature selection using the relative discriminative criterion. Expert Syst Appl 149:113276. https://doi.org/10.1016/j.eswa.2020.113276
Article Google Scholar
Larabi Marie-Sainte S, Alalyani N (2020) Firefly algorithm based feature selection for arabic text classification. J King Saud Univ Comput Inf Sci 32(3):320–328. https://doi.org/10.1016/j.jksuci.2018.06.004
Article Google Scholar
Li AD, He Z (2020) Multiobjective feature selection for key quality characteristic identification in production processes using a nondominated-sorting-based whale optimization algorithm. Comput Ind Eng 149:106852. https://doi.org/10.1016/j.cie.2020.106852
Article Google Scholar
Li Q, Li P, Mao K, Lo EYM (2020) Improving convolutional neural network for text classification by recursive data pruning. Neurocomputing 414:143–152. https://doi.org/10.1016/j.neucom.2020.07.049
Article Google Scholar
Li Q, Peng H, Li J, Xia C, Yang R, Sun L, Yu PS, He L (2021) A survey on text classification: From shallow to deep learning. arXiv: 2008.00364
Liu G, Guo J (2019) Bidirectional lstm with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338. https://doi.org/10.1016/j.neucom.2019.01.078
Article Google Scholar
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, association for computational linguistics, Portland, Oregon, USA, pp 142–150
Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312. https://doi.org/10.1016/j.neucom.2017.04.053
Article Google Scholar
McSherry D (1999) Strategic induction of decision trees. Knowl-Based Syst 12(5):269–275. https://doi.org/10.1016/S0950-7051(99)00024-6
Article Google Scholar
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008
Article Google Scholar
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61. https://doi.org/10.1016/j.advengsoft.2013.12.007
Article Google Scholar
Mohammed A, Kora R (2021) An effective ensemble deep learning framework for text classification. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2021.11.001
Article Google Scholar
Nasir JA, Khan OS, Varlamis I (2021) Fake news detection: a hybrid cnn-rnn based deep learning approach. Int J Inf Manage Data Insights 1(1):100007. https://doi.org/10.1016/j.jjimei.2020.100007
Article Google Scholar
Pintas JT, Fernandes LAF, Garcia ACB (2021) Feature selection methods for text classification: a systematic literature review. Artif Intell Rev. https://doi.org/10.1007/s10462-021-09970-6
Article Google Scholar
Purushothaman R, Rajagopalan S, Dhandapani G (2020) Hybridizing gray wolf optimization (gwo) with grasshopper optimization algorithm (goa) for text feature selection and clustering. Appl Soft Comput 96:106651. https://doi.org/10.1016/j.asoc.2020.106651
Article Google Scholar
Ramli R, Jamari Z, Ku-Mahamud KR (2016) Evolutionary algorithm with roulette-tournament selection for solving aquaculture diet formulation. Math Probl Eng. https://doi.org/10.1155/2016/3672758
Article Google Scholar
Salesi S, Cosma G (2017) A novel extended binary cuckoo search algorithm for feature selection. In: 2017 2nd international conference on knowledge engineering and applications (ICKEA), pp 6–12, https://doi.org/10.1109/ICKEA.2017.8169893
Shang C, Li M, Feng S, Jiang Q, Fan J (2013) Feature selection via maximizing global information gain for text classification. Knowl-Based Syst 54:298–309. https://doi.org/10.1016/j.knosys.2013.09.019
Article Google Scholar
Sikelis K, Tsekouras GE, Kotis KI (2021) Ontology-based feature selection: a survey. arXiv: 2104.07720
Thirumoorthy K, Muneeswaran K (2020) Optimal feature subset selection using hybrid binary jaya optimization algorithm for text classification. Sādhanā. https://doi.org/10.1007/s12046-020-01443-w
Article Google Scholar
Tolles J, Meurer WJ (2016) Logistic regression: relating patient characteristics to outcomes. JAMA 316(5):533–534. https://doi.org/10.1001/jama.2016.7653
Article PubMed Google Scholar
Uysal AK (2016) An improved global feature selection scheme for text classification. Expert Syst Appl 43:82–92. https://doi.org/10.1016/j.eswa.2015.08.050
Article Google Scholar
Wang S, Li D, Song X, Wei Y, Li H (2011) A feature selection method based on improved fisher’s discriminant ratio for text sentiment classification. Expert Syst Appl 38(7):8696–8702. https://doi.org/10.1016/j.eswa.2011.01.077
Article Google Scholar
Wang SI, Manning CD (2012) Baselines and bigrams: Simple, good sentiment and topic classification. In: proceedings of the 50th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 90–94
Yang XS, He X (2013) Bat algorithm: literature review and applications. Int J Bio-Inspired Comput 5(3):141–149. https://doi.org/10.1504/IJBIC.2013.055093
Article Google Scholar
Zhang X, Zhao J, LeCun Y (2016) Character-level convolutional networks for text classification. arXiv: 1509.01626
Zheng Y, Li Y, Wang G, Chen Y, Xu Q, Fan J, Cui X (2019) A novel hybrid algorithm for feature selection based on whale optimization algorithm. IEEE Access 7:14908–14923. https://doi.org/10.1109/ACCESS.2018.2879848
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Applications, Kalinga Institute of Industrial Technology, Bhubaneswar, Odisha, India
Pradip Dhal
Department of Computer Science and Engineering, National Institute of Technology, Jamshedpur, Jharkhand, India
Pradip Dhal & Chandrashekhar Azad

Authors

Pradip Dhal
View author publications
You can also search for this author in PubMed Google Scholar
Chandrashekhar Azad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pradip Dhal.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dhal, P., Azad, C. A fine-tuning deep learning with multi-objective-based feature selection approach for the classification of text. Neural Comput & Applic 36, 3525–3553 (2024). https://doi.org/10.1007/s00521-023-09225-1

Download citation

Received: 12 July 2022
Accepted: 26 October 2023
Published: 05 December 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00521-023-09225-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fine-tuning deep learning with multi-objective-based feature selection approach for the classification of text

Abstract

Access this article

Similar content being viewed by others

A lightweight filter based feature selection approach for multi-label text classification

Automatic Text Document Classification by Using Semantic Analysis and Lion Optimization Algorithm

An efficient automatic multiple objectives optimization feature selection strategy for internet text classification

Availability of Data and Material

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A fine-tuning deep learning with multi-objective-based feature selection approach for the classification of text

Abstract

Access this article

Similar content being viewed by others

A lightweight filter based feature selection approach for multi-label text classification

Automatic Text Document Classification by Using Semantic Analysis and Lion Optimization Algorithm

An efficient automatic multiple objectives optimization feature selection strategy for internet text classification

Availability of Data and Material

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation