Skip to main content
Log in

FANS: a framework for feature selection in sentiment classification using a modified Firefly algorithm

  • Research Paper
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Sentiment classification is a prevalent task in text mining in which a text classifies into positive, negative, or neutral classes. Sentiment classification is an essential issue of decision-making for people, companies, etc. Feature selection is the most influential stage in sentiment classification. Due to the NP-hard nature of the problem and a huge of existing texts, the traditional feature selection techniques, such as statistical techniques, generate sub-optimal solutions. Swarm intelligence algorithms are extensively devoted to optimization problems. These algorithms produce features by increasing the classification performance and decreasing the computational complexity and feature set size. In this study, the authors proposed a framework using the modified multi-objective Firefly algorithm, namely FANS (Firefly Algorithm Naïve Bayes Sentiment). The two targets are decreasing the naïve Bayes error classifier and the k-nearest neighbor. A neural network is used as the final classifier. The three datasets on Movie review and Twitter domains are applied to evaluate the FANS. The FANS outperform its counterparts regarding precision, accuracy, and recall. The FANS yields 96.88% precision, 97.65% accuracy, and 96.54% recall.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

All datasets are available from: https://www.cs.cornell.edu/people/pabo/movie-review-data/ and https://www.tensorflow.org/datasets/catalog/sentiment140.

References

  1. Asgarnezhad R, Monadjemi SA (2021) Persian sentiment analysis: feature engineering, datasets, and challenges. J Appl Intell Syst Inf Sci 2(2):1–21

    Google Scholar 

  2. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting of Association for Computational Linguistics. 2004. Association for Computational Linguistics. https://doi.org/10.48550/arXiv.cs/0409058

  3. Nawaz A, Asghar S, Naqvi SHA (2019) A segregational approach for determining aspect sentiments in social media analysis. J Supercomput 75(5):2584–2602

    Article  Google Scholar 

  4. Gokalp O, Tasci E, Ugur A (2020) A novel wrapper feature selection algorithm based on iterated greedy metaheuristic for sentiment classification. Expert Syst Appl 146:113176. https://doi.org/10.1016/j.eswa.2020.113176

    Article  Google Scholar 

  5. Asgarnezhad R, Monadjemi A, Soltanaghaei M (2020) NSE-PSO: toward an effective model using optimization algorithm and sampling methods for text classification. J Electr Comput Eng Innov 8(2):183–192

    Google Scholar 

  6. Asgarnezhad R, Monadjemi SA (2021) NB VS. SVM: a contrastive study for sentiment classification on two text domains. J Appl Intell Syst Inf Sci 2(1):1–12

    Google Scholar 

  7. Rosenthal S, Farra N, Nakov P (2017) SemEval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017). https://doi.org/10.48550/arXiv.1912.00741

  8. Sharma S, Jain A (2020) An empirical evaluation of correlation-based feature selection for tweet sentiment classification. In: Proceedings of advances in cybernetics, cognition, and machine learning for communication technologies, Springer, pp 199–208. https://doi.org/10.1007/978-981-15-3125-5_22

  9. Sharma S, Jain A (2020) Hybrid ensemble learning with feature selection for sentiment classification in social media. Int J Inf Retrieval Res 10(2):40–58

    Google Scholar 

  10. Asgarnezhad R, Monadjemi A, Soltanaghaei M (2020) A high-performance model based on ensembles for twitter sentiment classification. J Electr Comput Eng Innov 8(1):41–52

    Google Scholar 

  11. Gao H, Zeng X, Yao C (2019) Application of improved distributed naive Bayesian algorithms in text classification. J Supercomput 75(9):5831–5847

    Article  Google Scholar 

  12. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1):1–135

    Article  Google Scholar 

  13. Asgarnezhad R, Monadjemi SA, Aghaei MS (2022) A new hierarchy framework for feature engineering through multi-objective evolutionary algorithm in text classification. Concurr Comput Pract Exp 34(3):e6594. https://doi.org/10.1002/cpe.6594

    Article  Google Scholar 

  14. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the fourteenth international conference on machine learning Nashville, TN, USA. https://doi.org/10.5555/645526.657137

  15. Fonseca CM, Fleming PJ (1995) An overview of evolutionary algorithms in multiobjective optimization. Evol Comput 3(1):1–16

    Article  Google Scholar 

  16. Yang XS (2010) Firefly algorithm, stochastic test functions and design optimisation. Int J Bio-inspired Comput 2(2):78–84

    Article  Google Scholar 

  17. Marie-Sainte SL, Alalyani N (2020) Firefly algorithm based feature selection for Arabic text classification. J King Saud Univ-Comput Inf Sci 32(3):320–328

    Google Scholar 

  18. Asgarnezhad R, Monadjemi SA, Soltanaghaei M (2021) An application of MOGW optimization for feature selection in text classification. J Supercomput 77(6):5806–5839

    Article  Google Scholar 

  19. Alarifi A et al (2020) A big data approach to sentiment analysis using greedy feature selection with cat swarm optimization-based long short-term memory neural networks. J Supercomput 76(6):4414–4429

    Article  Google Scholar 

  20. Alshaer HN et al (2021) Feature selection method using improved CHI Square on Arabic text classifiers: analysis and application. Multimedia Tools Appl 80(7):10373–10390

    Article  Google Scholar 

  21. Ye X et al (2021) Multi-view ensemble learning method for microblog sentiment classification. Expert Syst Appl 166(9):113987. https://doi.org/10.1016/j.eswa.2020.113987

    Article  Google Scholar 

  22. Shang L, Zhou Z, Liu X (2016) Particle swarm optimization-based feature selection in sentiment classification. Soft Comput 20(10):3821–3834

    Article  Google Scholar 

  23. Wang Z, Lin Z (2020) Optimal feature selection for learning-based algorithms for sentiment classification. Cogn Comput 12(1):238–248

    Article  Google Scholar 

  24. Behera RK et al (2021) Co-LSTM: convolutional LSTM model for sentiment analysis in social big data. Inf Process Manag 58(1):102435. https://doi.org/10.1016/j.ipm.2020.102435

    Article  Google Scholar 

  25. Vashishtha S, Susan S (2021) Highlighting keyphrases using senti-scoring and fuzzy entropy for unsupervised sentiment analysis. Expert Syst Appl 169(2):114323. https://doi.org/10.1016/j.eswa.2020.114323

    Article  Google Scholar 

  26. Kumar A, Jaiswal A (2020) Deep learning based sentiment classification on user-generated big data. Recent Adv Comput Sci Commun 13(5):1047–1056. https://doi.org/10.2174/2213275912666190409152308

    Article  Google Scholar 

  27. Rasool A et al (2020) GAWA–a feature selection method for hybrid sentiment classification. IEEE Access 8:191850–191861. https://doi.org/10.1109/ACCESS.2020.3030642

    Article  Google Scholar 

  28. Saha U et al. (2022) Sentiment Classification in Bengali News Comments using a hybrid approach with Glove. In: 2022 6th international conference on trends in electronics and informatics (ICOEI). IEEE

  29. Zhou X et al (2023) Dynamic multichannel fusion mechanism based on a graph attention network and BERT for aspect-based sentiment classification. Appl Intell 53(6):6800–6813

    Article  Google Scholar 

  30. Sharma, S. and A. Jain (2023) Hybrid ensemble learning with feature selection for sentiment classification in social media. In: Research Anthology on Applying Social Networking Strategies to Classrooms and Libraries. IGI Global, pp 1183–1203

  31. Mahabub A (2020) A robust technique of fake news detection using Ensemble Voting Classifier and comparison with other classifiers. SN Appl Sci 2(4):525

    Article  Google Scholar 

  32. Zhu L et al (2023) Exploring rich structure information for aspect-based sentiment classification. J Intell Inf Syst 60(1):97–117

    Article  Google Scholar 

  33. Qorich M, El Ouazzani R (2023) Text sentiment classification of Amazon reviews using word embeddings and convolutional neural networks. J Supercomput 5:1–26

    Google Scholar 

  34. Severyn A et al (2016) Multi-lingual opinion mining on YouTube. Inf Process Manag 52(1):46–60

    Article  Google Scholar 

  35. Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl-Based Syst 108:42–49

    Article  Google Scholar 

  36. Nakov P et al. (2019) SemEval-2016 task 4: sentiment analysis in Twitter. arXiv preprint arXiv:1912.01973

  37. Deshmukh JS, Tripathy AK (2018) Entropy based classifier for cross-domain opinion mining. Appl Comput Inf 14(1):55–64

    Google Scholar 

  38. Kumar A, Jaiswal A (2019) Swarm intelligence based optimal feature selection for enhanced predictive sentiment accuracy on Twitter. Multimedia Tools Appl 78(20):29529–29553

    Article  Google Scholar 

  39. Rashaideh H et al (2018) A grey wolf optimizer for text document clustering. J Intell Syst 29(1):814–830

    Google Scholar 

  40. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  41. Nguyen DQ et al. (2014) Sentiment classification on polarity reviews: an empirical study using rating-based features. In: Proceeding o the 5th workshop on computational approaches to subjectivity, sentiment and social media analysis, WASSA@ACL2014. https://doi.org/10.3115/v1/W14-2621

  42. Cha SH (2007) Comprehensive survey on distance/similarity measures between probability density functions. Int J Math Models Methods Appl Sci 5:63

    Google Scholar 

  43. Alpaydin E (2014) Introduction to machine learning. MIT Press, London

    MATH  Google Scholar 

  44. Han J (2006) MichelineKamber. Data mining: concepts and techniques. Elsevier 500:05–150

    Google Scholar 

  45. Chakraborty S et al (2019) Minimal path-based reliability model for wireless sensor networks with multistate nodes. IEEE Trans Reliab 69(1):382–400

    Article  Google Scholar 

  46. Deeply Moving: Deep Learning for Sentiment Analysis. 2013. https://nlp.stanford.edu/sentiment/

Download references

Acknowledgements

Appreciations are extended to Aghigh Institute of Higher Education Shahinshahr for supporting this study by Grant #40004.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

RA has implemented the article. She and AM have written its structure and done its final review.

Corresponding author

Correspondence to Razieh Asgarnezhad.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asgarnezhad, R., Monajemi, A. FANS: a framework for feature selection in sentiment classification using a modified Firefly algorithm. Evol. Intel. (2023). https://doi.org/10.1007/s12065-023-00887-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12065-023-00887-3

Keywords

Navigation