Skip to main content
Log in

Hybrid Filter–Wrapper Feature Selection Method for Sentiment Classification

  • Research Article - Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

The feature selection (FS) has been the latest challenge in the area of sentiment classification. The filter- and wrapper-based feature selection methods are applied in the domain to reduce feature set size and increase accuracy of the classifiers. In this paper, a hybrid of filter and wrapper method for selecting relevant features is proposed. The feature subset is first selected from the original feature set using computationally fast rank-based FS methods. The selected features are further refined using two wrapper approaches. In the first approach, recursive feature elimination is applied to select optimal feature set, and in the second approach, evolutionary method based on binary particle swarm optimization is applied for finalization of feature subset. The comparison between the two proposed techniques is conducted on five different domain datasets used in the area of sentiment analysis. We used simple and efficient ML algorithms (Naïve Bayes, support vector machine and logistic regression) to evaluate the performance of the hybrid FS techniques. Finally, we assessed the performance of the proposed hybrid FS technique by comparing our results with the state-of-the-art methods. The results reveal that the proposed method is able to give better accuracy with fewer number of features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Medhat, W.; Hassan, A.; Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)

    Article  Google Scholar 

  2. Pang, B.; Lee, L.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2(1–2), 1–135 (2008)

    Google Scholar 

  3. Pang, B.; Lee, L.; Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86. ACL (2002)

  4. Pang, B.; Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 271. ACL (2004)

  5. Yang, Y.; Pederson, J.: A comparative study on feature selection in text categorization. In: International Conference on Machine Learning (ICML), vol. 97, pp. 412–420 (1997)

  6. Tang, J.; Alelyani, S.; Liu, H.: Feature selection for classification: a review. In: Aggarwal, C.C. (ed.) Data Classification: Algorithms and Applications, pp. 37–64. CRC Press (2014)

  7. Kohavi, R.; John, G.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    Article  MATH  Google Scholar 

  8. Abbasi, A.; Chen, H.: Sentiment analysis in multiple languages: feature selection for opinion classification in Web forums. ACM Trans. Inf. Syst. 26(3), 12:11–12.34 (2008)

    Article  Google Scholar 

  9. Onan, A.; Koruko, S.; Glu, S.: A feature selection model based on genetic rank aggregation for text sentiment classification. J. Inf. Sci. 43(1), 25–38 (2017)

    Article  Google Scholar 

  10. Cervante, L.; Xue, B.; Zhang, M.; Shang, L.: Binary particle swarm optimization for feature selection: a filter based approach. In: IEEE Congress on Evolutionary Computation (CEC), pp. 1–8 (2012)

  11. Xue, B.; Zhang, M.; Browne, W.N.: Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans. Cybern. 43(6), 1656–1671 (2013)

    Article  Google Scholar 

  12. Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. ACL (2002)

  13. Sharma, A.; Dey, S.: A comparative study of feature selection and machine learning techniques for sentiment analysis. In: Proceedings of the 2012 ACM Research in Applied Computation Symposium, pp. 1–7. ACM (2012)

  14. Tan, S.; Zhang, J.: An empirical study of sentiment analysis for Chinese documents. Expert Syst. Appl. 34(4), 2622–2629 (2008)

    Article  Google Scholar 

  15. Agarwal, B.; Mittal, N.: Prominent feature extraction for review analysis: an empirical study. J. Exp. Theor. Artif. Intell. 28(3), 485–498 (2016)

    Article  Google Scholar 

  16. Xia, R.; Zong, C.; Li, S.: Ensemble of feature sets and classification algorithms for sentiment classification. Inf. Sci. 181(6), 1138–1152 (2011)

    Article  Google Scholar 

  17. Xie, J.; Wang, C.: Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. 38(5), 5809–5815 (2011)

    Article  Google Scholar 

  18. Peng, Y.; Wu, Z.; Jiang, J.: A novel feature selection approach for biomedical data classification. J. Biomed. Inform. 43(1), 15–23 (2010)

    Article  Google Scholar 

  19. Agarwal, B.; Mittal, N.: Sentiment Classification using Rough Set based Hybrid Feature Selection. In: Proceedings of 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 115–119. ACL (2013)

  20. Yousefpour, A.; Ibrahim, R.; Hamed, H.N.A.: Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis. Expert Syst. Appl. 75, 80–93 (2017)

    Article  Google Scholar 

  21. Zhang, L.; Wang, J.; Zha, Y.; Yang Z.: A novel hybrid feature selection method algorithm: using ReliefF estimation for GA-Wrapper Search. In: Proceedings of the Second International Conference on Machine Learning and Cybernetics, pp. 380–384. IEEE (2003)

  22. Hsu, H.H.; Hsieh, C.W.; Lu, M.D.: Hybrid feature selection by combining filters and wrappers. Expert Syst. Appl. 38(7), 8144–8150 (2011)

    Article  Google Scholar 

  23. Apolloni, J.; Leguizamón, G.; Alba, E.: Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl. Soft Comput. J. 38, 922–932 (2016)

    Article  Google Scholar 

  24. Zhang, Y.; Zhang, Y.; Lv, Y.; Hou, X.; Liu, F.; Jia, W.; Yang, M.; Phillips, P.; Wang, S.: Alcoholism detection by medical robots based on Hu moment invariants and predator–prey adaptive-inertia chaotic particle swarm optimization. Comput. Electr. Eng. J. 63, 126–138 (2017)

    Article  Google Scholar 

  25. Zhang, Y.; Wang, S.; Sui, Y.; Yang, M.; Liu, B.; Cheng, H.; Sun, J.; Jia, W.; Phillips, P.; Gorriz, J.: Multivariate approach for Alzheimer’s disease detection using stationary wavelet entropy and predator-prey particle swarm optimization. J. Alzheimers Dis. 65(3), 855–869 (2018)

    Article  Google Scholar 

  26. Basari, A.S.H.; Hussin, B.; Ananta, I.G.P.; Zeniarja, J.: Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization. Procedia Eng. 53, 453–462 (2013)

    Article  Google Scholar 

  27. Shang, L.; Zhou, Z.; Liu, X.: Particle swarm optimization-based feature selection in sentiment classification. Soft Comput. 20(10), 3821–3834 (2016)

    Article  Google Scholar 

  28. Chen, Y.T.; Chen, M.C.: Using Chi square statistics to measure similarities for text categorization. Expert Syst. Appl. 38(4), 3085–3090 (2011)

    Article  Google Scholar 

  29. Parlar, T.; Özel, S.A.; Song, F.: QER: a new feature selection method for sentiment analysis. Hum. Centric Comput. Inf. Sci. 8(1), 10 (2018)

    Article  Google Scholar 

  30. Meesad, P.; Boonrawd, P.; Nuipian, V.: A Chi square-test for word importance differentiation in text classification. In: International Conference on Information and Electronics Engineering, vol. 6, pp. 110–114. IACSIT (2011)

  31. Kennedy, J.; Eberhart, R.: A discrete binary version of particle swarm optimization. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics and Computational Cybernetics and Simulation, vol. 5, pp. 4104–4108. IEEE (1997)

  32. Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)

    Article  MATH  Google Scholar 

  33. Blitzer, J.; Dredze, M.; Pereira, F.: Biographies, Bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the 45rd Annual Meeting on Association for Computational Linguistics, vol. 7, pp. 440–447. ACL (2007)

  34. Bolón-Canedo, V.; Sánchez-Maroño, N.; Alonso-Betanzos, A.; Benítez, J.M.; Herrera, F.: A review of microarray datasets and applied feature selection methods. Inf. Sci. 282, 111–135 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gunjan Ansari.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ansari, G., Ahmad, T. & Doja, M.N. Hybrid Filter–Wrapper Feature Selection Method for Sentiment Classification. Arab J Sci Eng 44, 9191–9208 (2019). https://doi.org/10.1007/s13369-019-04064-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-019-04064-6

Keywords

Navigation