Abstract
Sentiment analysis is a domain of study that focuses on identifying and classifying the ideas expressed in the form of text into positive, negative and neutral polarities. Feature selection is a crucial process in machine learning. In this paper, we aim to study the performance of different feature selection techniques for sentiment analysis. Term Frequency Inverse Document Frequency (TF-IDF) is used as the feature extraction technique for creating feature vocabulary. Various Feature Selection (FS) techniques are experimented to select the best set of features from feature vocabulary. The selected features are trained using different machine learning classifiers Logistic Regression (LR), Support Vector Machines (SVM), Decision Tree (DT) and Naive Bayes (NB). Ensemble techniques Bagging and Random Subspace are applied on classifiers to enhance the performance on sentiment analysis. We show that, when the best FS techniques are trained using ensemble methods achieve remarkable results on sentiment analysis. We also compare the performance of FS methods trained using Bagging, Random Subspace with varied neural network architectures. We show that FS techniques trained using ensemble classifiers outperform neural networks requiring significantly less training time and parameters thereby eliminating the need for extensive hyper-parameter tuning.
Similar content being viewed by others
Notes
Code will be available at repository https://github.com/avinashsai/MTAP
Train,Test splits can found in https://github.com/avinashsai/Cross-domain-sentiment-analysis/tree/master/Dataset/Actualdata
References
Abbasi A, Chen H C, Salem A (2008) Sentiment analysis in multiple languages: Feature selection for opinion classification in web forums. In: ACM transactions on information systems (TOIS), 2008, 26(3)
Abdi A, Shamsuddin S M, Hasan S, Piran J (2019) Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion. Inf Process Manag 56(4):1245–1259
Agarwal B, Mittal N (2012) Categorical probability proportion difference (CPPD): a feature selection method for sentiment classification. In: Proceedings of the 2nd workshop on sentiment analysis where ai meets psychology, pp 17–26
Agarwal B, Mittal N (2013) Optimal feature selection for sentiment analysis. In: International conference on intelligent text processing and computational linguistics. Springer, Berlin, pp 13–24
Bahassine S, Madani A, Al-Sarem M, Kissi M (2018) Feature selection using an improved Chi-square for Arabic text classification. Journal of King Saud University-Computer and Information Sciences
Barandiaran I (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):1–22
Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 440–447
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Cai J, Song F (2008) Maximum entropy modeling with feature selection for text categorization. In: Li H, Liu T, Ma WY, Sakai T, Wong KF, Zhou G (eds) Information retrieval technology. AIRS 2008. Lecture notes in computer science, vol 4993. Springer, Berlin
Chi X, Siew T P, Cambria E (2017) Adaptive two-stage feature selection for sentiment classification. In: IEEE international conference on systems, man, and cybernetics (SMC), pp 1238–1243
Conneau A, Schwenk H, Barrault L, Lecun Y (2016) Very deep convolutional networks for text classification. arXiv:1606.01781
Das S (2001) Filters, wrappers and a boosting-based hybrid for feature selection. In: Icml, vol 1, pp 74–81
From Group to Individual Labels using Deep Features’, Kotzias et al. KDD, 2015
Galavotti L, Sebastiani F, Simi M (2000) Experiments on the use of feature selection and negative evidence in automated text categorization. In: Borbinha J, Baker T (eds) Research and advanced technology for digital libraries. ECDL 2000. Lecture Notes in Computer Science, vol 1923. Springer, Berlin
Gao Z, Wang D Y, Wan S H, Zhang H, Wang Y L (2019) Cognitive-inspired class-statistic matching with triple-constrain for camera free 3D object retrieval. Futur Gener Comput Syst 94:641–653
Gao Z, Xuan H Z, Zhang H, Wan S, Choo KKR (2019) Adaptive fusion and category-level dictionary learning model for multi-view human action recognition. IEEE Internet of Things Journal
Harris ZS (1954) Distributional structure. Word 10.2-3:146–162
Hochreiter S, Bengio Y, Frasconi P, Schmidhuber J (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural computation 9(8):1735–1780
Jones KS (2004) A statistical interpretation of term specificity and its application in retrieval. Journal of documentation
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: EACL, 427–431. Association for computational linguistics
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882
Labani M, Moradi P, Ahmadizar F, Jalili M (2018) A novel multivariate filter method for feature selection in text classification problems. Eng Appl Artif Intel 70:25–37
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196
Lee J, Yu I, Park J, Kim D W (2019) Memetic feature selection for multilabel text categorization using label frequency difference. Inform Sci 485:263–280
Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. arXiv:1605.05101
López M, Valdivia A, Martínez-Cámara E, Luzón MV, Herrera F (2019) E2SAM: Evolutionary ensemble of sentiment analysis methods for domain adaptation. Inform Sci 480:273–286
Metsis V, Androutsopoulos I, Paliouras G (2006) Spam filtering with naive bayes which naive bayes?. In: Proceedings of CEAS
Morinaga S, Yamanishi K, Tateishi K, Fukushima T (2002) Mining product reputations on the web. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 341–349. ACM
O’Keefe T, Koprinska I (2009) Feature selection and weighting methods in sentiment analysis. In: Proceedings of the 14th Australasian document computing symposium, Sydney, pp 67–74
Oussous A, Lahcen AA, Belfkih S (2019) Impact of text pre-processing and ensemble learning on arabic sentiment analysis. In: Proceedings of the 2nd international conference on networking, information systems and security, pp 65. ACM
Pang B, Lee L (2005) Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd annual meeting on association for computational linguistics (pp. 115–124). Association for Computational Linguistics
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing - Vol 10,EMNLP ’02, pp 79–86
Pascanu R, Mikolov T, Bengio Y (2012) Understanding the exploding gradient problem. arXiv:1211.5063, 2
Plackett R L (1983) Karl Pearson and the chi-squared test. International Statistical Review/Revue Internationale de Statistique, pp 59–72
Pong-Inwong C, Kaewmak K (2016) Improved sentiment analysis for teaching evaluation using feature selection and voting ensemble learning integration. In: 2nd IEEE international conference on computer and communications (ICCC), pp 1222–1225
Rehman A, Javed K, Babri H A, Saeed M (2015) Relative discrimination criterion–A novel feature ranking method for text data. Expert Syst Appl 42:3670–3681
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1-2):23–69
Tan S, Zhang J (2008) An empirical study of sentiment analysis for chinese documents. Expert Syst Appl 34(4):2622–2629
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: A review. Data classification: Algorithms and applications, pp 37
Van Der Maaten L, Postma E, Van den Herik J (2009) Dimensionality reduction: a comparative. J Mach Learn Res 10(66-71):13
Wang S, Li D, Wei Y, Li H (2009) A feature selection method based on fisher’s discriminant ratio for text sentiment classification. In: Liu W, Luo X, Wang FL, Lei J (eds) Web information systems and mining. WISM 2009. Lecture notes in computer science, vol 5854. Springer, Berlin
Wang S, Manning CD (2012) Baselines and bigrams: Simple, good sentiment and topic classification. In: Proceedings of the 50th annual meeting of the association for computational linguistics: Short papers-volume 2 (pp. 90–94). Association for Computational Linguistics
Xiao L, Zhang H, Chen W, Wang Y, Jin Y (2018) Transformable convolutional neural network for text classification. In IJCAI, pp 4496–4502
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Madasu, A., Elango, S. Efficient feature selection techniques for sentiment analysis. Multimed Tools Appl 79, 6313–6335 (2020). https://doi.org/10.1007/s11042-019-08409-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08409-z