Abstract
With the rapid growth of web and mobile technology, Social networking services like Twitter are widely used, resulting in large amounts of data being generated daily in social networking sites. Efficient Sentiment analysis of such data is very important for a range of applications and improvement of accuracy in detecting sentiment is the main aim of this research. This report examines the combination of a Chi-Squared feature selection algorithm, k-mean clustering and TF-IDF for attribute weighting based on Naïve Bayes, for classification of text and sentiment in communications generated on Twitter. This approach is compared with other approaches based on Naïve Bayes to give an account of their relative strengths and weaknesses. When running experiments on multi-domain twitter datasets, results indicate that the proposed method shows superior performance across a range of. The main aim of this research is to enhance the performance of the Naïve Bayes classifier using a feature selection technique.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kumar, A., Sebastian, T.M.: Sentiment analysis on Twitter. Int. J. Comput. Sci. Issues (IJCSI) 9(4), 372–378 (2012)
Bigonha, C., Cardoso, T.N.C., Moro, M.M.: Sentiment-based influence detection on Twitter. J. Braz. Comput. Soc. (2012). https://doi.org/10.1007/s13173-011-0051-5
Devika, M. D., Sunitha, C., Amal, G.: Sentiment analysis: a comparative study on different approaches. Procedia Comput. Sci. 44–49 (2016)
Kotsiantis, S.B.: Supervised machine learning: a review of classification techniques. Informatica, 249–268 (2007)
Giachanou, A., Crestani, F.: Like it or not: a survey of twitter sentiment analysis methods. ACM Comput. Surv 49, 28 (2016). http://doi.org/10.1145/2938640
Zainuddin, N., Selamat, A., Ibrahim, R.: Hybrid sentiment classification on twitter aspect-based sentiment analysis. Appl. Intell. 1–9 (2007)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)
Sonawane, V.A.: Sentiment analysis of Twitter data: a survey of techniques. Int. J. Comput. Appl. 139(11), 5–15 (2016)
Ebrahimi, M., Amir, H.Y., Sheth, A.: Challenges of sentiment analysis for dynamic events. IEEE Intell. Syst. 32(5), 70–75 (2017)
Song, J., Tae, K.K., Lee, B., Kim, S., Yong Youn, H.: A novel classification approach based on Naïve Bayes for Twitter sentiment analysis. KSII Trans. Internet Inf. Syst. 11(6), 2996–3011 (2017)
Rana, T.A., Cheah, Y.: Aspect extraction in sentiment analysis: comparative analysis and survey. Artif. Intell. Rev. 46(4), 459–483 (2016)
Yuvaraj, N., Sabari, A.: Twitter sentiment classification using binary shuffled frog algorithm. Intell. Autom. Soft Comput. 23, 373–381 (2016)
Hai, Z., Cong, G., Chang, K.: Analyzing sentiment in one go: supervised joint aspect and sentiment model. Futur. Comput. Inform. J. 29, 1172–1185 (2016)
Pletneva, E.V.: Text sentiment classification based on a genetic algorithm. J. Comput. Syst. Sci. Int. 55, 106–114 (2016)
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)
Lunghan Zhang, L.J.: Two feature weighting approaches for Naive Bayes text classifiers. Knowl. Based Syst. 100(15), 137–144 (2016)
Subhajit Dey Sarkar, S. G.: A novel feature selection technique for text classification using Naïve Bayes. Int. Sch. Res. Not. (2014). Article no. 717092
Ang Yang, J.Z.: Enhanced Twitter sentiment analysis by using feature selection and combination. In: Proceedings of International Symposium on Security and Privacy, pp 52–57 (2015)
Schuller, B., Knaup, T.: Learning and knowledge-based sentiment analysis in movie review key excerpts. In: Esposito, A., Esposito, A.M., Martone, R., Müller, V.C., Scarpetta, G. (eds.) Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues. LNCS, vol. 6456. Springer, Berlin (2011)
Yelena Mejova, P.S.: Exploring feature definition and selection for sentiment classifiers. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (2011)
Rachburee, N., Punlumjeak, W.: A comparison of feature selection approach between greedy, IG-ratio, chi-square, and mRMR in educational mining. In: IEEE Conference Proceedings, pp. 420–424. The Institute of Electrical and Electronics Engineers, Inc. (2015)
Qiu, Y.F., Wang, W., Liu, D.Y.: Research on an improved CHI feature selection method. Appl. Mech. Mater. 241–244 (2012)
Saroj, T.: Kavita, R: Review: study on simple k mean and modified K mean clustering technique. Int. J. Sci. Eng. Comput. Technol. 6(7), 279–281 (2016)
Acknowledgement
We are grateful to Mrs. Angelika Maag for proof reading and making corrections to this article. Without her support, it would have not been possible to submit this in the current form.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Paudel, S., Prasad, P.W.C., Alsadoon, A., Islam, M.R., Elchouemi, A. (2019). Feature Selection Approach for Twitter Sentiment Analysis and Text Classification Based on Chi-Square and Naïve Bayes. In: Abawajy, J., Choo, KK., Islam, R., Xu, Z., Atiquzzaman, M. (eds) International Conference on Applications and Techniques in Cyber Security and Intelligence ATCI 2018. ATCI 2018. Advances in Intelligent Systems and Computing, vol 842. Springer, Cham. https://doi.org/10.1007/978-3-319-98776-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-98776-7_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98775-0
Online ISBN: 978-3-319-98776-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)