Skip to main content

Feature Selection Approach for Twitter Sentiment Analysis and Text Classification Based on Chi-Square and Naïve Bayes

  • Conference paper
  • First Online:
International Conference on Applications and Techniques in Cyber Security and Intelligence ATCI 2018 (ATCI 2018)

Abstract

With the rapid growth of web and mobile technology, Social networking services like Twitter are widely used, resulting in large amounts of data being generated daily in social networking sites. Efficient Sentiment analysis of such data is very important for a range of applications and improvement of accuracy in detecting sentiment is the main aim of this research. This report examines the combination of a Chi-Squared feature selection algorithm, k-mean clustering and TF-IDF for attribute weighting based on Naïve Bayes, for classification of text and sentiment in communications generated on Twitter. This approach is compared with other approaches based on Naïve Bayes to give an account of their relative strengths and weaknesses. When running experiments on multi-domain twitter datasets, results indicate that the proposed method shows superior performance across a range of. The main aim of this research is to enhance the performance of the Naïve Bayes classifier using a feature selection technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kumar, A., Sebastian, T.M.: Sentiment analysis on Twitter. Int. J. Comput. Sci. Issues (IJCSI) 9(4), 372–378 (2012)

    Google Scholar 

  2. Bigonha, C., Cardoso, T.N.C., Moro, M.M.: Sentiment-based influence detection on Twitter. J. Braz. Comput. Soc. (2012). https://doi.org/10.1007/s13173-011-0051-5

    Article  Google Scholar 

  3. Devika, M. D., Sunitha, C., Amal, G.: Sentiment analysis: a comparative study on different approaches. Procedia Comput. Sci. 44–49 (2016)

    Article  Google Scholar 

  4. Kotsiantis, S.B.: Supervised machine learning: a review of classification techniques. Informatica, 249–268 (2007)

    Google Scholar 

  5. Giachanou, A., Crestani, F.: Like it or not: a survey of twitter sentiment analysis methods. ACM Comput. Surv 49, 28 (2016). http://doi.org/10.1145/2938640

    Article  Google Scholar 

  6. Zainuddin, N., Selamat, A., Ibrahim, R.: Hybrid sentiment classification on twitter aspect-based sentiment analysis. Appl. Intell. 1–9 (2007)

    Google Scholar 

  7. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)

    Article  Google Scholar 

  8. Sonawane, V.A.: Sentiment analysis of Twitter data: a survey of techniques. Int. J. Comput. Appl. 139(11), 5–15 (2016)

    Google Scholar 

  9. Ebrahimi, M., Amir, H.Y., Sheth, A.: Challenges of sentiment analysis for dynamic events. IEEE Intell. Syst. 32(5), 70–75 (2017)

    Article  Google Scholar 

  10. Song, J., Tae, K.K., Lee, B., Kim, S., Yong Youn, H.: A novel classification approach based on Naïve Bayes for Twitter sentiment analysis. KSII Trans. Internet Inf. Syst. 11(6), 2996–3011 (2017)

    Google Scholar 

  11. Rana, T.A., Cheah, Y.: Aspect extraction in sentiment analysis: comparative analysis and survey. Artif. Intell. Rev. 46(4), 459–483 (2016)

    Article  Google Scholar 

  12. Yuvaraj, N., Sabari, A.: Twitter sentiment classification using binary shuffled frog algorithm. Intell. Autom. Soft Comput. 23, 373–381 (2016)

    Article  Google Scholar 

  13. Hai, Z., Cong, G., Chang, K.: Analyzing sentiment in one go: supervised joint aspect and sentiment model. Futur. Comput. Inform. J. 29, 1172–1185 (2016)

    Google Scholar 

  14. Pletneva, E.V.: Text sentiment classification based on a genetic algorithm. J. Comput. Syst. Sci. Int. 55, 106–114 (2016)

    Article  Google Scholar 

  15. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)

    Article  MathSciNet  Google Scholar 

  16. Lunghan Zhang, L.J.: Two feature weighting approaches for Naive Bayes text classifiers. Knowl. Based Syst. 100(15), 137–144 (2016)

    Article  Google Scholar 

  17. Subhajit Dey Sarkar, S. G.: A novel feature selection technique for text classification using Naïve Bayes. Int. Sch. Res. Not. (2014). Article no. 717092

    Google Scholar 

  18. Ang Yang, J.Z.: Enhanced Twitter sentiment analysis by using feature selection and combination. In: Proceedings of International Symposium on Security and Privacy, pp 52–57 (2015)

    Google Scholar 

  19. Schuller, B., Knaup, T.: Learning and knowledge-based sentiment analysis in movie review key excerpts. In: Esposito, A., Esposito, A.M., Martone, R., Müller, V.C., Scarpetta, G. (eds.) Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues. LNCS, vol. 6456. Springer, Berlin (2011)

    Google Scholar 

  20. Yelena Mejova, P.S.: Exploring feature definition and selection for sentiment classifiers. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (2011)

    Google Scholar 

  21. Rachburee, N., Punlumjeak, W.: A comparison of feature selection approach between greedy, IG-ratio, chi-square, and mRMR in educational mining. In: IEEE Conference Proceedings, pp. 420–424. The Institute of Electrical and Electronics Engineers, Inc. (2015)

    Google Scholar 

  22. Qiu, Y.F., Wang, W., Liu, D.Y.: Research on an improved CHI feature selection method. Appl. Mech. Mater. 241–244 (2012)

    Google Scholar 

  23. Saroj, T.: Kavita, R: Review: study on simple k mean and modified K mean clustering technique. Int. J. Sci. Eng. Comput. Technol. 6(7), 279–281 (2016)

    Google Scholar 

Download references

Acknowledgement

We are grateful to Mrs. Angelika Maag for proof reading and making corrections to this article. Without her support, it would have not been possible to submit this in the current form.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Paudel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Paudel, S., Prasad, P.W.C., Alsadoon, A., Islam, M.R., Elchouemi, A. (2019). Feature Selection Approach for Twitter Sentiment Analysis and Text Classification Based on Chi-Square and Naïve Bayes. In: Abawajy, J., Choo, KK., Islam, R., Xu, Z., Atiquzzaman, M. (eds) International Conference on Applications and Techniques in Cyber Security and Intelligence ATCI 2018. ATCI 2018. Advances in Intelligent Systems and Computing, vol 842. Springer, Cham. https://doi.org/10.1007/978-3-319-98776-7_30

Download citation

Publish with us

Policies and ethics