Skip to main content

An Innovative ‘Cluster-then-Predict’ Approach for Improved Sentiment Prediction

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 452))

Abstract

Sentiment analysis is a field related to data mining in which subjective information is extracted from source materials such as Twitter, blogs, newspaper articles, etc. Twitter data presents an opportunity for companies to analyze the sentiment the customers or potential users have towards its products. This paper presents an innovative sentiment prediction approach in which the data is first clustered using K-means clustering and then CART algorithm is applied to each cluster to classify the tweets as positive or negative. The results of innovative ‘cluster-then-predict’ approach directs towards an improved overall prediction accuracy with an increased collected sample data size leading to better clustering and improved classification of each cluster. Also, clustering of data provides useful insights, which helps the companies to gauge consumer sentiment. For the purpose of this paper, tweets related to ‘Windows 10’ which was launched by Microsoft on July 29, 2015; have been extracted.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Westling, A.: Sentiment analysis of microblog posts from a crisis event using machine learning. Master of Science Thesis, KTH CSC, Stockholm, Sweden

    Google Scholar 

  2. Dodd, J.: Twitter sentiment analysis. Final Project Report. National College of Ireland (2015, May)

    Google Scholar 

  3. Jose, A.K., Bhatia, N., Krishna, S.: Twitter sentiment analysis. Major Project Report, NIT Calicut (2010)

    Google Scholar 

  4. Akhavan Rahnama, A.: Real-time sentiment analysis of Twitter public stream, 84pp. University of Jyväskylä, Jyväskylä (2015)

    Google Scholar 

  5. Zhao, Y.: R and Data Mining: Examples and Case Studies. Academic Press. (2012)

    Google Scholar 

  6. Mittal, A., Arpit, G.: Stock Prediction Using Twitter Sentiment Analysis. Stanford University, CS229 (2012)

    Google Scholar 

  7. Zhang, L.: Sentiment analysis on Twitter with stock price and significant keyword correlation. Diss. 2013, The University of Texas at Austin (2013)

    Google Scholar 

  8. Ohmura, M., Kakusho, K., Okadome, T.: Stock market prediction by regression model with social moods. Int. J. Soc. Behav. Educ. Econ. Manage. Eng. 8(10) (2014)

    Google Scholar 

  9. Vu, T.-T., et al.: An experiment in integrating sentiment features for tech stock prediction in twitter. In: Workshop on Information Extraction and Entity Analytics on Social Media Data, 9 Dec 2012, Mumbai, The COLING 2012 Organizing Committee, pp. 23–38 (2012)

    Google Scholar 

  10. Wang, X., Gerber, M.S., Brown, D.E.: Automatic crime prediction using events extracted from twitter posts. In: Social Computing, Behavioral-Cultural Modeling and Prediction, pp. 231–238. Springer Berlin Heidelberg (2012)

    Google Scholar 

  11. Lu, Y., et al.: Integrating predictive analytics and social media. In: 2014 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, Paris, France (2014)

    Google Scholar 

  12. Pak, A., Patrick, P.: Twitter as a corpus for sentiment analysis and opinion mining. LREC 10 (2010)

    Google Scholar 

  13. Kagan, V., Andrew, S., Subrahmanian, V.S.: Using twitter sentiment to forecast the 2013 Pakistani election and the 2014 Indian election. In: IEEE Intelligent Systems, vol. 1, pp. 2–5 (2015)

    Google Scholar 

  14. Gayo-Avello, D.: I wanted to predict elections with twitter and all I got was this Lousy Paper. A Balanced Survey on Election Prediction using Twitter Data. arXiv preprint arXiv:1204.6441. University of Oviedo, Spain (2012)

  15. Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp. 30–38. Association for Computational Linguistics, June 2011

    Google Scholar 

  16. Selmer, O., Brevik, M.: Classification and visualisation of Twitter sentiment data. Master’s Thesis, NTNU-Trondheim (2013)

    Google Scholar 

  17. The R project for statistical computing. https://www.r-project.org/

  18. “Bag-of-Words” feature extraction technique. https://en.wikipedia.org/wiki/Bag-of-words_model

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rishabh Soni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Soni, R., James Mathai, K. (2016). An Innovative ‘Cluster-then-Predict’ Approach for Improved Sentiment Prediction. In: Choudhary, R., Mandal, J., Auluck, N., Nagarajaram, H. (eds) Advanced Computing and Communication Technologies. Advances in Intelligent Systems and Computing, vol 452. Springer, Singapore. https://doi.org/10.1007/978-981-10-1023-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-1023-1_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-1021-7

  • Online ISBN: 978-981-10-1023-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics