Abstract
Sentiment analysis is a field related to data mining in which subjective information is extracted from source materials such as Twitter, blogs, newspaper articles, etc. Twitter data presents an opportunity for companies to analyze the sentiment the customers or potential users have towards its products. This paper presents an innovative sentiment prediction approach in which the data is first clustered using K-means clustering and then CART algorithm is applied to each cluster to classify the tweets as positive or negative. The results of innovative ‘cluster-then-predict’ approach directs towards an improved overall prediction accuracy with an increased collected sample data size leading to better clustering and improved classification of each cluster. Also, clustering of data provides useful insights, which helps the companies to gauge consumer sentiment. For the purpose of this paper, tweets related to ‘Windows 10’ which was launched by Microsoft on July 29, 2015; have been extracted.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Westling, A.: Sentiment analysis of microblog posts from a crisis event using machine learning. Master of Science Thesis, KTH CSC, Stockholm, Sweden
Dodd, J.: Twitter sentiment analysis. Final Project Report. National College of Ireland (2015, May)
Jose, A.K., Bhatia, N., Krishna, S.: Twitter sentiment analysis. Major Project Report, NIT Calicut (2010)
Akhavan Rahnama, A.: Real-time sentiment analysis of Twitter public stream, 84pp. University of Jyväskylä, Jyväskylä (2015)
Zhao, Y.: R and Data Mining: Examples and Case Studies. Academic Press. (2012)
Mittal, A., Arpit, G.: Stock Prediction Using Twitter Sentiment Analysis. Stanford University, CS229 (2012)
Zhang, L.: Sentiment analysis on Twitter with stock price and significant keyword correlation. Diss. 2013, The University of Texas at Austin (2013)
Ohmura, M., Kakusho, K., Okadome, T.: Stock market prediction by regression model with social moods. Int. J. Soc. Behav. Educ. Econ. Manage. Eng. 8(10) (2014)
Vu, T.-T., et al.: An experiment in integrating sentiment features for tech stock prediction in twitter. In: Workshop on Information Extraction and Entity Analytics on Social Media Data, 9 Dec 2012, Mumbai, The COLING 2012 Organizing Committee, pp. 23–38 (2012)
Wang, X., Gerber, M.S., Brown, D.E.: Automatic crime prediction using events extracted from twitter posts. In: Social Computing, Behavioral-Cultural Modeling and Prediction, pp. 231–238. Springer Berlin Heidelberg (2012)
Lu, Y., et al.: Integrating predictive analytics and social media. In: 2014 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, Paris, France (2014)
Pak, A., Patrick, P.: Twitter as a corpus for sentiment analysis and opinion mining. LREC 10 (2010)
Kagan, V., Andrew, S., Subrahmanian, V.S.: Using twitter sentiment to forecast the 2013 Pakistani election and the 2014 Indian election. In: IEEE Intelligent Systems, vol. 1, pp. 2–5 (2015)
Gayo-Avello, D.: I wanted to predict elections with twitter and all I got was this Lousy Paper. A Balanced Survey on Election Prediction using Twitter Data. arXiv preprint arXiv:1204.6441. University of Oviedo, Spain (2012)
Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp. 30–38. Association for Computational Linguistics, June 2011
Selmer, O., Brevik, M.: Classification and visualisation of Twitter sentiment data. Master’s Thesis, NTNU-Trondheim (2013)
The R project for statistical computing. https://www.r-project.org/
“Bag-of-Words” feature extraction technique. https://en.wikipedia.org/wiki/Bag-of-words_model
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media Singapore
About this paper
Cite this paper
Soni, R., James Mathai, K. (2016). An Innovative ‘Cluster-then-Predict’ Approach for Improved Sentiment Prediction. In: Choudhary, R., Mandal, J., Auluck, N., Nagarajaram, H. (eds) Advanced Computing and Communication Technologies. Advances in Intelligent Systems and Computing, vol 452. Springer, Singapore. https://doi.org/10.1007/978-981-10-1023-1_13
Download citation
DOI: https://doi.org/10.1007/978-981-10-1023-1_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-1021-7
Online ISBN: 978-981-10-1023-1
eBook Packages: EngineeringEngineering (R0)