Abstract
This paper analyses Twitter data to detect the political lean of a profile by extracting and classifying sentiments expressed through tweets. The work utilizes natural language processing, augmented with sentiment analysis algorithms and machine learning techniques, to classify specific keywords. The proposed methodology initially performs data pre-processing, followed by multi-aspect sentiment analysis for computing the sentiment score of the extracted keywords, for precisely classifying users into various clusters based on similarity score with respect to a sample user in each cluster. The proposed technique also predicts the sentiment of a profile towards unknown keywords and gauges the bias of an unidentified user towards political events or social issues. The proposed technique was tested on Twitter dataset with 1.72 million tweets taken from over 10,000 profiles and was able to successfully identify the political leniency of the user profiles with 99% confidence level, and also on a synthetic dataset with 2500 tweets, where the predicted accuracy and F1 score were 0.99 and 0.985 respectively, and 0.97 and 0.975 when neutral users were also considered for classification. The paper could also identify the impact of political decisions on various clusters, by analyzing the shift in the number of users belonging to the different clusters.
Similar content being viewed by others
Code Availibility
The experimental code is available on request.
References
Abdi, S., Bagherzadeh, J., Gholami, G., et al. (2021). Using an auxiliary dataset to improve emotion estimation in users’ opinions. Journal of Intelligent Information System, 56, 581–603. https://doi.org/10.1007/s10844-021-00643-y.
Ahmed, C., ElKorany, A., & ElSayed, E. (2023). Prediction of customer’s perception in social networks by integrating sentiment analysis and machine learning. Journal of Intelligent Information System, 60, 829–851. https://doi.org/10.1007/s10844-022-00756-y
Berka, P. (2020). Sentiment analysis using rule-based and case-based reasoning. Journal of Intelligent Information System, 55, 51–66. https://doi.org/10.1007/s10844-019-00591-8
Brito, K. D. S., Filho, R. L. C. S., & Adeodato, P. J. L. (2021). A Systematic Review of Predicting Elections Based on Social Media Data: Research Challenges and Future Directions. IEEE Trans Comput Soc Syst, 8, 819–843. https://doi.org/10.1109/TCSS.2021.3063660
Cena, F., Console, L., & Vernero, F. (2023). How to deal with negative preferences in recommender systems: a theoretical framework. Journal of Intelligent Information System, 60, 23–47. https://doi.org/10.1007/s10844-022-00705-9
Chakraborty, K., Bhattacharyya, S., & Bag, R. (2020). A Survey of Sentiment Analysis from Social Media Data. IEEE Trans Comput Soc Syst, 7, 450–464. https://doi.org/10.1109/TCSS.2019.29569577
Chiche, A., & Yitagesu, B. (2022). Part of speech tagging: a systematic review of deep learning and machine learning approaches. J Big Data, 9, 10. https://doi.org/10.1186/s40537-022-00561-y
Chouchani, N., & Abed, M. (2020). Online social network analysis: detection of communities of interest. Journal of Intelligent Information System, 54, 5–21. https://doi.org/10.1007/s10844-018-0522-7
Crisci, A., Grasso, V., Nesi, P., et al. (2018). Predicting TV programme audience by using twitter based metrics. Multimed Tools Appl, 77, 12203–12232. https://doi.org/10.1007/s11042-017-4880-x
Das, R., Kamruzzaman, J., & Karmakar, G. (2019). Opinion Formation in Online Social Networks: Exploiting Predisposition, Interaction, and Credibility. IEEE Trans Comput Soc Syst, 6, 554–566. https://doi.org/10.1109/TCSS.2019.2914264
de Campos, L. M., Fernandez-Luna, J. M., Huete, J. F., et al. (2021). LDA-based term profiles for expert finding in a political setting. Journal of Intelligent Information System, 56, 529–559. https://doi.org/10.1007/s10844-021-00636-x
Deng, Z., Yan, M., Sang, J., et al. (2015). Twitter is faster: Personalized Time- aware Video Recommendation from Twitter to YouTube. ACM Trans Multimedia Comput Commun Appl (TOMM), 11, 1–23. https://doi.org/10.1145/2637285
Elbagir, S., & Yang, J. (2019). Twitter Sentiment Analysis Using Natural Language Toolkit and VADER Sentiment. Proceedings of the International MultiConference of Engineers and Computer Scientists, 122, 16.
Fagni, T., & Cresci, S. (2022). Fine-Grained Prediction of Political Leaning on Social Media with Unsupervised Deep Learning. J Artif Intell Res, 73, 633–672. https://doi.org/10.1613/jair.1.13112
Hui, M. (2020). US Election 2020 Tweetshttps:https://www.kaggle.com/datasets/manchunhui/us-election-2020-tweets.
Ianni, M., Masciari, E., & Sperlí, G. (2021). A survey of Big Data dimensions vs Social Networks analysis. Journal of Intelligent Information System, 57, 73–100. https://doi.org/10.1007/s10844-020-00629-2.
Kayiki, S. (2022). SenDemonNet: sentiment analysis for demonetization tweets using heuristic deep neural network. Multimed Tools Appl, 81, 11341–11378. https://doi.org/10.1007/s11042-022-11929-w.
Kowsik, V. V. S., Yashwanth, L., Harish, S. et al. (2023). Political Tweets. http://tinyurl.com/PoliticalTweets.
Kumar, S., Saini, M., Goel, M., et al. (2021). Modeling information diffusion in online social networks using a modified forest-fire model. Journal of Intelligent Information System, 56, 355–377. https://doi.org/10.1007/s10844-020-00623-8
Ligthart, A., Catal, C., & Tekinerdogan, B. (2021). Systematic reviews in sentiment analysis: a tertiary study. Artificial Intelligence Review, 54, 4997–5053. https://doi.org/10.1007/s10462-021-09973-3
Liu, H., Chatterjee, I., Zhou, M., et al. (2020). Aspect-Based Sentiment Analysis: A Survey of Deep Learning Methods. IEEE Trans Comput Soc Syst, 7, 1358–1375. https://doi.org/10.1109/TCSS.2020.3033302
Luceri, L., Braun, T., & Giordano, S. (2019). Analyzing and inferring human real-life behavior through online social networks with social influence deep learning. Appl Netw Sci, 4, 34. https://doi.org/10.1007/s41109-019-0134-3
Nagarajan, S. M., & Gandhi, U. D. (2019). Classifying streaming of Twitter data based on sentiment analysis using hybridization. Neural Comput & Applic, 31, 1425–1433. https://doi.org/10.1007/s00521-018-3476-3
Nasar, Z., Jaffry, S.W. & Malik, M.K. (2019). Textual keyword extraction and summarization: State-of-the-art.Information Processing & Management, 56, 102088. https://doi.org/10.1016/j.ipm.2019.102088
Nazir, F., Ghazanfar, M. A., Maqsood, M., et al. (2019). Social media signal detection using tweets volume, hashtag, and sentiment analysis. Multimed Tools Appl, 78, 3553–3586. https://doi.org/10.1007/s11042-018-6437-z
Nguyen, N., T., Szczerbicki, E., Trawinski, B., et al. (2019). Collective intelligence in information systems. J. Intell. Fuzzy Syst, 37, 7113–7115. https://doi.org/10.3233/JIFS-179324
Ouertatani, A., Gasmi, G., & Latiri, C. (2021). Parsing argued opinion structure in Twitter content. Journal of Intelligent Information System, 56, 327–353. https://doi.org/10.1007/s10844-020-00620-x
Park, S. M., & Kim, Y. G. (2021). Root Cause Analysis Based on Relations Among Sentiment Words. Cognitive Computation, 13, 903–918. https://doi.org/10.1007/s12559-021-09872-3
Pathak, A. R., Pandey, M., & Rautaray, S. (2021). Topic-level sentiment analysis of social media data using deep learning. Applied Soft Computing, 108, 107440. https://doi.org/10.1016/j.asoc.2021.107440
Petukhova, A., & Fachada, N. (2022). TextCL: A Python package for NLP preprocessing tasks. SoftwareX, 19, 101122. https://doi.org/10.1016/j.softx.2022.101122
Salehan, M., Kim, D. J., & Koo, C. (2018). A study of the effect of social trust, trust in social networking services, and sharing attitude, on two dimensions of personal information sharing behavior. J Supercomput, 74, 3596–3619. https://doi.org/10.1007/s11227-016-1790-z
Sedhai, S., & Sun, A. (2018). Semi-Supervised Spam Detection in Twitter Stream. IEEE Trans Comput Soc Syst, 5, 169–175. https://doi.org/10.1109/TCSS.2017.2773581
Sharma, P. S., Yadav, D., & Garg, P. (2020). A systematic review on page ranking algorithms. Int. j. inf. tecnol, 12, 329–337. https://doi.org/10.1007/s41870-020-00439-3
Singh, L. G., & Singh, S. R. (2021). Empirical study of sentiment analysis tools and techniques on societal topics. Journal of Intelligent Information System, 56, 379–407. https://doi.org/10.1007/s10844-020-00616-7
Stefanov, P., Darwish, K., Atanasov, A. et al. (2020). Predicting the topical stance and political leaning of media using tweets. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 527–537. 10.18653/v1/2020.acl-main.50.
Stieglitz, S., Mirbabaie, M., Ross, B., et al. (2017). Social media analytics - Challenges in topic discovery, data collection, and data preparation. International Journal of Information Management, 39, 156–168. https://doi.org/10.1016/j.ijinfomgt.2017.12.002
Sun, L., Guo, J., & Zhu, Y. (2020). A multi-aspect user-interest model based on sentiment analysis and uncertainty theory for recommender systems. Electronic Commerce Research, 20, 857–882. https://doi.org/10.1007/s10660-018-9319-6
Toprak, M., Boldrini, C., Passarella, A., et al. (2023). Harnessing the Power of Ego Network Layers for Link Prediction in Online Social Networks. IEEE Trans Comput Soc Syst, 10, 48–60. https://doi.org/10.1109/TCSS.2022.3155946
Trupthi, M., Pabboju, S., Gugulotu, N. (2019). Deep Sentiments Extraction for Consumer Products Using NLP-Based Technique. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds) Soft Computing and Signal Processing. Advances in Intelligent Systems and Computing, vol 898. Springer, Singapore. https://doi.org/10.1007/978-981-13-3393-4_20
Vidyashree, K. P., & Rajendra, A. B. (2023). An Improvised Sentiment Analysis Model on Twitter Data Using Stochastic Gradient Descent (SGD) Optimization Algorithm in Stochastic Gate Neural Network (SGNN). SN Comp Sci, 4, 190. https://doi.org/10.1007/s42979-022-01607-x
Wankhade, M., Rao, A. C. S., & Kulkarni, C. (2022). A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review, 55, 5731–5780. https://doi.org/10.1007/s10462-022-10144-1
Wehner, D. (2023). Meta Reports First Quarter 2023 Results.1–10. https://s21.q4cdn.com/399680738/files/doc_news/Meta-Reports-First-Quarter-2023
Widyassari, A. P., Rustad, S., Shidik, G. F., et al. (2022). Review of automatic text summarization techniques & methods. J King Saud Univ-Computer and Information Science, 34, 1029–1046. https://doi.org/10.1016/j.jksuci.2020.05.006
Wongkar, M., Angdresey, A. (2019). Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler: Twitter. Fourth International Conference on Informatics and Computing (ICIC), Semarang, Indonesia. 1–5. https://doi.org/10.1109/ICIC47613.2019.8985884
Xue, D., Hirche, S., & Cao, M. (2020). Opinion Behavior Analysis in Social Networks under the Influence of Coopetitive Media. IEEE Trans Netw Sci Eng, 7, 961–974. https://doi.org/10.1109/TNSE.2019.2894565
You, Q., Bhatia, S. & Luo, J. (2016). A picture tells a thousand words - About you! User interest profiling from user-generated visual content. Signal Processing, 124, 45–53. https://doi.org/10.1016/j.sigpro.2015.10.032
Zainuddin, N., Selamat, A., & Ibrahim, R. (2018). Hybrid sentiment classification on twitter aspect-based sentiment analysis. Applied Intelligence, 48, 1218–1232. https://doi.org/10.1007/s10489-017-1098-6
Zhao, D., Hu, X., Xiong, S., et al. (2021). k-means clustering and kNN classification based on negative databases. Applied Soft Computing, 110, 107732. https://doi.org/10.1016/j.asoc.2021.107732
Zheng, Y., Li, Y., Wang, G., et al. (2019). A Novel Hybrid Algorithm for Feature Selection Based on Whale Optimization Algorithm. IEEE Access, 7, 14908–14923. https://doi.org/10.1109/ACCESS.2018.2879848
Acknowledgements
Not Applicable
Funding
The authors have no relevant financial or non-financial interests to disclose.
Author information
Authors and Affiliations
Contributions
All authors contributed equally to this work.
Corresponding author
Ethics declarations
Statement of Responsibility
All authors reviewed the manuscript and contributed equally to this work.
Conflicts of interest
The authors declare that they have no conflict of interest
Ethical Approval
Not Applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kowsik, V.V.S., Yashwanth, L., Harish, S. et al. Sentiment analysis of twitter data to detect and predict political leniency using natural language processing. J Intell Inf Syst (2024). https://doi.org/10.1007/s10844-024-00842-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10844-024-00842-3