Abstract
In recent times, Twitter is one of the major sources to access information. Its feature of the hashtag is something that grabs more attention from the users. One can write one’s mind and heart out on Twitter at any given minute. Due to which there is a rapid increase in the generation of irrelevant content on Twitter. Lately, a new hashtag called “#whitelivesmatter” was used as a counter for another hashtag “#blacklivesmatter”. A lot of anti-government protests and various other violent activities were conducted, recorded, and posted on Twitter with this hashtag. A lot of Kpop fans had taken over this hashtag and flooded Twitter with extremely irrelevant content. Due to which the main and important content of the protests was drowned in these irrelevant tweets, which made it extremely hard for the officials to find and reinforce the law and order. This paper aims at building a model that helps in finding the relevance of text content in the tweet and its hashtag #whitelivesmatter in specific. In this paper, supervised data analysis techniques like text classification are used to get the required output.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cooper, Jr., G.P., Yeager, V., Burkle, Jr., F.M., Subbarao, I.: Twitter as a potential disaster risk reduction tool. Part I: introduction, terminology, research and operational applications. PLoS Curr. 7 (2015)
Halawi, B., Mourad, A., Otrok, H., Damiani, E.: Few are as good as many: an ontology-based tweet spam detection approach. IEEE Access 6, 63890–63904 (2018)
Vosoughi, S.: Automatic detection and verification of rumors on Twitter. Mit.edu. Url: https://www.media.mit.edu/cogmac/publications/Soroush_Vosoughi_PHD_thesis.pdf
Kolos, S.: Hashtag as a Way of Archiving and Distributing Information on the Internet
Davidov, D., Tsur, O., Rappoport, A.: Enhanced sentiment learning using Twitter hashtags and smileys. In: Coling 2010: Posters, Aug 2010, pp. 241–249
Sedhai, S., Sun, A.: An analysis of 14 million tweets on hashtag-oriented spamming. J. Assoc. Inf. Sci. Technol. 68(7), 1638–1651 (2017)
Pervin, N., Phan, T.Q., Datta, A., Takeda, H., Toriumi, F.: Hashtag popularity on Twitter: analyzing co-occurrence of multiple hashtags. In: International Conference on Social Computing and Social Media, Aug 2015, pp. 169–182. Springer, Cham (2015)
Vijayakumar, T., Vinothkanna, M.R.: Capsule network on font style classification. J. Artif. Intell. 2(02), 64–76 (2020)
Dann, S.: Twitter data acquisition and analysis: methodology and best practice. In: Maximizing Commerce and Marketing Strategies Through Micro-Blogging, pp. 280–296. IGI Global. Twitter, 2020, Terms of Service. Url: https://twitter.com/en/tos
Herzallah, W., Faris, H., Adwan, O.: Feature engineering for detecting spammers on Twitter: modelling and analysis. J. Inf. Sci. 44(2), 230–247 (2018)
Inuwa-Dutse, I., Liptrott, M., Korkontzelos, I.: Detection of spam-posting accounts on Twitter. Neurocomputing 315, 496–511 (2018)
Narasamma, V.L., Sreedevi, M.: A Comparative Approach for Classification and Combined Cluster Based Classification Method for Tweets Data Analysis. Url: https://link.springer.com/chapter/10.1007%2F978-981-32-9690-9_33
Kumar, S., Morstatter, F., Liu, H.: Twitter Data Analytics, pp. 1041–4347. Springer New York, New York, NY (2014)
Sungheetha, A., Sharma, R.: Transcapsule model for sentiment classification. J. Artif. Intell. 2(03), 163–169 (2020)
Twitter, 2020, Terms of Service. Url: https://twitter.com/en/tos
Wolny, W.: Knowledge gained from Twitter data. In: 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), Gdansk, pp. 1133–1136 (2016). https://doi.org/10.15439/2016F149
Uysal, A.K., Gunal, S.: The impact of preprocessing on text classification. Inf. Process. Manage. 50(1), 104–112 (2014)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008). https://doi.org/10.1017/CBO9780511809071
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kodali, J., Kandikatla, V., Nagati, P., Nerendla, V., Sreedevi, M. (2022). Irrelevant Racist Tweets Identification Using Data Mining Techniques. In: Smys, S., Bestak, R., Palanisamy, R., Kotuliak, I. (eds) Computer Networks and Inventive Communication Technologies . Lecture Notes on Data Engineering and Communications Technologies, vol 75. Springer, Singapore. https://doi.org/10.1007/978-981-16-3728-5_15
Download citation
DOI: https://doi.org/10.1007/978-981-16-3728-5_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3727-8
Online ISBN: 978-981-16-3728-5
eBook Packages: EngineeringEngineering (R0)