Abstract
Spam text messages are unwanted messages sent to a large number of users on their mobile phones by telemarketers, companies to advertise their products and services and can often be a trap set by a scammer. These junk messages are capable of installing malware on phones if the user engages with the messages and can also be an attempt to steal the private information of the user. Thus, it is necessary to classify and detect these messages in order to protect the user from being a victim of such traps and to prevent identity theft of the user. Spam text messages can be detected by creating a corpus of text message words and identifying or classifying the words common to the spam text messages. In order to create a corpus, the data first has to be cleaned. Feature extraction can be performed on the cleaned data using various methods like term frequency, TF-IDF, Word2Vec, and GloVe. In this paper, we have classified the spam text from the messages by applying Naïve Bayes Classifier, Logistic Regression, LSTM and Convolutional Neural Network (CNN ). Further, we have presented their comparative analysis by calculating accuracy, precision, recall and F1 score.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shirani-Mehr, H.: SMS spam detection using machine learning approach. unpublished. http://cs229.stanford.edu/proj2013/ShiraniMehr-SMSSpamDetectionUsingMachineLearningApproach.pdf. (2013)
Almeida, T.A., Hidalgo, J.M.G., Yamakami, A.: Contributions to the study of SMS spam filtering: new collection and results. In: Proceedings of the 11th ACM Symposium on Document Engineering, pp. 259–262 (2011)
Aski, A.S., Sourati, N.K.: Proposed efficient algorithm to filter spam using machine learning techniques. Pac. Sci. Rev. Nat. Sci. Eng. 18(2), 145–149 (2016)
Mujtaba, D.G., Yasin, M.: SMS spam detection using simple message content features. J. Basic Appl. Sci. Res. 4(4), 5 (2014)
Gudkova, D., Vergelis, M., Shcherbakova, T., Demidova, N.: Spam and phishing in Q3 2017. Securelist—kaspersky lab’s cyberthreat research and reports. https://securelist.com/spam-and-phishing-in-q3-2017/82901/ (2017)
Choudhary, N., Jain, A.K.: Towards filtering of SMS spam messages using machine learning based technique. Adv. Inf. Comput. Res. 712, 18–30 (2017)
Agarwal, S., Kaur, S., Garhwal, S.: SMS spam detection for Indian messages. In: 2015 1st International Conference on Next Generation Computing Technologies (NGCT), pp. 634–638. IEEE (2015)
Xu, Q.E., Xiang, W., Yang, Q., Du, J., Zhong, J.: SMS spam detection using non-content features. IEEE Intell. Syst. 27(6), 44–51 (2012)
Suleiman, D., Al-Naymat, G.: SMS spam detection using H2O framework. Procedia Comput. Sci. 113, 154–161 (2017)
Sethi, P., Bhandari, V., Kohli, B.: SMS spam detection and comparison of various machine learning algorithms. In 2017 International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN), pp. 28–31(2017)
Sajedi, H., Parast, G.Z., Akbari, F.: SMS spam filtering using machine learning techniques: a survey. Mach. Lear. Res. 1(1), 1 (2016)
Sable, S., Kalavadekar, P.N.: SMS classification based on naïve bayes classifier and semi-supervised learning. Int. J. Mod. Trends. Eng. Res 3, 16–25 (2016)
Popovac, M., Karanovic, M, Sladojevic, S., Arsenovic, M., Anderla, A.: Convolutional neural network based SMS spam detection. In: 2018 26th Telecommunications Forum (TELFOR), pp. 1–4. IEEE (2018)
Dada, E.G., Bassi, J.S., Chiroma, H., Adetunmbi, A.O., Ajibuwa, O.E.: Machine learning for email spam filtering: review, approaches and open research problems. Heliyon 5(6), e01802 (2019)
Mathew, K., Biju, I.: Intelligent spam classification for mobile text message. In: Proceedings of 2011 International Conference on Computer Science and Network Technology, vol. 1, pp. 101–105. IEEE (2011)
Taheri, R., Javidan, R.: Spam filtering in SMS using recurrent neural networks. In: 2017 Artificial Intelligence and Signal Processing Conference (AISP), pp. 331–336. IEEE, (2017)
Google Developers, 2018, Machine learning guidelines-text classification, viewed 02 October 2019, https://developers.google.com/machine-learning/guides/text-classification/step-1 (2018)
Skymind: A beginner’s guide to word2vec and word embedding, viewed 20 October 2019, https://skymind.ai/wiki/word2vec (2019)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. arXiv preprint arXiv:1310.4546 (2013)
Awad, M., Foqaha, M.: Email spam classification using hybrid approach of RBF neural network and particle swarm optimization. Int. J. Netw. Secur. Appl. 8(4) (2016)
Fonseca, D.M., Fazzion, O.H., Cunha, E., Las-Casas, I., Guedes, P.D., Meira, W.M.: Chaves measuring characterizing, and avoiding spam traffic costs. IEEE Int. Comput. 99 (2016)
Jain, A.K., Gupta, B.B.: Phishing detection: analysis of visual similarity based approaches. Secur. Commun. Netw. (2017)
Bhowmick, A., Hazarika, S.M.: Machine learning for E-Mail spam filtering: review, techniques and trends arXiv:1606.01042v1 [cs.LG] 3 Jun 2016 pp. 1–27 (2016)
Sharma, A., Suryawansi, A.: A novel method for detecting spam email using KNN classification with spearman correlation as distance measure. Int. J. Comput. Appl. 136(6), 28–34 (2016)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mundra, S., Mundra, A., Saigal, A., Gupta, P., Agarwal, J., Goyal, M.K. (2022). Machine Learning Approaches for the Classification of Spammed Text in Messages. In: Somani, A.K., Mundra, A., Doss, R., Bhattacharya, S. (eds) Smart Systems: Innovations in Computing. Smart Innovation, Systems and Technologies, vol 235. Springer, Singapore. https://doi.org/10.1007/978-981-16-2877-1_56
Download citation
DOI: https://doi.org/10.1007/978-981-16-2877-1_56
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2876-4
Online ISBN: 978-981-16-2877-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)