Skip to main content

Machine Learning Approaches for the Classification of Spammed Text in Messages

  • Conference paper
  • First Online:
Smart Systems: Innovations in Computing

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 235))

  • 830 Accesses

Abstract

Spam text messages are unwanted messages sent to a large number of users on their mobile phones by telemarketers, companies to advertise their products and services and can often be a trap set by a scammer. These junk messages are capable of installing malware on phones if the user engages with the messages and can also be an attempt to steal the private information of the user. Thus, it is necessary to classify and detect these messages in order to protect the user from being a victim of such traps and to prevent identity theft of the user. Spam text messages can be detected by creating a corpus of text message words and identifying or classifying the words common to the spam text messages. In order to create a corpus, the data first has to be cleaned. Feature extraction can be performed on the cleaned data using various methods like term frequency, TF-IDF, Word2Vec, and GloVe. In this paper, we have classified the spam text from the messages by applying Naïve Bayes Classifier, Logistic Regression, LSTM and Convolutional Neural Network (CNN ). Further, we have presented their comparative analysis by calculating accuracy, precision, recall and F1 score.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shirani-Mehr, H.: SMS spam detection using machine learning approach. unpublished. http://cs229.stanford.edu/proj2013/ShiraniMehr-SMSSpamDetectionUsingMachineLearningApproach.pdf. (2013)

  2. Almeida, T.A., Hidalgo, J.M.G., Yamakami, A.: Contributions to the study of SMS spam filtering: new collection and results. In: Proceedings of the 11th ACM Symposium on Document Engineering, pp. 259–262 (2011)

    Google Scholar 

  3. Aski, A.S., Sourati, N.K.: Proposed efficient algorithm to filter spam using machine learning techniques. Pac. Sci. Rev. Nat. Sci. Eng. 18(2), 145–149 (2016)

    Google Scholar 

  4. Mujtaba, D.G., Yasin, M.: SMS spam detection using simple message content features. J. Basic Appl. Sci. Res. 4(4), 5 (2014)

    Google Scholar 

  5. Gudkova, D., Vergelis, M., Shcherbakova, T., Demidova, N.: Spam and phishing in Q3 2017. Securelist—kaspersky lab’s cyberthreat research and reports. https://securelist.com/spam-and-phishing-in-q3-2017/82901/ (2017)

  6. Choudhary, N., Jain, A.K.: Towards filtering of SMS spam messages using machine learning based technique. Adv. Inf. Comput. Res. 712, 18–30 (2017)

    Google Scholar 

  7. Agarwal, S., Kaur, S., Garhwal, S.: SMS spam detection for Indian messages. In: 2015 1st International Conference on Next Generation Computing Technologies (NGCT), pp. 634–638. IEEE (2015)

    Google Scholar 

  8. Xu, Q.E., Xiang, W., Yang, Q., Du, J., Zhong, J.: SMS spam detection using non-content features. IEEE Intell. Syst. 27(6), 44–51 (2012)

    Article  Google Scholar 

  9. Suleiman, D., Al-Naymat, G.: SMS spam detection using H2O framework. Procedia Comput. Sci. 113, 154–161 (2017)

    Article  Google Scholar 

  10. Sethi, P., Bhandari, V., Kohli, B.: SMS spam detection and comparison of various machine learning algorithms. In 2017 International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN), pp. 28–31(2017)

    Google Scholar 

  11. Sajedi, H., Parast, G.Z., Akbari, F.: SMS spam filtering using machine learning techniques: a survey. Mach. Lear. Res. 1(1), 1 (2016)

    Google Scholar 

  12. Sable, S., Kalavadekar, P.N.: SMS classification based on naïve bayes classifier and semi-supervised learning. Int. J. Mod. Trends. Eng. Res 3, 16–25 (2016)

    Google Scholar 

  13. Popovac, M., Karanovic, M, Sladojevic, S., Arsenovic, M., Anderla, A.: Convolutional neural network based SMS spam detection. In: 2018 26th Telecommunications Forum (TELFOR), pp. 1–4. IEEE (2018)

    Google Scholar 

  14. Dada, E.G., Bassi, J.S., Chiroma, H., Adetunmbi, A.O., Ajibuwa, O.E.: Machine learning for email spam filtering: review, approaches and open research problems. Heliyon 5(6), e01802 (2019)

    Google Scholar 

  15. Mathew, K., Biju, I.: Intelligent spam classification for mobile text message. In: Proceedings of 2011 International Conference on Computer Science and Network Technology, vol. 1, pp. 101–105. IEEE (2011)

    Google Scholar 

  16. Taheri, R., Javidan, R.: Spam filtering in SMS using recurrent neural networks. In: 2017 Artificial Intelligence and Signal Processing Conference (AISP), pp. 331–336. IEEE, (2017)

    Google Scholar 

  17. Google Developers, 2018, Machine learning guidelines-text classification, viewed 02 October 2019, https://developers.google.com/machine-learning/guides/text-classification/step-1 (2018)

  18. Skymind: A beginner’s guide to word2vec and word embedding, viewed 20 October 2019, https://skymind.ai/wiki/word2vec (2019)

  19. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. arXiv preprint arXiv:1310.4546 (2013)

  21. Awad, M., Foqaha, M.: Email spam classification using hybrid approach of RBF neural network and particle swarm optimization. Int. J. Netw. Secur. Appl. 8(4) (2016)

    Google Scholar 

  22. Fonseca, D.M., Fazzion, O.H., Cunha, E., Las-Casas, I., Guedes, P.D., Meira, W.M.: Chaves measuring characterizing, and avoiding spam traffic costs. IEEE Int. Comput. 99 (2016)

    Google Scholar 

  23. Jain, A.K., Gupta, B.B.: Phishing detection: analysis of visual similarity based approaches. Secur. Commun. Netw. (2017)

    Google Scholar 

  24. Bhowmick, A., Hazarika, S.M.: Machine learning for E-Mail spam filtering: review, techniques and trends arXiv:1606.01042v1 [cs.LG] 3 Jun 2016 pp. 1–27 (2016)

  25. Sharma, A., Suryawansi, A.: A novel method for detecting spam email using KNN classification with spearman correlation as distance measure. Int. J. Comput. Appl. 136(6), 28–34 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mundra, S., Mundra, A., Saigal, A., Gupta, P., Agarwal, J., Goyal, M.K. (2022). Machine Learning Approaches for the Classification of Spammed Text in Messages. In: Somani, A.K., Mundra, A., Doss, R., Bhattacharya, S. (eds) Smart Systems: Innovations in Computing. Smart Innovation, Systems and Technologies, vol 235. Springer, Singapore. https://doi.org/10.1007/978-981-16-2877-1_56

Download citation

Publish with us

Policies and ethics