Optimizing semantic LSTM for spam detection

  • Gauri Jain
  • Manisha Sharma
  • Basant Agarwal
Original Research


Classifying spam is a topic of ongoing research in the area of natural language processing, especially with the increase in the usage of the Internet for social networking. This has given rise to the increase in spam activity by the spammers who try to take commercial or non-commercial advantage by sending the spam messages. In this paper, we have implemented an evolving area of technique known as deep learning technique. A special architecture known as Long Short Term Memory (LSTM), a variant of the Recursive Neural Network (RNN) is used for spam classification. It has an ability to learn abstract features unlike traditional classifiers, where the features are hand-crafted. Before using the LSTM for classification task, the text is converted into semantic word vectors with the help of word2vec, WordNet and ConceptNet. The classification results are compared with the benchmark classifiers like SVM, Naïve Bayes, ANN, k-NN and Random Forest. Two corpuses are used for comparison of results: SMS Spam Collection dataset and Twitter dataset. The results are evaluated using metrics like Accuracy and F measure. The evaluation of the results shows that LSTM is able to outperform traditional machine learning methods for detection of spam with a considerable margin.


Spam detection Machine learning Recursive neural network (RNN) Long Short Term Memory (LSTM) Deep learning Twitter spam SMS spam 


  1. 1.
    MAAWG. Messaging anti-abuse working group. Email metrics report. Q1 2012 to Q2 2014. Accessed 30 Mar 2017
  2. 2.
    Mowbray M (2010) The twittering machine. In: WEBIST (2), pp 299–304Google Scholar
  3. 3.
    Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, no. 2010, p 12Google Scholar
  4. 4.
    Mittal N, Agarwal B, Agarwal S, Agarwal S, Gupta P (2013) A hybrid approach for twitter sentiment analysis. In: 10th international conference on natural language processing (ICON-2013), pp 116–120Google Scholar
  5. 5.
    Ahmed S, Mithun F (2004) Word stemming to enhance spam filtering. In: The conference on email and anti-spam (CEAS’04) 2004Google Scholar
  6. 6.
    Agarwal B, Mittal N (2016) Prominent feature extraction for sentiment analysis. Springer International Publishing, Berlin, pp 21–45Google Scholar
  7. 7.
    Khorsi A (2007) An overview of content-based spam filtering techniques. Informatica 31(3):269–277zbMATHGoogle Scholar
  8. 8.
    Kolari P, Java A, Finin T, Oates T, Joshi A (2006) Detecting spam blogs: a machine learning approach. In: Proceedings of the 21st national conference on artificial intelligence (AAAI), July 2006Google Scholar
  9. 9.
    Wang AH (2010) Don’t follow me: spam detection in twitter. In: Proceedings of the 2010 international conference on security and cryptography (SECRYPT). IEEE, New York, pp 1–10Google Scholar
  10. 10.
    Tretyakov K (2004) Machine learning techniques in spam filtering. In: Data mining problem-oriented seminar. MTAT, vol 3, no 177, pp 60–79Google Scholar
  11. 11.
    Ntoulas A, Najork M, Manasse M, Fetterly D (2006) Detecting spam web pages through content analysis. In: Proceedings of the 15th international conference on World Wide Web. ACM, New York, pp 83–92Google Scholar
  12. 12.
    Mccord M, Chuah M (2011) Spam detection on twitter using traditional classifiers. In: International conference on autonomic and trusted computing. Springer, Berlin, pp 175–186Google Scholar
  13. 13.
    SMS Spam Collection v.1. Accessed 27 Dec 2016
  14. 14.
    LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRefGoogle Scholar
  15. 15.
    Bengio Y (2009) Learning deep architectures for AI. In: Foundations and trends® in machine learning, vol 2, no 1, pp 1–127Google Scholar
  16. 16.
    Deng L (2014) A tutorial survey of architectures, algorithms, and applications for deep learning. In: APSIPA transactions on signal and information processing, vol 3Google Scholar
  17. 17.
    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  18. 18.
    Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 513–520Google Scholar
  19. 19.
    Tang D, Wei F, Qin B, Liu T, Zhou M (2014) Coooolll: a deep learning system for twitter sentiment classification. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 208–212Google Scholar
  20. 20.
    Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, New York, pp 8599–8603Google Scholar
  21. 21.
    Hong J, Fang M (2015) Sentiment analysis with deeply learned distributed representations of variable length texts. Technical report, Stanford University, pp 655–665Google Scholar
  22. 22.
    Tzortzis G, Likas A (2007) Deep belief networks for spam filtering. In: 19th IEEE international conference on tools with artificial intelligence, 2007. ICTAI 2007, vol 2. IEEE, New York, pp 306–309Google Scholar
  23. 23.
    Mi G, Gao Y, Tan Y (2015) Apply stacked auto-encoder to spam detection. In: International conference in swarm intelligence. Springer, Cham, pp 3–15Google Scholar
  24. 24.
    Jain G, Sharma M, Agarwal B (2018) Spam detection on social media using semantic convolutional neural network. Int J Knowl Discov Bioinform (IJKDB) 8(1):12–26CrossRefGoogle Scholar
  25. 25.
    Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53st annual meeting on association for computational linguistics, ACL’15, Stroudsburg, PA, USA. Association for Computational LinguisticsGoogle Scholar
  26. 26.
    Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1422–1432Google Scholar
  27. 27.
    Ul-Hasan A, Ahmed SB, Rashid F, Shafait F, Breuel TM (2013) Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: 12th international conference on document analysis and recognition (ICDAR), 2013. IEEE, pp 1061–1065Google Scholar
  28. 28.
    Wöllmer M, Metallinou A, Eyben F, Schuller B, Narayanan S (2010) Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional lstm modeling. In: Proceedings on INTERSPEECH 2010, Makuhari, Japan, pp 2362–2365Google Scholar
  29. 29.
    Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: Thirteenth annual conference of the international speech communication association, pp 194–197Google Scholar
  30. 30.
    Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112Google Scholar
  31. 31.
    Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41CrossRefGoogle Scholar
  32. 32.
    Liu H, Singh P (2004) ConceptNet—a practical commonsense reasoning tool-kit. BT Technol J 22(4):211–226CrossRefGoogle Scholar
  33. 33.
    Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of international conference on learning representations (ICLR)Google Scholar
  34. 34.
    Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(Jul):2121–2159MathSciNetzbMATHGoogle Scholar

Copyright information

© Bharati Vidyapeeth's Institute of Computer Applications and Management 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceBanasthali VidyapithBanasthaliIndia
  2. 2.Department of Computer ScienceBanasthali VidyapithJaipurIndia
  3. 3.Department of Computer Science and EngineeringSKIT, Rajasthan Technical UniversityJaipurIndia

Personalised recommendations