Frontiers of Computer Science

, Volume 12, Issue 1, pp 135–145 | Cite as

A multi-level approach to highly efficient recognition of Chinese spam short messages

Research Article
  • 17 Downloads

Abstract

The problem of spam short message (SMS) recognition involves many aspects of natural language processing. A good solution to solving the problem can not only improve the quality of people experiencing the mobile life, but also has a positive role on promoting the analysis of short text occurring in current mobile applications, such as Webchat and microblog. As spam SMSes have characteristics of sparsity, transformation and real-timedness, we propose three methods at different levels, i.e., recognition based on symbolic features, recognition based on text similarity, and recognition based on pattern matching. By combining these methods, we obtain a multi-level approach to spam SMS recognition. In order to enrich the pattern base to reduce manual labor and time, we propose a quasi-pattern learning method, which utilizes quasi-pattern matching results in the pattern matching process. The method can learn many interesting and new patterns from the SMS corpus. Finally, a comprehensive analysis indicates that our spam SMS recognition approach achieves a precision rate as high as 95.18%, and a recall rate of 95.51%.

Keywords

spam short message spam recognition similarity computing pattern learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11704_2016_5415_MOESM1_ESM.ppt (188 kb)
Supplementary material, approximately 188 KB.

References

  1. 1.
    Chen Y W. The research of treatment for spam message in China. Dissertation for the Doctoral Degree. Shanghai: Shanghai Jiao Tong University, 2010Google Scholar
  2. 2.
    Huang L Y. On the countermeasures of junk message. Journal of Chongqing University of Posts and Telecommunications (Social Science Edition), 2010, 3: 25–30Google Scholar
  3. 3.
    Jia X Z. A study on legal governance of spam messages in China. Dissertation for the Doctoral Degree. Changchun: Jilin University, 2013Google Scholar
  4. 4.
    Yi Y F. Principles and implementation of spam short message monitoring. Zhongxing Telecom Technology, 2005, 11(6): 49–54Google Scholar
  5. 5.
    Zhang Y, Fu J M. Identifying and trace backing short message spam. Application Research of Computers, 2006, 23(3): 245–247Google Scholar
  6. 6.
    Wang B, Pan WF. A survey of content-based anti-spam email filtering. Journal of Chinese Information Processing, 2006, 19(5): 1–10Google Scholar
  7. 7.
    Shan G Y, Fan X H, Yang Y X. Short message service system security analysis. Information Network Security, 2003, 11: 52–54Google Scholar
  8. 8.
    Shi J. An effective spam short message filtering system. Dissertation for the Doctoral Degree. Chengdu: University of Electronic Science and Technology of China, 2010Google Scholar
  9. 9.
    Wang R, Tan W. Management of spam SMS based on big data mining. Telecom Engineering Technics and Standardization, 2015, 2: 78–82Google Scholar
  10. 10.
    Qian Q, Wan B. Spam messages intercept strategy research based on the generalized digit. China New Communication, 2015, 4: 42–43Google Scholar
  11. 11.
    Zhang Y J, Liu J L, Gao S B. Spam short message classifier model based on association rules. Journal of Nantong University (Natural Science Edition), 2014, 3: 6–12Google Scholar
  12. 12.
    Sun D. Application and implementation of Hadoop cloud computing technology in junk message filtering. Netinfo Security, 2015, 7: 13–19Google Scholar
  13. 13.
    Uysal A K, Gunal S, Ergin S, Gunal E S. A novel framework for SMS spam filtering. In: Proceedings of 2012 International Symposium on Innovations in Intelligent Systems and Applications (INISTA). 2012CrossRefGoogle Scholar
  14. 14.
    Duan L Z, Li N, Huang L J. A new spam short message classification. In: Proceedings of the 1st International Workshop on Education Technology and Computer Science. 2009Google Scholar
  15. 15.
    Rafique M Z, Farooq M. SMS SPAM detection by operating on bytelevel distributions using hidden markov models. In: Proceedings of the 20th Virus Bulletin International Conference. 2010Google Scholar
  16. 16.
    Chen K X, Chen J Y. An improved spam short message filtering technology based on the naive Bayesian algorithm. Fujian Computer, 2014, 3: 42–43Google Scholar
  17. 17.
    Wu N N, Wu M G, Chen S. Real-time monitoring and filtering system for mobile SMS. In: Proceedings of the 3rd IEEE Conference on Industrial Electronics and Applications. 2008Google Scholar
  18. 18.
    Ma N. Research on content based spam short message identifying. Dissertation for the Doctoral Degree. Beijing: Beijing University of Posts and Telecommunications, 2014Google Scholar
  19. 19.
    Huang W L. Research on key techniques of spam short message filtering. Dissertation for the Doctoral Degree. Hangzhou: Zhejiang University, 2008Google Scholar
  20. 20.
    Li Y T. Research on spam short message text classification algorithm. Heilongjiang Science and Technology Information, 2015, 19: 144Google Scholar
  21. 21.
    Gong C C. Research on short text language computing. Dissertation for the Doctoral Degree. Beijing: The Institute of Computing Technology of the Chinese Academy of Sciences, 2008Google Scholar
  22. 22.
    Ma X, XuWR, Guo J, Hu R L. SMS-2008: an annotated Chinese short messages corpus. Journal of Chinese Information, 2009, 23(4): 22–26Google Scholar
  23. 23.
    He X. Design and implementation of junk short message filtering system. Dissertation for the Doctoral Degree. Chengdu: University of Electronic Science and Technology of China, 2009Google Scholar
  24. 24.
    Li H, Zhang Y, Lu H. Junk SMS filtering based on context. Computer Engineering, 2008, 34(12): 154–156Google Scholar

Copyright information

© Higher Education Press and Springer-Verlag Berlin Heidelberg 2018

Authors and Affiliations

  1. 1.School of Computer Science & EngineeringJiangsu University of Science and TechnologyJiangsuChina
  2. 2.School of Computer and Control EngineeringUniversity of Chinese Academy of SciencesBeijingChina

Personalised recommendations