Skip to main content

Detecting Malicious Spam Mails: An Online Machine Learning Approach

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 8836)

Abstract

Malicious spam is one of the major problems of the Internet nowadays. It brings financial damage to companies and security threat to governments and organizations. Most recent spam emails contain URLs that redirect spam receivers to malicious Web servers. In this paper, we propose an online machine learning based malicious spam email detection system. The term-weighting scheme represents each spam email. These feature vectors are then used as the input of the classifier. The learning is periodically performed to update the classifier so that the system provides increased adaptability to take account of spam emails whose contents change from time to time. A real data set is labeled by the SPIKE system which is developed by NICT. Evaluation experiments show that the detection system is efficient and accurate to identify malicious spam emails.

Keywords

  • malicious spam detection
  • online learning
  • tf-idf
  • vector space model

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-12643-2_45
  • Chapter length: 8 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-12643-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Prabhakar, R., Basavaraju, M.: A Novel Method of Spam Mail Detection Using Text Based Clustering Approach. Phil. Trans. Roy. Soc. London A247, 529–551 (2010)

    Google Scholar 

  2. Internet 2012 in Numbers, http://royal.pingdom.com/2013/01/16/internet-2012-in-numbers

  3. Inoue, D., Eto, M., Yoshioka, K., Baba, S., Suzuki, K., Nakazato, J., Ohtaka, K., Nakao, K.: Nicter: An Incident Analysis System Toward Binding Network Monitoring with Malware Analysis. In: WOMBAT Workshop on Information Security Threats Data Collection and Sharing (WISTDCS), pp. 58–66 (2008)

    Google Scholar 

  4. Nakao, K., Yoshioka, K., Inoue, D., Eto, M.: A Novel Concept of Network Incident Analysis based on Multi-layer Observations of Malware Activities. In: The 2nd Joint Workshop on Information Security (JWIS 2007), pp. 267–279 (2007)

    Google Scholar 

  5. Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Communications of the ACM 18(11), 613–620 (1975)

    CrossRef  MATH  Google Scholar 

  6. Guzella, T.S., Caminhas, W.M.: A review of machine learning approaches to spam filtering. Expert Syst. Appl. 36, 10206–10222 (2009)

    CrossRef  Google Scholar 

  7. Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets, pp. 1–17 (2011)

    Google Scholar 

  8. Zhu, J., Rosset, S., Hastie, T., Tibshirani, R.: 1-norm support vector machines. Advances in Neural Information Processing Systems 16, 49–56 (2004)

    Google Scholar 

  9. Tretyakov, K.: Machine Learning Techniques in Spam Filtering. Technical Report, Institute of Computer Science, University of Tartu (2004)

    Google Scholar 

  10. Chang, Y.W., Lin, C.J.: Feature Ranking using linear SVM. In: JMLR Workshop and Conference Proceedings, vol. 3, pp. 53–64 (2008)

    Google Scholar 

  11. Lewis, D.D.: Evaluating and optimizing autonomous text classification systems. In: Fox, E.A., Ingwersen, P., Fidel, R. (eds.) Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, pp. 246–254 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Dai, Y., Tada, S., Ban, T., Nakazato, J., Shimamura, J., Ozawa, S. (2014). Detecting Malicious Spam Mails: An Online Machine Learning Approach. In: Loo, C.K., Yap, K.S., Wong, K.W., Beng Jin, A.T., Huang, K. (eds) Neural Information Processing. ICONIP 2014. Lecture Notes in Computer Science, vol 8836. Springer, Cham. https://doi.org/10.1007/978-3-319-12643-2_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12643-2_45

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12642-5

  • Online ISBN: 978-3-319-12643-2

  • eBook Packages: Computer ScienceComputer Science (R0)