Detecting Malicious Spam Mails: An Online Machine Learning Approach

Dai, Yuli; Tada, Shunsuke; Ban, Tao; Nakazato, Junji; Shimamura, Jumpei; Ozawa, Seiichi

doi:10.1007/978-3-319-12643-2_45

Yuli Dai²⁰,
Shunsuke Tada²⁰,
Tao Ban²¹,
Junji Nakazato²¹,
Jumpei Shimamura²² &
…
Seiichi Ozawa²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8836))

Included in the following conference series:

International Conference on Neural Information Processing

4448 Accesses
5 Citations
3 Altmetric

Abstract

Malicious spam is one of the major problems of the Internet nowadays. It brings financial damage to companies and security threat to governments and organizations. Most recent spam emails contain URLs that redirect spam receivers to malicious Web servers. In this paper, we propose an online machine learning based malicious spam email detection system. The term-weighting scheme represents each spam email. These feature vectors are then used as the input of the classifier. The learning is periodically performed to update the classifier so that the system provides increased adaptability to take account of spam emails whose contents change from time to time. A real data set is labeled by the SPIKE system which is developed by NICT. Evaluation experiments show that the detection system is efficient and accurate to identify malicious spam emails.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Prabhakar, R., Basavaraju, M.: A Novel Method of Spam Mail Detection Using Text Based Clustering Approach. Phil. Trans. Roy. Soc. London A247, 529–551 (2010)
Google Scholar
Internet 2012 in Numbers, http://royal.pingdom.com/2013/01/16/internet-2012-in-numbers
Inoue, D., Eto, M., Yoshioka, K., Baba, S., Suzuki, K., Nakazato, J., Ohtaka, K., Nakao, K.: Nicter: An Incident Analysis System Toward Binding Network Monitoring with Malware Analysis. In: WOMBAT Workshop on Information Security Threats Data Collection and Sharing (WISTDCS), pp. 58–66 (2008)
Google Scholar
Nakao, K., Yoshioka, K., Inoue, D., Eto, M.: A Novel Concept of Network Incident Analysis based on Multi-layer Observations of Malware Activities. In: The 2nd Joint Workshop on Information Security (JWIS 2007), pp. 267–279 (2007)
Google Scholar
Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Communications of the ACM 18(11), 613–620 (1975)
Article MATH Google Scholar
Guzella, T.S., Caminhas, W.M.: A review of machine learning approaches to spam filtering. Expert Syst. Appl. 36, 10206–10222 (2009)
Article Google Scholar
Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets, pp. 1–17 (2011)
Google Scholar
Zhu, J., Rosset, S., Hastie, T., Tibshirani, R.: 1-norm support vector machines. Advances in Neural Information Processing Systems 16, 49–56 (2004)
Google Scholar
Tretyakov, K.: Machine Learning Techniques in Spam Filtering. Technical Report, Institute of Computer Science, University of Tartu (2004)
Google Scholar
Chang, Y.W., Lin, C.J.: Feature Ranking using linear SVM. In: JMLR Workshop and Conference Proceedings, vol. 3, pp. 53–64 (2008)
Google Scholar
Lewis, D.D.: Evaluating and optimizing autonomous text classification systems. In: Fox, E.A., Ingwersen, P., Fidel, R. (eds.) Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, pp. 246–254 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Engineering, Kobe University, 1-1 Rokko-dai, Nada-ku, Kobe, 657-8501, Japan
Yuli Dai, Shunsuke Tada & Seiichi Ozawa
National Institute of Information and Communications Technology (NICT), Koganei, Tokyo, 184-8795, Japan
Tao Ban & Junji Nakazato
Clwit Inc., Tokyo, Japan
Jumpei Shimamura

Authors

Yuli Dai
View author publications
You can also search for this author in PubMed Google Scholar
Shunsuke Tada
View author publications
You can also search for this author in PubMed Google Scholar
Tao Ban
View author publications
You can also search for this author in PubMed Google Scholar
Junji Nakazato
View author publications
You can also search for this author in PubMed Google Scholar
Jumpei Shimamura
View author publications
You can also search for this author in PubMed Google Scholar
Seiichi Ozawa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Artificial Intelligence, Faculty of Computer Science and Information Technology Building, University of Malaya, 50603, Kuala Lumpur, Malaysia
Chu Kiong Loo
Department of Electronics and Communication Engineering, College of Engineering, Jalan IKRAM-UNITEN, Universiti Tenaga Nasional, 43009, Kajang, Selangor, Malaysia
Keem Siah Yap
School of Engineering and Information Technology, Murdoch University, 6150, South St, Murdoch, Western Australia, Australia
Kok Wai Wong
Department of Electrical and Electronics Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, 120-749, Seoul, South Korea
Andrew Teoh Beng Jin
Department of Electrical and Electronic Engineering, Xi’an Jiaotong-Liverpool University, Ren’ai Road 111, SIP 215123, Suzhou, Jiangsu Province, China
Kaizhu Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dai, Y., Tada, S., Ban, T., Nakazato, J., Shimamura, J., Ozawa, S. (2014). Detecting Malicious Spam Mails: An Online Machine Learning Approach. In: Loo, C.K., Yap, K.S., Wong, K.W., Beng Jin, A.T., Huang, K. (eds) Neural Information Processing. ICONIP 2014. Lecture Notes in Computer Science, vol 8836. Springer, Cham. https://doi.org/10.1007/978-3-319-12643-2_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-12643-2_45
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12642-5
Online ISBN: 978-3-319-12643-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics