Skip to main content

An Adaptive Concentration Selection Model for Spam Detection

  • Conference paper
Advances in Swarm Intelligence (ICSI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8794))

Included in the following conference series:

Abstract

Concentration based feature construction (CFC) approach has been proposed for spam detection. In the CFC approach, Global concentration (GC) and local concentration (LC) are used independently to convert emails to 2-dimensional or 2n-dimensional feature vectors. In this paper, we propose a novel model which selects concentration construction methods adaptively according to the match between testing samples and different kinds of concentration features. By determining which concentration construction method is proper for the current sample, the email is transformed into a corresponding concentration feature vector, which will be further employed by classification techniques in order to obtain the corresponding class. The k-nearest neighbor method is introduced in experiments to evaluate the proposed concentration selection model on the classic and standard corpora, namely PU1, PU2, PU3 and PUA. Experimental results demonstrate that the model performs better than using GC or LC separately, which provides support to the effectiveness of the proposed model and endows it with application in the real world.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. CYREN: Internet threats trend report: April 2014. Tech. rep. (2014)

    Google Scholar 

  2. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A bayesian approach to filtering junk e-mail. In: Learning for Text Categorization: Papers from the 1998 Workshop, vol. 62, pp. 98–105. AAAI Technical Report WS-98-05, Madison (1998)

    Google Scholar 

  3. Ciltik, A., Gungor, T.: Time-efficient spam e-mail filtering using n-gram models. Pattern Recognition Letters 29(1), 19–33 (2008)

    Article  Google Scholar 

  4. Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Sakkis, G., Spyropoulos, C., Stamatopoulos, P.: Learning to filter spam e-mail: A comparison of a naive bayesian and a memory-based approach. Arxiv preprint cs/0009009 (2000)

    Google Scholar 

  5. Sakkis, G., Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Spyropoulos, C., Stamatopoulos, P.: A memory-based approach to anti-spam filtering for mailing lists. Information Retrieval 6(1), 49–73 (2003)

    Article  Google Scholar 

  6. Drucker, H., Wu, D., Vapnik, V.: Support vector machines for spam categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)

    Article  Google Scholar 

  7. Clark, J., Koprinska, I., Poon, J.: A neural network based approach to automated e-mail classification. In: Proceedings of the IEEE/WIC International Conference on Web Intelligence, WI 2003, pp. 702–705. IEEE (2003)

    Google Scholar 

  8. Wu, C.: Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks. Expert Systems with Applications 36(3), 4321–4330 (2009)

    Article  Google Scholar 

  9. Yang, Y.: Noise reduction in a statistical approach to text categorization. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 256–263. ACM (1995)

    Google Scholar 

  10. Tan, Y., Deng, C., Ruan, G.: Concentration based feature construction approach for spam detection. In: International Joint Conference on Neural Networks, IJCNN 2009, pp. 3088–3093. IEEE (2009)

    Google Scholar 

  11. Ruan, G., Tan, Y.: A three-layer back-propagation neural network for spam detection using artificial immune concentration. Soft Computing 14(2), 139–150 (2010)

    Article  Google Scholar 

  12. Zhu, Y., Tan, Y.: Extracting discriminative information from e-mail for spam detection inspired by immune system. In: 2010 IEEE Congress on Evolutionary Computation (CEC), pp. 1–7. IEEE (2010)

    Google Scholar 

  13. Zhu, Y., Tan, Y.: A local-concentration-based feature extraction approach for spam filtering. IEEE Transactions on Information Forensics and Security 6(2), 486–497 (2011)

    Article  Google Scholar 

  14. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967)

    Article  MATH  Google Scholar 

  15. Androutsopoulos, I., Paliouras, G., Michelakis, E.: Learning to filter unsolicited commercial e-mail. “DEMOKRITOS”. National Center for Scientific Research (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Gao, Y., Mi, G., Tan, Y. (2014). An Adaptive Concentration Selection Model for Spam Detection. In: Tan, Y., Shi, Y., Coello, C.A.C. (eds) Advances in Swarm Intelligence. ICSI 2014. Lecture Notes in Computer Science, vol 8794. Springer, Cham. https://doi.org/10.1007/978-3-319-11857-4_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11857-4_26

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11856-7

  • Online ISBN: 978-3-319-11857-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics