Advertisement

Combined Classifiers with Neural Fuser for Spam Detection

  • Marcin Zmyślony
  • Bartosz Krawczyk
  • Michał Woźniak
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 189)

Abstract

Nowadays combining approach to classification is one of the most promising directions in the pattern recognition. There are many methods of decision making which could be used by the ensemble of classifiers. This work focuses on the fuser design to improve spam detection. We assume that we have a pool of diverse individual classifiers at our disposal and it can grow according the change of spam model. We propose to train a fusion block by the algorithm which has its origin in neural approach and the details and evaluations of mentioned method were presented in the previous works of authors. This work presents the results of computer experiments which were carried out on the basis of exemplary unbalanced spam dataset. They confirm that proposed compound classifier is further step in email security.

Keywords

combined classifiers neural networks fuser design spam detection concept drift imbalanced data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Biggio, B., Fumera, G., Roli, F.: Multiple Classifier Systems under Attack. In: El Gayar, N., Kittler, J., Roli, F. (eds.) MCS 2010. LNCS, vol. 5997, pp. 74–83. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  2. 2.
    Blanzieri, E., Bryl, A.: A survey of learning based techniques of email spam filtering. A Technical Report, University of Trento (2008)Google Scholar
  3. 3.
    Calton, P., Webb, S.: Observed trends in spam construction techniques: A case study of spam evolution. In: Proc of the 3rd Conference on E-mail and Anti-Spam, Mountain View, USA (2006)Google Scholar
  4. 4.
    Caruana, G., Li, M., Qi, M.: A MapReduce based parallel SVM for large scale spam filtering. In: Proc. of the 8th International Conference on Fuzzy Systems and Knowledge Discovery (2011)Google Scholar
  5. 5.
    Chih-Chin, L., Ming-Chi, T.: An empirical performance comparison of machine learning methods for spam e-mail categorization. In: Hybrid Intelligent Systems, pp. 44–48 (2004)Google Scholar
  6. 6.
    Dai, N., Davison, B.D., Qi, X.: Looking into the past to better classify web spam. In: AIRWeb 2009: Proc. of the 5th International Workshop on Adversarial Information Retrieval on the Web. ACM Press (2009)Google Scholar
  7. 7.
    Delany, S.J., Cunningham, P., Tsymbal, A., Coylem, L.: A case-based technique for tracking concept drift in spam filtering. Know.-Based Syst. 18, 187–195 (2005)CrossRefGoogle Scholar
  8. 8.
    Diao, L., Yang, C., Wang, H.: Training SVM email classifiers using very large imbalanced dataset. Journal of Experimental and Theoretical Artificial Intelligence 24(2), 193–210 (2012)CrossRefGoogle Scholar
  9. 9.
    Duin, R.P.W., Juszczak, P., Paclik, P., Pekalska, E., de Ridder, D., Tax, D.M.M.: PRTools4, A Matlab Toolbox for Pattern Recognition. Delft University of Technology (2004)Google Scholar
  10. 10.
    Erdelyi, M., Benczur, A.A., Masanes, J., Siklosi, D.: Web spam filtering in internet archives. In: AIRWeb 2009: Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web. ACM Press (2009)Google Scholar
  11. 11.
    Erdelyi, M., Garzo, A., Benczur, A.A.: Web spam classification: a few features worth more. In: Joint WICOW/AIRWeb Workshop on Web Quality In Conjunction with the 20th International World Web Conference in Hyderabad. ACM Press, India (2011)Google Scholar
  12. 12.
    Gansterer, W., et al.: Anti-spam methods – state-of-theart. Tech. rep., Institute for Distributed and Multimedia Systems, University of Vienna (2004)Google Scholar
  13. 13.
    Henzinger, M.R., Motwan, R., Silverstein, C.: Challenges in web search engines. SIGIR Forum 36(2), 11–22 (2002)CrossRefGoogle Scholar
  14. 14.
    Hershkop, S., Stolfo, S.J.: Combining email models for false positive reduction. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, pp. 98–107 (2005)Google Scholar
  15. 15.
    Kang, I., Jeong, M.K., Kong, D.: A differentiated one-class classification method with applications to intrusion detection. Expert Systems with Applications 39(4), 3899–3905 (2012)CrossRefGoogle Scholar
  16. 16.
    Krawczyk, B., Woźniak, M.: Combining Diverse One-Class Classifiers. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012, Part II. LNCS(LNAI), vol. 7209, pp. 590–601. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  17. 17.
    Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience (2004)Google Scholar
  18. 18.
    Sahami, M., Dumais, S., Heckerman, D., Hirvitz, E.: A Bayesian approach to filtering junck e-mail, Learning for Text Categorization: paper from the 1998 Workshop. AAAI Technical Report WS-98-05 (1998)Google Scholar
  19. 19.
  20. 20.
    Woźniak, M., Zmyślony, M.: Combining classifiers using trained fuser - analytical and experimental results. Neural Network World 20(7), 925–934 (2010)Google Scholar
  21. 21.
    Wozniak, M.: Proposition of common classifier construction for pattern recognition with context task. Knowledge-Based Systems 19(8), 617–624 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Marcin Zmyślony
    • 1
  • Bartosz Krawczyk
    • 1
  • Michał Woźniak
    • 1
  1. 1.Department of Systems and Computer NetworksWroclaw University of TechnologyWroclawPoland

Personalised recommendations