A Neural Model in Anti-spam Systems

  • Otávio A. S. Carpinteiro
  • Isaías Lima
  • João M. C. Assis
  • Antonio C. Zambroni de Souza
  • Edmilson M. Moreira
  • Carlos A. M. Pinheiro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4132)


The paper proposes the use of the multilayer perceptron model to the problem of detecting ham and spam e-mail patterns. It also proposes an intensive use of data pre-processing and feature selection methods to simplify the task of the multilayer perceptron in classifying ham and spam e-mails. The multilayer perceptron is trained and assessed on patterns extracted from the SpamAssassin Public Corpus. It is required to classify novel types of ham and spam patterns. The results are presented and evaluated in the paper.


Feature Selection Method Multilayer Perceptron Hide Unit Neural Model Output Unit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)Google Scholar
  2. 2.
    Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice-Hall, Inc., Englewood Cliffs (1999)MATHGoogle Scholar
  3. 3.
    Fawcett, T.: “In vivo” spam filtering: A challenge problem for KDD. ACM SIGKDD Explorations 5, 140–148 (2003)CrossRefGoogle Scholar
  4. 4.
    Gomes, L.H., Cazita, C., Almeida, J.M., Almeida, V., Meira Junior, W.: Characterizing a spam traffic. In: Proceedings of the Internet Measurement Conference, ACM SIGCOMM (2004)Google Scholar
  5. 5.
    Pfleeger, S.L., Bloom, G.: Canning spam: Proposed solutions to unwanted email. IEEE Security & Privacy 3, 40–47 (2005)Google Scholar
  6. 6.
    Cournane, A., Hunt, R.: An analysis of the tools used for the generation and prevention of spam. Computers & Security 23, 154–166 (2004)CrossRefGoogle Scholar
  7. 7.
    Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Paliouras, G., Spyropoulos, C.D.: An evaluation of naive Bayesian anti-spam filtering. In: Proceedings of the Workshop on Machine Learning in the New Information Age, pp. 9–17 (2000)Google Scholar
  8. 8.
    Özgür, L., Güngör, T., Gürgen, F.: Adaptive anti-spam filtering for agglutinative languages: a special case for Turkish. Pattern Recognition Letters 25, 1819–1831 (2004)CrossRefGoogle Scholar
  9. 9.
    Zhang, L., Zhu, J., Yao, T.: An evaluation of statistical spam filtering techniques. ACM Transactions on Asian Language Information Processing 3, 243–269 (2004)CrossRefGoogle Scholar
  10. 10.
    Drucker, H., Wu, D., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Transactions on Neural Networks 10, 1048–1054 (1999)CrossRefGoogle Scholar
  11. 11.
    Chuan, Z., Xianliang, L., Mengshu, H., Xu, Z.: A LVQ-based neural network antispam email approach. ACM SIGOPS Operating Systems Review 39, 34–39 (2005)CrossRefGoogle Scholar
  12. 12.
    Zorkadis, V., Karras, D.A., Panayotou, M.: Efficient information theoretic strategies for classifier combination, feature extraction and performance evaluation in improving false positives and false negatives for spam e-mail filtering. Neural Networks 18, 799–807 (2005)CrossRefGoogle Scholar
  13. 13.
    Internet web page: The Apache SpamAssassin Project. The Apache Software Foundation (2006), http://spamassassin.apache.org/publiccorpus/
  14. 14.
    Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the International Conference on Machine Learning (1997)Google Scholar
  15. 15.
    Papoulis, A., Pillai, S.U.: Probability, Random Variables, and Stochastic Processes, 4th edn. McGraw-Hill, New York (2001)Google Scholar
  16. 16.
    Fahlman, S.E.: An empirical study of learning speed in back-propagation networks. Technical Report CMU-CS-88-162, School of Computer Science—Carnegie Mellon University, Pittsburgh, PA (1988)Google Scholar
  17. 17.
    Rumelhart, D.E., Hinton, G.E., McClelland, J.L.: A general framework for parallel distributed processing. In: Rumelhart, D.E., McClelland, J.L., the PDP Research Group (eds.) Parallel Distributed Processing, vol. 1, pp. 45–76. The MIT Press, Cambridge (1986)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Otávio A. S. Carpinteiro
    • 1
  • Isaías Lima
    • 1
  • João M. C. Assis
    • 1
  • Antonio C. Zambroni de Souza
    • 1
  • Edmilson M. Moreira
    • 1
  • Carlos A. M. Pinheiro
    • 1
  1. 1.Research Group on Computer Networks and Software EngineeringFederal University of ItajubáItajubáBrazil

Personalised recommendations