Performance Analysis of Naϊve Bayes Classification, Support Vector Machines and Neural Networks for Spam Categorization
Spam mail recognition is a new growing field which brings together the topic of natural language processing and machine learning as it is in essence a two class classification of natural language texts. An important feature of spam recognition is that it is a cost-sensitive classification: misclassification of a nonspam mail as spam is generally a more severe error than misclassifying a spam mail as non-spam. In order to be compared, the methods applied to this field should be all evaluated with the same corpus and within the same cost-sensitive framework. In this paper, the performances of Support Vector Machines (SVM), Neural Networks (NN) and Naϊve Bayes (NB) techniques are compared using a publicly available corpus (LINGSPAM) for different cost scenarios. The training time complexities of the methods are also evaluated. The results show that NN has significantly better performance than the two other, having acceptable training times. NB gives better results than SVM when the cost is extremely high while in all other cases SVM outperforms NB.
KeywordsSupport Vector Machine Attribute Size Sequential Minimal Optimization Natural Language Text Cost Scenario
Unable to display preview. Download preview PDF.
- Androutsopoulos, I, Koutsias, J., Chandrinos, K.V., and Spyropoulos, C.D. (2000), “An Evaluation of Naive Bayesian Anti-Spam Filtering,” Proceedings of the workshop on Machine Learning in the New Information Age, pp. 9–17.Google Scholar
- Carreras, X., and Marquez, L. (2001), “Boosting Trees for Anti-Spam Email Filtering,” Proceedings of the 4th International Conference on Recent Advances in NLP, pp. 58–64.Google Scholar
- Chang, C.C, and Lin, C. (2001), “LIBSVM: a Library for Support Vector Machines,” http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google Scholar
- Duda, R.O., Hart, P.E., and Strok, D.G., (2001), “Linear Discriminant Functions,” Chapter 5 in Pattern Classification. John Wiley, 10–43.Google Scholar
- Efe, O.M., and Kaynak, O. (2000), Artificial Neural Networks and their Applications, Bogazici University Press, Istanbul.Google Scholar
- Osuna, E.E., Freund, R., and Girosi, F. (1997), “Improved training algorithm for support vector machines,” Proceedings of the IEEE Workshops on Neural Network for Signal Processing, pp. 24–26.Google Scholar
- Platt, J.C. (1998), “Sequential minimal optimization: A fast algorithm for training support vector machines,” Advances in Kernel Method: Support Vector Learning, Scholkopf, Surges, and Smola, Eds. Cambridge, MA: MIT Press, pp. 185–208.Google Scholar
- Sahami, M., Dumais, S., Heckerman, D., and Horvitz, E. (1998), “A Bayesian Approach to Filtering Junk E-Mail. Learning for Text Categorization,” AAAI Technical Report, WS-98-05, pp. 55–62.Google Scholar
- Schneider, K. (2003), “A Comparison of Event Models for Naive Bayes Anti-Spam E-Mail Filtering,” Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, pp. 207–314.Google Scholar
- Vapnik, V. (1995), The Nature of Statistical Learning Theory, Springer-Verlag.Google Scholar
- Vapnik, V. (1982), Estimation of Dependences Based on Empirical Data, Springer-Verlag.Google Scholar
- Zurada, J. M. (1992), Introduction To Artificial Neural Networks, West Publishing Company.Google Scholar