Skip to main content

Evolutionary Multi-objective Scheduling for Anti-Spam Filtering Throughput Optimization

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10334))

Included in the following conference series:

Abstract

This paper presents an evolutionary multi-objective optimization problem formulation for the anti-spam filtering problem, addressing both the classification quality criteria (False Positive and False Negative error rates) and email messages classification time (minimization). This approach is compared to single objective problem formulations found in the literature, and its advantages for decision support and flexible/adaptive anti-spam filtering configuration is demonstrated. A study is performed using the Wirebrush4SPAM framework anti-spam filtering and the SpamAssassin email dataset. The NSGA-II evolutionary multi-objective optimization algorithm was applied for the purpose of validating and demonstrating the adoption of this novel approach to the anti-spam filtering optimization problem, formulated from the multi-objective optimization perspective. The results obtained from the experiments demonstrated that this optimization strategy allows the decision maker (anti-spam filtering system administrator) to select among a set of optimal and flexible filter configuration alternatives with respect to classification quality and classification efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Statista: The statistics portal, Global spam volume as percentage of total e-mail traffic from January 2014 to September 2016, by month (2016). https://www.statista.com/statistics/420391/spam-email-traffic-share/. Accessed 14 Feb 2017

  2. Digital Marketing Ramblings, 73 Incredible e-mail statistics (2016). http://expandedramblings.com/index.php/email-statistics/. Accessed 14 Feb 2017

  3. The Apache SpamAssassin Group, The first enterprise open-source spam filter (2003), http://spamassassin.apache.org/. Accessed 14 Feb 2017

  4. Méndez, J.R., Reboiro-Jato, M., Díaz, F., Díaz, E., Fdez-Riverola, F.: Grindstone4Spam: an optimization toolkit for boosting e-mail classification. J. Syst. Softw. 85(12), 2909–2920 (2012). doi:10.1016/j.jss.2012.06.027

    Article  Google Scholar 

  5. Yevseyeva, I., Basto-Fernandes, V., Ruano-Ordás, D., Méndez, J.R.: Optimising anti-spam filters with evolutionary algorithms. Expert Syst. Appl. 40(10), 4010–4021 (2013). doi:10.1016/j.eswa.2013.01.008

    Article  Google Scholar 

  6. Zhao, J., Basto-Fernandes, V., Jiao, L., Yevseyeva, I., Maulana, A., Li, R., Bäck, T., Tang, K.: Emmerich, Michael T. M.: Multiobjective optimization of classifiers by means of 3D convex-hull-based evolutionary algorithms. Inf. Sci. 367–368, 80–104 (2016). doi:10.1016/j.ins.2016.05.026

    Article  Google Scholar 

  7. Basto-Fernandes, V., Yevseyeva, I., Méndez, J.R., Zhao, J., Fdez-Riverola, F.: Emmerich, Michael T. M.: A spam filtering multi-objective optimization study covering parsimony maximization and three-way classification. Appl. Soft Comput. 48, 111–123 (2016). doi:10.1016/j.asoc.2016.06.043

    Article  Google Scholar 

  8. Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, J., Méndez, J.R.: Effective scheduling strategies for boosting performance on rule-based spam filtering frameworks. J. Syst. Softw. 86(12), 3151–3161 (2013). doi:10.1016/j.jss.2013.07.036

    Article  Google Scholar 

  9. Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, F., Méndez, J.R.: Combining scheduling heuristics to improve e-mail filtering throughput. In: Omatu, S., Malluhi, Qutaibah M., Gonzalez, S.R., Bocewicz, G., Bucciarelli, E., Giulioni, G., Iqba, F. (eds.) Distributed Computing and Artificial Intelligence. AISC, vol. 373, pp. 235–242. Springer, Cham (2015). doi:10.1007/978-3-319-19638-1_27

    Google Scholar 

  10. Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, F., Méndez, J.R.: Using new scheduling heuristics based on resource consumption information for increasing throughput on rule-based spam filtering systems. Softw. Pract. Exper. 46(8), 1035–1051 (2016). doi:10.1002/spe.2343

  11. Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, F., Basto-Fernandes, V., Méndez, J.R.: RuleSIM: a toolkit for simulating the operation and improving throughput of rule-based spam filters. Softw. Pract. Exp. 46, 1091–1108 (2016). doi:10.1002/spe.2342

    Article  Google Scholar 

  12. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002). doi:10.1109/4235.996017

    Article  Google Scholar 

  13. IEEE Transactions on Evolutionary Computing – Popular Documents, February 2017. http://ieeexplore.ieee.org/xpl/topAccessedArticles.jsp?punumber=4235&sortType=popular_most_cited_by_papers. Accessed 3 April 2017

  14. The Apache SpamAssassin Group, How do I get SpamAssassin to run faster? https://wiki.apache.org/spamassassin/FasterPerformance. Accessed 14 Feb 2017

  15. Pérez-Díaz, N., Ruano-Ordás, D., Fdez-Riverola, F., Méndez, J.R.: Wirebrush4SPAM: a novel framework for improving efficiency on spam filtering services. Softw. Pract. Exp. 43(11), 1299–1318 (2013). doi:10.1002/spe.2135

  16. The Apache SpamAssassin Group. RescoreMassCheck. https://wiki.apache.org/spamassassin/RescoreMassCheck. Accessed 14 Feb 2017

  17. Beasley, D.: Possible applications of evolutionary computation. In: Evolutionary Computation 1: Basic Algorithms and Operators, 1st edn., pp. 4–18. Institute of Physics Publishing, Bristol and Philadelphia (2000)

    Google Scholar 

  18. Resnick, P.: RFC2822: Internet Message Format, Network Working Group. https://www.ietf.org/rfc/rfc2822.txt. Accessed 14 Feb 2017

  19. The Apache SpamAssassin Group, The Apache SpamAssassin Public Corpus. https://spamassassin.apache.org/publiccorpus/. Accessed 14 Feb 2017

  20. CSMINING Group, Spam Emails Datasets. http://csmining.org/index.php/spam-email-datasets-.html. Accessed 14 Feb 2017

  21. TREC Spam. Text REtrieval Conference. http://trec.nist.gov/data/spam.html. Accessed 14 Feb 2017

  22. Guenter, B.: SPAM archive. http://untroubled.org/spam/. Accessed 14 Feb 2017

  23. Durillo, J.J., Nebro, A.J.: jMetal: a java framework for multi-objective optimization. Adv. Eng. Softw. 42(10), 760–771 (2011). doi:10.1016/j.advengsoft.2011.05.014

    Article  Google Scholar 

Download references

Acknowledgements

SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from University of Vigo for hosting its IT infrastructure.

Funding: This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia) and FEDER (European Union). This work was partially supported by the project Platform of integration of intelligent techniques for analysis of biomedical information (TIN2013-47153-C3-3-R) from the Spanish Ministry of Economy and Competitiveness.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Ramón Méndez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Ruano-Ordás, D., Basto-Fernandes, V., Yevseyeva, I., Méndez, J.R. (2017). Evolutionary Multi-objective Scheduling for Anti-Spam Filtering Throughput Optimization. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59650-1_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59649-5

  • Online ISBN: 978-3-319-59650-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics