Quadcriteria Optimization of Binary Classifiers: Error Rates, Coverage, and Complexity

  • Vitor Basto-Fernandes
  • Iryna Yevseyeva
  • David Ruano-Ordás
  • Jiaqi Zhao
  • Florentino Fdez-Riverola
  • José Ramón Méndez
  • Michael T. M. Emmerich
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 674)

Abstract

This paper presents a 4-objective evolutionary multiobjective optimization study for optimizing the error rates (false positives, false negatives), reliability, and complexity of binary classifiers. The example taken is the email anti-spam filtering problem.

The two major goals of the optimization is to minimize the error rates that is the false negative rate and the false positive rate. Our approach discusses three-way classification, that is the binary classifier can also not classify an instance in cases where there is not enough evidence to assign the instance to one of the two classes. In this case the instance is marked as suspicious but still presented to the user. The number of unclassified (suspicious) instances should be minimized, as long as this does not lead to errors. This will be termed the coverage objective. The set (ensemble) of rules needed for the anti-spam filter to operate in optimal conditions is addressed as a fourth objective. All objectives stated above are in general conflicting with each other and that is why we address the problem as a 4-objective (quadcriteria) optimization problem. We assess the performance of a set of state-of-the-art evolutionary multiobjective optimization algorithms. These are NSGA-II, SPEA2, and the hypervolume indicator-based SMS-EMOA. Focusing on the anti-spam filter optimization, statistical comparisons on algorithm performance are provided on several benchmarks and a range of performance indicators. Moreover, the resulting 4-D Pareto hyper-surface is discussed in the context of binary classifier optimization.

Keywords

Binary classification Three-way classification Parsimony Evolutionary multi-objective optimization Parallel coordinates 

References

  1. 1.
    Wang, P., Emmerich, M., Li, R., Tang, K., Bäck, T., Yao, X.: Convex hull-based multi-objective genetic programming for maximizing receiver operating characteristic performance. IEEE Trans. Evol. Comput. 19(2), 188–200 (2015)CrossRefGoogle Scholar
  2. 2.
    Li, R., Emmerich, M.T., Eggermont, J., Bäck, T., Schütz, M., Dijkstra, J., Reiber, J.H.: Mixed integer evolution strategies for parameter optimization. Evolu. Comput. 21(1), 29–64 (2013)CrossRefGoogle Scholar
  3. 3.
    Basto-Fernandes, V., Yevseyeva, I., Méndez, J.R.: Anti-spam multiobjective genetic algorithms optimization analysis. Int. Resour. Manage. J. 26(1), 54–67 (2012)CrossRefGoogle Scholar
  4. 4.
    Yevseyeva, I., Basto-Fernandes, V., Méndez, J.R.: Survey on anti-spam single and multi-objective optimization. In: Cruz-Cunha, M.M., Varajo, J., Powell, P., Martinho, R. (eds.), ENTERprise Information Systems. Communications in Computer and Information Science, vol. 220, pp. 120–129. Springer, Heidelberg (2011)Google Scholar
  5. 5.
    Basto-Fernandes, V., Yevseyeva, I., Méndez, J.R.: Optimization of anti-spam systems with multiobjective evolutionary algorithms. Int. Resour. Manage. J. 26, 54–67 (2012)CrossRefGoogle Scholar
  6. 6.
    Yevseyeva, I., Basto-Fernandes, V., Ruano-Ordás, D., Méndez, J.R.: Optimising anti-spam filters with evolutionary algorithms. Expert Syst. Appl. 40(10), 4010–4021 (2013)CrossRefGoogle Scholar
  7. 7.
    Jin, Y.: Multi-objective Machine Learning. Studies in Computational Intelligence. Springer, Heidelberg (2006)CrossRefMATHGoogle Scholar
  8. 8.
    Zhao, J., Basto-Fernandes, V., Jiao, L., Yevseyeva, L., Maulana, A., Li, R., Bäck, T., Emmerich, M.T.M.: Multiobjective optimization of classifiers by means of 3-d convex hull based evolutionary algorithm, ARXIV Computer Science abs/1412.5710 (2014). http://arxiv.org/abs/1412.5710
  9. 9.
    The Apache SpamAssassin Project - SpamAssassin public corpus (2005). http://spamassassin.apache.org/publiccorpus
  10. 10.
    SpamAssassin Team: The apache spamassassin project (2011). http://spamassassin.apache.org/
  11. 11.
    Durillo, J.J., Nebro, A.J.: jMetal: a java framework for multi-objective optimization. Adv. Eng. Softw. 42, 760–771 (2011)CrossRefGoogle Scholar
  12. 12.
    Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRefGoogle Scholar
  13. 13.
    Basto-Fernandes, V., Yevseyeva, I., Frantz, R.Z., Grilo, C., Daz, N.P., Emmerich, M.: An automatic generation of textual pattern rules for digital content filters proposal, using grammatical evolution genetic programming. Procedia Technol. 16, 806–812 (2014)CrossRefGoogle Scholar
  14. 14.
    Yao, Y.: The superiority of three-way decisions in probabilistic rough set models. Inf. Sci. 181(6), 1080–1096 (2011)CrossRefMATHGoogle Scholar
  15. 15.
    Miettinen, K.: Nonlinear Multiobjective Optimization. Springer, New York (1999)MATHGoogle Scholar
  16. 16.
    Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: improving the strength Pareto evolutionary algorithm. In: Proceedings of EUROGEN 2001, Athens Greece. CIMNE, Barcelona (2001)Google Scholar
  17. 17.
    Emmerich, M., Beume, N., Naujoks, B.: An EMO algorithm using the hypervolume measure as selection criterion. In: Coello Coello, C.A., Hernández Aguirre, A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 62–76. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  18. 18.
    While, L., Bradstreet, L., Barone, L.: A fast way of calculating exact hypervolumes. IEEE Trans. Evol. Comput. 16(1), 86–95 (2012)CrossRefGoogle Scholar
  19. 19.
    Emmerich, M.T.M., Fonseca, C.M.: Computing hypervolume contributions in low dimensions: asymptotically optimal algorithm and complexity results. In: Evolutionary Multi-Criterion Optimization. Springer, Heidelberg (2011)Google Scholar
  20. 20.
    Guerreiro, A.P., Fonseca, C.M., Emmerich, M.T.: A fast dimension-sweep algorithm for the hypervolume indicator in four dimensions. In: CCCG, pp. 77–82 (2012)Google Scholar
  21. 21.
    Tušar, T., Filipič, B.: Visualizing 4D approximation sets of multiobjective optimizers with prosections. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, pp. 737–744. ACM (2011)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Vitor Basto-Fernandes
    • 1
    • 2
  • Iryna Yevseyeva
    • 3
  • David Ruano-Ordás
    • 4
  • Jiaqi Zhao
    • 5
  • Florentino Fdez-Riverola
    • 4
  • José Ramón Méndez
    • 4
  • Michael T. M. Emmerich
    • 6
  1. 1.Instituto Universitario de Lisboa (ISCTE-IUL), University Institute of Lisbon, ISTAR-IULLisboaPortugal
  2. 2.School of Technology and Management, Computer Science and Communications Research CentrePolytechnic Institute of LeiriaLeiriaPortugal
  3. 3.School of Computer Science and Informatics, Faculty of Technology, Cyber Technology InstituteDe Montfort UniversityLeicesterUK
  4. 4.Informatics Engineering SchoolUniversity of VigoOurenseSpain
  5. 5.The School of Computer Science and TechnologyChina University of Mining and TechnologyXuzhouP.R. China
  6. 6.Multicriteria Optimization, Design, and Analytics Group, LIACSLeiden UniversityLeidenThe Netherlands

Personalised recommendations