Advertisement

Deep F-Measure Maximization in Multi-label Classification: A Comparative Study

  • Stijn Decubber
  • Thomas MortierEmail author
  • Krzysztof Dembczyński
  • Willem Waegeman
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)

Abstract

In recent years several novel algorithms have been developed for maximizing the instance-wise F\(_\beta \)-measure in multi-label classification problems. However, so far, such algorithms have only been tested in tandem with shallow base learners. In the deep learning landscape, usually simple thresholding approaches are implemented, even though it is expected that such approaches are suboptimal. In this article we introduce extensions of utility maximization and decision-theoretic methods that can optimize the F\(_\beta \)-measure with (convolutional) neural networks. We discuss pros and cons of the different methods and we present experimental results on several image classification datasets. The results illustrate that decision-theoretic inference algorithms are worth the investment. While being more difficult to implement compared to thresholding strategies, they lead to a better predictive performance. Overall, a decision-theoretic inference algorithm based on proportional odds models outperforms the other methods. Code related to this paper is available at: https://github.com/sdcubber/f-measure.

Keywords

F\(_\beta \)-measure Bayes optimal classification Multi-label image classification Convolutional neural networks 

References

  1. 1.
    Keerthi, S., Sindhwani, V., Chapelle, O.: An efficient method for gradient-based adaptation of hyperparameters in SVM models. In: Advances in Neural Information Processing Systems, vol. 19. MIT Press (2007)Google Scholar
  2. 2.
    Fan, R., Lin, C.: A study on threshold selection for multi-label classification. Technical report, Department of Computer Science, National Taiwan University (2007)Google Scholar
  3. 3.
    Zhang, X., Graepel, T., Herbrich, R.: Bayesian online learning for multi-label and multi-variate performance measures. In: Proceedings of the Conference on Artificial Intelligence and Statistics (AISTATS), pp. 956–963 (2010)Google Scholar
  4. 4.
    Lipton, Z.C., Elkan, C., Naryanaswamy, B.: Optimal thresholding of classifiers to maximize F1 measure. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part II. LNCS (LNAI), vol. 8725, pp. 225–239. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-662-44851-9_15CrossRefGoogle Scholar
  5. 5.
    Petterson, J., Caetano, T.: Reverse multi-label learning. In: Advances in Neural Information Processing Systems, vol. 24 (2010)Google Scholar
  6. 6.
    Petterson, J., Caetano, T.: Submodular multi-label learning. In: Advances in Neural Information Processing Systems, vol. 25 (2011)Google Scholar
  7. 7.
    Jansche, M.: Maximum expected F-measure training of logistic regression models. In: Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pp. 736–743 (2005)Google Scholar
  8. 8.
    Kokkinos, I.: Boundary detection using F-Measure-, filter- and feature- (f3) boost. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 650–663. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-15552-9_47CrossRefGoogle Scholar
  9. 9.
    Musicant, D., Kumar, V., Ozgur, A.: Optimizing F-measure with support vector machines. In: Proceedings of the International FLAIRS Conference, Haller, pp. 356–360. AAAI Press (2003)Google Scholar
  10. 10.
    Joachims, T.: A support vector method for multivariate performance measures. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 377–384 (2005)Google Scholar
  11. 11.
    Waegeman, W., Dembczyński, K., Jachnik, A., Cheng, W., Hüllermeier, E.: On the Bayes-optimality of F-measure maximizers. J. Mach. Learn. Res. 15(1), 3333–3388 (2014)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Lewis, D.: Evaluating and optimizing autonomous text classification systems. In: Proceedings of the International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 246–254 (1995)Google Scholar
  13. 13.
    Chai, K.: Expectation of F-measures: tractable exact computation and some empirical observations of its properties. In: Proceedings of the International ACM Conference on Research and Development in Information Retrieval (SIGIR) (2005)Google Scholar
  14. 14.
    Jansche, M.: A maximum expected utility framework for binary sequence labeling. In: Proceedings of the Annual Meetings of the Association for Computational Linguistics (ACL), pp. 736–743 (2007)Google Scholar
  15. 15.
    Quevedo, J., Luaces, O., Bahamonde, A.: Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recogn. 45, 876–883 (2012)zbMATHGoogle Scholar
  16. 16.
    Ye, N., Chai, K., Lee, W., Chieu, H.: Optimizing F-measures: a tale of two approaches. In: Proceedings of the International Conference on Machine Learning (2012)Google Scholar
  17. 17.
    Dembczyński, K., Kotłowski, W., Koyejo, O., Natarajan, N.: Consistency analysis for binary classification revisited. In: Proceedings of the International Conference on Machine Learning (ICML), vol. 70. PMLR (2017)Google Scholar
  18. 18.
    Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: An exact algorithm for F-measure maximization. In: Advances in Neural Information Processing Systems, vol. 25 (2011)Google Scholar
  19. 19.
    Dembczyński, K., Jachnik, A., Kotlowski, W., Waegeman, W., Hüllermeier, E.: Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: Proceedings of the International Conference on Machine Learning (ICML) (2013)Google Scholar
  20. 20.
    Gasse, M., Aussem, A.: F-Measure maximization in multi-label classification with conditionally independent label subsets. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016, Part I. LNCS (LNAI), vol. 9851, pp. 619–631. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46128-1_39CrossRefGoogle Scholar
  21. 21.
    Wolpert, D.H.: Original contribution: stacked generalization. Neural Netw. 5(2), 241–259 (1992)CrossRefGoogle Scholar
  22. 22.
    Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 76(2–3), 211–225 (2009)CrossRefGoogle Scholar
  23. 23.
    Agresti, A.: Categorical Data Analysis, 3rd edn. Wiley, Hoboken (2013)zbMATHGoogle Scholar
  24. 24.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results (2007). http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
  25. 25.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results (2012). http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
  26. 26.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  27. 27.
    Kaggle: Planet: Understanding the amazon from space (2017). https://www.kaggle.com/c/planet-understanding-the-amazon-from-space
  28. 28.
    Abadi, M., Agarwal, A., Barham, P., et al.: TensorFlow: Large-scale machine learning on heterogeneous systems (2015). tensorflow.org
  29. 29.
    Chollet, F., et al.: Keras (2015). https://github.com/keras-team/keras
  30. 30.
    Paszke, A., Gross, S., Chintala, S., et al.: Automatic differentiation in pytorch. In: NIPS-W (2017)Google Scholar
  31. 31.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Stijn Decubber
    • 1
    • 2
  • Thomas Mortier
    • 2
    Email author
  • Krzysztof Dembczyński
    • 3
  • Willem Waegeman
    • 2
  1. 1.ML6GhentBelgium
  2. 2.Department of Data Analysis and Mathematical ModellingGhent UniversityGhentBelgium
  3. 3.Institute of Computing SciencePoznań University of TechnologyPoznańPoland

Personalised recommendations