A Blended Metric for Multi-label Optimisation and Evaluation

  • Laurence A. F. ParkEmail author
  • Jesse Read
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)


In multi-label classification, a large number of evaluation metrics exist, for example Hamming loss, exact match, and Jaccard similarity – but there are many more. In fact, there remains an apparent uncertainty in the multi-label literature about which metrics should be considered and when and how to optimise them. This has given rise to a proliferation of metrics, with some papers carrying out empirical evaluations under 10 or more different metrics in order to analyse method performance. We argue that further understanding of underlying mechanisms is necessary. In this paper we tackle the challenge of having a clearer view of evaluation strategies. We present a blended loss function. This function allows us to evaluate under the properties of several major loss functions with a single parameterisation. Furthermore we demonstrate the successful use of this metric as a surrogate loss for other metrics. We offer experimental investigation and theoretical backing to demonstrate that optimising this surrogate loss offers best results for several different metrics than optimising the metrics directly. It simplifies and provides insight to the task of evaluating multi-label prediction methodologies. Data related to this paper are available at:,,


  1. 1.
    Dembczyński, K., Cheng, W., Hüllermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: 27th International Conference on Machine Learning, ICML 2010, Haifa, Israel, pp, 279–286. Omnipress, June 2010Google Scholar
  2. 2.
    Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Mach. Learn. 88(1–2), 5–45 (2012)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 22–30. Springer, Heidelberg (2004). Scholar
  4. 4.
    Largeron, C., Moulin, C., Géry, M.: MCut: a thresholding strategy for multi-label classification. In: Hollmén, J., Klawonn, F., Tucker, A. (eds.) IDA 2012. LNCS, vol. 7619, pp. 172–183. Springer, Heidelberg (2012). Scholar
  5. 5.
    Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recognit. 45(9), 3084–3104 (2012)CrossRefGoogle Scholar
  6. 6.
    Nguyen, U.T., et al.: An automated method for retinal arteriovenous nicking quantification from color fundus images. IEEE Trans. Biomed. Eng. 60(11), 3194–3203 (2013)CrossRefGoogle Scholar
  7. 7.
    Park, L.A.F., Simoff, S.: Using entropy as a measure of acceptance for multi-label classification. In: Fromont, E., De Bie, T., van Leeuwen, M. (eds.) IDA 2015. LNCS, vol. 9385, pp. 217–228. Springer, Cham (2015). Scholar
  8. 8.
    Read, J., Martino, L., Luengo, D.: Efficient Monte Carlo methods for multi-dimensional learning with classifier chains. Pattern Recognit. 47(3), 1535–1546 (2014)CrossRefGoogle Scholar
  9. 9.
    Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Read, J., Puurula, A., Bifet, A.: Multi-label classification with meta labels, In: IEEE International Conference on Data Mining (ICDM 2014), pp. 941–946. IEEE, December 2014Google Scholar
  11. 11.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, Boston (2009). Scholar
  12. 12.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2011)CrossRefGoogle Scholar
  13. 13.
    Waegeman, W., Dembczyńki, K., Jachnik, A., Cheng, W., Hüllermeier, E.: On the bayes-optimality of f-measure maximizers. J. Mach. Learn. Res. 15(1), 3333–3388 (2014)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Wu, X., Zhou, Z.: A unified view of multi-label performance measures. In: ICML, vol. 70, pp. 3780–3788. PMLR (2017)Google Scholar
  15. 15.
    Zhang, M.-L., Zhou, Z.-H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of Computing, Engineering and MathematicsWestern Sydney UniversitySydneyAustralia
  2. 2.DaSciM team, LIX Laboratory, École PolytechniquePalaiseauFrance

Personalised recommendations