Skip to main content
Log in

Binary relevance for multi-label learning: an overview

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Multi-label learning deals with problems where each example is represented by a single instance while being associated with multiple class labels simultaneously. Binary relevance is arguably the most intuitive solution for learning from multi-label examples. It works by decomposing the multi-label learning task into a number of independent binary learning tasks (one per class label). In view of its potential weakness in ignoring correlations between labels, many correlation-enabling extensions to binary relevance have been proposed in the past decade. In this paper, we aim to review the state of the art of binary relevance from three perspectives. First, basic settings for multi-label learning and binary relevance solutions are briefly summarized. Second, representative strategies to provide binary relevancewith label correlation exploitation abilities are discussed. Third, some of our recent studies on binary relevance aimed at issues other than label correlation exploitation are introduced. As a conclusion, we provide suggestions on future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Zhang M-L, Zhou Z-H. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819–1837

    Article  Google Scholar 

  2. Zhou Z-H, Zhang M-L. Multi-label learning. In: Sammut C, Webb G I, eds. Encyclopedia of Machine Learning and Data Mining. Berlin: Springer, 2016, 1–8

    Google Scholar 

  3. Schapire R E, Singer Y. Boostexter: a boosting-based system for text categorization. Machine Learning, 2000, 39(2–3): 135–168

    Article  MATH  Google Scholar 

  4. Cabral R S, De la Torre F, Costeira J P, Bernardino A. Matrix completion for multi-label image classification. In: Proceedings of Advances in Neural Information Processing Systems. 2011, 190–198

    Google Scholar 

  5. Sanden C, Zhang J Z. Enhancing multi-label music genre classification through ensemble techniques. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2011, 705–714

    Google Scholar 

  6. Barutcuoglu Z, Schapire R E, Troyanskaya O G. Hierarchical multilabel prediction of gene function. Bioinformatics, 2006, 22(7): 830–836

    Article  Google Scholar 

  7. Qi G-J, Hua X-S, Rui Y, Tang J, Mei T, Zhang H-J. Correlative multilabel video annotation. In: Proceedings of the 15th ACM International Conference on Multimedia. 2007, 17–26

    Google Scholar 

  8. Tang L, Rajan S, Narayanan V K. Large scale multi-label classification via metalabeler. In: Proceedings of the 19th International Conference on World Wide Web. 2009, 211–220

    Google Scholar 

  9. Boutell M R, Luo J, Shen X, Brown C M. Learning multi-label scene classification. Pattern Recognition, 2004, 37(9): 1757–1771

    Article  Google Scholar 

  10. Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In: Maimon O, Rokach L, eds. Data Mining and Knowledge Discovery Handbook. Berlin: Springer, 2010, 667–686

    Google Scholar 

  11. Gibaja E, Ventura S. A tutorial on multilabel learning. ACM Computing Surveys, 2015, 47(3): 52

    Article  Google Scholar 

  12. Read J, Pfahringer B, Holmes G, Frank E. Classifier chains for multilabel classification. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2009, 254–269

    Chapter  Google Scholar 

  13. Dembczyński K, Cheng W, Hüllermeier E. Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on Machine Learning. 2010, 279–286

    Google Scholar 

  14. Read J, Pfahringer B, Holmes G, Frank E. Classifier chains for multilabel classification. Machine Learning, 2011, 85(3): 333–359

    Article  MathSciNet  Google Scholar 

  15. Kumar A, Vembu S, Menon A K, Elkan C. Learning and inference in probabilistic classifier chains with beam search. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 665–680

    Chapter  Google Scholar 

  16. Li N, Zhou Z-H. Selective ensemble of classifier chains. In: Proceedings of International Workshop on Multiple Classifier Systems. 2013, 146–156

    Chapter  Google Scholar 

  17. Senge R, del Coz J J, Hüllermeier E. Rectifying classifier chains for multi-label classification. In: Proceedings of the 15th German Workshop on Learning, Knowledge, and Adaptation. 2013, 162–169

    Google Scholar 

  18. Mena D, Montañés E, Quevedo J R, del Coz J J. A family of admissible heuristics for A* to perform inference in probabilistic classifier chains. Machine Learning, 2017, 106(1): 143–169

    Article  MathSciNet  MATH  Google Scholar 

  19. Godbole S, Sarawagi S. Discriminative methods for multi-labeled classification. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2004, 22–30

    Chapter  Google Scholar 

  20. Montañés E, Quevedo J R, del Coz J J. Aggregating independent and dependent models to learn multi-label classifiers. In: proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2011, 484–500

    Chapter  Google Scholar 

  21. Montañés E, Senge R, Barranquero J, Quevedo J R, del Coz J J, Hüllermeier E. Dependent binary relevance models for multi-label classification. Pattern Recognition, 2014, 47(3): 1494–1508

    Article  Google Scholar 

  22. Tahir M A, Kittler J, Bouridane A. Multi-label classification using stacked spectral kernel discriminant analysis. Neurocomputing, 2016, 171: 127–137

    Article  Google Scholar 

  23. Loza Mencía E, Janssen F. Learning rules for multi-label classification: a stacking and a separate-and-conquer approach. Machine Learning, 2016, 105(1): 77–126

    Article  MathSciNet  MATH  Google Scholar 

  24. Tsoumakas G, Dimou A, Spyromitros E, Mezaris V, Kompatsiaris I, Vlahavas I. Correlation-based pruning of stacked binary relevance models for multi-label learning. In: Proceedings of the 1st International Workshop on Learning from Multi-Label Data. 2009, 101–116

    Google Scholar 

  25. Zhang M-L, Zhang K. Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2010, 999–1007

    Google Scholar 

  26. Alessandro A, Corani G, Mauá D, Gabaglio S. An ensemble of Bayesian networks for multilabel classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2013, 1220–1225

    Google Scholar 

  27. Sucar L E, Bielza C, Morales E F, Hernandez-Leal P, Zaragoza J H, Larrañaga P. Multi-label classification with bayesian network-based chain classifiers. Pattern Recognition Letters, 2014, 41: 14–22

    Article  Google Scholar 

  28. Li Y-K, Zhang M-L. Enhancing binary relevance for multi-label learning with controlled label correlations exploitation. In: Proceedings of Pacific Rim International Conference on Artificial Intelligence. 2014, 91–103

    Google Scholar 

  29. Alali A, Kubat M. Prudent: a pruned and confident stacking approach for multi-label classification. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(9): 2480–2493

    Article  Google Scholar 

  30. Petterson J, Caetano T. Reverse multi-label learning. In: Proceedings of the Neural Information Processing Systems Comference. 2010, 1912–1920

    Google Scholar 

  31. Spyromitros-Xioufis E, Spiliopoulou M, Tsoumakas G, Vlahavas I. Dealing with concept drift and class imbalance in multi-label stream classification. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 1583–1588

    Google Scholar 

  32. Tahir M A, Kittler J, Yan F. Inverse random under sampling for class imbalance problem and its application to multi-label classification. Pattern Recognition, 2012, 45(10): 3738–3750

    Article  Google Scholar 

  33. Quevedo J R, Luaces O, Bahamonde A. Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognition, 2012, 45(2): 876–883

    MATH  Google Scholar 

  34. Pillai I, Fumera G, Roli F. Threshold optimisation for multi-label classifiers. Pattern Recognition, 2013, 46(7): 2055–2065

    Article  MATH  Google Scholar 

  35. Dembczynski K, Jachnik A, Kotlowski W, Waegeman W, Hüllermeier E. Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: Proceedings of the 30th International Conference on Machine Learning. 2013, 1130–1138

    Google Scholar 

  36. Charte F, Rivera A J, del Jesus M J, Herrera F. Addressing imbalance in multilabel classification: Measures and random resampling algorithms. Neurocomputing, 2015, 163: 3–16

    Article  Google Scholar 

  37. Charte F, Rivera A J, del Jesus M J, Herrera F. Mlsmote: approaching imbalanced multilabel learning through synthetic instance generation. Knowledge-Based Systems, 2015, 89: 385–397

    Article  Google Scholar 

  38. Zhang M-L, Li Y-K, Liu X-Y. Towards class-imbalance aware multilabel learning. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence. 2015, 4041–4047

    Google Scholar 

  39. Wu B, Lyu S, Ghanem B. Constrained submodular minimization for missing labels and class imbalance in multi-label learning. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2229–2236

    Google Scholar 

  40. Cheng W, Dembczynski K J, Hüllermeier E. Graded multilabel classification: the ordinal case. In: Proceedings of the 27th International Conference on Machine Learning. 2010, 223–230

    Google Scholar 

  41. Xu M, Li Y-F, Zhou Z-H. Multi-label learning with PRO loss. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence. 2013, 998–1004

    Google Scholar 

  42. Li Y-K, Zhang M-L, Geng X. Leveraging implicit relative labelingimportance information for effective multi-label learning. In: Proceedings of the 15th IEEE International Conference on Data Mining. 2015, 251–260

    Google Scholar 

  43. Geng X, Yin C, Zhou Z-H. Facial age estimation by learning from label distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10): 2401–2412

    Article  Google Scholar 

  44. Geng X. Label distribution learning. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(7): 1734–1748

    Article  Google Scholar 

  45. Gao N, Huang S-J, Chen S. Multi-label active learning by model guided distribution matching. Frontiers of Computer Science, 2016, 10(5): 845–855

    Article  Google Scholar 

  46. Dembczyński K, Waegeman W, Cheng W, Hüllermeier E. On label dependence and loss minimization in multi-label classification. Machine Learning, 2012, 88(1–2): 5–45

    Article  MathSciNet  MATH  Google Scholar 

  47. Gao W, Zhou Z-H. On the consistency of multi-label learning. In: Proceedings of the 24th Annual Conference on Learning Theory. 2011, 341–358

    Google Scholar 

  48. Sun Y-Y, Zhang Y, Zhou Z-H. Multi-label learning with weak label. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010, 593–598

    Google Scholar 

  49. Xu M, Jin R, Zhou Z-H. Speedup matrix completion with side information: application to multi-label learning. In: Proceedings of the Neural Information Processing Systems Conference. 2013, 2301–2309

    Google Scholar 

  50. Cabral R, De la Torre F, Costeira J P, Bernardino A.Matrix completion for weakly-supervised multi-label image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 121–135

    Article  Google Scholar 

  51. Senge R, del Coz J J, Hüllermeier E. On the problem of error propagation in classifier chains for multi-label classification. In: Spiliopoulou M, Schmidt-Thieme L, Janning R, eds. Data Analysis, Machine Learning and Knowledge Discovery. Berlin: Springer, 2014. 163–170

    Chapter  Google Scholar 

  52. Zhou Z-H. Ensemble Methods: Foundations and Algorithms. Boca Raton, FL: Chap-man & Hall/CRC, 2012

    Google Scholar 

  53. Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. Cambridge, MA: MIT Press, 2009

    MATH  Google Scholar 

  54. Koivisto M. Advances in exact Bayesian structure discovery in Bayesian networks. In: Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence. 2006, 241–248

    Google Scholar 

  55. Smith V, Yu J, Smulders T, Hartemink A, Jarvis E. Computational inference of neural information flow networks. PLoS Computational Biology, 2006, 2: 1436–1449

    Article  Google Scholar 

  56. Murphy K. Software for graphical models: a review. ISBA Bulletin, 2007, 14(4): 13–15

    Google Scholar 

  57. Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I. MULAN: a java library for multi-label learning. Journal of Machine Learning Research, 2011, 12: 2411–2414

    MathSciNet  MATH  Google Scholar 

  58. He H, Garcia E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263–1284

    Article  Google Scholar 

  59. Wang S, Yao X. Multiclass imbalance problems: analysis and potential solutions. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, 2012, 42(4): 1119–1130

    Article  Google Scholar 

  60. Liu X-Y, Li Q-Q, Zhou Z-H. Learning imbalanced multi-class data with optimal dichotomy weights. In Proceedings of the 13th IEEE International Conference on Data Mining. 2013, 478–487

    Google Scholar 

  61. Abdi L, Hashemi S. To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(1): 238–251

    Article  Google Scholar 

  62. Zhou D, Bousquet O, Lal T N, Weston J, Schölkopf B. Learning with local and global consistency. In: Proceedings of the Neural Information Processing Systems Conference. 2004, 284–291

    Google Scholar 

  63. Zhu X, Goldberg A B. Introduction to semi-supervised learning. In: Brachman R, Stone P, eds. Synthesis Lectures to Artificial Intelligence and Machine Learning. San Francisco, CA: Morgan & Claypool Publishers, 2009, 1–130

    Google Scholar 

  64. Della Pietra S, Della Pietra V, Lafferty J. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(4): 380–393

    Article  Google Scholar 

  65. Zhang M-L, Wu L. LIFT: multi-label learning with label-specific features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107–120

    Article  MathSciNet  Google Scholar 

  66. Xu X, Yang X, Yu H, Yu D-J, Yang J, Tsang E C C. Multi-label learning with label-specific feature reduction. Knowledge-Based Systems, 2016, 104: 52–61

    Article  Google Scholar 

  67. Huang J, Li G, Huang Q, Wu X. Learning label-specific features and class-dependent labels for multi-label classification. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(12): 3309–3323

    Article  Google Scholar 

  68. Weston J, Bengio S, Usunier N. WSABIE: scaling up to large vocabulary image annotation. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 2764–2770

    Google Scholar 

  69. Agrawal R, Gupta A, Prabhu Y, Varma M. Multi-label learning with millions of labels: Recommending advertiser bid phrases for web pages. In: Proceedings of the 22nd International Conference on World Wide Web. 2013, 13–24

    Chapter  Google Scholar 

  70. Xu C, Tao D, Xu C. Robust extreme multi-label learning. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1275–1284

    Chapter  Google Scholar 

  71. Jain H, Prabhu Y, Varma M. Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 935–944

    Chapter  Google Scholar 

  72. Zhou W J, Yu Y, Zhang M-L. Binary linear compression for multi-label classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the associate editor and anonymous reviewers for their helpful comments and suggestions. This work was supported by the National Natural Science Foundation of China (Grant Nos. 61573104, 61622203), the Natural Science Foundation of Jiangsu Province (BK20141340), the Fundamental Research Funds for the Central Universities (2242017K40140), and partially supported by the Collaborative Innovation Center of Novel Software Technology and Industrialization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min-Ling Zhang.

Additional information

Min-Ling Zhang received the BS, MS, and PhD degrees in computer science from Nanjing University, China in 2001, 2004 and 2007, respectively. Currently, he is a professor at the School of Computer Science and Engineering, Southeast University, China. In recent years, he has served as the Program Co-Chairs of ACML’17, CCFAI’17, PRICAI’16, Senior PC member or Area Chair of AAAI’18/’17, IJCAI’17/’15, ICDM’17/’16, PAKDD’16/’15, etc. He is also on the editorial board of Frontiers of Computer Science, ACM Transactions on Intelligent Systems and Technology, Neural Networks. He is the secretary-general of the CAAI (Chinese Association of Artificial Intelligence) Machine Learning Society, standing committee member of the CCF (China Computer Federation) Artificial Intelligence & Pattern Recognition Society. He is an awardee of the NSFC Excellent Young Scholars Program in 2012.

Yu-Kun Li received the BS and MS degrees in computer science from Southeast University, China in 2012 and 2015 respectively. Currently, he is an R&D engineer at the Baidu Inc. His main research interests include machine learning and data mining, especially in learning from multilabel data.

Xu-Ying Liu received the BS degree at Nanjing University of Aeronautics and Astronautics, China, the MS and PhD degrees at Nanjing University, China in 2006 and 2010 respectively. Now she is an assistant professor at the PALM Group, School of Computer Science and Engineering, Southeast University, China. Her research interests mainly include machine learning and data mining, especially cost-sensitive learning and class imbalance learning.

Xin Geng is currently a professor and the director of the PALMlab of Southeast University, China. He received the BS (2001) and MS (2004) degrees in computer science from Nanjing University, China, and the PhD (2008) degree in computer science from Deakin University, Australia. His research interests include pattern recognition, machine learning, and computer vision. He has published more than 50 refereed papers in these areas, including those published in prestigious journals and top international conferences.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, ML., Li, YK., Liu, XY. et al. Binary relevance for multi-label learning: an overview. Front. Comput. Sci. 12, 191–202 (2018). https://doi.org/10.1007/s11704-017-7031-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-017-7031-7

Keywords

Navigation