On the Trade-Off Between Consistency and Coverage in Multi-label Rule Learning Heuristics

  • Michael RappEmail author
  • Eneldo Loza Mencía
  • Johannes Fürnkranz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11828)


Recently, several authors have advocated the use of rule learning algorithms to model multi-label data, as rules are interpretable and can be comprehended, analyzed, or qualitatively evaluated by domain experts. Many rule learning algorithms employ a heuristic-guided search for rules that model regularities contained in the training data and it is commonly accepted that the choice of the heuristic has a significant impact on the predictive performance of the learner. Whereas the properties of rule learning heuristics have been studied in the realm of single-label classification, there is no such work taking into account the particularities of multi-label classification. This is surprising, as the quality of multi-label predictions is usually assessed in terms of a variety of different, potentially competing, performance measures that cannot all be optimized by a single learner at the same time. In this work, we show empirically that it is crucial to trade off the consistency and coverage of rules differently, depending on which multi-label measure should be optimized by a model. Based on these findings, we emphasize the need for configurable learners that can flexibly use different heuristics. As our experiments reveal, the choice of the heuristic is not straight-forward, because a search for rules that optimize a measure locally does usually not result in a model that maximizes that measure globally.


Multi-label classification Rule learning Heuristics 



This research was supported by the German Research Foundation (DFG) (grant number FU 580/11).


  1. 1.
    Allamanis, M., Tzima, F.A., Mitkas, P.A.: Effective rule-based multi-label classification with learning classifier systems. In: Tomassini, M., Antonioni, A., Daolio, F., Buesser, P. (eds.) ICANNGA 2013. LNCS, vol. 7824, pp. 466–476. Springer, Heidelberg (2013). Scholar
  2. 2.
    Arunadevi, J., Rajamani, V.: An evolutionary multi label classification using associative rule mining for spatial preferences. In: IJCA Special Issue on Artificial Intelligence Techniques-Novel Approaches and Practical Applications (2011)CrossRefGoogle Scholar
  3. 3.
    Ávila-Jiménez, J.L., Gibaja, E., Ventura, S.: Evolving multi-label classification rules with gene expression programming: a preliminary study. In: Corchado, E., Graña Romay, M., Manhaes Savio, A. (eds.) HAIS 2010. LNCS (LNAI), vol. 6077, pp. 9–16. Springer, Heidelberg (2010). Scholar
  4. 4.
    Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)CrossRefGoogle Scholar
  5. 5.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) CrossRefGoogle Scholar
  6. 6.
    Cano, A., Zafra, A., Gibaja, E.L., Ventura, S.: A grammar-guided genetic programming algorithm for multi-label classification. In: Krawiec, K., Moraglio, A., Hu, T., Etaner-Uyar, A.Ş., Hu, B. (eds.) EuroGP 2013. LNCS, vol. 7831, pp. 217–228. Springer, Heidelberg (2013). Scholar
  7. 7.
    Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on International Conference on Machine Learning (1995)CrossRefGoogle Scholar
  8. 8.
    Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005). Scholar
  9. 9.
    Flach, P.A.: The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In: Proceedings of the 20th International Conference on Machine Learning (2003)Google Scholar
  10. 10.
    Fürnkranz, J., Flach, P.A.: An analysis of rule evaluation metrics. In: Proceedings of the 20th International Conference on Machine Learning (2003)Google Scholar
  11. 11.
    Fürnkranz, J., Flach, P.: An analysis of stopping and filtering criteria for rule learning. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 123–133. Springer, Heidelberg (2004). Scholar
  12. 12.
    Fürnkranz, J., Flach, P.A.: ROC ’n’ rule learning-towards a better understanding of covering algorithms. Mach. Learn. 58(1), 39–77 (2005)CrossRefGoogle Scholar
  13. 13.
    Fürnkranz, J., Gamberger, D., Lavrač, N.: Foundations of Rule Learning. Springer, Heidelberg (2012). CrossRefzbMATHGoogle Scholar
  14. 14.
    Janssen, F., Fürnkranz, J.: An empirical investigation of the trade-off between consistency and coverage in rule learning heuristics. In: Jean-Fran, J.-F., Berthold, M.R., Horváth, T. (eds.) DS 2008. LNCS (LNAI), vol. 5255, pp. 40–51. Springer, Heidelberg (2008). Scholar
  15. 15.
    Janssen, F., Fürnkranz, J.: On the quest for optimal rule learning heuristics. Mach. Learn. 78(3), 343–379 (2010)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Klimt, B., Yang, Y.: The enron corpus: a new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 217–226. Springer, Heidelberg (2004). Scholar
  17. 17.
    Lakkaraju, H., Bach, S.H., Leskovec, J.: Interpretable decision sets: a joint framework for description and prediction. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016)Google Scholar
  18. 18.
    Li, B., Li, H., Wu, M., Li, P.: Multi-label classification based on association rules with application to scene classification. In: The 9th International Conference for Young Computer Scientists (2008)Google Scholar
  19. 19.
    Mencía, E.L., Janssen, F.: Learning rules for multi-label classification: a stacking and a separate-and-conquer approach. Mach. Learn. 105(1), 77–216 (2016)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Pestian, J.P., et al.: A shared task involving multi-label classification of clinical free text. In: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing (2007)Google Scholar
  21. 21.
    Rapp, M., Loza Mencía, E., Fürnkranz, J.: Exploiting anti-monotonicity of multi-label evaluation measures for inducing multi-label rules. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L. (eds.) PAKDD 2018. LNCS (LNAI), vol. 10937, pp. 29–42. Springer, Cham (2018). Scholar
  22. 22.
    Thabtah, F.A., Cowling, P., Peng, Y.: MMAC: a new multi-class, multi-label associative classification approach. In: 4th IEEE International Conference on Data Mining (2004)Google Scholar
  23. 23.
    Thabtah, F.A., Cowling, P., Peng, Y.: Multiple labels associative classification. Knowl. Inf. Syst. 9(1), 109–129 (2006)CrossRefGoogle Scholar
  24. 24.
    Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.P.: Multi-label classification of music into emotions. In: International Society for Music Information Retrieval (2008)Google Scholar
  25. 25.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook. Springer, Boston (2009). Scholar
  26. 26.
    Turnbull, D., Barrington, L., Torres, D., Lanckriet, G.: Semantic annotation and retrieval of music and sound effects. IEEE Trans. Audio Speech Lang. Process. 16(2), 467–476 (2008)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Michael Rapp
    • 1
    Email author
  • Eneldo Loza Mencía
    • 1
  • Johannes Fürnkranz
    • 1
  1. 1.Knowledge Engineering GroupTU DarmstadtDarmstadtGermany

Personalised recommendations