Machine Learning

, Volume 73, Issue 2, pp 133–153 | Cite as

Multilabel classification via calibrated label ranking

  • Johannes Fürnkranz
  • Eyke Hüllermeier
  • Eneldo Loza Mencía
  • Klaus Brinker


Label ranking studies the problem of learning a mapping from instances to rankings over a predefined set of labels. Hitherto existing approaches to label ranking implicitly operate on an underlying (utility) scale which is not calibrated in the sense that it lacks a natural zero point. We propose a suitable extension of label ranking that incorporates the calibrated scenario and substantially extends the expressive power of these approaches. In particular, our extension suggests a conceptually novel technique for extending the common learning by pairwise comparison approach to the multilabel scenario, a setting previously not being amenable to the pairwise decomposition technique. The key idea of the approach is to introduce an artificial calibration label that, in each example, separates the relevant from the irrelevant labels. We show that this technique can be viewed as a combination of pairwise preference learning and the conventional relevance classification technique, where a separate classifier is trained to predict whether a label is relevant or not. Empirical results in the area of text categorization, image classification and gene analysis underscore the merits of the calibrated model in comparison to state-of-the-art multilabel learning methods.


Multi-label classification Preference learning Ranking 


  1. Altun, Y., McAllester, D., & Belkin, M. (2006). Margin semi-supervised learning for structured variables. In Y. Weiss, B. Schölkopf, & J. Platt (Eds.), Advances in neural information processing systems 18. Cambridge: MIT Press. Google Scholar
  2. Boutell, M. R., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771. CrossRefGoogle Scholar
  3. Bradley, R. A., & Terry, M. E. (1952). The rank analysis of incomplete block designs, I: the method of paired comparisons. Biometrika, 39, 324–345. zbMATHMathSciNetGoogle Scholar
  4. Brinker, K., & Hüllermeier, E. (2005). Calibrated label-ranking. In S. Agarwal, C. Cortes, & R. Herbrich (Eds.), Proceedings of the NIPS-2005 workshop on learning to rank (pp. 1–6), Whistler, BC, Canada. Google Scholar
  5. Brinker, K., & Hüllermeier, E. (2007). Case-based multilabel ranking. In Proceedings of the 20th international joint conference on artificial intelligence (IJCAI-07) (pp. 702–707). Google Scholar
  6. Brinker, K., Fürnkranz, J., & Hüllermeier, E. (2006). A unified model for multilabel classification and ranking. In G. Brewka, S. Coradeschi, A. Perini, & P. Traverso (Eds.), Proceedings of the 17th European conference on artificial intelligence (ECAI-06) (pp. 489–493). Google Scholar
  7. Cai, L., & Hofmann, T. (2004). Hierarchical document categorization with support vector machines. In Proceedings of the 13th ACM conference on information and knowledge management (CIKM-04) (pp. 78–87), Washington, DC. Google Scholar
  8. Coakley, C. W., & Heise, M. A. (1996). Versions of the sign test in the presence of ties. Biometrics, 52, 1242–1251. zbMATHCrossRefMathSciNetGoogle Scholar
  9. Crammer, K., & Singer, Y. (2003). A new family of online algorithms for category ranking. Journal of Machine Learning Research, 3, 1025–1058. zbMATHCrossRefMathSciNetGoogle Scholar
  10. Dekel, O., Manning, C. D., & Singer, Y. (2004). Log-linear models for label ranking. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems 16 (NIPS 2003) (pp. 497–504). Cambridge: MIT Press. Google Scholar
  11. Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30. Google Scholar
  12. Elisseeff, A., & Weston, J. (2002). A kernel method for multi-labelled classification. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems 14 (pp. 681–687). Cambridge: MIT Press. Google Scholar
  13. Friedman, J. H. (1996). Another approach to polychotomous classification (Technical report). Department of Statistics, Stanford University, Stanford, CA. Google Scholar
  14. Fürnkranz, J. (2003). Round robin ensembles. Intelligent Data Analysis, 7(5), 385–404. Google Scholar
  15. Fürnkranz, J. (2002). Round robin classification. Journal of Machine Learning Research, 2, 721–747. zbMATHCrossRefGoogle Scholar
  16. Fürnkranz, J., & Hüllermeier, E. (2003). Pairwise preference learning and ranking. In N. Lavrač, D. Gamberger, H. Blockeel, & L. Todorovski (Eds.), Lecture notes in artificial intelligence : Vol. 2837 Proceedings of the 14th European conference on machine learning (ECML-03), Cavtat, Croatia (pp. 145–156). Berlin: Springer. Google Scholar
  17. Gärtner, T. (2003). A survey of kernels for structured data. SIGKDD Explorations, 5(1), 49–58. CrossRefGoogle Scholar
  18. Har-Peled, S., Roth, D., & Zimak, D. (2002). Constraint classification: a new approach to multiclass classification and ranking. In Advances in Neural Information Processing Systems 15 (NIPS 2002). Cambridge: MIT Press. Google Scholar
  19. Hastie, T., & Tibshirani, R. (1998). Classification by pairwise coupling. In M. Jordan, M. Kearns, & S. Solla (Eds.), Advances in neural information processing systems 10 (NIPS-97) (pp. 507–513). Cambridge: MIT Press. Google Scholar
  20. Hsu, C.-W., & Lin, C.-J. (2002). A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13(2), 415–425. CrossRefGoogle Scholar
  21. Hüllermeier, E., Fürnkranz, J., Cheng, W., & Loza Mencía, E. (2008, in press). Label ranking by learning pairwise preferences. Artificial Intelligence. Google Scholar
  22. Knerr, S., Personnaz, L., & Dreyfus, G. (1990). Single-layer learning revisited: A stepwise procedure for building and training a neural network. In F. Fogelman Soulié & J. Hérault (Eds.), NATO ASI series : Vol. F68. Neurocomputing: algorithms, architectures and applications (pp. 41–50). Berlin: Springer. Google Scholar
  23. Knerr, S., Personnaz, L., & Dreyfus, G. (1992). Handwritten digit recognition by neural networks with single-layer training. IEEE Transactions on Neural Networks, 3(6), 962–968. CrossRefGoogle Scholar
  24. Kreßel, U. H.-G. (1999). Pairwise classification and support vector machines. In B. Schölkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods: support vector learning (pp. 255–268). Cambridge: MIT Press. Google Scholar
  25. Lewis, D. D. (1997). Reuters-21578 text categorization test collection. README file (V 1.2), available from
  26. Lewis, D. D., Yang, Y., Rose, T. G., & Li, F. (2004). RCV1: a new benchmark collection for text categorization research. Journal of Machine Learning Research, 5, 361–397. Google Scholar
  27. Loza Mencía, E., & Fürnkranz, J. (2008). Pairwise learning of multilabel classifications with perceptrons. In Proceedings of the 2008 international joint conference on neural networks (IJCNN-08), Hong Kong (pp. 2900–2907). New York: IEEE. Google Scholar
  28. Lu, B.-L., & Ito, M. (1999). Task decomposition and module combination based on class relations: a modular neural network for pattern classification. IEEE Transactions on Neural Networks, 10(5), 1244–1256. CrossRefGoogle Scholar
  29. McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12, 153–157. CrossRefGoogle Scholar
  30. Park, S.-H., & Fürnkranz, J. (2007). Efficient pairwise classification. In J. N. Kok, J. Koronacki, R. Lopez de Mantaras, S. Matwin, D. Mladenič, & A. Skowron (Eds.), Proceedings of 18th European conference on machine learning (ECML-07) (pp. 658–665), Warsaw, Poland. Berlin: Springer. Google Scholar
  31. Price, D., Knerr, S., Personnaz, L., & Dreyfus, G. (1995). Pairwise neural network classifiers with probabilistic outputs. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems 7 (NIPS-94) (pp. 1109–1116). Cambridge: MIT Press. Google Scholar
  32. Putter, J. (1955). The treatment of ties in some nonparametric tests. Annals of Mathematical Statistics, 26, 368–386. zbMATHCrossRefMathSciNetGoogle Scholar
  33. Rousu, J., Saunders, C., Szedmák, S., & Shawe-Taylor, J. (2006). Kernel-based learning of hierarchical multilabel classification models. Journal of Machine Learning Research, 7, 1601–1626. Google Scholar
  34. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523. CrossRefGoogle Scholar
  35. Schapire, R. E., & Singer, Y. (2000). BoosTexter: a boosting-based system for text categorization. Machine Learning 39(2/3), 135–168. zbMATHCrossRefGoogle Scholar
  36. Schapire, R. E., & Singer, Y. (1999). Improved boosting using confidence-rated predictions. Machine Learning, 37(3), 297–336. zbMATHCrossRefGoogle Scholar
  37. Schmidt, M. S., & Gish, H. (1996). Speaker identification via support vector classifiers. In Proceedings of the 21st IEEE international conference on acoustics, speech, and signal processing (ICASSP-96) (pp. 105–108), Atlanta, GA. Google Scholar
  38. Shalev-Shwartz, S., & Singer, Y. (2006). Efficient learning of label ranking by soft projections onto polyhedra. Journal of Machine Learning Research, 7, 1567–1599. MathSciNetGoogle Scholar
  39. Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In Proceedings of the 21st international conference on machine learning (ICML-04) (pp. 823–830). New York: ACM Press. Google Scholar
  40. Tsoumakas, G., & Katakis, I. (2007). Multi-label classification: an overview. International Journal of Data Warehousing and Mining, 3(3), 1–17. Google Scholar
  41. Weskamp, N., Hüllermeier, E., Kuhn, D., & Klebe, G. (2007). Multiple graph alignment for the structural analysis of protein active sites. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 4(2), 310–320. CrossRefGoogle Scholar
  42. Yang, Y. (1999). An evaluation of statistical approaches to text categorization. Information Retrieval, 1, 69–90. CrossRefGoogle Scholar
  43. Zhang, M.-L., & Zhou, Z.-H. (2005). A k-nearest neighbor based algorithm for multi-label classification. In Proceedings of the 1st IEEE international conference on granular computing (GRC-05) (pp. 718–721). Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Johannes Fürnkranz
    • 1
  • Eyke Hüllermeier
    • 2
  • Eneldo Loza Mencía
    • 1
  • Klaus Brinker
    • 2
  1. 1.TU DarmstadtDarmstadtGermany
  2. 2.Philipps-Universität MarburgMarburgGermany

Personalised recommendations