On Active Learning in Multi-label Classification

Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


In conventional multiclass classification learning, we seek to induce a prediction function from the domain of input patterns to a mutually exclusive set of class labels. As a straightforward generalization of this category of learning problems, so-called multi-label classification allows for input patterns to be associated with multiple class labels simultaneously. Text categorization is a domain of particular relevance which can be viewed as an instance of this setting. While the process of labeling input patterns for generating training sets already constitutes a major issue in conventional classification learning, it becomes an even more substantial matter of relevance in the more complex multi-label classification setting. We propose a novel active learning strategy for reducing the labeling effort and conduct an experimental study on the well-known Reuters-21578 text categorization benchmark dataset to demonstrate the efficiency of our approach.


Support Vector Machine Active Learning Class Label Input Pattern Kernel Machine 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    ANGLUIN, D. (1988). Queries and concept learning. Journal of Machine Learning, 2:319–342.Google Scholar
  2. 2.
    BOUTELL, M.R., LUO, J., SHEN, X., and BROWN, C.M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9):1757–1771.CrossRefGoogle Scholar
  3. 3.
    BRINKER, K. (2004). Active learning of label ranking functions. In Greiner, R. and Schuurmans, D., editors, Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), pages 129–136.Google Scholar
  4. 4.
    CORTES, C., and VAPNIK, V. (1995). Support vector networks. Journal of Machine Learning, 20:273–297.Google Scholar
  5. 5.
    GRÜNBAUM, B. (1960). Partitions of mass-distributions and convex bodies by hyperplanes. Pacific J. Math., 10:1257–1261.zbMATHMathSciNetGoogle Scholar
  6. 6.
    JOACHIMS, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Nédellec, C. and Rouveirol, C., editors, Proceedings of the European Conference on Machine Learning (ECML 1998), pages 137–142, Berlin. Springer.Google Scholar
  7. 7.
    LEWIS, D.D., and GALE, W.A. (1994). A sequential algorithm for training text classifiers. In Croft, W. B. and van Rijsbergen, C. J., editors, Proceedings of SIGIR-94, 17th ACM International Conference on Research and Development in Information Retrieval, pages 3–12, Dublin, IE. Springer Verlag, Heidelberg, DE.Google Scholar
  8. 8.
    McCALLUM, A.K., and NIGAM, K. (1998). Employing EM in pool-based active learning for text classification. In: Shavlik, J.W., editor, Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), pages 350–358, Madison, US. Morgan Kaufmann Publishers, San Francisco, US.Google Scholar
  9. 9.
    MITCHELL, T.M. (1982). Generalization as search. Journal of Artificial Intelligence, 18:203–226.Google Scholar
  10. 10.
    ROY, N., and McCALLUM, A. (2001). Toward optimal active learning through sampling estimation of error reduction. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), pages 441–448. Morgan Kaufmann, San Francisco, CA.Google Scholar
  11. 11.
    SHAWE-TAYLOR, J., and CRISTIANINI, N. (1999). Further results on the margin distribution. In Proceedings of the Twelfth Annual Conference on Computational Learning Theory (COLT 1999), pages 278–285. ACM Press.Google Scholar
  12. 12.
    TONG, S., and KOLLER, D. (2001). Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, 2:45–66.Google Scholar
  13. 13.
    VAPNIK, V. (1998). Statistical Learning Theory. John Wiley, N.Y.Google Scholar

Copyright information

© Springer Berlin · Heidelberg 2006

Authors and Affiliations

  1. 1.Data and Knowledge Engineering, Faculty of Computer ScienceOtto-von-Guericke-University MagdeburgMagdeburgGermany

Personalised recommendations