In conventional multiclass classification learning, we seek to induce a prediction function from the domain of input patterns to a mutually exclusive set of class labels. As a straightforward generalization of this category of learning problems, so-called multi-label classification allows for input patterns to be associated with multiple class labels simultaneously. Text categorization is a domain of particular relevance which can be viewed as an instance of this setting. While the process of labeling input patterns for generating training sets already constitutes a major issue in conventional classification learning, it becomes an even more substantial matter of relevance in the more complex multi-label classification setting. We propose a novel active learning strategy for reducing the labeling effort and conduct an experimental study on the well-known Reuters-21578 text categorization benchmark dataset to demonstrate the efficiency of our approach.
- Support Vector Machine
- Active Learning
- Class Label
- Input Pattern
- Kernel Machine
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
Unable to display preview. Download preview PDF.
ANGLUIN, D. (1988). Queries and concept learning. Journal of Machine Learning, 2:319–342.
BOUTELL, M.R., LUO, J., SHEN, X., and BROWN, C.M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9):1757–1771.
BRINKER, K. (2004). Active learning of label ranking functions. In Greiner, R. and Schuurmans, D., editors, Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), pages 129–136.
CORTES, C., and VAPNIK, V. (1995). Support vector networks. Journal of Machine Learning, 20:273–297.
GRÜNBAUM, B. (1960). Partitions of mass-distributions and convex bodies by hyperplanes. Pacific J. Math., 10:1257–1261.
JOACHIMS, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Nédellec, C. and Rouveirol, C., editors, Proceedings of the European Conference on Machine Learning (ECML 1998), pages 137–142, Berlin. Springer.
LEWIS, D.D., and GALE, W.A. (1994). A sequential algorithm for training text classifiers. In Croft, W. B. and van Rijsbergen, C. J., editors, Proceedings of SIGIR-94, 17th ACM International Conference on Research and Development in Information Retrieval, pages 3–12, Dublin, IE. Springer Verlag, Heidelberg, DE.
McCALLUM, A.K., and NIGAM, K. (1998). Employing EM in pool-based active learning for text classification. In: Shavlik, J.W., editor, Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), pages 350–358, Madison, US. Morgan Kaufmann Publishers, San Francisco, US.
MITCHELL, T.M. (1982). Generalization as search. Journal of Artificial Intelligence, 18:203–226.
ROY, N., and McCALLUM, A. (2001). Toward optimal active learning through sampling estimation of error reduction. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), pages 441–448. Morgan Kaufmann, San Francisco, CA.
SHAWE-TAYLOR, J., and CRISTIANINI, N. (1999). Further results on the margin distribution. In Proceedings of the Twelfth Annual Conference on Computational Learning Theory (COLT 1999), pages 278–285. ACM Press.
TONG, S., and KOLLER, D. (2001). Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, 2:45–66.
VAPNIK, V. (1998). Statistical Learning Theory. John Wiley, N.Y.
Editors and Affiliations
© 2006 Springer Berlin · Heidelberg
About this paper
Cite this paper
Brinker, K. (2006). On Active Learning in Multi-label Classification. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds) From Data and Information Analysis to Knowledge Engineering. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31314-1_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31313-7
Online ISBN: 978-3-540-31314-4