Active Learning Algorithms for Multi-label Data
Active learning is an iterative supervised learning task where learning algorithms can actively query an oracle, i.e. a human annotator that understands the nature of the pro blem, for labels. As the learner is allowed to interactively choose the data from which it learns, it is expected that the learner will perform better with less training. The active learning approach is appropriate to machine learning applications where training labels are costly to obtain but unlabeled data is abundant. Although active learning has been widely considered for single-label learning, this is not the case for multi-label learning, where objects can have more than one class labels and a multi-label learner is trained to assign multiple labels simultaneously to an object. We discuss the key issues that need to be considered in pool-based multi-label active learning and discuss how existing solutions in the literature deal with each of these issues. We further empirically study the performance of the existing solutions, after implementing them in a common framework, on two multi-label datasets with different characteristics and under two different applications settings (transductive, inductive). We find out interesting results that we attribute to the properties of, mainly, the data sets, and, secondarily, the application settings.
KeywordsSupervised learning Multi-label learning Active learning Pool-based strategies
This research was supported by the São Paulo Research Foundation (FAPESP), grants 2010/15992-0 and 2011/21723-5, and Brazilian National Council for Scientific and Technological Development (CNPq), grant 644963.
- 1.Brinker, K.: On active learning in multi-label classification. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nurnberger, A., Gaul, W. (eds.) From Data and Information Analysis to Knowledge Engineering. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 206–213. Springer, Heidelberg (2006)Google Scholar
- 3.Hung, C.W., Lin, H.T.: Multi-label active learning with auxiliary learner. In: 3rd Asian Conference on Machine Learning, Taoyuan, Taiwan (2011)Google Scholar
- 4.Nowak, S., Nagel, K., Liebetrau, J.: The CLEF 2011 photo annotation and concept-based retrieval tasks. In: CLEF (Notebook Papers/Labs/Workshop), pp. 1–25 (2011)Google Scholar
- 6.Settles, B.: Active learning literature survey. Technical report 1648. University of Wisconsin-Madison (2010)Google Scholar
- 7.Singh, M., Brew, A., Greene, D., Cunningham, P.: Score normalization and aggregation for active learning in multi-label classification. Technical report. University College Dublin (2010)Google Scholar
- 10.Yang, B., Sun, J.T., Wang, T., Chen, Z.: Effective multi-label active learning for text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, NY, USA, pp. 917–926 (2009) http://doi.acm.org/10.1145/1557019.1557119